Package | Description |
---|---|
org.apache.tika |
Apache Tika.
|
org.apache.tika.config |
Tika configuration tools.
|
org.apache.tika.detect |
Media type detection.
|
org.apache.tika.embedder | |
org.apache.tika.exception |
Tika exception.
|
org.apache.tika.extractor |
Extraction of component documents.
|
org.apache.tika.fork |
Forked parser.
|
org.apache.tika.io |
IO utilities.
|
org.apache.tika.language |
Language detection.
|
org.apache.tika.mime |
Media type information.
|
org.apache.tika.parser |
Tika parsers.
|
org.apache.tika.parser.external |
External parser process.
|
org.apache.tika.sax |
SAX utilities.
|
Modifier and Type | Method and Description |
---|---|
String |
Tika.parseToString(File file)
Parses the given file and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream,
Metadata metadata)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream,
Metadata metadata,
int maxLength)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(URL url)
Parses the resource at the given URL and returns the extracted
text content.
|
Constructor and Description |
---|
TikaConfig()
Creates a default Tika configuration.
|
TikaConfig(Document document) |
TikaConfig(Element element) |
TikaConfig(Element element,
ClassLoader loader) |
TikaConfig(File file) |
TikaConfig(InputStream stream) |
TikaConfig(String file) |
TikaConfig(URL url) |
TikaConfig(URL url,
ClassLoader loader) |
Constructor and Description |
---|
AutoDetectReader(InputStream stream) |
AutoDetectReader(InputStream stream,
Metadata metadata) |
AutoDetectReader(InputStream stream,
Metadata metadata,
ServiceLoader loader) |
Modifier and Type | Method and Description |
---|---|
void |
ExternalEmbedder.embed(Metadata metadata,
InputStream inputStream,
OutputStream outputStream,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
void |
Embedder.embed(Metadata metadata,
InputStream originalStream,
OutputStream outputStream,
ParseContext context)
Embeds related document metadata from the given metadata object into the
given output stream.
|
Modifier and Type | Class and Description |
---|---|
class |
EncryptedDocumentException |
Modifier and Type | Method and Description |
---|---|
void |
ParserContainerExtractor.extract(TikaInputStream stream,
ContainerExtractor recurseExtractor,
EmbeddedResourceHandler handler) |
void |
ContainerExtractor.extract(TikaInputStream stream,
ContainerExtractor recurseExtractor,
EmbeddedResourceHandler handler)
Processes a container file, and extracts all the embedded
resources from within it.
|
Modifier and Type | Method and Description |
---|---|
void |
ForkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
Modifier and Type | Class and Description |
---|---|
static class |
EndianUtils.BufferUnderrunException |
Modifier and Type | Method and Description |
---|---|
void |
TemporaryResources.dispose()
Calls the
TemporaryResources.close() method and wraps the potential
IOException into a TikaException for convenience
when used within Tika. |
Modifier and Type | Method and Description |
---|---|
static LanguageProfilerBuilder |
LanguageProfilerBuilder.create(String name,
InputStream is,
String encoding)
Creates a new Language profile from (preferably quite large - 5-10k of
lines) text file
|
float |
LanguageProfilerBuilder.getSimilarity(LanguageProfilerBuilder another)
Calculates a score how well NGramProfiles match each other
|
Modifier and Type | Class and Description |
---|---|
class |
MimeTypeException
A class to encapsulate MimeType related exceptions.
|
Modifier and Type | Method and Description |
---|---|
SAXParser |
ParseContext.getSAXParser()
Returns the SAX parser specified in this parsing context.
|
void |
AbstractParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata)
Deprecated.
use the
Parser.parse(InputStream, ContentHandler, Metadata, ParseContext) method instead |
void |
AutoDetectParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata) |
void |
ParserDecorator.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the method call to the decorated parser.
|
void |
ErrorParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
Parser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses a document stream into a sequence of XHTML SAX events.
|
void |
DelegatingParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Looks up the delegate parser from the parsing context and
delegates the parse operation to it.
|
void |
ParserPostProcessor.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Forwards the call to the delegated parser and post-processes the
results as described above.
|
void |
CryptoParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
NetworkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
AutoDetectParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
CompositeParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the call to the matching component parser.
|
Modifier and Type | Method and Description |
---|---|
static void |
ExternalParsersFactory.attachExternalParsers(TikaConfig config) |
static List<ExternalParser> |
ExternalParsersFactory.create() |
static List<ExternalParser> |
ExternalParsersFactory.create(ServiceLoader loader) |
static List<ExternalParser> |
ExternalParsersFactory.create(String filename,
ServiceLoader loader) |
static List<ExternalParser> |
ExternalParsersFactory.create(URL... urls) |
void |
ExternalParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
static List<ExternalParser> |
ExternalParsersConfigReader.read(Document document) |
static List<ExternalParser> |
ExternalParsersConfigReader.read(Element element) |
static List<ExternalParser> |
ExternalParsersConfigReader.read(InputStream stream) |
Constructor and Description |
---|
CompositeExternalParser() |
CompositeExternalParser(MediaTypeRegistry registry) |
Modifier and Type | Method and Description |
---|---|
void |
SecureContentHandler.throwIfCauseOf(SAXException e)
Converts the given
SAXException to a corresponding
TikaException if it's caused by this instance detecting
a zip bomb. |
Copyright © 2007–2013 The Apache Software Foundation. All rights reserved.