com.lowagie.text.pdf.parser
public class PdfTextExtractor extends Object
Since: 2.1.4
Field Summary | |
---|---|
SimpleTextExtractingPdfContentStreamProcessor | extractionProcessor The processor that will extract the text. |
PdfReader | reader The PdfReader that holds the PDF file. |
Constructor Summary | |
---|---|
PdfTextExtractor(PdfReader reader)
Creates a new Text Extractor object. |
Method Summary | |
---|---|
byte[] | getContentBytesForPage(int pageNum)
Gets the content stream of a page. |
String | getTextFromPage(int page)
Gets the text from a page. |
Parameters: reader the reader with the PDF
Parameters: pageNum the page number of page you want get the content stream from
Returns: a byte array with the content stream of a page
Throws: IOException
Parameters: page the page number of the page
Returns: a String with the content as plain text (without PDF syntax)
Throws: IOException