TextExtractor Methods
Free Trial Web API version Licensing Request A Quote
HAVE QUESTIONS OR NEED HELP?SUBMIT THE SUPPORT REQUEST FORM or write email toSUPPORT@BYTESCOUT.COM
The TextExtractor type exposes the following members.
Methods
Name | Description | |
---|---|---|
AddFilter(String, Boolean, Boolean) | Adds a filter to remove a text from extracted data. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Color, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
AddFilter(String, String, Boolean, Boolean) | Adds a filter to replace a text in extracted data. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Int32, Int32, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
CreateProfile(String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) | |
CreateProfile(String, String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) | |
Dispose | Releases the unmanaged resources used by the instance and optionally releases the managed resources. (Inherited from BaseExtractor.) | |
DisposePage | Disposes the page object. Uses this method carefully to destroy the page object that should not be used further. Useful to free allocated memory when processing huge PDF documents. (Inherited from BaseTextExtractor.) | |
Equals | (Inherited from Object.) | |
Finalize | (Inherited from Object.) | |
Find(Int32, String, Boolean) | Searches the document page for specified text. | |
Find(Int32, String, RegexOptions) | Searches the document page for specified text in Regex mode with specified options. | |
FindAll | Searches for all occurrences of specified text in specified document page or in entire document. | |
FindAllToJSON | Searches for all occurrences of specified text in specified document page or in entire document and returns result as JSON string. | |
FindNext | Continues the text search started by one of Find() methods. | |
FireParsingError | (Inherited from BaseExtractor.) | |
FireProgressChanged | (Inherited from BaseExtractor.) | |
GetHashCode | (Inherited from Object.) | |
GetPageCount | Returns document page count. (Inherited from BaseExtractor.) | |
GetPageRect_Height | Gets the specified page height. (Inherited from BaseExtractor.) | |
GetPageRect_Left | Gets the specified page left coordinate. (Inherited from BaseExtractor.) | |
GetPageRect_Top | Gets the specified page top coordinate. (Inherited from BaseExtractor.) | |
GetPageRect_Width | Gets the specified page width. (Inherited from BaseExtractor.) | |
GetPageRectangle(Int32) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) | |
GetPageRectangle(Int32, Boolean) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) | |
GetPageRotationAngle | Returns the rotation angle of specified page. (Inherited from BaseExtractor.) | |
GetPageTextAsVariant | Returns page text as array of bytes. This is COM/ActiveX-compatible version of the method SavePageTextToStream(Int32, Stream) for in-memory processing of PDF documents or images. | |
GetPreprocessedPagePreview | Returns preview image of document page with preprocessing filters applied. (Inherited from BaseTextExtractor.) | |
GetText | Extracts text from whole document. | |
GetText(IListInt32) | Extracts text from specified pages. | |
GetText(String) | Extracts text from specified page ranges. | |
GetText(Int32, Int32) | Extracts text from specified page range. | |
GetTextAsVariant | Returns document text as array of bytes. This is COM/ActiveX-compatible version of the method SaveTextToStream(Stream) for in-memory processing of PDF documents or images. | |
GetTextAsVariant(String) | Returns document text as array of bytes. This is COM/ActiveX-compatible version of the method SaveTextToStream(String, Stream) for in-memory processing of PDF documents or images. | |
GetTextAsVariant(Int32, Int32) | Returns document text as array of bytes. This is COM/ActiveX-compatible version of the method SaveTextToStream(Int32, Int32, Stream) for in-memory processing of PDF documents or images. | |
GetTextFromPage | Extracts text from specified document page. | |
GetType | (Inherited from Object.) | |
IsEncrypted | Gets the document encrypted state. (Inherited from BaseExtractor.) | |
IsOCRRecommendedForPage | Detects whether OCR is recommended for specified page. OCR (Optical Character Recognition) is recommended when pages has no text objects bat has an image that might contain text. (Inherited from BaseTextExtractor.) | |
LoadAndApplyProfiles | Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction. (Inherited from BaseExtractor.) | |
LoadDocumentFromFile | Loads PDF document from specified file. (Inherited from BaseExtractor.) | |
LoadDocumentFromStream | Loads PDF document from provided stream. (Inherited from BaseExtractor.) | |
LoadDocumentFromVariant | Loads PDF document from byte array presented as array of Variant or Byte objects ('Variant()' or 'Byte()'). This is COM/ActiveX-compatible version of the method LoadDocumentFromStream(Stream) for in-memory processing of PDF files. (Inherited from BaseExtractor.) | |
LoadProfiles | Loads profiles from JSON file. (Inherited from BaseExtractor.) | |
LoadProfilesFromString | Loads profiles from JSON string. (Inherited from BaseExtractor.) | |
MemberwiseClone | (Inherited from Object.) | |
PerformTextAnalysis | (Inherited from BaseTextExtractor.) | |
Reset | Resets the instance and disposes internal resources. Also automatically invoked by Dispose. (Overrides BaseTextExtractorReset.) | |
ResetBaseExtractionData | (Inherited from BaseTextExtractor.) | |
ResetExtractionArea | Resets the extraction area to the full page. (Inherited from BaseExtractor.) | |
ResetFilters | Reset text filters. (Inherited from BaseTextExtractor.) | |
SavePageTextToFile(Int32, String) | Saves page text to file. | |
SavePageTextToFile(Int32, String, Encoding) | Saves page text to file in specified encoding. | |
SavePageTextToStream(Int32, Stream) | Saves page text to stream. | |
SavePageTextToStream(Int32, Stream, Encoding) | Saves page text to stream in specified encoding. | |
SavePreprocessedPagePreview | Saves preview image of document page with preprocessing filters applied. Image is saved in PNG format. (Inherited from BaseTextExtractor.) | |
SaveTextToFile(String) | Saves document text to file. | |
SaveTextToFile(IListInt32, String) | Saves text from specified pages to file. | |
SaveTextToFile(String, String) | Saves text from specified page ranges to file. | |
SaveTextToFile(String, Encoding) | Saves document text to file in specified encoding. | |
SaveTextToFile(IListInt32, String, Encoding) | Saves text from specified pages to file in specified encoding. | |
SaveTextToFile(Int32, Int32, String) | Saves text from specified page range to file. | |
SaveTextToFile(String, String, Encoding) | Saves text from specified page ranges to file in specified encoding. | |
SaveTextToFile(Int32, Int32, String, Encoding) | Saves text from specified page range to file in specified encoding. | |
SaveTextToStream(Stream) | Saves document text to stream. | |
SaveTextToStream(IListInt32, Stream) | Saves text from specified page range to stream. | |
SaveTextToStream(Stream, Encoding) | Saves document text to stream in specified encoding. | |
SaveTextToStream(String, Stream) | Saves text from specified page range to stream. | |
SaveTextToStream(IListInt32, Stream, Encoding) | Saves text from specified page range to stream in specified encoding. | |
SaveTextToStream(Int32, Int32, Stream) | Saves text from specified page range to stream. | |
SaveTextToStream(String, Stream, Encoding) | Saves text from specified page range to stream in specified encoding. | |
SaveTextToStream(Int32, Int32, Stream, Encoding) | Saves text from specified page range to stream in specified encoding. | |
SetCustomExtractionColumns | Helper method to set CustomExtractionColumns property when using the extractor though COM from VC++ VB, VBA, VBScript, or Delphi. (Inherited from BaseTextExtractor.) | |
SetExtractionArea(RectangleF) | Sets the extraction area by rectangle. (Inherited from BaseExtractor.) | |
SetExtractionArea(Double, Double, Double, Double) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) | |
SetExtractionArea(Single, Single, Single, Single) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) | |
ToString | (Inherited from Object.) |
See Also