BaseTextExtractor Methods
Free Trial Web API version Licensing Request A Quote
HAVE QUESTIONS OR NEED HELP?SUBMIT THE SUPPORT REQUEST FORM or write email toSUPPORT@BYTESCOUT.COM
The BaseTextExtractor type exposes the following members.
Methods
Name | Description | |
---|---|---|
![]() | AddFilter(String, Boolean, Boolean) | Adds a filter to remove a text from extracted data. |
![]() | AddFilter(String, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. |
![]() | AddFilter(String, Int32, Color, Boolean) | Adds filter to exclude text objects with specified attributes. |
![]() | AddFilter(String, String, Boolean, Boolean) | Adds a filter to replace a text in extracted data. |
![]() | AddFilter(String, Int32, Int32, Int32, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. |
![]() | CreateProfile(String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) |
![]() | CreateProfile(String, String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) |
![]() | Dispose | Releases the unmanaged resources used by the instance and optionally releases the managed resources. (Inherited from BaseExtractor.) |
![]() | DisposePage | Disposes the page object. Uses this method carefully to destroy the page object that should not be used further. Useful to free allocated memory when processing huge PDF documents. |
![]() | Equals | (Inherited from Object.) |
![]() | Finalize | (Inherited from Object.) |
![]() | FireParsingError | (Inherited from BaseExtractor.) |
![]() | FireProgressChanged | (Inherited from BaseExtractor.) |
![]() | GetHashCode | (Inherited from Object.) |
![]() | GetPageCount | Returns document page count. (Inherited from BaseExtractor.) |
![]() | GetPageRect_Height | Gets the specified page height. (Inherited from BaseExtractor.) |
![]() | GetPageRect_Left | Gets the specified page left coordinate. (Inherited from BaseExtractor.) |
![]() | GetPageRect_Top | Gets the specified page top coordinate. (Inherited from BaseExtractor.) |
![]() | GetPageRect_Width | Gets the specified page width. (Inherited from BaseExtractor.) |
![]() | GetPageRectangle(Int32) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) |
![]() | GetPageRectangle(Int32, Boolean) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) |
![]() | GetPageRotationAngle | Returns the rotation angle of specified page. (Inherited from BaseExtractor.) |
![]() | GetPreprocessedPagePreview | Returns preview image of document page with preprocessing filters applied. |
![]() | GetType | (Inherited from Object.) |
![]() | IsEncrypted | Gets the document encrypted state. (Inherited from BaseExtractor.) |
![]() | IsOCRRecommendedForPage | Detects whether OCR is recommended for specified page. OCR (Optical Character Recognition) is recommended when pages has no text objects bat has an image that might contain text. |
![]() | LoadAndApplyProfiles | Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction. (Inherited from BaseExtractor.) |
![]() | LoadDocumentFromFile | Loads PDF document from specified file. (Inherited from BaseExtractor.) |
![]() | LoadDocumentFromStream | Loads PDF document from provided stream. (Inherited from BaseExtractor.) |
![]() | LoadDocumentFromVariant | Loads PDF document from byte array presented as array of Variant or Byte objects ('Variant()' or 'Byte()'). This is COM/ActiveX-compatible version of the method LoadDocumentFromStream(Stream) for in-memory processing of PDF files. (Inherited from BaseExtractor.) |
![]() | LoadProfiles | Loads profiles from JSON file. (Inherited from BaseExtractor.) |
![]() | LoadProfilesFromString | Loads profiles from JSON string. (Inherited from BaseExtractor.) |
![]() | MemberwiseClone | (Inherited from Object.) |
![]() | PerformTextAnalysis | |
![]() | Reset | Resets the instance and disposes internal resources. Also automatically invoked by Dispose. (Overrides BaseExtractorReset.) |
![]() | ResetBaseExtractionData | |
![]() | ResetExtractionArea | Resets the extraction area to the full page. (Inherited from BaseExtractor.) |
![]() | ResetFilters | Reset text filters. |
![]() | SavePreprocessedPagePreview | Saves preview image of document page with preprocessing filters applied. Image is saved in PNG format. |
![]() | SetCustomExtractionColumns | Helper method to set CustomExtractionColumns property when using the extractor though COM from VC++ VB, VBA, VBScript, or Delphi. |
![]() | SetExtractionArea(RectangleF) | Sets the extraction area by rectangle. (Inherited from BaseExtractor.) |
![]() | SetExtractionArea(Double, Double, Double, Double) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) |
![]() | SetExtractionArea(Single, Single, Single, Single) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) |
![]() | ToString | (Inherited from Object.) |
See Also