XMLExtractor Methods
Free Trial Web API version Licensing Request A Quote
HAVE QUESTIONS OR NEED HELP?SUBMIT THE SUPPORT REQUEST FORM or write email toSUPPORT@BYTESCOUT.COM
The XMLExtractor type exposes the following members.
Methods
Name | Description | |
---|---|---|
AddFilter(String, Boolean, Boolean) | Adds a filter to remove a text from extracted data. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Color, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
AddFilter(String, String, Boolean, Boolean) | Adds a filter to replace a text in extracted data. (Inherited from BaseTextExtractor.) | |
AddFilter(String, Int32, Int32, Int32, Int32, Boolean) | Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.) | |
CreateProfile(String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) | |
CreateProfile(String, String, Boolean, Boolean, Boolean) | Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.) | |
Dispose | Releases the unmanaged resources used by the instance and optionally releases the managed resources. (Inherited from BaseExtractor.) | |
DisposePage | Disposes the page object. Uses this method carefully to destroy the page object that should not be used further. Useful to free allocated memory when processing huge PDF documents. (Inherited from BaseTextExtractor.) | |
Equals | (Inherited from Object.) | |
Finalize | (Inherited from Object.) | |
FireParsingError | (Inherited from BaseExtractor.) | |
FireProgressChanged | (Inherited from BaseExtractor.) | |
GetHashCode | (Inherited from Object.) | |
GetPageCount | Returns document page count. (Inherited from BaseExtractor.) | |
GetPageRect_Height | Gets the specified page height. (Inherited from BaseExtractor.) | |
GetPageRect_Left | Gets the specified page left coordinate. (Inherited from BaseExtractor.) | |
GetPageRect_Top | Gets the specified page top coordinate. (Inherited from BaseExtractor.) | |
GetPageRect_Width | Gets the specified page width. (Inherited from BaseExtractor.) | |
GetPageRectangle(Int32) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) | |
GetPageRectangle(Int32, Boolean) | Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.) | |
GetPageRotationAngle | Returns the rotation angle of specified page. (Inherited from BaseExtractor.) | |
GetPageXMLAsVariant | Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SavePageXMLToStream(Int32, Stream) for in-memory processing of PDF documents or images. | |
GetPreprocessedPagePreview | Returns preview image of document page with preprocessing filters applied. (Inherited from BaseTextExtractor.) | |
GetType | (Inherited from Object.) | |
GetXML | Extracts XML data from the entire document as string. | |
GetXML(IListInt32) | Extracts XML data from specified page range. | |
GetXML(String) | Extracts XML data from specified page range. | |
GetXML(Int32, Int32) | Extracts XML data from specified page range. | |
GetXMLAsVariant | Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(Stream) for in-memory processing of PDF documents or images. | |
GetXMLAsVariant(String) | Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(String, Stream) for in-memory processing of PDF documents or images. | |
GetXMLAsVariant(Int32, Int32) | Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(Int32, Int32, Stream) for in-memory processing of PDF documents or images. | |
GetXMLDocument | Extracts XML data from the entire document as XmlDocument. | |
GetXMLDocument(IListInt32) | Extracts XML data from specified pages as XmlDocument. | |
GetXMLDocument(String) | Extracts XML data from specified page ranges as XmlDocument. | |
GetXMLDocument(Int32, Int32) | Extracts XML data from specified page range as XmlDocument. | |
GetXMLDocumentFromPage | Extracts XML data from specified document page as XmlDocument. | |
GetXMLFromPage | Extracts XML data from specified document page as string. | |
IsEncrypted | Gets the document encrypted state. (Inherited from BaseExtractor.) | |
IsOCRRecommendedForPage | Detects whether OCR is recommended for specified page. OCR (Optical Character Recognition) is recommended when pages has no text objects bat has an image that might contain text. (Inherited from BaseTextExtractor.) | |
LoadAndApplyProfiles | Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction. (Inherited from BaseExtractor.) | |
LoadDocumentFromFile | Loads PDF document from specified file. (Inherited from BaseExtractor.) | |
LoadDocumentFromStream | Loads PDF document from provided stream. (Inherited from BaseExtractor.) | |
LoadDocumentFromVariant | Loads PDF document from byte array presented as array of Variant or Byte objects ('Variant()' or 'Byte()'). This is COM/ActiveX-compatible version of the method LoadDocumentFromStream(Stream) for in-memory processing of PDF files. (Inherited from BaseExtractor.) | |
LoadProfiles | Loads profiles from JSON file. (Inherited from BaseExtractor.) | |
LoadProfilesFromString | Loads profiles from JSON string. (Inherited from BaseExtractor.) | |
MemberwiseClone | (Inherited from Object.) | |
PerformTextAnalysis | (Inherited from BaseTextExtractor.) | |
Reset | (Overrides BaseTextExtractorReset.) | |
ResetBaseExtractionData | (Inherited from BaseTextExtractor.) | |
ResetExtractionArea | Resets the extraction area to the full page. (Inherited from BaseExtractor.) | |
ResetFilters | Reset text filters. (Inherited from BaseTextExtractor.) | |
SavePageXMLToFile | Saves page XML data to file. | |
SavePageXMLToStream | Saves page XML data to stream. | |
SavePreprocessedPagePreview | Saves preview image of document page with preprocessing filters applied. Image is saved in PNG format. (Inherited from BaseTextExtractor.) | |
SaveXMLToFile(String) | Saves XML data from the entire document to file. | |
SaveXMLToFile(IListInt32, String) | Saves XML data from specified pages to file. | |
SaveXMLToFile(String, String) | Saves XML data from specified page ranges to file. | |
SaveXMLToFile(Int32, Int32, String) | Saves XML data from specified page range to file. | |
SaveXMLToStream(Stream) | Saves XML data to stream. | |
SaveXMLToStream(IListInt32, Stream) | Saves XML data from specified pages to stream. | |
SaveXMLToStream(String, Stream) | Saves XML data from specified page ranges to stream. | |
SaveXMLToStream(Int32, Int32, Stream) | Saves XML data from specified page range to stream. | |
SetCustomExtractionColumns | Helper method to set CustomExtractionColumns property when using the extractor though COM from VC++ VB, VBA, VBScript, or Delphi. (Inherited from BaseTextExtractor.) | |
SetExtractionArea(RectangleF) | Sets the extraction area by rectangle. (Inherited from BaseExtractor.) | |
SetExtractionArea(Double, Double, Double, Double) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) | |
SetExtractionArea(Single, Single, Single, Single) | Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.) | |
ToString | (Inherited from Object.) |
See Also