Link Search Menu Expand Document

IXMLExtractor Interface

Defines the PDF to XML extractor interface.

Namespace:Bytescout.PDFExtractor
Assembly: Bytescout.PDFExtractor (in Bytescout.PDFExtractor.dll) Version: 13.4.0.4760-master
Syntax
public interface IXMLExtractor

The IXMLExtractor type exposes the following members.

Properties
NameDescription
Public propertyAllowStandalonePunctuation
Gets or sets whether to allow standalone punctuation characters. If false they will be merged with nearest text object.
Public propertyDetectStrikeoutTextStyle
Get or sets whether to detect the "strikeout" text style. Default is false.
Public propertyDetectUnderlineTextStyle
Get or sets whether to detect the "underline" text style. Default is false.
Public propertyImageFolder
Gets or sets the folder to put extracted images when SaveImages property is set to ImageHandling.OuterFile. Default is "images" - the extractor will create "images" sub-folder in the same folder with output XML file.
Public propertyImageFormat
Gets or sets the image format for extracted images. Default is PNG.
Public propertyIndentedXML
Get or sets whether to generate indented XML. Default is true.
Public propertyKeepOriginalFontNames
By default XMLExtractor replaces names of embedded fonts with standard (or "descendant") fonts similar by metrics and typeface. This is because embedded fonts differ from fonts installed into your system or absent there at all. Set this property to true if you want to keep the original font names.
Public propertySaveImages
Get or sets the image saving way: do not save; save to outer file; embed into result XML as Base64 string. Default is ImageHandling.None.
Public propertySaveVectors
Get or sets whether to save vector objects. Default is false.
Top
Methods
NameDescription
Public methodGetPageXMLAsVariant
Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SavePageXMLToStream(Int32, Stream) for in-memory processing of PDF documents or images.
Public methodGetXML
Extracts XML data from the entire document as string.
Public methodGetXML(IListInt32)
Extracts XML data from specified page range.
Public methodGetXML(String)
Extracts XML data from specified page range.
Public methodGetXML(Int32, Int32)
Extracts XML data from specified page range.
Public methodGetXMLAsVariant
Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(Stream) for in-memory processing of PDF documents or images.
Public methodGetXMLAsVariant(String)
Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(String, Stream) for in-memory processing of PDF documents or images.
Public methodGetXMLAsVariant(Int32, Int32)
Returns extracted XML data as array of bytes. This is COM/ActiveX-compatible version of the method SaveXMLToStream(Int32, Int32, Stream) for in-memory processing of PDF documents or images.
Public methodGetXMLDocument
Extracts XML data from the entire document as XmlDocument.
Public methodGetXMLDocument(IListInt32)
Extracts XML data from specified pages as XmlDocument.
Public methodGetXMLDocument(String)
Extracts XML data from specified page ranges as XmlDocument.
Public methodGetXMLDocument(Int32, Int32)
Extracts XML data from specified page range as XmlDocument.
Public methodGetXMLDocumentFromPage
Extracts XML data from specified document page as XmlDocument.
Public methodGetXMLFromPage
Extracts XML data from specified document page as string.
Public methodSavePageXMLToFile
Saves page XML data to file.
Public methodSavePageXMLToStream
Saves page XML data to stream.
Public methodSaveXMLToFile(String)
Saves XML data from the entire document to file.
Public methodSaveXMLToFile(IListInt32, String)
Saves XML data from specified pages to file.
Public methodSaveXMLToFile(String, String)
Saves XML data from specified page ranges to file.
Public methodSaveXMLToFile(Int32, Int32, String)
Saves XML data from specified page range to file.
Public methodSaveXMLToStream(Stream)
Saves XML data to stream.
Public methodSaveXMLToStream(IListInt32, Stream)
Saves XML data from specified pages to stream.
Public methodSaveXMLToStream(String, Stream)
Saves XML data from specified page ranges to stream.
Public methodSaveXMLToStream(Int32, Int32, Stream)
Saves XML data from specified page range to stream.
Top
See Also

Reference