Link Search Menu Expand Document

BaseExtractor Class

Defines a base class for PDF extractors.
Inheritance Hierarchy
SystemObject
Bytescout.PDFExtractorBaseExtractor
More...

Namespace:Bytescout.PDFExtractor
Assembly: Bytescout.PDFExtractor (in Bytescout.PDFExtractor.dll) Version: 13.4.0.4760-master
Syntax
public abstract class BaseExtractor : IBaseExtractor, 
	IDisposable, IExtractionArea, IProfiles

The BaseExtractor type exposes the following members.

Constructors
NameDescription
Protected methodBaseExtractor
Default constructor.
Protected methodBaseExtractor(String, String)
Initializes a new instance of the extractor class.
Top
Properties
NameDescription
Public propertyCheckPermissions
Defines whether to respect permissions set by document owner. If True, extractor throws exception when the extraction is prohibited. IMPORTANT: THIS OPTION HAVE TO BE ENABLED AND SET TO "TRUE" TO RESPECT OWNERS OF PDF DOCUMENTS. IF YOU SET IT TO FALSE TO IGNORE PERMISSIONS WHICH ARE SET IN PDF DOCUMENT THEN YOU ARE SOLELY LIABLE FOR THIS ACTION AND ANY COPYRIGHT OR OTHER VIOLATIONS AT YOUR OWN RISK. BYTESCOUT IS NOT LIABLE FOR ANY DAMAGES, LOSSES, COPYRIGHT INFRINGEMENTS OR ANY OTHER CONSEQUENCES CAUSED BY IGNORING PERMISSIONS OF PDF DOCUMENT. BY CHANGING THIS OPTION YOU ARE CONFIRMING YOU ARE UNDERSTANDING ALL WRITTEN ABOVE AND DOING IT AT YOUR OWN RISK.
Public propertyComHelpers
Set of utility functions and properties to use from COM/ActiveX.
Public propertyContentType
Returns content type of PDF document: normal document, portfolio or XFA form. To extract files from PDF portfolio use AttachmentExtractor class. To extract XFA form content use XFAFormExtractor class.
Public propertyEmbeddedFileCountObsolete.
Property is disabled to speed up the document loading. Use AttachmentExtractor to work with attachments.
Public propertyEncrypted
Gets whether the document is encrypted.
Public propertyExtractionArea
Sets the extraction area by coordinates and dimensions (left, top, width, height).
Public propertyExtractionAreaRect
Sets the extraction area by rectangle.
Public propertyExtractionAreaUsageMode
Gets or sets how to use the ExtractionArea: whether to extract any object intersecting with the area or only objects completely located inside the area.
Public propertyIsDocumentLoaded
Get the document loaded state.
Public propertyLicenseInfo
Gets license information.
Public propertyPageDataCaching
Controls page data caching behavior.
Public propertyPassword
PDF document password.
Public propertyProfiles
Comma-separated list of profiles to apply to the extractor. Profiles must be previously loaded.
Public propertyRegistrationKey
Registration key.
Public propertyRegistrationName
Registration name.
Public propertyVersion
Gets the component version number.
Top
Methods
NameDescription
Public methodCreateProfile(String, Boolean, Boolean, Boolean)
Creates JSON profile will all extractor properties with current values.
Public methodCreateProfile(String, String, Boolean, Boolean, Boolean)
Creates JSON profile will all extractor properties with current values.
Public methodDispose
Releases the unmanaged resources used by the instance and optionally releases the managed resources.
Public methodEquals (Inherited from Object.)
Protected methodFinalize (Inherited from Object.)
Protected methodFireParsingError
Protected methodFireProgressChanged
Public methodGetHashCode (Inherited from Object.)
Public methodGetPageCount
Returns document page count.
Public methodGetPageRect_Height
Gets the specified page height.
Public methodGetPageRect_Left
Gets the specified page left coordinate.
Public methodGetPageRect_Top
Gets the specified page top coordinate.
Public methodGetPageRect_Width
Gets the specified page width.
Public methodGetPageRectangle(Int32)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
Public methodGetPageRectangle(Int32, Boolean)
Gets the page rectangle in PDF Points (1 Point = 1/72 in.).
Public methodGetPageRotationAngle
Returns the rotation angle of specified page.
Public methodGetType (Inherited from Object.)
Public methodIsEncrypted
Gets the document encrypted state.
Public methodLoadAndApplyProfiles
Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction.
Public methodLoadDocumentFromFile
Loads PDF document from specified file.
Public methodLoadDocumentFromStream
Loads PDF document from provided stream.
Public methodLoadDocumentFromVariant
Loads PDF document from byte array presented as array of Variant or Byte objects ('Variant()' or 'Byte()'). This is COM/ActiveX-compatible version of the method LoadDocumentFromStream(Stream) for in-memory processing of PDF files.
Public methodLoadProfiles
Loads profiles from JSON file.
Public methodLoadProfilesFromString
Loads profiles from JSON string.
Protected methodMemberwiseClone (Inherited from Object.)
Public methodReset
Resets the instance, disposes internal resources and releases the file. Use this method before loading another PDF file.
Public methodResetExtractionArea
Resets the extraction area to the full page.
Public methodSetExtractionArea(RectangleF)
Sets the extraction area by rectangle.
Public methodSetExtractionArea(Double, Double, Double, Double)
Sets the extraction area by coordinates and dimensions.
Public methodSetExtractionArea(Single, Single, Single, Single)
Sets the extraction area by coordinates and dimensions.
Public methodToString (Inherited from Object.)
Top
Events
NameDescription
Public eventParsingError
Raised on PDF document parsing errors. This usually indicates a damaged document.
Public eventPasswordRequired
Occurs when the password required to decrypt the document.
Public eventProgressChanged
Raised for each reported progress value. Allows to cancel the processing.
Top
Fields
NameDescription
Protected fieldExtractionAreaInternal
Top
See Also

Reference

Inheritance Hierarchy
SystemObject
Bytescout.PDFExtractorBaseExtractor
Bytescout.PDFExtractorAnnotationExtractor
Bytescout.PDFExtractorAttachmentExtractor
Bytescout.PDFExtractorBaseTextExtractor
Bytescout.PDFExtractorImageExtractor
Bytescout.PDFExtractorLineDetector
Bytescout.PDFExtractorMultimediaExtractor
Bytescout.PDFExtractorOCRAnalyzer
Bytescout.PDFExtractorPDFAValidator
Bytescout.PDFExtractorSearchablePDFMaker
Bytescout.PDFExtractorTableDetector2
Bytescout.PDFExtractorUnsearchablePDFMaker
Bytescout.PDFExtractorXFAFormExtractor