Return to previous page Explore PDF Extractor SDK

StructuredExtractor Methods

Free Trial Web API version Licensing Request A Quote

HAVE QUESTIONS OR NEED HELP?SUBMIT THE SUPPORT REQUEST FORM or write email toSUPPORT@BYTESCOUT.COM

The StructuredExtractor type exposes the following members.

Methods

	Name	Description
	AddFilter(String, Boolean, Boolean)	Adds a filter to remove a text from extracted data. (Inherited from BaseTextExtractor.)
	AddFilter(String, Int32, Boolean)	Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.)
	AddFilter(String, Int32, Color, Boolean)	Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.)
	AddFilter(String, String, Boolean, Boolean)	Adds a filter to replace a text in extracted data. (Inherited from BaseTextExtractor.)
	AddFilter(String, Int32, Int32, Int32, Int32, Boolean)	Adds filter to exclude text objects with specified attributes. (Inherited from BaseTextExtractor.)
	CreateProfile(String, Boolean, Boolean, Boolean)	Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.)
	CreateProfile(String, String, Boolean, Boolean, Boolean)	Creates JSON profile will all extractor properties with current values. (Inherited from BaseExtractor.)
	Dispose	Releases the unmanaged resources used by the instance and optionally releases the managed resources. (Inherited from BaseExtractor.)
	DisposePage	Disposes the page object. Uses this method carefully to destroy the page object that should not be used further. Useful to free allocated memory when processing huge PDF documents. (Inherited from BaseTextExtractor.)
	Equals	(Inherited from Object.)
	Finalize	(Inherited from Object.)
	FireParsingError	(Inherited from BaseExtractor.)
	FireProgressChanged	(Inherited from BaseExtractor.)
	GetCellValue	Returns value of specified cell of the table structure.
	GetColumnCount	Returns number of columns in the specified row of the table structure of the document.
	GetHashCode	(Inherited from Object.)
	GetPageCount	Returns document page count. (Inherited from BaseExtractor.)
	GetPageRect_Height	Gets the specified page height. (Inherited from BaseExtractor.)
	GetPageRect_Left	Gets the specified page left coordinate. (Inherited from BaseExtractor.)
	GetPageRect_Top	Gets the specified page top coordinate. (Inherited from BaseExtractor.)
	GetPageRect_Width	Gets the specified page width. (Inherited from BaseExtractor.)
	GetPageRectangle(Int32)	Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.)
	GetPageRectangle(Int32, Boolean)	Gets the page rectangle in PDF Points (1 Point = 1/72 in.). (Inherited from BaseExtractor.)
	GetPageRotationAngle	Returns the rotation angle of specified page. (Inherited from BaseExtractor.)
	GetPreprocessedPagePreview	Returns preview image of document page with preprocessing filters applied. (Inherited from BaseTextExtractor.)
	GetRowCount	Returns number of rows in the table structure of the document.
	GetType	(Inherited from Object.)
	IsEncrypted	Gets the document encrypted state. (Inherited from BaseExtractor.)
	IsOCRRecommendedForPage	Detects whether OCR is recommended for specified page. OCR (Optical Character Recognition) is recommended when pages has no text objects bat has an image that might contain text. (Inherited from BaseTextExtractor.)
	LoadAndApplyProfiles	Loads profiles from JSON string and automatically applies them. Note that profiles containing detection keywords will be deferred until the extraction. (Inherited from BaseExtractor.)
	LoadDocumentFromFile	Loads PDF document from specified file. (Inherited from BaseExtractor.)
	LoadDocumentFromStream	Loads PDF document from provided stream. (Inherited from BaseExtractor.)
	LoadDocumentFromVariant	Loads PDF document from byte array presented as array of Variant or Byte objects ('Variant()' or 'Byte()'). This is COM/ActiveX-compatible version of the method LoadDocumentFromStream(Stream) for in-memory processing of PDF files. (Inherited from BaseExtractor.)
	LoadProfiles	Loads profiles from JSON file. (Inherited from BaseExtractor.)
	LoadProfilesFromString	Loads profiles from JSON string. (Inherited from BaseExtractor.)
	MemberwiseClone	(Inherited from Object.)
	PerformTextAnalysis	(Inherited from BaseTextExtractor.)
	PrepareStructure	Prepares the table structure for iteration. Use it before calling GetRowCount(), GetColumnCount(), GetCellValue() methods.
	Reset	Resets the instance and disposes internal resources. Also automatically invoked by Dispose. (Inherited from BaseTextExtractor.)
	ResetBaseExtractionData	(Inherited from BaseTextExtractor.)
	ResetExtractionArea	Resets the extraction area to the full page. (Inherited from BaseExtractor.)
	ResetFilters	Reset text filters. (Inherited from BaseTextExtractor.)
	SavePreprocessedPagePreview	Saves preview image of document page with preprocessing filters applied. Image is saved in PNG format. (Inherited from BaseTextExtractor.)
	SetCustomExtractionColumns	Helper method to set CustomExtractionColumns property when using the extractor though COM from VC++ VB, VBA, VBScript, or Delphi. (Inherited from BaseTextExtractor.)
	SetExtractionArea(RectangleF)	Sets the extraction area by rectangle. (Inherited from BaseExtractor.)
	SetExtractionArea(Double, Double, Double, Double)	Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.)
	SetExtractionArea(Single, Single, Single, Single)	Sets the extraction area by coordinates and dimensions. (Inherited from BaseExtractor.)
	ToString	(Inherited from Object.)

Top

Reference

StructuredExtractor Class

Bytescout.PDFExtractor Namespace