Return to previous page Explore PDF Extractor SDK

Bytescout.PDFExtractor Namespace

Free Trial Web API version Licensing Request A Quote

HAVE QUESTIONS OR NEED HELP?SUBMIT THE SUPPORT REQUEST FORM or write email toSUPPORT@BYTESCOUT.COM

Classes

	Class	Description
	AnnotationExtractor	Extracts annotations from PDF file.
	AnnotationInfo	Defines various attachment information.
	AttachmentExtractor	Extracts attachments from PDF file.
	AttachmentInfo	Defines various attachment information.
	BaseExtractor	Defines a base class for PDF extractors.
	BaseTextExtractor	Defines a base class for text-related PDF extractors.
	BookmarkRemover	Represents tool that remove bookmarks from PDF document.
	ComHelpers	Class containing helping methods to use the SDK as ActiveX object from VBScript, VBA, VB6, Delphi, Visual C++.
	CSVExtractor	Represents PDF to CSV extractor. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
	DetectedSensitiveItem	Represents results of sensitive data detection. See SensitiveDataDetector.
	DetectedTable	Represents a table detected by TableDetector2.
	DetectedTableList	Represents list of tables detected by TableDetector2.
	DocumentMerger	Represents PDF document merger.
	DocumentOptimizer	Represents PDF document optimizer.
	DocumentRotator	Represents PDF document rotator.
	DocumentRotatorPageAndAngle	Defines page index and angle pair used to set rotation for individual pages in the DocumentRotator.Rotate method.
	DocumentSplitter	Represents PDF and TIFF document splitter.
	DocumentSplitter2	Represents PDF document splitter that splits a document by pages containing specific text.
	FormFieldInfo	Defines form fields properties.
	FoundLine	Represents a line object found by LineDetector.
	FoundLinesCollection	Represents collection of lines found by LineDetector.
	ImageExtractor	Extracts images from PDF document.
	ImagePreprocessingFiltersCollection	Collection of image preprocessing filters.
	InfoExtractor	Provides information about PDF document.
	JSONExtractor	Represents PDF to XML extractor.
	LicenseInfo	License information.
	LineDetector	Represents line detector.
	Logger
	MultimediaExtractor	Extracts videos from PDF document.
	OCRAnalysisResults	Represents OCR Analyzer. It is designed for analysis of scanned documents in PDF or raster image formats to find best parameters for Optical Character Recognition (OCR) that provide highest recognition quality.
	OCRAnalyzer	Represents OCR Analyzer. It is designed for analysis of scanned documents in PDF or raster image formats to find best parameters for Optical Character Recognition (OCR) that provide highest recognition quality.
	OCRCell	Represents OCR cell (word) data.
	OCRCorrection	Represents a correction automatically applied to recognized text to fix repeating recognition errors.
	OCRCorrectionList	Represents collection of corrections automatically applied to recognized text to fix repeating recognition errors.
	OptimizationOptions	Represents PDF document optimizer.
	PDFAValidator	Creates the object.
	PDFExtractorCancellationException	Cancellation exception.
	PDFExtractorDamagedDocumentException	Damaged document exception.
	PDFExtractorException	Represents errors that occur during PDF extraction process.
	PDFExtractorInvalidPasswordException	Invalid password exception.
	PDFExtractorPermissionsException	Permissions exception.
	PDFExtractorProfileException
	Remover	Utility class to remove objects from PDF document.
	Remover2	Utility class to remove text and image objects from PDF document. Improved version of Remover class.
	SearchablePDFMaker	Represents Searchable PDF Maker tool.
	SearchResult	Defines result of the text search.
	SearchResultElement	Defines the search result element.
	SensitiveDataDetectionResults	Represents results of sensitive data detection. See SensitiveDataDetector.
	SensitiveDataDetector	Class that detects sensitive data in PDF documents.
	SensitiveDataPolicy	Sensitive data policy information. See GetSensitiveDataPolicies.
	Stamper	Stamps PDF document with an image.
	StructuredExtractor	Defines the table structure extractor interface.
	TableDetector	Represents PDF tables detector.
	TableDetector2	Represents experimental detector of tables in PDF documents.
	TextAnalysisResults	Text analysis results.
	TextComparer	Represents PDF text comparer.
	TextComparerDiffPiece
	TextExtractor	Represents PDF to Text extractor. Also able to extract text from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
	UnsearchablePDFMaker	Represents Unsearchable PDF Maker tool.
	XFAFormExtractor	Extracts XFA Form attachments from PDF file.
	XFDFExtractor	Represents forms data extractor in XFDF (XML Forms Data Format) format.
	XLSExtractor	Defines XLS extractor interface. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).
	XMLExtractor	Represents PDF to XML extractor. Also able to extract data from PNG, JPEG, BMP and TIFF (single-page) images using Optical Character Recognition (OCR).

Interfaces

	Interface	Description
	IAnnotationExtractor	Defines annotation extractor.
	IAttachmentExtractor	Defines the PDF attachment extractor interface.
	IAttachmentInfo	Defines various attachment information.
	IBaseExtractor	Defines a base interface for PDF extractors.
	IBaseOCRExtractor	Defines a base interface for PDF text extractors.
	IBaseTextExtractor	Defines a base interface for PDF text extractors.
	IBookmarkRemover	Represents tool that remove bookmarks from PDF document.
	ICSVExtractor	Defines the PDF to CSV extractor interface.
	IDocumentMerger	Represents PDF document merger.
	IDocumentOptimizer	Represents PDF document optimizer.
	IDocumentRotator	Represents PDF document rotator.
	IDocumentSplitter	Represents PDF document splitter.
	IDocumentSplitter2	Represents PDF document splitter that splits a document by pages containing specific text.
	IExtractionArea	Defines extraction area support for extractors
	IFoundLine	Represents a line object found by LineDetector.
	IFoundLinesCollection	Represents collection of lines found by LineDetector.
	IImageExtractor	Defines the image extractor interface.
	IImagePreprocessingFiltersCollection
	IInfoExtractor	Defines the PDF info extractor interface.
	IJSONExtractor	Defines the PDF to JSON extractor interface.
	ILineDetector	Represents line detector.
	IMultimediaExtractor	Defines the video extractor interface.
	IOCRAnalyzer
	IOptimizationOptions	Represents PDF document optimizer.
	IProfiles	Defines profiles support.
	IRemover	Defines a class for PDF extractors.
	IRemover2	Defines a class for PDF extractors.
	ISearchablePDFMaker
	ISearchResult	Defines search result interface.
	ISearchResultElement	Defines the search result element interface.
	ISensitiveDataDetector	Class that detects sensitive data in PDF documents.
	IStamper	Interface of Stamper utility class. Allows you to add a stamp or sign picture to PDF document pages.
	IStructuredExtractor	Defines the table structure extractor interface.
	ITableDetector	Represents PDF tables detector.
	ITextExtractor	Defines the PDF to Text extractor interface.
	IUnsearchablePDFMaker
	IXFAFormExtractor	Defines the XFA Form attachments extractor interface.
	IXFDFExtractor	Defines the PDF to XML extractor interface.
	IXLSExtractor	Defines XLS extractor interface.
	IXMLExtractor	Defines the PDF to XML extractor interface.

Delegates

	Delegate	Description
	BaseExtractorParsingErrorEventHandler	Defines ParsingError event parameters.
	BaseExtractorProgressEventHandler	Defines Progress event parameters.
	DocumentOptimizerProgressEventHandler	Defines progress event parameters.
	DocumentSplitterProgressEventHandler	Defines progress event parameters.
	OCRAnalyzerProgressEventHandler	Defines Progress event parameters.
	PasswordEventHandler	Represents parameters for PasswordRequired event.

Enumerations

	Enumeration	Description
	AudioType	Defines embedded audio resource types.
	ColumnDetectionByTextAlignment	Defines text alignments for detection of table column. See ColumnDetectionByTextAlignment property.
	ColumnDetectionMode	Defines how columns are detected on the document page.
	EmbeddedImageFormat	Image format to convert PDF pages to.
	ExtractionAreaUsageMode	Defines how extraction area (if any) is treated when doing text extraction or text search.
	Graphics3DType	Defines embedded audio resource types.
	ImageHandling	Defines the image handling way during the XML extraction.
	ImageOptimizationFormat	Defines image compression types used for image optimization in PDF
	InfoExtractorPDFEncryptionAlgorithm	PDF encryption algorithm.
	LineGroupingMode	Sets if lines are not checked to be merged, can be merged by rows, or inside columns
	LineOrientation	Represents line types.
	LineOrientationsToFind	Represents line detector.
	OCRCacheMode	OCR results caching behavior. Turned off by default (no cache is used). In "WholePage" caching mode you may save processing time as the SDK will check if need to re-run OCR on the page or can just pull results from previously cached OCR results.
	OCRMode	OCR (Optical Character Recognition) usage mode.
	OngoingOperation	The ongoing operation for ProgressChanged event.
	OutputImageFormat	Defines format for output images.
	OutputStructure
	PageDataCaching	Page data caching behaviour.
	PDFContentType	Defines PDF content types.
	RotationAngle	Represents angle for document rotation.
	SensitiveDataReportFormat	Defines formats of sensitive data detection report.
	SpreadseetOutputFormat	Defines spreadsheet output formats.
	TextAnalysisStatus	Defines statuses of the text analysis. See TextAnalysisResults.
	TextComparerChangeType
	VideoType	Defines embedded video resource types.
	WordMatchingMode	Word matching mode (for search).
	XFAFormContentType	Specifies XFA Form content part types.