Link Search Menu Expand Document

Save OCR Objects As XML - VB.NET

Text Recognition SDK sample in VB.NET demonstrating ‘Save OCR Objects As XML’

Module1.vb
Imports System
Imports System.Diagnostics
Imports Bytescout.TextRecognition

Module Module1

    Sub Main()

        Dim inputDocument As String = ".\ocr-sample.pdf"
        Dim outputDocument As String = ".\result.xml"

        ' Create and activate TextRecognizer instance
        Using textRecognizer As TextRecognizer = New TextRecognizer("demo", "demo")

            Try
                ' Load document (image or PDF)
                textRecognizer.LoadDocument(inputDocument)

                ' Set the location of OCR language data files
                textRecognizer.OCRLanguageDataFolder = "c:\Program Files\ByteScout Text Recognition SDK\ocrdata_best\"

                ' Set OCR language.
                ' "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish, etc. - according to files in "ocrdata" folder
                ' Find more language files at https://github.com/bytescout/ocrdata
                textRecognizer.OCRLanguage = "eng"

                ' Recognize text from page and save objects as word to xml file
                textRecognizer.SaveOCRObjectsAsXML(outputDocument, 0, OCRObjectType.Word)

                ' Open the result file in default associated application (for demo purposes)
                Process.Start(outputDocument)

            Catch exception As Exception

                Console.WriteLine(exception)

            End Try

        End Using

    End Sub

End Module

Download Source Code (.zip)

Return to the previous page Explore Text Recognition SDK