OCR in PowerBuilder 2022R3

  • Ramón San Félix Ramón
  • Ramón San Félix Ramón's Avatar Code Author
  • Offline
More
1 month 3 weeks ago #520 by Ramón San Félix Ramón
Ramón San Félix Ramón created the code: OCR in PowerBuilder 2022R3
Recently, after the publication of my article on How to do OCR in PowerBuilder 2022R3 (C# OCR Tesserate) , I received an interesting contact from Oscar Francisco Hernández . He showed me an innovative way to capture document information using a pop-up window that allowed you to select a specific portion of text and then process it with OCR.

The central concept of this implementation is a file viewer that allows you to view documents in formats such as PDF, images or plain text files. It uses a WebBrowser control and offers the functionality of selecting a specific area to perform OCR and extract the corresponding text.

Let's imagine that we open a scanned PDF file on our computer, where it is not possible to select the text to copy and paste. When you press the OCR button, a semi-transparent yellow pop-up window is displayed that allows us to delimit the area of ​​interest in the text. When double-clicking, we are presented with a message box with the text recognized by OCR using the Tesseract C# library.

To achieve this effect, I have used two examples from the Topwiz Software page :

1- Bitmap: www.topwizprogramming.com/freecode_bitmap.html
This example demonstrates how to capture a BMP of a control, window or screen, returning the image as a blob variable that can be saved to disk as a .bmp file. This function can optionally save the image to the clipboard.

2- Resize Response: www.topwizprogramming.com/freecode_resize_response.html
This example allows a response type window to be resizable.

The combination of these examples allows us to create a floating window to delimit a text area and capture the resulting image for processing with the Tesseract library.

In addition, I have added a checkbox called "Clipboard", which when checked prevents the previous recording of the captured image to disk as a BMP file, leaving it on the clipboard. To process the image from the clipboard and convert it to PNG, I needed to develop a C# library called ImageFromClipboard.dll. Although this option is not essential for OCR, I found it an interesting idea for future exploration.

As a bonus, I've incorporated a button to perform OCR on a PDF (limited to the first page). For this, I have developed the ImageFromPdf.dll library to convert the PDF into a PNG file, making it easier to process with Tesseract.

Finally, I have included a dedicated button for the PDF to PNG conversion feature, which can be useful for a variety of purposes.

As always, I leave you the current project link on GitHub:

github.com/rasanfe/pbImageOCR

And project in Visual Studio 2022:

github.com/rasanfe/ImageOCR

Attached here is the project compiled today in PowerBuilder 2022R3 Build 3289 along with the Visual Studio Project.

I always recommend going to the github links to find the latest version.

To be aware of what I publish you can follow my blog in Spanish:

rsrsystem.blogspot.com

This message has an attachment file.
Please log in or register to see it.

Please Log in or Create an account to join the conversation.

Moderators: Appeon Administrator