PDF OCR Turns PDF Documents Into Text
It sometimes happens that text in a pdf document cannot be selected in a pdf reader like Adobe Reader or Foxit Reader. This is usually the case with scanned documents that have been embedded into the pdf file.
One of the options to work with the text in those pdf documents is to use OCR technology to convert the information to text you can work with.
OCR means optical character recognition which uses an algorithm to identify the characters displayed in a pdf file to export them into a plain text document or other supported file format.
PDF OCR is a free software program for the Windows operating system that can turn pdf documents into editable text.
Update: The most recent free version of PDF OCR is severely limited. The PDF OCR tool can only process three pages, and the image to pdf tool displays a big watermark in the resulting PDF document. This makes the free version of the program unusable for most tasks.
The interface is divided into two areas that are independent from each other. The first window loads the pdf document and displays its contents in its interface. All pages are displayed on the left and it is possible to read the pdf right on the screen.
The Start OCR button displays a configuration window for the OCR process. It is possible to OCR all pages, a selection of pages or only the current page.
The progress and status is displayed right in the window and all processed pages are displayed in the second window afterwards.
The PDF OCR Editor is a basic text editor that can theoretically be used to edit the text right away. The OCR process naturally misinterprets some of the characters which have to be edited afterwards.
The text editor can export the converted text as a text or doc document which indicates the second possibility of editing the text.
It usually makes sense to save the processed pdf as a doc and load it into a text processing application like Microsoft Word which offers spell and grammar checking.
PDF OCR is a convenient program that offers its users a fast and easy way of turning pdf documents into text. The program supports ten different languages and is compatible with all 32-bit and 64-bit editions of the Microsoft Windows operating system.
A alternative is Free OCR Scanning which is an online service that can process pdf files among others.
Thank you for your article. I will do my best to add the grammar feature on next version
Just tried it. Scanned it with Avira and McAfee, no detections. Tried it on a convoluted government PDF doc. Works beautifully.
Author of the software might want to note that the install path is pdfPCR. I believe it should be corrected to pdfOCR?
Tried it too. Great suggestion. Will use often.
Doesn’t Acrobat have its own OCR implementation when you scan documents directly in Acrobat?
It shuts down automatically before even OCR begins.
Does not work, unfortunately.
Does not work.
UI does not look like that in tutorial.
Result is a pdf page, which states that programme is not registered. Only way to register is to buy it.
So it is free to download, but not free to use!!!
I HATE it when people abuse my time like that!! Why not state upfront that the programme is not free??? Even if I had the money, I will NOT buy from a supplier that treat me like this!!!
Why does GHack states that it is free???
Lorraine, the article is from 2010 and was last updated in 2012. It is likely that the developer made a change that we have not reviewed yet. Thanks to your input, we will do so and update the article accordingly.