PDF OCR Turns PDF Documents Into Text - gHacks Tech News

PDF OCR Turns PDF Documents Into Text

It sometimes happens that text in a pdf document cannot be selected in a pdf reader like Adobe Reader or Foxit Reader. This is usually the case with scanned documents that have been embedded into the pdf file.

One of the options to work with the text in those pdf documents is to use OCR technology to convert the information to text you can work with.

OCR means optical character recognition which uses an algorithm to identify the characters displayed in a pdf file to export them into a plain text document or other supported file format.

PDF OCR is a free software program for the Windows operating system that can turn pdf documents into editable text.

Update: The most recent free version of PDF OCR is severely limited. The PDF OCR tool can only process three pages, and the image to pdf tool displays a big watermark in the resulting PDF document. This makes the free version of the program unusable for most tasks.

The interface is divided into two areas that are independent from each other. The first window loads the pdf document and displays its contents in its interface. All pages are displayed on the left and it is possible to read the pdf right on the screen.

The Start OCR button displays a configuration window for the OCR process. It is possible to OCR all pages, a selection of pages or only the current page.

The progress and status is displayed right in the window and all processed pages are displayed in the second window afterwards.

The PDF OCR Editor is a basic text editor that can theoretically be used to edit the text right away. The OCR process naturally misinterprets some of the characters which have to be edited afterwards.

The text editor can export the converted text as a text or doc document which indicates the second possibility of editing the text.

It usually makes sense to save the processed pdf as a doc and load it into a text processing application like Microsoft Word which offers spell and grammar checking.

PDF OCR is a convenient program that offers its users a fast and easy way of turning pdf documents into text. The program supports ten different languages and is compatible with all 32-bit and 64-bit editions of the Microsoft Windows operating system.

A alternative is Free OCR Scanning which is an online service that can process pdf files among others.

Summary
software image
Author Rating
1stargraygraygraygray
no rating based on 0 votes
Software Name
PDF OCR Free
Operating System
Windows
Landing Page
Advertisement

We need your help

Advertising revenue is falling fast across the Internet, and independently-run sites like Ghacks are hit hardest by it. The advertising model in its current form is coming to an end, and we have to find other ways to continue operating this site.

We are committed to keeping our content free and independent, which means no paywalls, no sponsored posts, no annoying ad formats or subscription fees.

If you like our content, and would like to help, please consider making a contribution:


Previous Post: «
Next Post: »

Comments

  1. PDF OCR said on March 11, 2010 at 5:32 pm
    Reply

    Thank you for your article. I will do my best to add the grammar feature on next version

  2. DanTe said on March 11, 2010 at 8:09 pm
    Reply

    Just tried it. Scanned it with Avira and McAfee, no detections. Tried it on a convoluted government PDF doc. Works beautifully.

    Author of the software might want to note that the install path is pdfPCR. I believe it should be corrected to pdfOCR?

  3. Eric said on March 12, 2010 at 2:23 am
    Reply

    Tried it too. Great suggestion. Will use often.

  4. David Levin said on March 14, 2010 at 6:31 pm
    Reply

    Doesn’t Acrobat have its own OCR implementation when you scan documents directly in Acrobat?

  5. Luiz said on April 7, 2010 at 12:26 am
    Reply

    It shuts down automatically before even OCR begins.
    Does not work, unfortunately.

  6. Lorraine said on August 9, 2015 at 8:43 pm
    Reply

    Does not work.

    UI does not look like that in tutorial.

    Result is a pdf page, which states that programme is not registered. Only way to register is to buy it.

    So it is free to download, but not free to use!!!

    I HATE it when people abuse my time like that!! Why not state upfront that the programme is not free??? Even if I had the money, I will NOT buy from a supplier that treat me like this!!!

    Why does GHack states that it is free???

    1. Martin Brinkmann said on August 9, 2015 at 10:55 pm
      Reply

      Lorraine, the article is from 2010 and was last updated in 2012. It is likely that the developer made a change that we have not reviewed yet. Thanks to your input, we will do so and update the article accordingly.

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

Please note that your comment may not appear immediately after you post it.