If you ever had to copy text of an image or scanned document, you know that you have two basic options to do so.
You can either copy the text manually which may take quite some time depending on the length of it and quality of document, or you may use OCR software instead.
Optical Character Recognition software can speed up the process, and while it is not infallible and requires you to go through that text to correct any mistakes made during the recognition process, it may save you a lot of time.
We have reviewed Project Naptha for Google Chrome only recently which added the functionality to the browser. While it works well on the web, it won't really help you with local documents at all.
FreeOCR for Windows provides you with two modes of operation. You can use it to open existing image files or pdf documents, or use the built-in scan functionality to scan and process documents that are not available in electronic form yet.
The program interface is very simple. You find a main toolbar at the top which you use to load a document. You can either select open to load an image, open PDF to load a pdf document, or scan to use a connected scanner to scan a paper document.
If you select the scan option, make sure that the scanner is set to at least 300 DPI during the scan for best results.
The document is displayed on the left side of the main area. You can flip pages here if it is a multi-page document, and use other functionality such as zoom, rotation or fit to screen functionality.
A click on the OCR button at the top enables you to run optical character recognition on the current page or all pages. You can use the selection tool on the left page to only OCR text of the selected area.
The process is fast and should not take long. Results are automatically displayed on the right side. This side works like a text editor, which means that you can make corrections here directly before you save or copy the information.
The program uses the Tesseract OCR engine and is regularly updated.
The program works really well if you load black text on white background documents into it. The OCR was near-perfect every time under those conditions.
The output quality goes down if the quality of the source document or image is not the highest. While it may still be able to determine some or even most characters, you will have to edit the resulting text afterwards as it will contain errors.
Ghacks is a technology news blog that was founded in 2005 by Martin Brinkmann. It has since then become one of the most popular tech news sites on the Internet with five authors and regular contributions from freelance writers.