Copy Text From Images Using Gttext
While it may not happen too often, you sometimes may want to copy text from an image into a document. It is certainly possible to type the text manually into the document which may be ok if it consists of a few words or sentences. But what if it is an image full of text? Maybe you have received a fax, or a document copy in image format that someone attached to an email.
Gttext is a free open source program for the Windows operating system to identify text in images and copy it to the Windows clipboard. The Ground Truthing tool for Color Images with Text needs to be installed before it can be used to copy text from images to the clipboard.
The program supports a variety of image formats including the popular jpg and png formats as well as bmp, tiff and gif. You start by loading an image into the program. One issue that I had was with the available file filter in the browser, as it offered separate filters for all image formats so that it was necessary to switch to the right filter before the image file would appear in the file browser.
All you then need to do in best case is to draw a rectangle around the text on the image that you want to copy. The program displays the copied text that it identified automatically in a popup with options to cancel, try again or to continue (copy to clipboard).
Try again will run the text recognition again to correct possible errors that were made in a previous run. The text recognition software supports various tools to optimize the image for identifying text. This includes zooming in or out, or modifying the documents brightness among other tools.
Another interesting feature is the ability to extract all text at once without selecting the text first. This is done with a click on Tools > Copy Text From > Full Image.
The text recognition algorithm of Gttext is solid, and worked very well on several document scans that I had in image format on my PC. You do need to go over the results though as they may contain errors that you need to correct manually.
Windows users can download Gttext from the project's Google Code project website. The program is compatible with 32-bit and 64-bit editions of the Microsoft Windows operating system.
Update: The program is no longer hosted on Google Code due to Google Code shutting down. You find it on its own domain SoftOCR now from where it can be downloaded.
https://www.virustotal.com/file/b279107f7a70cc7bc7be0361787a36076faf852f73faf4cc811c5a15ee06b4bd/analysis/
It seems they have fixed the problem. The author explains the issue in this task
http://code.google.com/p/gttext/issues/detail?id=1&can=1
Latest virus report
https://www.virustotal.com/file/b279107f7a70cc7bc7be0361787a36076faf852f73faf4cc811c5a15ee06b4bd/analysis/
It seems he contacted with the antiviruses with the help of some users.
Also, their new website since googlecode closes.
Softocr.com
I got the same warning from Kasp – decided to try it anyway. The exe wouldn’t extract – Win7-64 says it’s not a valid win32 program.
Say wha…?
Metaphorically kissing the ground you walk on at the moment. Thank you. I have been wanting to be able to do this for the longest time and had no idea such a program was available. Again, thank you.
AVG has now clearing the file.
Only Kaspersky flags the file.
Kaspersky is not infallible.
MSE and MBAM gave it clean cheat. So, most probably false positive.
I’m also getting the same Trojan alerts
as @jack, (above).
Hmmm…not reassuring at all…
Will err on the side of caution
and pass on this one!
My onboard Kaspersky reports this as containing a Trojan. As do the VirusTotal versions of Kaspersky and AVG. But another 41 applications in VirusTotal give it the all clear. Normally I’d count this as a couple of false positives, but I regard Kaspersky and AVG as among the better ones.
Can anyone comment? It’s certainly an application which would fill a definite need.
Good question. If it is a generic trojan it would hint at a false positive. Maybe try contacting Kaspersky to get them to analyze the file.