Google Docs OCR Demonstration
One cannot really deny the fact that Google is constantly working on new features for their popular online services like Gmail or Google Docs. The latest feature is currently available as a demonstration only and not yet integrated into Google Docs. The Google Docs OCR demonstration can OCR the three image formats jpg, png and gif. Google lists the following limitations that are currently in place:
- Files must be fairly high-resolution -- rule of thumb is 10 pixel character height.
- Maximum file size: 10MB, maximum resolution: 25 mega pixel
- The larger the file, the longer the OCR operation will take (500K: ~15s, 2MB: ~40s, 10MB: forever)
Supported image formats that are uploaded on the demonstration page will be turned into text documents and displayed in Google Docs once the process has been completed. The quality depends largely on the quality of the image. It is usually necessary to look over the text and correct errors that have been made during character recognition. Google Docs helps in the error correction by underlining unknown words in red in its interface. It still takes some time to correct the errors.
The OCR demonstration is linked to a Google Docs account but not integrated into Google Docs yet. It is very likely that Google will integrated OCR capabilities to Google Docs in the near future. You can use the demonstration page for now to test the OCR service.
Update: Google has shut down the test server, so that the demonstration page is no longer available. The OCR feature has however been implemented into Google Docs, at least when it comes to pdf or image file formats that you upload to Google Docs.
The support of image file formats basically lets you use the service with any type of document format, as you could make a screenshot and upload the image file to use Google Docs' OCR feature.Advertisement
Thats great! now spammers will have another tool to solve captchas!XD
Nice, I took a screen shot of google books and passed in into the google OCR and it did get the titles and larger font headings, but the body of the text was not scan-able.
None of my images get uploaded… Even the sample one. Why the hell it is happening???