"A few things to know about Tesseract OCR: for now it only supports the English language, and does not include a page layout analysis module (yet), so it will perform poorly on multi-column material. It also doesn't do well on grayscale and color documents, and it's not nearly as accurate as some of the best commercial OCR packages out there. Yet, as far as we know, despite its shortcomings, Tesseract is far more accurate than any other Open Source OCR package out there."
OCR is useful for Google Book Search and it could be useful for Picasa or Image Search in addition to an object recognition engine. And, if Google improves the software, it could be launched as a successful alternative to commercial applications. Currently, the software has no UI and it can be run in Linux and Windows.
Related:
Use camera phones for OCR
No comments:
Post a Comment