ANSWERS: 3
  • That is such a good question and I don't have the answer. I simply want to thank you for asking it because I would like to know also. I have had more than one occasion where I could have used a feature like that. I know that such programs exist but am unsure about freeware. Thanks again!
  • I don't think you're likely to see workable and useful OCR (optical character recognition) software for a long, long time. This is something that has been worked on for a long time by a lot of people -- without a lot of useful software to show for it. There are paid programs out there that can do "some" OCR and conversion, but results are spotty. If you're talking about a hardware scan, such as a copy machine, then you're introducing that hardware element which will certainly not be free. But as for .PDF files and such, you already have the capability within Adobe and other PDF readers to capture "plain text" (assuming that the .PDF file was created from a text document conversion to PDF, and not a scanned file to begin with. Part of the problem is that there is so much variability among fonts that letters in one font may be rendered completely differently in another. And the program can't rely upon spelling and grammar rules to help determine what a word might be, because humans often make up their own grammar and syntax rules. Then when you add in elements of vernacular and slang, specialized vocabulary, abbreviations, mixed fonts, mixed sizes, bold, italic, superscript, subscript, text size differences ... you start to get an idea of the complexity of the problem.
  • Try SimpleOCR http://www.simpleocr.com/. I recommended it to someone else who got very good results with it. It is the cut-down free version of a paid-for program, but the free version seems to be able to do quite a lot of work.

Copyright 2023, Wired Ivy, LLC

Answerbag | Terms of Service | Privacy Policy