Scan Images to Text in Microsoft Word
Posted by Gina Trapani at 9:30 AM on January 10, 2008

Tech help site Of Zen and Computing describes how to use Microsoft Office to do Optical Character Recognition (OCR)—that is, recognize text inside digital images (like scanned documents). The Microsoft Document Imaging application comes with Microsoft Office (who knew?) and can grok text from TIFF images. Haven't tried this one myself, but after wrestling with various OCR apps several years ago, my expectations are low. What's your favorite OCR application or method? Tell us about it in the comments.
Tags: microsoft office | MICROSOFT OFFICE TIP | microsoft word | ocr | OPTICAL CHARACTER RECOGNITION | windows | word

Comments (AU Comments · US Comments)
Lav
Posted January 10, 2008 12:49 PM
ABBYY and acrobat if I'm lazy since it adds ocr on top of the image.
craig barry
Posted July 3, 2008 3:40 PM
Hi all
I compile alot of reports, which include photos in my occupation.
I print them to an MDI file from Adobe Photo Shop Starter Edition 3.2. which collates them brilliantly, however, I can't seem to insert header/footer onto the pages.
Does anyone know if this is possible, and how, or is there a better way to go about it?
Any advice will be greatly appreciated
Cheers
Craig
branden
Posted July 22, 2008 12:28 PM
i have scan some paper work off my printer and i brought it up in microsoft word now i dont know how to be able to write on the document i want to incert txt to a document that has text already on it
fsmontenegro
Posted 12:28 PM 9/1/08
On the topic of OCR, is anyone aware of any image recognition solution (sorry, Windows here...) that can translate diagrams? You know, when we draw boxes and circles connecting to each other - be it a network diagram or a flowchart or a process map or... - it would be great to have that automagically transferred to Visio (or other diagramming tool).
Any insights?
fsmontenegro
marksman7328
Posted 1:40 PM 9/1/08
This was great for my Chemistry class. Just scan in the book pages and convert to a Word document for quick and easy notes. (I don't own the chemistry book so I can't highlight the real pages or write in it.) It is pretty good at recognizing characters (although you will want to proofread). It had trouble reading subscripts in chemical formulas, but it was easy to just paste the formula as an image in the word document. This was also great for copying graphs and pictures from my book into my notes since I had the page scanned in already anyway.
marksman7328
joelena
Posted 1:38 PM 9/1/08
I've used JOCR a few times. Review: [www.freewaregenius.com]
It's OCR capabilities are the same as the MS Office method, since it uses the MS Office OCR engine (meaning you must have Microsoft Document Imaging from Office 2003 or later installed to use it). It provides a pretty nice screen clipping interface, loads quickly (it's USB portable, too), and dumps text to Notepad.
joelena
philosopher_dog
Posted 1:33 PM 9/1/08
I like ABBYY Finereader and other products by them. If you're scanning documents and are using ocr they're the state of the art I believe. I'd be surprised if Word's feature was worth the trouble.
philosopher_dog
sumocat
Posted 1:28 PM 9/1/08
I typically use Acrobat, but that's largely because our work scanner outputs PDF. If this ever came up at home, I'd use MDI.
sumocat
wrussf
Posted 1:10 PM 9/1/08
This did a fair job with printed characters; not so much with handwriting.
wrussf
oneshot417
Posted 3:10 PM 9/1/08
4th vote on abbyy
oneshot417
getjustin
Posted 2:55 PM 9/1/08
wHyy w0uld 1 need M5 Off1ce..? My sc@nner"s s0ftwar3 worrks f1ne .
getjustin
aeronaut
Posted 2:43 PM 9/1/08
@fsmontenegro: There is a lot of raster to vector software out there, most of it does not work very well. Even high end software from GE Energy, ESRI and Autodesk usually results in a fair amount of clean up. Often it's easier just to trace or recreate the drawing.
Getting a vector format from complicated engineering drawing and map rasters is usually done by on-screen tracing.
aeronaut
aeronaut
Posted 2:36 PM 9/1/08
The MS product works fine for capturing text without the paragraph formating. I prefer ABBYY, especially when working with foreign language text.
aeronaut
GroovyMojo
Posted 2:25 PM 9/1/08
I'm not sure if I'm blown away by the fact that Office has OCR built in and I never knew it, or by the fact that you used the word "grok." That just made me happy. Thank you.
GroovyMojo
Capone
Posted 2:23 PM 9/1/08
Third vote for ABBYY.
Capone
da5id_nz
Posted 2:00 PM 9/1/08
@philosopher_dog: Second for Abbyy. Very accurate. I used it at work. Some of the Nuance stuff (the guys who do Dragon Naturally Speaking) is supposed to be very good, too.
da5id_nz
Sal-Monella
Posted 1:48 PM 9/1/08
Text recognition on a good clean scan or fax is in the 99% range. Unfortunately any white space formatting is lost, so that the resulting Word document needs a LOT of cleanup.
Sal-Monella
ecbtln
Posted 3:47 PM 9/1/08
Acrobat 8 Professional
ecbtln
takemetoyourtoaster
Posted 6:32 PM 9/1/08
is there a free ocr program for mac?
takemetoyourtoaster
LordDaMan
Posted 8:16 PM 9/1/08
There's a very handy feature with office imaging. When you install that it adds a printer called "Microsoft Office Document Image Writer". If you print to that, the imaging program will open up with the printed page as a file. Then just use that for ocr. Handy for files where you can't get the text from normally
LordDaMan
SouldrinK
Posted 9:39 PM 9/1/08
@fsmontenegro: I would say one of your best bets is vector magic. Just scan in your diagrams in a raster-ized format, and from there you can take the eps and export it to a number of different formats, including dwg and whatnot. Best of luck!
SouldrinK
Turis
Posted 12:20 AM 10/1/08
Vote fore ABBYY Finereader.
Turis
rampras
Posted 11:37 PM 9/1/08
Of all the OCRs that I have worked with, ABBYY Finereader seems to be the best in its job.
rampras
runiteking1
Posted 2:21 PM 13/1/08
@fsmontenegro: I think GraphClick [www.arizona-software.ch] by the makers of ProVoc if you have a Mac works
runiteking1