Scanning tips, Recognize text in scanned documents, Recognize text in a single document – Adobe Acrobat XI User Manual
Page 139: Recognize text in multiple documents

132
Creating PDFs
Last updated 1/14/2015
Scanning tips
• Acrobat scanning accepts images between 10 dpi and 3000 dpi. If you select Searchable Image or ClearScan for PDF
Output Style, input resolution of 72 dpi or higher is required. Also, input resolution higher than 600 dpi is
downsampled to 600 dpi or lower.
• To apply lossless compression to a scanned image, select one of these options under the Optimization Options in
the Optimize Scanned PDF dialog box: CCITT Group 4 for monochrome images, or Lossless for color or grayscale
images. If this image is appended to a PDF document, and you save the file using the Save option, the scanned image
remains uncompressed. If you save the PDF using Save As, the scanned image may be compressed.
• For most pages, black-and-white scanning at 300 dpi produces text best suited for conversion. At 150 dpi, OCR
accuracy is slightly lower, and more font-recognition errors occur; at 400 dpi and higher resolution, processing
slows, and compressed pages are bigger. If a page has many unrecognized words or small text (9 points or smaller),
try scanning at higher resolution. Scan in black and white whenever possible.
• When Recognize Text Using OCR is disabled, full 10-to-3000 dpi resolution range may be used, but the
recommended resolution is 72 and higher dpi. For Adaptive Compression, 300 dpi is recommended for grayscale or
RGB input, or 600 dpi for black-and-white input.
• Pages scanned in 24-bit color, 300 dpi, at 8-1/2–by-11 in. (21.59-by-27.94 cm) result in large images (25 MB) before
compression. Your system may require 50 MB of virtual memory or more to scan the image. At 600 dpi, both
scanning and processing typically are about four times slower than at 300 dpi.
• Avoid dithering or halftone scanner settings. These settings can improve the appearance of photographs, but they
make it difficult to recognize text.
• For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has color-
filtering capability, consider using a filter or lamp that drops out the background color. Or if the text isn’t crisp or
drops out, try adjusting scanner contrast and brightness to clarify the scan.
• If your scanner has a manual brightness control, adjust it so that characters are clean and well formed. If characters
are touching, use a higher (brighter) setting. If characters are separated, use a lower (darker) setting.
Recognize text in scanned documents
You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF.
Optical character recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. To
apply OCR to a PDF, the original scanner resolution must have been set at 72 dpi or higher.
Note: Scanning at 300 dpi produces the best text for conversion. At 150 dpi, OCR accuracy is slightly lower.
Recognize text in a single document
1
Open the scanned PDF.
2
Choose Tools > Text Recognition > In This File.
3
In the Recognize Text dialog box, select an option under Pages.
4
Optionally, click Edit to open the Recognize Text - General Settings dialog box, and specify the options as needed.
Recognize text in multiple documents
1
In Acrobat, choose Tools > Text Recognition > In Multiple Files.
2
In the Recognize Text dialog box, click Add Files, and choose Add Files, Add Folders, or Add Open Files. Then select
the files or folder.
3
In the Output Options dialog box, specify a target folder for output files, and filename preferences.