Scanning tips, Recognize text in scanned documents, Recognize text in a single document – Adobe Acrobat XI User Manual

Page 139: Recognize text in multiple documents

132

Creating PDFs

Last updated 1/14/2015

Scanning tips

• Acrobat scanning accepts images between 10 dpi and 3000 dpi. If you select Searchable Image or ClearScan for PDF

Output Style, input resolution of 72 dpi or higher is required. Also, input resolution higher than 600 dpi is
downsampled to 600 dpi or lower.

• To apply lossless compression to a scanned image, select one of these options under the Optimization Options in

the Optimize Scanned PDF dialog box: CCITT Group 4 for monochrome images, or Lossless for color or grayscale
images. If this image is appended to a PDF document, and you save the file using the Save option, the scanned image
remains uncompressed. If you save the PDF using Save As, the scanned image may be compressed.

• For most pages, black-and-white scanning at 300 dpi produces text best suited for conversion. At 150 dpi, OCR

accuracy is slightly lower, and more font-recognition errors occur; at 400 dpi and higher resolution, processing
slows, and compressed pages are bigger. If a page has many unrecognized words or small text (9 points or smaller),
try scanning at higher resolution. Scan in black and white whenever possible.

• When Recognize Text Using OCR is disabled, full 10-to-3000 dpi resolution range may be used, but the

recommended resolution is 72 and higher dpi. For Adaptive Compression, 300 dpi is recommended for grayscale or
RGB input, or 600 dpi for black-and-white input.

• Pages scanned in 24-bit color, 300 dpi, at 8-1/2–by-11 in. (21.59-by-27.94 cm) result in large images (25 MB) before

compression. Your system may require 50 MB of virtual memory or more to scan the image. At 600 dpi, both
scanning and processing typically are about four times slower than at 300 dpi.

• Avoid dithering or halftone scanner settings. These settings can improve the appearance of photographs, but they

make it difficult to recognize text.

• For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has color-

filtering capability, consider using a filter or lamp that drops out the background color. Or if the text isn’t crisp or
drops out, try adjusting scanner contrast and brightness to clarify the scan.

• If your scanner has a manual brightness control, adjust it so that characters are clean and well formed. If characters

are touching, use a higher (brighter) setting. If characters are separated, use a lower (darker) setting.

Recognize text in scanned documents

You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF.
Optical character recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF. To
apply OCR to a PDF, the original scanner resolution must have been set at 72 dpi or higher.

Note: Scanning at 300 dpi produces the best text for conversion. At 150 dpi, OCR accuracy is slightly lower.

Recognize text in a single document

Open the scanned PDF.

Choose Tools > Text Recognition > In This File.

In the Recognize Text dialog box, select an option under Pages.

Optionally, click Edit to open the Recognize Text - General Settings dialog box, and specify the options as needed.

Recognize text in multiple documents

In Acrobat, choose Tools > Text Recognition > In Multiple Files.

In the Recognize Text dialog box, click Add Files, and choose Add Files, Add Folders, or Add Open Files. Then select
the files or folder.

In the Output Options dialog box, specify a target folder for output files, and filename preferences.

This manual is related to the following products:

Acrobat X Standard Acrobat X PRO Acrobat DC Acrobat 9 Standard Acrobat 9 PRO