Recognize text in scanned documents – Adobe Acrobat 8 3D User Manual

Page 75

ADOBE ACROBAT 3D VERSION 8

User Guide

•

When Recognize Text Using OCR is disabled, full 10-to-3000 ppi resolution range may be used, but the recom
mended resolution is 72 and higher ppi. For Adaptive compression, 300 ppi is recommended for grayscale or RGB
input, or 600 ppi for black-and-white input.

•

Pages scanned in 24-bit color, 300 ppi, at 8-1/2–by-11 inches (21.59-by-27.94 cm) result in large images (25 MB)
prior to compression. Your system may require 50 MB of virtual memory or more to scan the image. At 600 ppi,
both scanning and processing typically are about four times slower than at 300 ppi.

•

Avoid dithering or halftone scanner settings. These can improve the appearance of photographs, but they make it
difficult to recognize text.

•

For text printed on colored paper, try increasing the brightness and contrast by about 10%. If your scanner has
color-filtering capability, consider using a filter or lamp that drops out the background color. Or if the text isn’t
crisp or drops out, try adjusting scanner contrast and brightness to clarify the scan.

•

If your scanner has a manual brightness control, adjust it so that characters are clean and well formed. If characters
are touching, use a higher (brighter) setting. If characters are separated, use a lower (darker) setting.

Recognize text in scanned documents

You can use Acrobat to recognize text in previously scanned documents that have already been converted to PDF.

OCR runs with header/footer/Bates number on image PDF files.

Open the scanned PDF.

Choose Document > OCR Text Recognition > Recognize Text Using OCR.

In the Recognize Text dialog box, select an option under Pages.

(Optional) Click Edit to open the Recognize Text - Settings dialog box, and select the options you want to use.

Recognize Text - Settings

Optical Character Recognition (OCR) software enables you to search, correct, and copy the text in a scanned PDF.
If you do not apply OCR when you create a PDF by scanning a paper document, you can apply OCR to the PDF later
if you have set the scanner resolution at 72 ppi and higher.

OCR runs with header/footer/Bates number on image PDF files.

Primary OCR Language

Specifies the language for the OCR engine to use to identify the characters.

PDF Output Style

Determines the type of PDF to be produced. All options require an input resolution of 72 ppi or

higher (recommended). All formats apply OCR and font and page recognition to the text images and convert them
to normal text.

•

Searchable Image

Ensures that text is searchable and selectable. This option keeps the original image, deskews it

as needed, and places an invisible text layer over it. The selection for Downsample Images in this same dialog box
determines whether or not the image will be downsampled and to what extent.

•

Searchable Image (Exact)

Ensures that text is searchable and selectable. This option keeps the original image and

places an invisible text layer over it. Recommended for cases requiring maximum fidelity to the original image.

•

Formatted Text & Graphics

Reconstructs the original page using recognized text, fonts, and graphic elements. The

accuracy of the results depends on the scanning resolution and other factors. You may need to review and correct
the OCR text in the new PDF page after scanning.

Note: The Formatted Text & Graphics option is available for only some languages.

This manual is related to the following products:

Acrobat 8 Standard Acrobat 8 Professional