Preparing pdfs for indexing, Adding metadata to document properties – Adobe Acrobat 9 PRO Extended User Manual
Page 373
367
USING ACROBAT 9 PRO EXTENDED
Searching and indexing
Last updated 9/30/2011
Preparing PDFs for indexing
Begin by creating a folder to contain the PDFs you want to index. All PDFs should be complete in both content and
electronic features, such as links, bookmarks, and form fields. If the files to be indexed include scanned documents,
make sure that the text is searchable. Break long documents into smaller, chapter-sized files, to improve search
performance. You can also add information to a file’s document properties to improve the file’s searchability.
Before you index a document collection, it’s essential that you set up the document structure on the disk drive or
network server volume and verify cross-platform filenames. Filenames may become truncated and hard to retrieve in
a cross-platform search. To prevent this problem, consider these guidelines:
•
Rename files, folders, and indexes using the MS-DOS file-naming convention (eight characters or fewer followed
by a three-character file extension), particularly if you plan to deliver the document collection and index on an ISO
9660-formatted CD-ROM disc.
•
Remove extended characters, such as accented characters and non-English characters, from file and folder names.
(The font used by the Catalog feature does not support character codes 133 through 159.)
•
Don’t use deeply nested folders or path names that exceed 256 characters for indexes that will be searched by
Mac
OS users.
•
If you use Mac OS with an OS/2 LAN server, configure IBM® LAN Server Macintosh (LSM) to enforce MS-DOS
file-naming conventions, or index only FAT (File Allocation Table) volumes. (HPFS [High Performance File
System] volumes may contain long unretrievable filenames.)
If the document structure includes subfolders that you don’t want indexed, you can exclude them during the
indexing process.
Adding metadata to document properties
To make a PDF easier to search, you can add file information, called metadata, to the document properties. (You can
see the properties for the currently open PDF by choosing File > Properties, and clicking the Description tab.)
(Windows) You can also enter and read the data properties information from the desktop. Right-click the document
in Windows Explorer, choose Properties, and click the PDF tab. Any information you type or edit in this dialog box
also appears in the Document Properties Description when you open the file.
When adding data for document properties, consider the following recommendations:
•
Use a good descriptive title in the Title field. The filename of the document should appear in the Search Results
dialog box.
•
Always use the same option (field) for similar information. For example, don’t add an important term to the Subject
option for some documents and to the Keywords option for others.
•
Use a single, consistent term for the same information. For example, don’t use biology for some documents and life
sciences for others.
•
Use the Author option to identify the group responsible for the document. For example, the author of a hiring
policy document might be the Human Resources department.
•
If you use document part numbers, add them as keywords. For example, adding doc#=m234 in Keywords could
indicate a specific document in a series of several hundred documents on a particular subject.
•
Use the Subject or Keywords option, either alone or together, to categorize documents by type. For example, you
might use status report as a Subject entry and monthly or weekly as a Keywords entry for a single document.