beautypg.com

Representative documents, Document set management steps, Create configuration – Kofax INDICIUS 6.0 User Manual

Page 46: Recognition

background image

Chapter 4

36

Getting Started Guide (Classification and Separation)

Representative Documents

It is important that the sample documents are scanned using the production scanner
and represent the variations that are seen in production, for example faxes and
photocopies. If extraction (indexing) is being implemented as well as classification
and separation, it is recommended that the documents are scanned at 300 dpi.

Document Set Management Steps

The following steps are used to create two accurate document sets, which are then
used to configure and test a solution.

Step 1: Create Project

: Open Transformation Studio and create a new project.

Step 2: Import Documents

: Import documents, optionally with document properties.

Step 3: Initial Analysis

: Get an overview of your document set.

Step 4: Select Sample Documents for Configuration

: Select a subset of documents to

cleanup and use for configuration and testing.

Step 5: Read Page Content

: Read (OCR) all the pages in the documents selected for

configuration. Transformation Studio will use the reads in the next step.

Step 6: Cleanup Documents

: Within this step you will analyze your document set,

cleanup the documents and add more samples until the set is ready to be used for
configuration.

Step 7: Select Documents for Testing

: From the clean document set, select a set of

documents to use for testing. These documents must not be used for configuration.

Create Configuration

Recognition

The Recognition module uses classifiers and separators to determine how a batch of
pages is split into documents, and to determine the type of each document.

Classifiers may be based on image (at page-level) or text content and are built from a
set of sample documents. These learn-by-example classifiers can be supplemented
with manually configured templated (including barcode) or rules-based classification
methods.