Special invoice processing technology, Knowledge bases – Kofax Getting Started with Ascent Xtrata Pro User Manual
Page 32

Overview
Ascent Xtrata Pro User's Guide
13
• Additional fees and tolls
These fields are read by a pre-trained system that can already recognize a certain
percentage of invoices. Since additional information is created during the data
extraction process, this information can be used to improve the recognition of invoice
data through additional training.
In addition to the preconfigured items, fields can be added to an invoice project
specifically for the extraction of additional information. Data for these fields are
extracted using “locators.” Locators are special algorithms that encompass a variety
of methods for extracting invoice data. For instance data can be read from bar codes,
fields with specific formatting, or by database lookup.
Special Invoice Processing Technology
The following sections give a short overview of the special invoice processing
capabilities of Ascent Xtrata Pro.
Knowledge Bases
Invoice projects make use of a learning system that needs very little user intervention
to create a working invoice project.
Knowledge bases are binary files used to store extraction patterns. A knowledge base
is relatively compact. For example, a knowledge base for 341 trained invoices might
be about 60 Kbytes. This size roughly increases linearly, such that for 5,000 trained
invoices, the knowledge base will be about 1 Mbyte.
When a knowledge base is imported into a new project, this inherited store of
knowledge makes it possible for that project to immediately extract data from a
certain percentage of invoices. A single project may have multiple knowledge bases.
Documents that were not properly extracted can then be used to improve the
extraction results for your project. This training is typically the responsibility of the
system administrator who will process sample documents that have been placed in a
training set. The training session will create new extraction patterns that are stored
with the project.
In addition, these new extraction patterns can be made portable by adding them to a
knowledge base. If this is done, all projects using that knowledge base will benefit
from the training. It is important to note that only the relevant extraction pattern