beautypg.com

Dictionaries – Kofax Getting Started with Ascent Xtrata Pro User Manual

Page 186

background image

Extraction

Ascent Xtrata Pro User's Guide

167

In the above example, an eight-digit invoice number (Rechnungsnummer) was
specified using a simple regular expression that also fits for other numeric values.
Since no keyword is defined, the order number “65005285” was not identified
exclusively. The result of the test is shown above.

Figure 4-35. Locate a 8-Digit Invoice Number (With Keyword “Invoice”)

Adding the keyword “Invoice” that is above the expected item changes the
confidence of all other format matches to 0%, while the invoice number is returned as
the best match with a confidence of 100%.

Note

Keywords are searched using a fault tolerant search, such that misspelled

keywords and words containing OCR errors will be found. The better the match of a
word with a keyword, the higher the confidence of the matched format.

Dictionaries

Dictionaries can be used to locate complex expressions that also contain words. It is
quite easy to define a format for dates, as long as a numeric format is used. But when