beautypg.com

Locators, Field settings, Ocr confidence – Kofax Transformation Modules Invoice Pack 1.0 User Manual

Page 23

background image

Configuration

Kofax Transformation Modules Invoice Pack - Configuration Guide

23

Locators

The DocumentType field uses the following locators:

ƒ

FL_DocumentType – This format locator contains “regular expressions”
which are in fact just keywords used to identify a credit note. No settings
exist at the invoice level, all setup is done at the locale level with the locator
overridden.

ƒ

SL_DocumentType – This script locator is also blank at the invoice level, with
all code sitting instead at the locale level with the locator overridden. This
locator checks the results from FL_DocumentType and performs some simple
logic. If the format locator has retrieved no results, it assumes that the
document is an invoice and sets the field accordingly. If the format locator has
found some results, it means the document may be a credit note – in this case
the field is set but made unconfident so that it will be flagged to a validation
operator.

Field Settings

This field takes its result from the SL_DocumentType script locator. It includes no
formatting or validation, beyond being a limited choice dropdown field on the
validation form.

OCR Confidence

The OCR Confidence functionality exists for all extraction fields described above. It
allows some additional processing not normally available. Within the config.xml file
there are a list of fields in the project, along with associated OCR confidence
thresholds. These thresholds are used to indicate how confident the extraction must
be that the text has been correctly read from the document in order for a field to be
valid.

For the OCR Confidence functionality, the “Reread Options” field functionality is
modified to behave in a slightly different manner. The OCR Confidence functionality
is only used if an OCR profile is selected for rereading. The “min. confidence to
accept reread result” slider is then used to store our confidence threshold. This
threshold cannot be manually set here, as its value is taken from the config.xml file
and set at runtime.

The result of these changes is to provide the following logic:

1. When a field has retrieved its results from a locator, the text found is reread

using the selected OCR profile.