beautypg.com

2 spectral data sample selection methods, Spectral data sample selection methods – Metrohm Vision Manual User Manual

Page 125

background image

▪▪▪▪▪▪▪

123

5.2.2

Spectral Data Sample Selection Methods

Three methods are available for sample selection based on spectral information: random selection,
selection by PCA, and selection by wavelength distance.

Random Selection

The Random Selection method selects samples at random for calibration and validation sets.
No outlier set is created when this method is used.

The Math Treatment option is not available when Random Selection is chosen.

Selection by Mahalanobis Distance

Selection by PCA first calculates a principal component model for a product. All samples with
Mahalanobis distances from the center of the distribution greater than a user-defined
threshold (default 0.6 for match value or 0.95 for probability) are flagged as outliers. Samples
located in high-density regions of the population (nearest neighbors) are identified as
redundant, so that the Euclidean distances between samples in the training set is greater than
the threshold.

Selection by Wavelength Distance

Selection by wavelength distance uses a maximum distance concept (maximum conformity
index) to identify outliers. Samples with the maximum distance from the mean product
spectrum greater than the threshold (default value 3.0) are placed in the outlier set. Selection
of redundant samples uses a method similar to selection by PCA, except that the Euclidean
distance in the wavelength space is used.

Lab Data Sample Selection Method

This sample selection method is based on the results from the reference analysis. The program
displays the histogram of the reference values distribution, and allows user to correct this
distribution by moving samples between calibration, validation, and outlier sets. Also, the
errors from entering incorrect concentration values can be corrected.

Combined Sample Selection Method

In this method, the user can create a temporary set of samples without knowing the
corresponding reference values. The temporary set is processed by spectral sample selection,
and the calibration, validation, and outlier sets are saved. The samples belonging to the
calibration set are then analyzed.

The saved set undergoes the Lab Data Direct sample Selection. This sample selection method
does not start from the temporary set, but uses the previously saved sets. The reference
values are entered into Vision, and the sample distribution corrected as necessary.

This type of sample selection is useful when dealing with large sample set, and/or limited
access to the reference analysis.