beautypg.com

Encoding for a load job, Encoding for an extract job – HP Neoview Release 2.5 Software User Manual

Page 56

background image

TIP:

The Transporter NVTHOME/utils directory contains these programs that you can compile

and execute to display character set encodings:

CharsetsSupported.java

, which will display all the character set encodings supported

by your Java installation.

DefaultEncoding.java

, which will display the default encoding used by Java on your

client system.

Encoding For A Load Job

For a load job, the data file or pipe is read using the specified or default encoding and the data
is decoded into Java's native encoding character set UTF-16.

If the data is loaded into an ISO88591 column, its final encoding depends on the character
set configuration of the Neoview system.

If the data is loaded into a UCS2 column the data stays in the UTF-16 character set.

On Unicode configurations the data is encoded as UTF-8,

On SJIS configurations the data is encoded as MS932

On an ISO88591 configuration the data is encoded using the same encoding when the data
was read.

Encoding For An Extract Job

For an extract job, the data is extracted from the Neoview system and decoded into Java's native
encoding character set UTF-16. The data is then encoded using the encoding specified in the
control file or the default. If the data cannot be encoded in the specified encoding or the target
encoding of the Neoview system, Transporter reports the record as bad.

Changing Default Encoding Behavior Using Transporter Properties

There are three properties that can be used to change the Transporter encoding behavior.

encoding-error-disposition

— Set to one of three options:

— REPORT — reports an encoding error. This is the default.
— REPLACE — causes the unmappable character to be replaced with another character.

By default the replacement character is a question mark (?).

— IGNORE

encoding-error-replacementString

— Set this in order to change the replacement

character from a question mark to another single character.

pass-through-mode

— Set to true in order to prevent the data from being converted to

Java's native UTF-16 encoding. This property is only valid when loading to or extracting
from an ISO88591 Neoview configuration. This property can be useful if your character set
has multiple encodings for the same character.

Limitations for this property:

“Field Delimiter Character” (page 76)

,

“nullstring” (page 61)

,

“startseq” (page 64)

,

and

“endseq” (page 57)

should be limited to single byte characters that won’t be misread

as the second byte of multi-byte data. These are allowed characters:

ASCII characters 1 – 31

" (double quote)

# (number sign)

! (exclamation point)

$ (dollar sign)

% (percent sign)

& (ampersand)

56

Control File Options