Encoding for a load job, Encoding for an extract job – HP Neoview Release 2.5 Software User Manual
Page 56
TIP:
The Transporter NVTHOME/utils directory contains these programs that you can compile
and execute to display character set encodings:
•
CharsetsSupported.java
, which will display all the character set encodings supported
by your Java installation.
•
DefaultEncoding.java
, which will display the default encoding used by Java on your
client system.
Encoding For A Load Job
For a load job, the data file or pipe is read using the specified or default encoding and the data
is decoded into Java's native encoding character set UTF-16.
•
If the data is loaded into an ISO88591 column, its final encoding depends on the character
set configuration of the Neoview system.
•
If the data is loaded into a UCS2 column the data stays in the UTF-16 character set.
•
On Unicode configurations the data is encoded as UTF-8,
•
On SJIS configurations the data is encoded as MS932
•
On an ISO88591 configuration the data is encoded using the same encoding when the data
was read.
Encoding For An Extract Job
For an extract job, the data is extracted from the Neoview system and decoded into Java's native
encoding character set UTF-16. The data is then encoded using the encoding specified in the
control file or the default. If the data cannot be encoded in the specified encoding or the target
encoding of the Neoview system, Transporter reports the record as bad.
Changing Default Encoding Behavior Using Transporter Properties
There are three properties that can be used to change the Transporter encoding behavior.
•
encoding-error-disposition
— Set to one of three options:
— REPORT — reports an encoding error. This is the default.
— REPLACE — causes the unmappable character to be replaced with another character.
By default the replacement character is a question mark (?).
— IGNORE
•
encoding-error-replacementString
— Set this in order to change the replacement
character from a question mark to another single character.
•
pass-through-mode
— Set to true in order to prevent the data from being converted to
Java's native UTF-16 encoding. This property is only valid when loading to or extracting
from an ISO88591 Neoview configuration. This property can be useful if your character set
has multiple encodings for the same character.
Limitations for this property:
—
“Field Delimiter Character” (page 76)
,
and
should be limited to single byte characters that won’t be misread
as the second byte of multi-byte data. These are allowed characters:
◦
ASCII characters 1 – 31
◦
" (double quote)
◦
# (number sign)
◦
! (exclamation point)
◦
$ (dollar sign)
◦
% (percent sign)
◦
& (ampersand)
56
Control File Options