beautypg.com

Multibyte strings – HP Integrity NonStop J-Series User Manual

Page 60

background image

Click on the banner to return to the user guide home page.

©Copyright 1996 Rogue Wave Software

Multibyte Strings

Class

RWCString

provides limited support for multibyte strings, sometimes used in representing

various alphabets (see

Chapter 16:

Localizing Alphabets...

). Because a multibyte character can

consist of two or more bytes, the length of a string in bytes may be greater than or equal to the
number of actual characters in the string.

If the

RWCString

contains multibyte characters, you should use member function mbLength() to

return the number of characters. On the other hand, if you know that the RWCString does not
contain any multibyte characters, then the results of length() and mbLength() will be the same,
and you may want to use length() because it is much faster. Here's an example using a multibyte
string in Sun:

RWCString Sun("\306\374\315\313\306\374");
cout << Sun.length(); // Prints "6"
cout << Sun.mbLength(); // Prints "3"

The string in Sun is the name of the day Sunday in Kanji, using the EUC (Extended UNIX Code)
multibyte code set. With the EUC, a single character may be 1 to 4 bytes long. In this example,
the string Sun consists of 6 bytes, but only 3 characters.

In general, the second or later byte of a multibyte character may be null. This means the length in
bytes of a character string may or may not match the length given by strlen(). Internally,

RWCString

makes no assumptions

[3]

about embedded nulls, and hence can be used safely with

character sets that use null bytes. You should also keep in mind that while RWCString::data()
always returns a null-terminated string, there may be earlier nulls in the string. All of these
effects are summarized in the following program:

#include
#include
#include
main() {
RWCString a("abc"); // 1
RWCString b("abc\0def"); // 2
RWCString c("abc\0def", 7); // 3

cout << a.length(); // Prints "3"
cout << strlen(a.data()); // Prints "3"

This manual is related to the following products: