This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New GB18030 gconv module for glibc (from ThizLinux Laboratory)

There is no definition of Unicode range U+10000..U+10ffff in the 
standard book published on March 17, 2000.

Markus Scherer wrote:

>I agree with what Anthony said about mapping code points: Even if they do 
>not have assigned characters, their mappings are defined. This is true for 
>all Unicode code points except _single_ surrogate code points 
>Mapping _from_ GB 18030 may sometimes result in "unassigned" handling 
>because some 4-byte GB 18030 sequences are defined but do not have 
>mappings to Unicode.
>Dirk and my publications on this are based on a printed version of the GB 
>18030 standard from 2000 (plus the published electronic mapping tables), 
>and from following discussions about the standard as much as possible. (I 
>do not read/speak Chinese, but Dirk does; our companies had Chinese 
>representatives that were in frequent discussion with the Chinese 
>standards agency.)
>Note that the supplementary Unicode code points U+10000..U+10ffff were 
>_designated_ in Unicode 2.0 (1996), with the pseudo-assignment of 
>128*1024-4 of those code points (U+f0000..U+ffffd and U+100000..U+10fffd) 
>as a Private-Use Area.
>Unicode 3.1 did not invent this supplementary range but was "only" the 
>first Unicode version that assigned "real" characters to such code points 
>(and assigned >40000 of them).
>Note also that formally GB 18030 defines mappings to ISO 10646, not 
>Unicode. One of the differences is the publication schedule. Supplementary 
>character assignments were published only in December 2001 with ISO 10646 
>part 2, which synchronized with Unicode 3.1 several months after its 
>Markus Scherer  IBM GCoC-Unicode/ICU  San Josť, CA 
> (also for SameTime)

Yu Shao
Red Hat Asia-Pacific
+61 7 3872 4835

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]