This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Unicode 3.2 support (7), JISX0213 converter



Unicode 3.2 has added enough Hanzi and backward compatibility characters to make
a converter from the JISX0213:2000 character set to Unicode possible. Here is a
patch that adds iconv converters for the two encodings based on it, EUC-JISX0213
and Shift_JISX0213. Also charmaps are added, to make the testsuite check these
encodings.

This is not "just another" Japanese encoding. Shift_JISX0213 is an extension
of Shift_JIS, with MB_CUR_MAX = 2, therefore the big promise of this encoding
is to prolong the life of the family of encodings possessing the Yen/backslash
problem.

Note 1: The JISX0213 standard also specifies an encoding named ISO-2022-JP-3.
Since I haven't been able to find reliable information about it in English, I
don't add a converter for it.

Note 2: EUC-JISX0213 and Shift_JISX0213 have a property that makes them unusable
as encodings for glibc locales. Both contain combining characters. Not in
the same sense as for CP1255 and CP1258, where Unicode contains the combined
character and the legacy encoding doesn't. But in the opposite sense: Some
EUC-JISX0213 characters correspond to pairs of Unicode characters. Which
means that mbrtowc() for such a character would have to return two wchar_t's,
not just one.


ChangeLog:
2002-04-15  Bruno Haible  <bruno@clisp.org>

	* iconvdata/JISX0213.TXT: New file.
	* iconvdata/jisx0213.h: New file.
	* iconvdata/jisx0213.c: New file.
	* iconvdata/euc-jisx0213.c: New file.
	* iconvdata/shift_jisx0213.c: New file.
	* iconvdata/gconv-modules (EUC-JISX0213, SHIFT_JISX0213): New modules.
	* iconvdata/EUC-JISX0213.precomposed: New file.
	* iconvdata/SHIFT_JISX0213.precomposed: New file.
	* iconvdata/SHIFT_JISX0213.irreversible: New file.
	* iconvdata/tst-table-to.c (main): Make it work for encodings for
	which the "to" direction is stateful.
	* iconvdata/tst-tables.sh: Add EUC-JISX0213, SHIFT_JISX0213.
	* iconvdata/Makefile (modules): Add libJISX0213, EUC-JISX0213,
	SHIFT_JISX0213.
	(libJISX0213-routines): New variable.
	(LDFLAGS-EUC-JISX0213.so, LDFLAGS-SHIFT_JISX0213.so): New variables.
	(EUC-JISX0213.so, SHIFT_JISX0213.so): Depend on libJISX0213.so.
	(LDFLAGS-libJISX0213.so): New variable.
	(distribute): Add JISX0213.TXT, EUC-JISX0213.precomposed,
	SHIFT_JISX0213.precomposed, SHIFT_JISX0213.irreversible,
	jisx0213.c, jisx0213.h, euc-jisx0213.c, shift_jisx0213.c.

localedata/ChangeLog:
2002-04-15  Bruno Haible  <bruno@clisp.org>

	* charmaps/EUC-JISX0213: New file.
	* charmaps/SHIFT_JISX0213: New file.

[The patch is too large for this mailing list. You can download it from
ftp://ftp.ilog.fr/pub/Users/haible/gnu/glibc-unicode32-patch7.bz2 .]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]