This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: [PATCH] Remove 0x005C conversion from __jisx0208_from_ucs4_lat1 for ISO-2022-JP
- From: Bruno Haible <bruno at clisp dot org>
- To: Fumitoshi UKAI <ukai at debian dot or dot jp>
- Cc: Ulrich Drepper <drepper at redhat dot com>, GOTO Masanori <gotom at debian dot or dot jp>, libc-alpha at sources dot redhat dot com
- Date: Mon, 27 Sep 2004 21:44:52 +0200
- Subject: Re: [PATCH] Remove 0x005C conversion from __jisx0208_from_ucs4_lat1 for ISO-2022-JP
Ulrich Drepper wrote:
> The implemented behavior has been added on demand and changing this
> will break code.
Probably the demand was to map U+005C to a particular ISO-2022-JP character.
But what the glibc code currently does, is to map U+005C to one ISO-2022-JP
character (equivalent to U+005C) or to another ISO-2022-JP character
(equivalent to U+FF3C), depending on the preceding characters.
$ printf "\xe3\x81\x82\x5c" | /usr/bin/iconv -f utf-8 -t iso-2022-jp \
| /usr/bin/iconv -f iso-2022-jp -t ucs-4le \
| hexdump -e '"%06.6_ax " 16/4 "%08X " "\n"'
000000 00003042 0000FF3C
$ printf "\xe3\x81\x82 \x5c" | /usr/bin/iconv -f utf-8 -t iso-2022-jp \
| /usr/bin/iconv -f iso-2022-jp -t ucs-4le \
| hexdump -e '"%06.6_ax " 16/4 "%08X " "\n"'
000000 00003042 00000020 0000005C
If the demand was to map U+005C to FULLWIDTH SOLIDUS, the current behaviour
is incorrect. If the demand was to map U+005C to SOLIDUS, the current
behaviour is incorrect as well. Either way, it looks like an implementation
bug, not like a desired behaviour.
GNU libiconv, by the way, maps U+005C to SOLIDUS always.
Bruno