This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove 0x005C conversion from __jisx0208_from_ucs4_lat1 for ISO-2022-JP


At Thu, 09 Sep 2004 10:12:01 -0700,
Ulrich Drepper wrote:

> GOTO Masanori wrote:
> > Could someone take and look at this patch?  I heard and reported from
> > various Japanese users about this problem for a long time.
> 
> The implemented behavior has been added by default and changing this
> will break code.  Since I did not decide on this myself back when there
> are definitely opposite sides on this issue within the group of people
> affected.  In this case it is better to be conservative and not change
> anything.

Current behavior breaks the ISO-2022-JP encoding, so we can't use iconv(3)
to convert to ISO-2022-JP.

Would you explain (or inform me who can explain) why this is a reasonable
behavior, plase?

For example, this is valid ISO-2022-JP sequence:

 $ printf "\x1b\x24\x42\x24\x22\x1b\x28\x4a\x5c\x61\x1b\x24\x42\x24\x22\x1b\x28\x42\x5c\x61\x1b\x24\x42\x24\x22\x21\x40\x1b\x28\x42\x61\n"

It will be 
 [HIRAGANA LETTER A] [YEN SIGN] [LATIN SMALL LETTER A]
 [HIRAGANA LETTER A] [REVERSE SOLIDUS] [LATIN SMALL LETTER A]
 [HIRAGANA LETTER A] [FULLWIDTH REVERSE SOLIDUS] [LATIN SMALL LETTER A]

In ISO-2022-JP, these 3 characters (YEN SIGN, REVERSE SOLIDUS, FULLWIDTH 
REVERSE SOLIDUS) can be represented without any confusion.

But, when it is passed to "iconv -f ISO-20222-JP -t ISO-2022-JP", we get

 [HIRAGANA LETTER A] [YEN SIGN] [LATIN SMALL LETTER A]
 [HIRAGANA LETTER A] [FULLWIDTH REVERSE SOLIDUS] [LATIN SMALL LETTER A]
 [HIRAGANA LETTER A] [FULLWIDTH REVERSE SOLIDUS] [LATIN SMALL LETTER A]

So, it breaks REVERSE SOLIDUS, and convert it to FULLWIDTH REVERSE SOLIDUS.
Why?

> You will have to do much better than just saying "some people
> complained".  You have to show that changing this does break anything
> significantly _and_ that people cannot live without this change (despite
> the existing behavior being in place for 7 years now).

I believe we hadn't used iconv(3) for such purpose for long time, but
recently many applications begin to use iconv(3) so this problem is
appeared. (Actually, I tought it was not iconv(3) bug, but application bug
at first time, so others do).

Regards,
Fumitoshi UKAI


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]