This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Remove 0x005C conversion from __jisx0208_from_ucs4_lat1 for ISO-2022-JP


At Thu, 09 Sep 2004 10:12:01 -0700,
Ulrich Drepper wrote:
> The implemented behavior has been added by default and changing this
> will break code.  Since I did not decide on this myself back when there
> are definitely opposite sides on this issue within the group of people
> affected.

Thanks for your reply.  I have not known that the patch was requested
on demand.  I investigated jis0208.c cvs diff and cvs log, but it
seemed this part was not modified from the first check in.

Do you remember who requested this part?  I guess they had some
reasons (like EUC-JP/SJIS reversible), so I would like to contact and
discuss this problem.  I would like to know the reason.

> In this case it is better to be conservative and not change
> anything.

In this case, the character \ reverse solidus is modified between
round-trip (even ISO-2022-JP <-> ISO-2022-JP).  I wonder why it's
acceptable conversion.  There're some irreversible round-trip in
EUC-JP and SJIS.  However this patch focuses only for ISO-2022-JP
specific.  I'm in the hope that the original author, you, reread the
patch again.

> You will have to do much better than just saying "some people
> complained".  You have to show that changing this does break anything
> significantly _and_ that people cannot live without this change (despite
> the existing behavior being in place for 7 years now).

The actual example raised this problem was especially some mail user
agents (sylpheed and mutt) and IRC clients (xchat) that use iconv().

We usually use EUC-JP in unix environment (and SJIS in some
environment known as Windows and Macintosh).  ISO-2022-JP is used
especially in emacs, HTML, IRC and email (RFC defines ISO-2022-JP
should be used in Japanese mail.  It's also popular to use ISO-2022-JP
in Japanese IRC).

But emacsen does not use iconv().  SJIS/EUC-JP can be used and they're
becoming majority in Japanese HTML.  Until a few years, the major
mailer and IRC client in unix environment were emacsen based (like
Mew, Wanderlust, liece, irchat-jp, and so on).  They also don't use
ISO-2022-JP glibc iconv() function.  (Note that I'm emacsen based
user, so I hardly see this kind of problem on my usual environment.)

Moreover, this problem is occured only the sequence "A\b" (where A is
JISX0208 and b is ISO-646).  It's appeared rarely in Japanese text.
One recent example is appeared on mutt.  When mutt user want to send
.po file that includes text like: msgstr "AAA\n".  However, the mail
becomes to "AAAcn" where c is fullwidth reverse solidus.  The reverse
solidus \ is undesirably changed and .po format becomes being broken
without any reasons.

So I'm not surprised even it has been existed for 7 years.

Regards,
-- gotom


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]