This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: The C locale

From: Andy Koppe <andy dot koppe at gmail dot com>
To: cygwin at cygwin dot com
Date: Tue, 29 Sep 2009 05:04:18 +0100
Subject: Re: The C locale
References: <20090921103758.GE20981@calimero.vinschen.de> <20090924073441.GA30267@calimero.vinschen.de> <3f0ad08d0909240237s518de248jee409b731711404a@mail.gmail.com> <20090924095701.GC30851@calimero.vinschen.de> <20090924100006.GD30851@calimero.vinschen.de> <20090926091504.GA7275@calimero.vinschen.de> <3f0ad08d0909262021u5fe79873r65850865166ce40f@mail.gmail.com> <3f0ad08d0909280903t5caaf611ie4049a73beb93f06@mail.gmail.com> <20090928161626.GC8378@calimero.vinschen.de> <20090929092340.796@binki>

2009/9/29 wynfield:
>
> Though I'm not an up on the details involved here, I will give
> you feedback to the request for information about the locale issue, because it affects the quick accessability and usage of Japanese language documents.
>
> Either of the two follow values would be acceptable, but I feel that the UTF-8 charset is becoming more and more adopted.
> Â Â Â ÂLANG=ja -> UTF-8
> Â Â LANG=ja_JP -> UTF-8
>
> Also the following be suitable if possible..
> Â Â Â ÂLANG=ja -> iso-2022-jp
> Â Â LANG=ja_JP -> iso-2022-jp

Thanks for the feedback!

Now, Windows knows three different variants of iso-2022-jp. Do you
know which one's the preferred one?

CP50220: ISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS)
CP50221: ISO 2022 Japanese with halfwidth Katakana; Japanese
(JIS-Allow 1 byte Kana)
CP50222: ISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte
Kana - SO/SI)

Also, Wikipedia has this to say:

"Since ISO 2022 is a stateful encoding, a program can not jump in the
middle of a block of text to search, insert or delete characters. This
makes manipulation of the text very cumbersome and slow when compared
to non-stateful encodings. Any jump in the middle of the text may
require a back up to the previous escape sequence before the bytes
following the escape sequence can be interpreted."

Doesn't that make it very difficult to use with standard Unix tools?

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

References:
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: IWAMURO Motonori
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: IWAMURO Motonori
- Re: The C locale
  - From: IWAMURO Motonori
- Re: The C locale
  - From: Corinna Vinschen
- Re: The C locale
  - From: wynfield

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]