The C locale
Andy Koppe
andy.koppe@gmail.com
Tue Sep 29 04:04:00 GMT 2009
2009/9/29 wynfield:
>
> Though I'm not an up on the details involved here, I will give
> you feedback to the request for information about the locale issue, because it affects the quick accessability and usage of Japanese language documents.
>
> Either of the two follow values would be acceptable, but I feel that the UTF-8 charset is becoming more and more adopted.
> LANG=ja -> UTF-8
> LANG=ja_JP -> UTF-8
>
> Also the following be suitable if possible..
> LANG=ja -> iso-2022-jp
> LANG=ja_JP -> iso-2022-jp
Thanks for the feedback!
Now, Windows knows three different variants of iso-2022-jp. Do you
know which one's the preferred one?
CP50220: ISO 2022 Japanese with no halfwidth Katakana; Japanese (JIS)
CP50221: ISO 2022 Japanese with halfwidth Katakana; Japanese
(JIS-Allow 1 byte Kana)
CP50222: ISO 2022 Japanese JIS X 0201-1989; Japanese (JIS-Allow 1 byte
Kana - SO/SI)
Also, Wikipedia has this to say:
"Since ISO 2022 is a stateful encoding, a program can not jump in the
middle of a block of text to search, insert or delete characters. This
makes manipulation of the text very cumbersome and slow when compared
to non-stateful encodings. Any jump in the middle of the text may
require a back up to the previous escape sequence before the bytes
following the escape sequence can be interpreted."
Doesn't that make it very difficult to use with standard Unix tools?
Andy
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list