From 573df59cd45796a1037c731c62a1bee86bb87913 Mon Sep 17 00:00:00 2001 From: Corinna Vinschen Date: Sat, 6 Feb 2010 21:41:05 +0000 Subject: [PATCH] * setup2.sgml (setup-locale-ov): Align description of working modifiers to latest changes. --- winsup/doc/ChangeLog | 5 ++++ winsup/doc/setup2.sgml | 52 +++++++++++++++++++++++++++--------------- 2 files changed, 38 insertions(+), 19 deletions(-) diff --git a/winsup/doc/ChangeLog b/winsup/doc/ChangeLog index ca3d63dea..95d248ae6 100644 --- a/winsup/doc/ChangeLog +++ b/winsup/doc/ChangeLog @@ -1,3 +1,8 @@ +2010-02-06 Corinna Vinschen + + * setup2.sgml (setup-locale-ov): Align description of working modifiers + to latest changes. + 2010-02-06 Corinna Vinschen * new-features.sgml (ov-new1.7.2): Add support for new charsets. diff --git a/winsup/doc/setup2.sgml b/winsup/doc/setup2.sgml index de0de2fae..0f236f6be 100644 --- a/winsup/doc/setup2.sgml +++ b/winsup/doc/setup2.sgml @@ -272,33 +272,47 @@ ignored for now. -For languages which default to one of the ISO-8859 character -sets, the modifier "@euro" can be added to enforce usage of the ISO-8859-15 -character set, which includes a character for the "Euro" currency sign . - +For languages which default to the ISO-8859-1 character +set, the modifier "@euro" can be added to enforce usage of the ISO-8859-15 +character set, which includes a character for the "Euro" currency sign. +Beware, that also works for non-european locales. + + + +The default script used for all Serbian language locales (sr_BA, sr_ME, sr_RS, +and the deprecated sr_CS and sr_SP) is cyrillic. With the "@latin" modifier +it gets switched to the latin script with the respective collation behaviour. + -The default charset of the "uz_UZ" locale is ISO-8859-1. With the "@cyrillic" -modifier it's UTF-8. +The default charset of the "be_BY" locale (Belarusian/Belarus) is CP1251. +With the "@latin" modifier it's UTF-8. -The default charset of the "tt_RU" locale is ISO-8859-5. With the "@iqtelif" -modifier it's UTF-8. +The default charset of the "tt_RU" locale (Tatar/Russia) is ISO-8859-5. +With the "@iqtelif" modifier it's UTF-8. -There's a class of characters in the Unicode character set, -called the "CJK Ambiguous Width Character set". For these characters the width + +The default charset of the "uz_UZ" locale (Uzbek/Uzbekistan) is ISO-8859-1. +With the "@cyrillic" modifier it's UTF-8. + + + +There's a class of characters in the Unicode character set, called the +"CJK Ambiguous Width Character set". For these characters the width returned by the wcwidth/wcswidth function is usually 1. This is often a -problem in East-Asian languages, which historically use character sets in -which these characters have a width of 2. By default, the wcwidth/wcswidth -functions return 1 as the width of these characters, except if the language is -specifed as "ja" (Japanese), "ko" (Korean), or "zh" (Chinese). In these -languages wcwidth and wcswidth return 2 for these characters. This is not -correct in all circumstances, so the user of one of these languages can specify -the modifier "@cjknarrow", which modifies the behaviour of wcwidth/wcswidth to -return 1 for the ambiguous width characters. - +problem in East-Asian languages, which historically use character sets +in which these characters have a width of 2. By default, the +wcwidth/wcswidth functions return 1 as the width of these characters, +except if the language is specifed as "ja" (Japanese), "ko" (Korean), or +"zh" (Chinese). In these languages wcwidth and wcswidth return 2 for +these characters. This is not correct in all circumstances, so the user +of one of these languages can specify the modifier "@cjknarrow", which +modifies the behaviour of wcwidth/wcswidth to return 1 for the ambiguous +width characters. + -- 2.43.5