Created attachment 7912 [details] patch for hanzi collation Add new collation file for hanzi from bug 16905 to localedata of cmn_TW.
Tested on Fedora 21 x86_64 beta.
Tested OK on Fedora 24.
Created attachment 10276 [details] patch for hanzi collation patch updated
(In reply to Wei-Lun Chao from comment #3) > Created attachment 10276 [details] > patch for hanzi collation > > patch updated Why does your patch remove “country_car”? @@ -200,7 +208,6 @@ % TWN country_ab3 "TWN" country_num 158 -country_car "RC" country_isbn 957 % 漢語官話 lang_name "漢語官話" According to https://en.wikipedia.org/wiki/List_of_international_vehicle_registration_codes “RC” seems to be correct.
Created attachment 10325 [details] patch for hanzi collation Oh! Its my fault. patch re-uploaded.
Should this stroke count sorting also be applied to zh_TW, or only to cmn_TW? (In reply to Wei-Lun Chao from comment #5) > Created attachment 10325 [details] > patch for hanzi collation > > Oh! Its my fault. > patch re-uploaded. Thank you! Should the new collation also be used for zh_TW, or only for cmn_TW. By the way, what is the difference between zh_TW and cmn_TW, isn’t both Mandarin?
(In reply to Mike FABIAN from comment #6) > Should the new collation also be used for zh_TW, or only > for cmn_TW. > By the way, what is the difference between zh_TW > and cmn_TW, isn’t both Mandarin? As reasons for bug 15963, those 14 languages have been behind the macro-language "zh" for a long time. Technically zh_TW and cmn_TW are the same, but for fairness, IMHO, the locale zh_TW should be deprecated and replaced with cmn_TW and other chinese locales. Personally I would like to differentiate cmn from zh with this radical patch, which may be followed by similar patches against nan_TW, hak_TW, lzh_TW and yue_HK.
(In reply to Wei-Lun Chao from comment #7) > (In reply to Mike FABIAN from comment #6) > > Should the new collation also be used for zh_TW, or only > > for cmn_TW. > > By the way, what is the difference between zh_TW > > and cmn_TW, isn’t both Mandarin? > > As reasons for bug 15963, those 14 languages have been behind the > macro-language "zh" for a long time. Technically zh_TW and cmn_TW are the > same, but for fairness, IMHO, the locale zh_TW should be deprecated and > replaced with cmn_TW and other chinese locales. > > Personally I would like to differentiate cmn from zh with this radical > patch, which may be followed by similar patches against nan_TW, hak_TW, > lzh_TW and yue_HK. OK. How to test your patch? I did this: Without your patch: $ echo -e "黄\n木\n機\n期" | LC_ALL=cmn_TW.UTF-8 sort 期 木 機 黄 $ With your patch: $ echo -e "黄\n木\n機\n期" | LC_ALL=cmn_TW.UTF-8 sort 木 黄 期 機 $ That seems to show that I applied your patch correctly, right?
(In reply to Mike FABIAN from comment #8) > (In reply to Wei-Lun Chao from comment #7) > > (In reply to Mike FABIAN from comment #6) > > > Should the new collation also be used for zh_TW, or only > > > for cmn_TW. > > > By the way, what is the difference between zh_TW > > > and cmn_TW, isn’t both Mandarin? > > > > As reasons for bug 15963, those 14 languages have been behind the > > macro-language "zh" for a long time. Technically zh_TW and cmn_TW are the > > same, but for fairness, IMHO, the locale zh_TW should be deprecated and > > replaced with cmn_TW and other chinese locales. > > > > Personally I would like to differentiate cmn from zh with this radical > > patch, which may be followed by similar patches against nan_TW, hak_TW, > > lzh_TW and yue_HK. > > OK. > > How to test your patch? > > I did this: > > Without your patch: > > $ echo -e "黄\n木\n機\n期" | LC_ALL=cmn_TW.UTF-8 sort > 期 > 木 > 機 > 黄 > $ > > With your patch: > > $ echo -e "黄\n木\n機\n期" | LC_ALL=cmn_TW.UTF-8 sort > 木 > 黄 > 期 > 機 > $ > > That seems to show that I applied your patch correctly, right? Yes, I used to test bug 16905 like this: $ touch 黄 木 機 期 $ ls $ LC_ALL=cmn_TW.UTF-8 ls
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via bd80111ed9cb93b2d56720dcd1d1f259616c27ae (commit) via 4169825556bcc23ced731e711be91819465d4a83 (commit) via 38dbcacb606f70ad0a35fbcacb6f3cbff5f34d94 (commit) from 68dc02d1dcbfb37ee22327d6a3c43f528d593035 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bd80111ed9cb93b2d56720dcd1d1f259616c27ae commit bd80111ed9cb93b2d56720dcd1d1f259616c27ae Author: Mike FABIAN <mfabian@redhat.com> Date: Thu Aug 10 12:16:29 2017 +0200 Fix stdlib/tst-strfmon_l.c test case to agree with the changes in Indian monetary formatting The test cases should expose non-standard grouping and the trailing space after the currency sign. After the changes to the Indian monetary formatting, the Indian formatting still shows the non-standard grouping. To test the trailing space after the currency sign I chose the hr_HR locale. See: commit 82b3124268bec0609b337dd993e771c93e44cbf2 Author: Akhilesh Kumar <akhilesh.k@samsung.com> Remove redundant data for LC_MONETARY for Indian locales https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4169825556bcc23ced731e711be91819465d4a83 commit 4169825556bcc23ced731e711be91819465d4a83 Author: Akhilesh Kumar <akhilesh.k@samsung.com> Date: Wed Aug 9 18:27:14 2017 +0530 Remove redundant data for LC_MONETARY for Indian locales Reference is taken from https://en.wikipedia.org/wiki/Indian_numbering_system https://en.wikipedia.org/wiki/Indian_rupee CLDR has the currency format pattern “¤#,##,##0.00”. [BZ #21836] * locales/ar_IN (LC_MONETARY) : copy "hi_IN" * locales/as_IN (LC_MONETARY) : copy "hi_IN" * locales/bhb_IN (LC_MONETARY): copy "hi_IN" * locales/bn_IN (LC_MONETARY) : copy "hi_IN" * locales/en_IN (LC_MONETARY) : copy "hi_IN" * locales/gu_IN (LC_MONETARY) : copy "hi_IN" * locales/hi_IN (LC_MONETARY) : Fix mon_grouping, p_sep_by_space and n_sep_by_space * locales/kn_IN (LC_MONETARY) : copy "hi_IN" * locales/kok_IN(LC_MONETARY) : copy "hi_IN" * locales/ks_IN (LC_MONETARY) : copy "hi_IN" * locales/ml_IN (LC_MONETARY) : copy "hi_IN" * locales/mr_IN (LC_MONETARY) : copy "hi_IN" * locales/or_IN (LC_MONETARY) : copy "hi_IN" * locales/pa_IN (LC_MONETARY) : copy "hi_IN" * locales/sa_IN (LC_MONETARY) : copy "hi_IN" * locales/sd_IN (LC_MONETARY) : copy "hi_IN" * locales/ta_IN (LC_MONETARY) : copy "hi_IN" * locales/tcy_IN(LC_MONETARY) : copy "hi_IN" * locales/te_IN (LC_MONETARY) : copy "hi_IN" * locales/ur_IN (LC_MONETARY) : copy "hi_IN" https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=38dbcacb606f70ad0a35fbcacb6f3cbff5f34d94 commit 38dbcacb606f70ad0a35fbcacb6f3cbff5f34d94 Author: Wei-Lun Chao <bluebat@member.fsf.org> Date: Wed Aug 9 12:19:44 2017 +0200 cmn_TW: add hanzi collation [BZ #17563] [BZ #16905] * locales/cmn_TW (LC_COLLATE): Use cns11643_stroke file for sorting. * locales/cmn_TW (LC_TIME): Improve time and date formats. * locales/cmn_TW (LC_MESSAGES): Add yesstr and nostr. * locales/cns11643_stroke: New file, stroke count collation for traditional Chinese. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 7 + localedata/ChangeLog | 41 + localedata/locales/ar_IN | 22 +- localedata/locales/as_IN | 22 +- localedata/locales/bhb_IN | 2 +- localedata/locales/bn_IN | 22 +- localedata/locales/cmn_TW | 44 +- localedata/locales/cns11643_stroke |70754 ++++++++++++++++++++++++++++++++++++ localedata/locales/en_IN | 22 +- localedata/locales/gu_IN | 21 +- localedata/locales/hi_IN | 16 +- localedata/locales/kn_IN | 21 +- localedata/locales/kok_IN | 22 +- localedata/locales/ks_IN | 23 +- localedata/locales/ml_IN | 25 +- localedata/locales/mr_IN | 22 +- localedata/locales/or_IN | 22 +- localedata/locales/pa_IN | 18 +- localedata/locales/sa_IN | 21 +- localedata/locales/sd_IN | 22 +- localedata/locales/ta_IN | 22 +- localedata/locales/tcy_IN | 2 +- localedata/locales/te_IN | 22 +- localedata/locales/ur_IN | 2 +- stdlib/Makefile | 2 +- stdlib/tst-strfmon_l.c | 20 +- 26 files changed, 70868 insertions(+), 371 deletions(-) create mode 100644 localedata/locales/cns11643_stroke
FIXED.
(In reply to Wei-Lun Chao from comment #7) > (In reply to Mike FABIAN from comment #6) > > Should the new collation also be used for zh_TW, or only > > for cmn_TW. > > By the way, what is the difference between zh_TW > > and cmn_TW, isn’t both Mandarin? > > As reasons for bug 15963, those 14 languages have been behind the > macro-language "zh" for a long time. Technically zh_TW and cmn_TW are the > same, but for fairness, IMHO, the locale zh_TW should be deprecated and > replaced with cmn_TW and other chinese locales. > > Personally I would like to differentiate cmn from zh with this radical > patch, which may be followed by similar patches against nan_TW, hak_TW, > lzh_TW and yue_HK. What about the translations? On Fedora 26, most translations at the moment are in /usr/share/locale/zh_TW/ and very few are in /usr/share/locale/cmn/ I also wonder why only the "cmn" exists and not "cmn_TW" and "cmn_CN", probably one would need to make a distinction between traditional and simplified here as well. As there is no cmn_CN locale, this does not matter at the moment but it might matter in future ... Users of zh_TW and cmn_TW would probably want the same translations, so maybe one of these folders should be a symlink to the other?
(In reply to Mike FABIAN from comment #12) > (In reply to Wei-Lun Chao from comment #7) > > (In reply to Mike FABIAN from comment #6) > > > Should the new collation also be used for zh_TW, or only > > > for cmn_TW. > > > By the way, what is the difference between zh_TW > > > and cmn_TW, isn’t both Mandarin? > > > > As reasons for bug 15963, those 14 languages have been behind the > > macro-language "zh" for a long time. Technically zh_TW and cmn_TW are the > > same, but for fairness, IMHO, the locale zh_TW should be deprecated and > > replaced with cmn_TW and other chinese locales. > > > > Personally I would like to differentiate cmn from zh with this radical > > patch, which may be followed by similar patches against nan_TW, hak_TW, > > lzh_TW and yue_HK. > > What about the translations? On Fedora 26, most translations at the moment > are in > > /usr/share/locale/zh_TW/ > > and very few are in /usr/share/locale/cmn/ > > I also wonder why only the "cmn" exists and not "cmn_TW" and "cmn_CN", > probably one would need to make a distinction between traditional and > simplified > here as well. As there is no cmn_CN locale, this does not matter at the > moment but it might matter in future ... > > Users of zh_TW and cmn_TW would probably want the same translations, so maybe > one of these folders should be a symlink to the other? Thanks for your concern :)