Kurdish (Iraq) needs an own locale definition, since Kurdish is an official language in Iraq since 200x duo to: https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/266975 the locale are defined: http://www.zkurd.org/aras/ckb_IQ.txt
Created attachment 3704 [details] localedata for Kurdish Sorani (CKB) the localedata for Kurdish Sorani (CKB)
Comment on attachment 3704 [details] localedata for Kurdish Sorani (CKB) comment_char % escape_char / % Kurdish (Sorani) language locale for Iraq and Iran. % Contributed by Aras Noori <aras.noori@gmal.com> and % Erdal Ronahi<erdal.ronahi@gmail.com>. % Contact: Aras % Language: ku % Date: 2009-01-29 % Distribution and use is free, also % for commercial purposes. % History: % LC_IDENTIFICATION title "Kurdish language locale for Sorani dialects" source "" address "" contact "Aras" email "aras.noori@gmail.com, Diyako@zkurd.org" tel "" fax "" language "Kurdish" territory "Iraq" revision "1.0" date "2009-01-29" % category "ckb_IQ:2000";LC_IDENTIFICATION category "ckb_IQ:2000";LC_CTYPE category "ckb_IQ:2000";LC_COLLATE category "ckb_IQ:2000";LC_TIME category "ckb_IQ:2000";LC_NUMERIC category "ckb_IQ:2000";LC_MONETARY category "ckb_IQ:2000";LC_MESSAGES category "ckb_IQ:2000";LC_PAPER category "ckb_IQ:2000";LC_NAME category "ckb_IQ:2000";LC_ADDRESS category "ckb_IQ:2000";LC_TELEPHONE category "ku_IQ:2000";LC_MEASUREMENT END LC_IDENTIFICATION LC_CTYPE copy "i18n" END LC_CTYPE LC_COLLATE % Copy the template from ISO/IEC 14651 copy "iso14651_t1" END LC_COLLATE LC_MONETARY % This is the POSIX Locale definition the LC_MONETARY category. % These are generated based on XML base Locale difintion file % for IBM Class for Unicode/Java % int_curr_symbol "<U0049><U0051><U0044><U0020>" currency_symbol "<U062F><U002E><U0639><U002E>" mon_decimal_point "<U002E>" mon_thousands_sep "<U002C>" mon_grouping 3 positive_sign "" negative_sign "<U002D>" int_frac_digits 3 frac_digits 3 p_cs_precedes 1 p_sep_by_space 1 n_cs_precedes 1 n_sep_by_space 1 p_sign_posn 1 n_sign_posn 2 % END LC_MONETARY LC_NUMERIC % This is the POSIX Locale definition for the LC_NUMERIC category. % decimal_point "<U002E>" thousands_sep "<U002C>" grouping 3 % END LC_NUMERIC LC_TIME % This is the POSIX Locale definition for the LC_TIME category. % These are generated based on XML base Locale difintion file % for IBM Class for Unicode/Java % % Abbreviated weekday names (%a) abday "<U062D>";"<U0646>";/ "<U062B>";"<U0631>";/ "<U062E>";"<U062C>";/ "<U0633>" % % Full weekday names (%A) day "<U06CC><U06D5><U0643><U0634><U06D5><U0645><U0645><U06D5>";/ "<U062F><U0648><U0648><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0633><U06CE><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0686><U0648><U0624><U0631><U0634><U06D5><U0645><U0645><U06D5>";/ "<U067E><U06CE><U0646><U062C><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0647><U06D5><U06CC><U0646><U06CC>";/ "<U0634><U06D5><U0645><U0645><U06D5>";/ % % Abbreviated month names (%b) abmon "<U064A><U0646><U0627>";"<U0641><U0628><U0631>";/ "<U0645><U0627><U0631>";"<U0623><U0628><U0631>";/ "<U0645><U0627><U064A>";"<U064A><U0648><U0646>";/ "<U064A><U0648><U0644>";"<U0623><U063A><U0633>";/ "<U0633><U0628><U062A>";"<U0623><U0643><U062A>";/ "<U0646><U0648><U0641>";"<U062F><U064A><U0633>" % % Full month names (%B) mon "<U064A><U0646><U0627><U064A><U0631>";/ "<U0641><U0628><U0631><U0627><U064A><U0631>";/ "<U0645><U0627><U0631><U0633>";/ "<U0623><U0628><U0631><U064A><U0644>";/ "<U0645><U0627><U064A><U0648>";/ "<U064A><U0648><U0646><U064A><U0648>";/ "<U064A><U0648><U0644><U064A><U0648>";/ "<U0623><U063A><U0633><U0637><U0633>";/ "<U0633><U0628><U062A><U0645><U0628><U0631>";/ "<U0623><U0643><U062A><U0648><U0628><U0631>";/ "<U0646><U0648><U0641><U0645><U0628><U0631>";/ "<U062F><U064A><U0633><U0645><U0628><U0631>" % % Equivalent of AM PM am_pm "<U0635>";"<U0645>" % % Appropriate date and time representation % %d %b, %Y%Z %I:%M:%S d_t_fmt "<U0025><U0064><U0020><U0025><U0062><U002C><U0020><U0025>/ <U0059><U0020><U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020><U0025><U0070>" % % Appropriate date representation % %d %b, %Y d_fmt "<U0025><U0064><U0020><U0025><U0062><U002C><U0020><U0025><U0059>" % % Appropriate time representation % %Z %I:%M:%S t_fmt "<U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020>" % % Appropriate 12 h time representation (%r) t_fmt_ampm "<U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020><U0025><U0070>" % % Appropriate date representation (date(1)) "%a %b %e %H:%M:%S %Z %Y" date_fmt "<U0025><U0061><U0020><U0025><U0062><U0020><U0025><U0065>/ <U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/ <U0025><U005A><U0020><U0025><U0059>" % FIXME: found in CLDR first_weekday 7 END LC_TIME LC_MESSAGES copy "ckb_IQ" END LC_MESSAGES LC_PAPER % This is the ISO_IEC TR14652 Locale definition for the copy "ckb_IQ" height 297 width 210 END LC_PAPER LC_NAME % This is the ISO_IEC TR14652 Locale definition for the % LC_NAME category. % name_fmt "<U0025><U0070><U0025><U0074><U0025><U0066><U0025><U0074>/ <U0025><U0067>" name_gen "<U002D><U0073><U0061><U006E>" name_mr "<U004D><U0072><U002E>" name_mrs "<U004D><U0072><U0073><U002E>" name_miss "<U004D><U0069><U0073><U0073><U002E>" name_ms "<U004D><U0073><U002E>" END LC_NAME LC_ADDRESS % This is the ISO_IEC TR14652 Locale definition for the % LC_ADDRESS postal_fmt "<U0025><U007A><U0025><U0063><U0025><U0054><U0025><U0073>/ <U0025><U0062><U0025><U0065><U0025><U0072>" country_ab2 "<U0049><U0051>" country_ab3 "<U0049><U0052><U0051>" country_post "<U0054><U0052>" country_num 368 country_car "<U0054><U0052>" % "kurd<U00EE>" lang_name "<U006B><U0075><U0072><U0064><U00EE>" lang_ab "<U006B><U0075>" lang_term "<U006B><U0075><U0072>" lang_lib "<U006B><U0075><U0072>" END LC_ADDRESS LC_TELEPHONE % This is the ISO_IEC TR14652 Locale definition for the % tel_int_fmt "<U002B><U0025><U0063><U0020><U003B><U0025><U0061><U0020>/ <U003B><U0025><U006C>" int_prefix "<U0039><U0036><U0034>" END LC_TELEPHONE LC_MEASUREMENT % This is the ISO_IEC TR14652 Locale definition for the % measurement 1 END LC_MEASUREMENT
The correspondig bug in the Launchpad bug tracker for Ubuntu is https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/266975
Aras is the reporter, not the assignee.
The file is ill-formed. The file is named ckb_IQ and you use it in "copy" in various categories? You also have copy and definitions in categories like LC_PAPER. You have to fix it up.
Comment on attachment 3704 [details] localedata for Kurdish Sorani (CKB) escape_char / comment_char % % Kurdish (Sorani) language locale for Iraq and Iran. % Contributed by Aras Noori <aras.noori@gmal.com> and % Erdal Ronahi<erdal.ronahi@gmail.com>. % Contact: Aras Noori % Language: ku % Date: 2009-04-14 % Distribution and use is free, also % for commercial purposes. % History: % January 2009: Defining CKB locale % March 2009: Adding rule for CKB % LC_IDENTIFICATION title "Kurdish language locale for Sorani dialects - Central Kurdish" source "" address "" contact "Aras" email "aras.noori@gmail.com" tel "" fax "" language "Kurdish" territory "Iraq" revision "1.1" date "2009-04-15" % category "ckb_IQ:2000";LC_IDENTIFICATION category "ckb_IQ:2000";LC_CTYPE category "ckb_IQ:2000";LC_COLLATE category "ckb_IQ:2000";LC_TIME category "ckb_IQ:2000";LC_NUMERIC category "ckb_IQ:2000";LC_MONETARY category "ckb_IQ:2000";LC_MESSAGES category "ckb_IQ:2000";LC_PAPER category "ckb_IQ:2000";LC_NAME category "ckb_IQ:2000";LC_ADDRESS category "ckb_IQ:2000";LC_TELEPHONE category "ckb_IQ:2000";LC_MEASUREMENT END LC_IDENTIFICATION LC_CTYPE copy "i18n" END LC_CTYPE LC_COLLATE % The Sorani Kurdish dialect is mainly written using a modified Arabic-based alphabet with 33 letters. % Unlike the regular Arabic alphabet, which is an abjad, Sorani is an alphabet in which vowels are mandatory, making the script easy to read. % % The CKB (Sorani) alphabet order is: % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z % ئ، ب، پ، ت، ج، چ، ح، خ، د، ر، ڕ، ز، ژ، س، ش، ف، ڤ، ق، ع، غ، ك، گ، ل، ڵ، م، ن، و، وو، ۆ، هـ، ی، ێ % vowels: A, E, I, O, U, UU % پیتەبزوێنەكان ئەمانەن: ئ، ا، ە، و، وو، ۆ، ی، ێ، % % Copy the template from ISO/IEC 14651 copy "iso14651_t1" collating-element <ئا> from <U0626><U0627> collating-element <وو> from <U0648><U0648> collating-element <لا> from <U0644><U0627> collating-symbol <U0628> collating-symbol <U062C> collating-symbol <U0631> collating-symbol <U0632> collating-symbol <U0641> collating-symbol <U0643> collating-symbol <U0644> collating-symbol <U0648> collating-symbol <U06CC> reorder-after <U0628> <U067E> reorder-after <U062C><U0686> reorder-after <U0631><U0695> reorder-after <U0632><U0698> reorder-after <U0641><U06A4> reorder-after <U0643><U06AF> reorder-after <U0644><U06B5> reorder-after <U0648><U06C6> reorder-after <U06CC><U06CE> % Kurdish digits same as Arabic ones: they are the basic forms. reorder-after <U0660> <U0660> <0>;<PCL>;<MIN>;IGNORE <U0661> <1>;<PCL>;<MIN>;IGNORE <U0662> <2>;<PCL>;<MIN>;IGNORE <U0663> <3>;<PCL>;<MIN>;IGNORE <U0664> <4>;<PCL>;<MIN>;IGNORE <U0665> <5>;<PCL>;<MIN>;IGNORE <U0666> <6>;<PCL>;<MIN>;IGNORE <U0667> <7>;<PCL>;<MIN>;IGNORE <U0668> <8>;<PCL>;<MIN>;IGNORE <U0669> <9>;<PCL>;<MIN>;IGNORE reorder-end END LC_COLLATE LC_MONETARY % This is the POSIX Locale definition the LC_MONETARY category. % These are generated based on XML base Locale difintion file % for IBM Class for Unicode/Java % int_curr_symbol "<U0049><U0051><U0044><U0020>" currency_symbol "<U062F><U002E><U0639><U002E>" mon_decimal_point "<U002E>" mon_thousands_sep "<U002C>" mon_grouping 3 positive_sign "" negative_sign "<U002D>" int_frac_digits 3 frac_digits 3 p_cs_precedes 1 p_sep_by_space 1 n_cs_precedes 1 n_sep_by_space 1 p_sign_posn 1 n_sign_posn 2 % END LC_MONETARY LC_NUMERIC % This is the POSIX Locale definition for the LC_NUMERIC category. % decimal_point "<U002E>" thousands_sep "<U002C>" grouping 3 % END LC_NUMERIC LC_TIME % This is the POSIX Locale definition for the LC_TIME category. % These are generated based on XML base Locale difintion file % % Abbreviated weekday names (%a) abday "<U06CC><U06D5><U0626><U0634>";"<U062F><U0648><U0648><U0634>";/ "<U0633><U0626><U0634>";"<U0686><U0648><U0631><U0634>";/ "<U067E><U0626><U0634>";"<U0647><U0647>";/ "<U0634><U06D5><U0645>" % % Full weekday names (%A) day "<U06CC><U06D5><U0643><U0634><U06D5><U0645><U0645><U06D5>";/ "<U062F><U0648><U0648><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0633><U06CE><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0686><U0648><U0624><U0631><U0634><U06D5><U0645><U0645><U06D5>";/ "<U067E><U06CE><U0646><U062C><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0647><U06D5><U06CC><U0646><U06CC>";/ "<U0634><U06D5><U0645><U0645><U06D5>";/ % % Abbreviated month names (%b) abmon "<U064A><U0646><U0627>";"<U0641><U0628><U0631>";/ "<U0645><U0627><U0631>";"<U0623><U0628><U0631>";/ "<U0645><U0627><U064A>";"<U064A><U0648><U0646>";/ "<U064A><U0648><U0644>";"<U0623><U063A><U0633>";/ "<U0633><U0628><U062A>";"<U0623><U0643><U062A>";/ "<U0646><U0648><U0641>";"<U062F><U064A><U0633>" % % Full month names (%B) mon "<U064A><U0646><U0627><U064A><U0631>";/ "<U0641><U0628><U0631><U0627><U064A><U0631>";/ "<U0645><U0627><U0631><U0633>";/ "<U0623><U0628><U0631><U064A><U0644>";/ "<U0645><U0627><U064A><U0648>";/ "<U064A><U0648><U0646><U064A><U0648>";/ "<U064A><U0648><U0644><U064A><U0648>";/ "<U0623><U063A><U0633><U0637><U0633>";/ "<U0633><U0628><U062A><U0645><U0628><U0631>";/ "<U0623><U0643><U062A><U0648><U0628><U0631>";/ "<U0646><U0648><U0641><U0645><U0628><U0631>";/ "<U062F><U064A><U0633><U0645><U0628><U0631>" % % Equivalent of AM PM am_pm "<U0635>";"<U0645>" % % Appropriate date and time representation % %d %b, %Y%Z %I:%M:%S d_t_fmt "<U0025><U0064><U0020><U0025><U0062><U002C><U0020><U0025>/ <U0059><U0020><U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020><U0025><U0070>" % % Appropriate date representation % %d %b, %Y d_fmt "<U0025><U0064><U0020><U0025><U0062><U002C><U0020><U0025><U0059>" % % Appropriate time representation % %Z %I:%M:%S t_fmt "<U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020>" % % Appropriate 12 h time representation (%r) t_fmt_ampm "<U0025><U005A><U0020><U0025><U0049><U003A><U0025><U004D>/ <U003A><U0025><U0053><U0020><U0025><U0070>" % % Appropriate date representation (date(1)) "%a %b %e %H:%M:%S %Z %Y" date_fmt "<U0025><U0061><U0020><U0025><U0062><U0020><U0025><U0065>/ <U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/ <U0025><U005A><U0020><U0025><U0059>" % FIXME: found in CLDR first_weekday 7 END LC_TIME LC_MESSAGES copy "ckb_IQ" END LC_MESSAGES LC_PAPER % This is the ISO_IEC TR14652 Locale definition for the copy "ckb_IQ" height 297 width 210 END LC_PAPER LC_NAME % This is the CKB Locale definition for the % LC_NAME category. % name_fmt "<U0025><U0069><U0025><U0074><U0025><U0066><U0025><U0074>/ <U0025><U0067>" name_gen "<U002D><U0073><U0061><U006E>" name_mr "<U0643><U0627><U0643>" name_mrs "<U062E><U0627><U062A><U0648>" name_miss "<U062E><U0627><U062A><U0648>" name_ms "<U062E><U0627><U062A><U0648>" END LC_NAME LC_ADDRESS % This is the IQ CKB Locale definition for the % LC_ADDRESS postal_fmt "<U0025><U007A><U0025><U0063><U0025><U0049><U0025><U0073>/ <U0025><U0062><U0025><U0065><U0025><U0072>" country_ab2 "<U0049><U0052><U0051>" country_ab3 "<U0049><U0052><U0051>" country_post "<U0049><U0052><U0051>" country_num 364 country_car "<U0049><U0052><U0051>" % "kurd<U00EE>" lang_name "<U0643><U0648><U0631><U062F><U06CC>" lang_ab "<U0643><U0648>" lang_term "<U0066><U0061><U0073>" lang_lib "<U006B><U0075><U0072>" END LC_ADDRESS LC_TELEPHONE % This is the IQ CKB Locale definition for the % tel_int_fmt "<U002B><U0025><U0063><U0020><U003B><U0025><U0061><U0020>/ <U003B><U0025><U006C>" int_prefix "<U0039><U0036><U0034>" END LC_TELEPHONE LC_MEASUREMENT % This is the ISO_IEC TR14652 Locale definition for the % measurement 1 END LC_MEASUREMENT
I updated the file. please check if still has a bad format.
Subject: Re: Please add Kurdish locale for Kurdish Sorani (CKB) Hi I update the locale couple weeks ago, would you please check it again http://www.sourceware.org/ml/libc-locales/2009-q2/msg00021.html Its my pleasure to hear from you a feedback. Regards Aras On Sat, Feb 7, 2009 at 5:53 AM, drepper at redhat dot com<sourceware-bugzilla@sourceware.org> wrote: > > ------- Additional Comments From drepper at redhat dot com 2009-02-07 03:53 ------- > The file is ill-formed. The file is named ckb_IQ and you use it in "copy" in > various categories? You also have copy and definitions in categories like > LC_PAPER. You have to fix it up. > > -- > What |Removed |Added > ---------------------------------------------------------------------------- > Status|NEW |WAITING > > > http://sourceware.org/bugzilla/show_bug.cgi?id=9809 > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
Can you please attach the current version instead of adding it as a comment? The latter destroys all the non-ASCII characters.
Created attachment 4013 [details] new Locale info for CKB hier is the new local info for CKB
Created attachment 4168 [details] new fixed Locale info for CKB fixed bugs in Message category
Created attachment 4228 [details] Locale info for CKB
Info is provided, changing back to NEW
I get a lot of errors when I try to build this locale: locales/ckb_IQ:62: LC_COLLATE: syntax error locales/ckb_IQ:63: LC_COLLATE: syntax error locales/ckb_IQ:64: LC_COLLATE: syntax error locales/ckb_IQ:66: LC_COLLATE: syntax error locales/ckb_IQ:67: LC_COLLATE: syntax error locales/ckb_IQ:68: LC_COLLATE: syntax error locales/ckb_IQ:69: LC_COLLATE: syntax error locales/ckb_IQ:70: LC_COLLATE: syntax error locales/ckb_IQ:71: LC_COLLATE: syntax error locales/ckb_IQ:72: LC_COLLATE: syntax error locales/ckb_IQ:73: LC_COLLATE: syntax error locales/ckb_IQ:74: LC_COLLATE: syntax error locales/ckb_IQ:76: trailing garbage at end of line locales/ckb_IQ:77: trailing garbage at end of line locales/ckb_IQ:78: trailing garbage at end of line locales/ckb_IQ:79: trailing garbage at end of line locales/ckb_IQ:80: trailing garbage at end of line locales/ckb_IQ:81: trailing garbage at end of line locales/ckb_IQ:82: trailing garbage at end of line locales/ckb_IQ:83: trailing garbage at end of line locales/ckb_IQ:84: LC_COLLATE: cannot reorder after U000006CC: symbol not known locales/ckb_IQ:155: extra trailing semicolon LC_NAME: invalid escape sequence in field `name_fmt' LC_ADDRESS: invalid escape `%I' sequence in field `postal_fmt' LC_ADDRESS: `lang_ab' value does not match `lang_term' value LC_ADDRESS: `lang_lib' value does not match `lang_term' value LC_ADDRESS: `country_ab2' value does not match `country_num' value LC_ADDRESS: `country_ab3' value does not match `country_num' value
I am analyzing the Errors now and try to fix them as soon as I can. Thanks & Regards Aras
Created attachment 4357 [details] CKB locale - updated Hi, I fixed some errors due to the occured Errors. How can I test it by myself before release to bugzilla? Regards
Any progress? (In reply to comment #16) > Created an attachment (id=4357) > CKB locale - updated > > Hi, > I fixed some errors due to the occured Errors. How can I test it by myself > before release to bugzilla? > > Regards >
Use localedef to compile your locale, and $LOCPATH if you don't want to install it system-wide in order to test it.
The file isn't usable as-is, there are many problems when compiling it. First, it must be in UTF-8. Second, the collation rules seem all pretty bogus since there already are rules for all the characters defined. If needed, you have to redefine the relocation. Third, there are many syntax errors. Fourth, all the values for the fields must use the <U....> notation, not real strings. Fifth, the values for some fields is plain wrong. localedef will tell you. I did add the language code to localedef now. Just run localedef like localedef -i ./YOURFILE -f UTF-8 ./SOMEDIR
Created attachment 5727 [details] CKB-IQ locale info (Kurdish Sorani) CKB-IQ locale info (Kurdish Sorani)
Hi, I updated the file, it was full of syntax Errors, I repair most of them, hope its works now. I observed many locale files used slach / others using backslash \, compiler distinguished between them!, I learned its should not be so. The file is saved as UTF-8 also with Unix format of EOL. best regards Aras
You haven't fixed the collation information. You have to use the generic collation data (include it, do not copy it) and then, if necessary at all, define modifications using reorder_after etc.
(In reply to comment #22) > You haven't fixed the collation information. You have to use the generic > collation data (include it, do not copy it) and then, if necessary at all, > define modifications using reorder_after etc. did you mean % collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627> they are already commented. regards Aras
Hallo Aras, gut zu sehen, dass Du wieder an der Datei arbeitest. Weißt Dul was "collation" ist und wozu es gut ist? Es geht dabei um die Reihenfolge der alphabetischen Sortierung. Liebe Grüße Erdal On 16 May 2011 17:24, aras.noori at gmail dot com < sourceware-bugzilla@sourceware.org> wrote: > http://sourceware.org/bugzilla/show_bug.cgi?id=9809 > > --- Comment #23 from Aras Noori <aras.noori at gmail dot com> 2011-05-16 > 15:23:50 UTC --- > (In reply to comment #22) > > You haven't fixed the collation information. You have to use the generic > > collation data (include it, do not copy it) and then, if necessary at > all, > > define modifications using reorder_after etc. > > did you mean > % collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627> > > they are already commented. > > regards > Aras > > -- > Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. >
Sorry for posting in German here, wasn't aware that replies go directly to the bugmail.
(In reply to comment #23) > (In reply to comment #22) > > You haven't fixed the collation information. You have to use the generic > > collation data (include it, do not copy it) and then, if necessary at all, > > define modifications using reorder_after etc. > > did you mean > % collating-element <LAM WITH SMALL V-ALEF> from <U06B5><U0627> > > they are already commented. No. Look at the other files. There are some broken ones but those I caught in time are using copy "iso14651_t1" and then use if necessary reorder_after.
Yes Erdal I also defined 3 collations, they were had syntax Error as Mr. Ulrich Drepper says. I would upload the new version later. Thanks for your efforts.
WAITING for almost a year now. Please reopen the bug when you have a new patch.
Sorry for the clicko; this was not fixed.
Can you Post problems?
*** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla.
Created attachment 7887 [details] CKB-IQ locale info (Kurdish Sorani)
Hi All, I attached the new version, bug free. I also renewed the Bug on Launchpad at: https://bugs.launchpad.net/ubuntu/+source/langpack-locales/+bug/1388808 Best regards for all your efforts in past. Regards Aras
Please see comment #33 by Aras Noori.
first, please update the header of the file to match all the current locales. the first ~10 lines should be the same (e.g. as en_US). you should sync to latest git as the last release is out of date already. > title "Kurdish language locale based on Arabic letters" this should be: Central Kurdish language locale for Iraq > tel "+49 17629857380" leave this field blank > language "Kurdish" change to "Central Kurdish" > territory "Iraq, Iran" drop Iran. this locale is only for Iraq. > category "ckb_IQ:2000";LC_IDENTIFICATION you'll need to fix all these category fields. copy them from en_US for their correct values. > LC_COLLATE please rebase this to start with: copy "iso14651_t1" > % This is the POSIX Locale definition for the LC_NUMERIC category. delete these old comments from the LC_NUMERIC, LC_TIME, LC_NAME, and LC_ADDRESS categories > LC_TIME are you sure about the day/abday/mon/abmon translations ? CLDR says they're different. make sure day/abday start on Sunday > am_pm does Iraq really use am/pm notation ? if not, leave these fields blank. > first_workday 7 change this to 2 and add this line: week 7;19971130;1 > yesexpr "<U0628><U06D5><U06B5><U06CE>" > noexpr "<U0646><U06D5><U062E><U06CE><U0631>" these need to be updated. these should be regular expressions to match a yes/no answer. see the current en_US value as an example. please also provide yesstr/nostr translations > LC_PAPER > LC_MEASUREMENT change both of these categories to simply: copy "ar_IQ" > name_gen "<U002D><U0073><U0061><U006E>" this is "-san". is that correct ? > country_car "<U0049><U0051>" shouldn't this be "IRQ" instead of "IQ" ? > LC_ADDRESS please define country_name (localized translation for Iraq) > tel_int_fmt "+%c ;%a ;%l" pretty sure this should be: +%c %a%t%l > tel_dom_fmt "<U202A><U0025><U0041><U2012><U0025><U006C><U202C>" are you sure this is correct ?
Created attachment 12173 [details] Localedata file for ckb_IQ Here is new version of localedata file for ckb.thanks
(In reply to Mike Frysinger from comment #35) > first, please update the header of the file to match all the current > locales. the first ~10 lines should be the same (e.g. as en_US). you > should sync to latest git as the last release is out of date already. > > > title "Kurdish language locale based on Arabic letters" > > this should be: > Central Kurdish language locale for Iraq > > > tel "+49 17629857380" > > leave this field blank > > > language "Kurdish" > > change to "Central Kurdish" > > > territory "Iraq, Iran" > > drop Iran. this locale is only for Iraq. > > > category "ckb_IQ:2000";LC_IDENTIFICATION > > you'll need to fix all these category fields. copy them from en_US for > their correct values. > > > LC_COLLATE > > please rebase this to start with: > copy "iso14651_t1" > > > % This is the POSIX Locale definition for the LC_NUMERIC category. > > delete these old comments from the LC_NUMERIC, LC_TIME, LC_NAME, and > LC_ADDRESS categories > > > LC_TIME > > are you sure about the day/abday/mon/abmon translations ? CLDR says they're > different. > > make sure day/abday start on Sunday > > > am_pm > > does Iraq really use am/pm notation ? if not, leave these fields blank. > > > first_workday 7 > > change this to 2 and add this line: > week 7;19971130;1 > > > yesexpr "<U0628><U06D5><U06B5><U06CE>" > > noexpr "<U0646><U06D5><U062E><U06CE><U0631>" > > these need to be updated. these should be regular expressions to match a > yes/no answer. see the current en_US value as an example. > > please also provide yesstr/nostr translations > > > LC_PAPER > > LC_MEASUREMENT > > change both of these categories to simply: > copy "ar_IQ" > > > name_gen "<U002D><U0073><U0061><U006E>" > > this is "-san". is that correct ? > > > country_car "<U0049><U0051>" > > shouldn't this be "IRQ" instead of "IQ" ? > > > LC_ADDRESS > > please define country_name (localized translation for Iraq) > > > tel_int_fmt "+%c ;%a ;%l" > > pretty sure this should be: > +%c %a%t%l > > > tel_dom_fmt "<U202A><U0025><U0041><U2012><U0025><U006C><U202C>" > > are you sure this is correct ? Thank you for your tipps, @Jwtiayr and I fixed the bugs.
(In reply to Jwtiayr Nariman from comment #36) > Created attachment 12173 [details] > Localedata file for ckb_IQ > > Here is new version of localedata file for ckb.thanks Great Work.
> LC_COLLATE > % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters. > % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read. > % > % The kurdish alphabet order is: > % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z > % vowels: A, E, I, O, U, UU > % > % Copy the template from ISO/IEC 14651 > > order_start forward; forward > % > % Kurdish numeric characters. > % > <U0660> <U0660> You still did not base the collation on iso14651_t1. Your LC_COLLATE section should start like this: LC_COLLATE copy "iso14651_t1" and then you should only reorder the characters which are not correctly ordered already, i.e. you should only do modifications to the default collation order comming from "iso14651_t1", *not* write everything from scratch. I can try to help you with that and try to rewrite your LC_COLLATE.
T(In reply to Mike FABIAN from comment #39) > > LC_COLLATE > > % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters. > > % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read. > > % > > % The kurdish alphabet order is: > > % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z > > % vowels: A, E, I, O, U, UU > > % > > % Copy the template from ISO/IEC 14651 > > > > order_start forward; forward > > % > > % Kurdish numeric characters. > > % > > <U0660> <U0660> > > You still did not base the collation on iso14651_t1. > > Your LC_COLLATE section should start like this: > > LC_COLLATE > copy "iso14651_t1" > > and then you should only reorder the characters which are not correctly > ordered already, i.e. you should only do modifications to the default > collation order comming from "iso14651_t1", *not* write everything from > scratch. > > I can try to help you with that and try to rewrite your LC_COLLATE. Thank you mike, its little complicated i think i don't understand your point. But if you can do its really appreciated.
Created attachment 12190 [details] attachment-64689-0.html What we do now dear mike? On Wed, Jan 8, 2020, 20:35 maiku.fabian at gmail dot com < sourceware-bugzilla@sourceware.org> wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=9809 > > --- Comment #39 from Mike FABIAN <maiku.fabian at gmail dot com> --- > > LC_COLLATE > > % The Kurdish Sorani, Bahdini, and others dialects is mainly written > using a modified (Arabic-based alphabet) with 33 letters. > > % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an > alphabet in which vowels are mandatory, making the script easy to read. > > % > > % The kurdish alphabet order is: > > % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, > q, r, rr, s, sh, t, u, uu, v, w, x, y, z > > % vowels: A, E, I, O, U, UU > > % > > % Copy the template from ISO/IEC 14651 > > > > order_start forward; forward > > % > > % Kurdish numeric characters. > > % > > <U0660> <U0660> > > You still did not base the collation on iso14651_t1. > > Your LC_COLLATE section should start like this: > > LC_COLLATE > copy "iso14651_t1" > > and then you should only reorder the characters which are not correctly > ordered already, i.e. you should only do modifications to the default > collation order comming from "iso14651_t1", *not* write everything from > scratch. > > I can try to help you with that and try to rewrite your LC_COLLATE. > > -- > You are receiving this mail because: > You are on the CC list for the bug.
You have <U0640> IGNORE in your sort order. U+0640 ARABIC TATWEEL Why IGNORE?
% % % Other control characters etc. upto order_end % Why do you sort control characters? These have nothing to do with the Kurdish Sorani language.
Created attachment 12192 [details] 0001-Add-ckb_IQ-locale.patch That is your original locale file as a patch
Created attachment 12193 [details] 0002-Fix-ckb_IQ-Add-ckb_IQ-to-SUPPORTED-file-Add-ckb_IQ.U.patch My suggested changes.
LC_MONETARY -int_curr_symbol "<U0049><U0051><U0044><U0020>" +int_curr_symbol "IQD " currency_symbol "<U062F><U002E><U0639>" -mon_decimal_point "<U002E>" -mon_thousands_sep "<U002C>" +mon_decimal_point "." +mon_thousands_sep "," mon_grouping 3 positive_sign "" -negative_sign "<U002D>" +negative_sign "-" int_frac_digits 3 frac_digits 3 p_cs_precedes 1 For everything which is ASCII, it is allowed (and preferred) to write the ASCII directly and not the code points. I.e. it is better (because more readable) to write "-" instead of "<U002D>". I hope in future this will be allowed also for non-ASCII characters, at the moment it is only allowed for ASCII.
LC_MESSAGES -yesexpr "<U0628><U06D5><U06B5><U06CE>" -noexpr "<U0646><U06D5><U062E><U06CE><U0631>" +yesexpr "^[+1yY<U0628>]" +noexpr "^[-0nN<U0646>]" yesstr "<U0628><U06D5><U06B5><U06CE>" nostr "<U0646><U06D5><U062E><U06CE><U0631>" END LC_MESSAGES "yesstr" and "nostr" are the words for "yes" and "no" in your language. "yesexpr" should *not* be the same as "yesstr". "yesexpr" should be a regular expression matching single letters which could be typed as the response for "yes" when you get a prompt asking something like: "Do you want ...? (y/n)" and when you type "y" in English, this means yes. In *all* glibc locales we include +1yY to the "yesexpr" as long as this does not conflict with the language of that locale. If "y" would suggest "no" in that language we can not add it to "yesexpr" but in all other cases we add it. Similar ofr "noexpr".
LC_ADDRESS postal_fmt "%z%c%T%s%b%e%r" -country_name "Iraq" -country_ab2 "<U0049><U0051>" -country_ab3 "<U0049><U0052><U0051>" -country_post "<U0049><U0052><U0051>" +country_name "<U0639><U06CE><U0631><U0627><U0642>" +country_ab2 "IQ" +country_ab3 "IRQ" +country_post "IRQ" country_num 368 -country_car "<U0049><U0051>" +country_car "IQ" +lang_name "<U06A9><U0648><U0631><U062F><U06CC><U06CC> <U0646><U0627><U0648><U06D5><U0646><U +lang_term "ckb" +lang_lib "ckb" % END LC_ADDRESS country_name should be the name of the country in your language (Sorani), *not* in English. The English name is already in: territory "Iraq" lang_name should be the the name of your language in your language. The English name is already in: language "Central Kurdish"
I rewrote the LC_COLLATE section to contain only the absolutely necessary stuff. Now it looks like this: LC_COLLATE % The Kurdish Sorani, Bahdini, and others dialects is mainly written using a modified (Arabic-based alphabet) with 33 letters. % Unlike the regular Arabic alphabet, which is an abjad, kurdish is an alphabet in which vowels are mandatory, making the script easy to read. % % The kurdish alphabet order is: % in Latin: a, b, c, ç, d, e, ê, f, g, h, i, î, j, k, l, ll, m, n, o, p, q, r, rr, s, sh, t, u, uu, v, w, x, y, z % vowels: A, E, I, O, U, UU % % Copy the template from ISO/IEC 14651 copy "iso14651_t1" reorder-after <S0631> % ر <S0695> % ڕ reorder-after <S0646> % ن <S0648> % و <S06C6> % ۆ END LC_COLLATE I.e. this sorts U+0695, U+0648, and U+06C6 differently from the default sort order. The default sort order comes from copy "iso14651_t1" You use this line to copy the default sort order and then add changes needed for your language. According to what you wrote in your locale, the 3 characters U+0695, U+0648, and U+06C6 sort differently than the default sort order for Arabic characters, all the reset sort the same as in the default sort order.
If you do *not* use copy "iso14651_t1" this is bad because then almost all Unicode characters which you do not cover by your own sort order will sort incorrectly. You want a reasonable default and apply the changes for your language to that default. Of course your locale should sort Kurdish Sorani correctly, but it should not sort other characters (Cyrillic, Devanagari, ... whatever) completely silly.
Your locale also sorted many control characters and ASCII punctuation characters. I think there is no reason to deviate from the default for these characters, therefore I removed them. If you have a good reason why some of these need to be sorted differently for Kurdish, please tell me.
Your locale sorted the Kurdish numbers at the top, i.e. before the Western numbers. The default order (as you can see in the ckb_IQ.UTF-8.in sorting test file in my patch) sorts these in between the Western numbers. Like this: 0 ٠ 1 ١ 2 ٢ 3 ٣ 4 ٤ 5 ٥ 6 ٦ 7 ٧ 8 ٨ 9 ٩ That is reasonably good, isn’t it?
Your locale also resorted all the ASCII letters to make upper case letters come first. I.e. A a instead of a A Lower case first is what comes from copy "iso14651_t1" When using CLDR for sorting, one can use an option [caseFirst upper], see for example: https://github.com/unicode-org/cldr/blob/master/common/collation/da.xml glibc has no easy option to do that at the moment. It is *possible* do sort A-Za-z differently in your locale *but* if you do that you will get a weird order for all Latin characters you forget. I.e. if you do not include äÄ in your sort order as well, they would still sort lower case first. It is a lot of work to do this correctly for *all* Latin characters without a convenient option like CLDR’s [caseFirst upper], I would recommend not doing that if it is not absolutely required.
(In reply to Mike FABIAN from comment #53) > Your locale also resorted all the ASCII letters to make upper case letters > come first. > > I.e. > > A > a > > instead of > > a > A > > Lower case first is what comes from > > copy "iso14651_t1" > > When using CLDR for sorting, one can use an option > [caseFirst upper], see for example: > > https://github.com/unicode-org/cldr/blob/master/common/collation/da.xml > > glibc has no easy option to do that at the moment. > > It is *possible* do sort A-Za-z differently in your locale *but* > if you do that you will get a weird order for all Latin characters you > forget. > I.e. if you do not include äÄ in your sort order as well, they would still > sort > lower case first. It is a lot of work to do this correctly for *all* Latin > characters without a convenient option like CLDR’s [caseFirst upper], > I would recommend not doing that if it is not absolutely required. Hello Fabian, thanks to your suggestions and notice. You are right with sorting (aA) as well with Numbers, this should be modified. The kurdish alphabet order is: ئ U+0626 ا U+0627 ب U+0628 پ U+067E ت U+062A ج U+062C چ U+0686 ح U+062D خ U+062E د U+062F ر U+0631 ڕ U+0695 ز U+0632 ژ U+0698 س U+0633 ش U+0634 ع U+0639 غ U+063A ف U+0641 ڤ U+06A4 ق U+0642 ک U+06A9 گ U+06AF ل U+0644 ڵ U+06B5 م U+0645 ن U+0646 و U+0648 ۆ U+06C6 ھ U+0647 ە U+06D5 ی U+06CC ێ U+06CE
thank you mike you is really appreciated i have pointed all my answers according to your question and suggestion to our locale as follow: 1. For positive sign and negative i agree with you let it be + and - . 2. For regular expression i didn't know how to type it in my language hope to hekp me solve this. we have "ب" for Y in English and "ن" for N in English . 3.You right we type Iraq in Kurdish(Sorani) now changed. 4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at iso14651_t1 now all characters which is used in Kurdish are exist, these characters that you did add them are from Arabic language not Kurdish. Can you send the .dat file with your last changes? Best Regards
Thank you mike you your help is really appreciated I have pointed all my answers according to your question and suggestion to our locale as follow: 1. For positive sign and negative i agree with you let it be + and - . 2. For regular expression i didn't know how to type it in my language hope to hekp me solve this. we have "ب" for Y in English and "ن" for N in English . 3.You right we type Iraq in Kurdish(Sorani) now changed. 4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at iso14651_t1 now all characters which is used in Kurdish are exist, these characters that you did add them are from Arabic language not Kurdish. Can you send the .dat file with your last changes? Best Regards
> thanks to your suggestions and notice. You are right with sorting (aA) as > well with Numbers, this should be modified. So sorting a A and 0 ٠ 1 ١ ... is OK? I hope so ... > The kurdish alphabet order is: To achieve that order, this is enough: copy "iso14651_t1" reorder-after <S0631> % ر <S0695> % ڕ reorder-after <S0646> % ن <S0648> % و <S06C6> % ۆ I added the test file ckb_IQ.UTF-8.in in my patch, this file is sorted using the rules of my patched ckb_IQ locale, the sorted result should be the same as the original file, otherwise the test fails. As the test passes, the above collation rules work and achieve the order as in the ckb_IQ.UTF-8.in test file. I’ll paste this test file here again for your easy refererence: 0 ٠ 1 ١ 2 ٢ 3 ٣ 4 ٤ 5 ٥ 6 ٦ 7 ٧ 8 ٨ 9 ٩ a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Q r R s S t T u U v V w W x X y Y z Z ئ ا ب پ ت ج چ ح خ د ر ڕ ز ژ س ش ع غ ف ڤ ق ک گ ل ڵ م ن و ۆ ه ە ی ێ Other characters not in this test file are sorted according to the defaults from copy "iso14651_t1"
(In reply to Jwtiyar Nariman from comment #56) > Thank you mike you your help is really appreciated > I have pointed all my answers according to your question and suggestion to > our locale as follow: > > 1. For positive sign and negative i agree with you let it be + and - . Your original locale had the positive sign empty. Probably a mistake. So I’ll make it + now. > 2. For regular expression i didn't know how to type it in my language hope > to hekp me solve this. > we have "ب" for Y in English and "ن" for N in English . That is what I used: yesexpr "^[+1yY<U0628>]" noexpr "^[-0nN<U0646>]" So these regular expressions except +, 1, y, Y, and ب as a yes answer. And -, 0, n, N, and ن as a no answer. > 3.You right we type Iraq in Kurdish(Sorani) now changed. > 4.We have Kurdish alphabet as Aras Noori wrote before my reply and i look at > iso14651_t1 now all characters which is used in Kurdish are exist, these > characters that you did add them are from Arabic language not Kurdish. I don’t understand. Most of these characters are used both in Arabic *and* Kurdish.
Created attachment 12194 [details] ckb_IQ > Can you send the .dat file with your last changes? Here is the latest file with the changes I made. I just added the + as the positive_sign.
Created attachment 12195 [details] 0001-Add-ckb_IQ-locale.patch Updated patch.
Created attachment 12196 [details] 0002-Fix-ckb_IQ-Add-ckb_IQ-to-SUPPORTED-file-Add-ckb_IQ.U.patch Updated patch.
(In reply to Mike FABIAN from comment #57) > > thanks to your suggestions and notice. You are right with sorting (aA) as > > well with Numbers, this should be modified. > > So sorting > > a > A > > and > > 0 > ٠ > 1 > ١ > ... > > is OK? I hope so ... > > > The kurdish alphabet order is: > > To achieve that order, this is enough: > > copy "iso14651_t1" > > reorder-after <S0631> % ر > <S0695> % ڕ > > reorder-after <S0646> % ن > <S0648> % و > <S06C6> % ۆ > > I added the test file ckb_IQ.UTF-8.in in my patch, this file is sorted > using the rules of my patched ckb_IQ locale, the sorted result should > be the same as the original file, otherwise the test fails. > > As the test passes, the above collation rules work and achieve the > order as in the ckb_IQ.UTF-8.in test file. > > I’ll paste this test file here again for your easy refererence: > > 0 > ٠ > 1 > ١ > 2 > ٢ > 3 > ٣ > 4 > ٤ > 5 > ٥ > 6 > ٦ > 7 > ٧ > 8 > ٨ > 9 > ٩ > a > A > b > B > c > C > d > D > e > E > f > F > g > G > h > H > i > I > j > J > k > K > l > L > m > M > n > N > o > O > p > P > q > Q > r > R > s > S > t > T > u > U > v > V > w > W > x > X > y > Y > z > Z > ئ > ا > ب > پ > ت > ج > چ > ح > خ > د > ر > ڕ > ز > ژ > س > ش > ع > غ > ف > ڤ > ق > ک > گ > ل > ڵ > م > ن > و > ۆ > ه > ە > ی > ێ > > Other characters not in this test file are sorted according to the defaults > from > > copy "iso14651_t1" Sorting is good now, but adding these reorder-after <S0631> % ر > <S0695> % ڕ > > reorder-after <S0646> % ن > <S0648> % و > <S06C6> % ۆ iam not understanding because for example this " <S0695> % ڕ " how you order it?
(In reply to Jwtiyar Nariman from comment #62) > > Other characters not in this test file are sorted according to the defaults > > from > > > > copy "iso14651_t1" > > Sorting is good now, but adding these > reorder-after <S0631> % ر > > <S0695> % ڕ > > > > reorder-after <S0646> % ن > > <S0648> % و > > <S06C6> % ۆ > iam not understanding because for example this " <S0695> % ڕ " how you > order it? copy "iso14651_t1" contains copy "iso14651_t1_common" and some modifications which affect only Chinese and Japanese. So we look into the iso14651_t1_common file to see what the default sort order is. We find for example: ... <S0631> % ARABIC LETTER REH <S0632> % ARABIC LETTER ZAIN <S0691> % ARABIC LETTER RREH <S0692> % ARABIC LETTER REH WITH SMALL V <S0693> % ARABIC LETTER REH WITH RING <S0694> % ARABIC LETTER REH WITH DOT BELOW <S0695> % ARABIC LETTER REH WITH SMALL V BELOW <S0696> % ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE ... Looking at this you see that ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW is sorted right after ڔ U+0694 ARABIC LETTER REH WITH DOT BELOW by default. That is not what you want for Kurdish. For Kurdish, you want ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW to be sorted right after ر U+0631 ARABIC LETTER REH. This is achieved by the rule: reorder-after <S0631> % ر <S0695> % ڕ Which removes U+0695 from its default position in the sort order and inserts it again after U+0631. reorder-after <S0646> % ن <S0648> % و <S06C6> % ۆ does a similar thing to change the sorting of U+0648 and U+06C6. To find out which of these rules I need, I created the ckb_IQ.UTF-8.in test file first and wrote the Kurdish characters in the order you wanted into that file. Then I ran a test sort using a ckb_IQ locale which had *only* LC_COLLATE copy "iso14651_t1" END LC_COLLATE and *nothing* else. The test sort showed that only U+0695, U+0648, and U+06C6 were sorted incorrectly. All other characters from your list of Kurdish characters were sorted correctly already. So I needed only to add rules to fix the sort order for these 3 characters. You can see the same by just reading the iso14651_t1_common and find out which of the Kurdish characters are already in the correct order in that file and which are not. You have to do nothing for the characters which are already in correct order. For the characters which are in a wrong position in iso14651_t1_common, you add rules like reorder-after <... collating-symbol after which to reorder ...> <... the collating-symbol which should be reordered ...> I found writing the test file and checking which characters are sorted wrongly by default easier than staring at iso14651_t1_common. And it is a good idea to have the test file anyway to make sure that the Kurdish sort order always stays correct when something is changed in glibc. If we have the test file, we will notice when some change causes a problem.
Thank you very much dear mike i got it, you made a great job, thanks again. So now every thing is ready to be accepted in glibc. Best Regards (In reply to Mike FABIAN from comment #63) > (In reply to Jwtiyar Nariman from comment #62) > > > > Other characters not in this test file are sorted according to the defaults > > > from > > > > > > copy "iso14651_t1" > > > > Sorting is good now, but adding these > > reorder-after <S0631> % ر > > > <S0695> % ڕ > > > > > > reorder-after <S0646> % ن > > > <S0648> % و > > > <S06C6> % ۆ > > iam not understanding because for example this " <S0695> % ڕ " how you > > order it? > > copy "iso14651_t1" > > contains > > copy "iso14651_t1_common" > > and some modifications which affect only Chinese and Japanese. > > So we look into the iso14651_t1_common file to see what the default sort > order is. > > We find for example: > > ... > <S0631> % ARABIC LETTER REH > <S0632> % ARABIC LETTER ZAIN > <S0691> % ARABIC LETTER RREH > <S0692> % ARABIC LETTER REH WITH SMALL V > <S0693> % ARABIC LETTER REH WITH RING > <S0694> % ARABIC LETTER REH WITH DOT BELOW > <S0695> % ARABIC LETTER REH WITH SMALL V BELOW > <S0696> % ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE > ... > > Looking at this you see that ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW > is sorted right after ڔ U+0694 ARABIC LETTER REH WITH DOT BELOW by default. > That is not what you want for Kurdish. For Kurdish, you want > ڕ U+0695 ARABIC LETTER REH WITH SMALL V BELOW to be sorted right after > ر U+0631 ARABIC LETTER REH. > > This is achieved by the rule: > > reorder-after <S0631> % ر > <S0695> % ڕ > > Which removes U+0695 from its default position in the sort order > and inserts it again after U+0631. > > reorder-after <S0646> % ن > <S0648> % و > <S06C6> % ۆ > > does a similar thing to change the sorting of U+0648 and U+06C6. > > To find out which of these rules I need, I created the ckb_IQ.UTF-8.in > test file first and wrote the Kurdish characters in the order you wanted > into that file. > > Then I ran a test sort using a ckb_IQ locale which had *only* > > LC_COLLATE > copy "iso14651_t1" > END LC_COLLATE > > and *nothing* else. > > The test sort showed that only U+0695, U+0648, and U+06C6 were sorted > incorrectly. > All other characters from your list of Kurdish characters were sorted > correctly > already. So I needed only to add rules to fix the sort order for these 3 > characters. > > You can see the same by just reading the iso14651_t1_common and find out > which > of the Kurdish characters are already in the correct order in that file and > which are not. > You have to do nothing for the characters which are already in correct order. > For the characters which are in a wrong position in iso14651_t1_common, you > add > rules like > > reorder-after <... collating-symbol after which to reorder ...> > <... the collating-symbol which should be reordered ...> > > I found writing the test file and checking which characters are sorted > wrongly by default easier than staring at iso14651_t1_common. And it > is a good idea to have the test file anyway to make sure that the > Kurdish sort order always stays correct when something is changed in > glibc. If we have the test file, we will notice when some change causes a > problem. Thank you very much dear mike i got it, you made a great job, thanks again. So now every thing is ready to be accepted in glibc. Best Regards
Posted to the mailing list for review: https://sourceware.org/ml/libc-alpha/2020-01/msg00255.html https://sourceware.org/ml/libc-alpha/2020-01/msg00256.html
https://github.com/mike-fabian/langtable/releases/tag/0.0.51 I added ckb_IQ.UTF-8 to langtable to make it usuable for installation on Fedora as soon as it is included in glibc.
By the way, how do you input Kurdish Sorani? Do you use a keyboard layout? Or do you need an input method?
(In reply to Mike FABIAN from comment #67) > By the way, how do you input Kurdish Sorani? Do you use a keyboard layout? > Or do you need an input method? Yes we have and its available.
(In reply to Mike FABIAN from comment #66) > https://github.com/mike-fabian/langtable/releases/tag/0.0.51 > > I added ckb_IQ.UTF-8 to langtable to make it usuable for installation on > Fedora as soon as it is included in glibc. Our focus now is on Ubuntu because too much users in Ubuntu we have.
Created attachment 12212 [details] 0001-Add-new-locale-ckb_IQ-Kurdish-Sorani-spoken-in-Iraq-.patch git log message changed according to Rafał Lużyński’s review.
Created attachment 12213 [details] 0002-Fix-ckb_IQ-BZ-9809.patch Fixed according to Rafał Lużyński’s review.
(In reply to Mike FABIAN from comment #71) > Created attachment 12213 [details] > 0002-Fix-ckb_IQ-BZ-9809.patch > > Fixed according to Rafał Lużyński’s review. Changed to this according to Rafał Lużyński’s suggestion: d_t_fmt "%A %d %b %Y, %I:%M:%S %p" date_fmt "%A %d %B %Y, %Z %I:%M:%S %p" All otherchanges are just whitespace and formatting.
Rafał Lużyński’s review: https://sourceware.org/ml/libc-alpha/2020-01/msg00281.html
(In reply to Mike FABIAN from comment #73) > Rafał Lużyński’s review: > > https://sourceware.org/ml/libc-alpha/2020-01/msg00281.html Thanks to your efforts, the locale is now ripe to join to the lib.
We have to wait until the release of glibc 2.31: https://www.gnu.org/software/libc/ The current development version of glibc 2.31, releasing on or around February 1st, 2020.
The master branch has been updated by Mike Fabian <mfabian@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4267522f5e0309f7606a8d1da5d436a166a719e2 commit 4267522f5e0309f7606a8d1da5d436a166a719e2 Author: Jwtiyar Nariman <jwtiyar@gmail.com> Date: Mon Jan 13 10:06:06 2020 +0100 Add new locale: ckb_IQ (Kurdish/Sorani spoken in Iraq) [BZ #9809]
The master branch has been updated by Mike Fabian <mfabian@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ae199e7d6423ed3bd0c8669381966ca4c58f4f49 commit ae199e7d6423ed3bd0c8669381966ca4c58f4f49 Author: Mike FABIAN <mfabian@redhat.com> Date: Mon Jan 13 10:12:07 2020 +0100 Fix ckb_IQ [BZ #9809] Add ckb_IQ to SUPPORTED file. Add ckb_IQ.UTF-8.in collation test file. Mention new ckb_IQ locale in NEWS.
Hey dear Mike I have downloaded new glibc 2.31 release but couldn't find ckb_iq localedata there? it was not planned to be there? best regards.
Created attachment 12261 [details] fixed typo in wednesday name in kurdish Just a typo now fixed, replaced U+0624 with U+0627 in the name of Wednesday in kurdish. Best Regards.
Created attachment 12262 [details] Added reorder-end command which missing Adding reorder-end because couldn't compile it with this error exist.
(In reply to Jwtiyar Nariman from comment #78) > Hey dear Mike > I have downloaded new glibc 2.31 release but couldn't find ckb_iq localedata > there? it was not planned to be there? Yes, of course, that’s what I wrote in https://sourceware.org/bugzilla/show_bug.cgi?id=9809#c75 2.31 was already in code freeze, I could push this only *after* 2.31 was released. Therefore, the target milesstone of this bug is set to 2.32.
(In reply to Jwtiyar Nariman from comment #80) > Created attachment 12262 [details] > Added reorder-end command which missing > > Adding reorder-end because couldn't compile it with this error exist. Not needed anymore, I rewrote the whole LC_COLLATE section, see https://sourceware.org/bugzilla/show_bug.cgi?id=9809#c49 If you want to do further changes, please look at what is in current git master: https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=localedata/locales/ckb_IQ;hb=refs/heads/master And then sent a *patch* not the complete new file.
(In reply to Jwtiyar Nariman from comment #79) > Created attachment 12261 [details] > fixed typo in wednesday name in kurdish > > Just a typo now fixed, replaced U+0624 with U+0627 in the name of Wednesday > in kurdish. > > Best Regards. $ git diff diff --git a/localedata/locales/ckb_IQ b/localedata/locales/ckb_IQ index a18ff69cb7..238c381edf 100644 --- a/localedata/locales/ckb_IQ +++ b/localedata/locales/ckb_IQ @@ -124,7 +124,7 @@ abday "<U0634><U06D5><U0645>";/ day "<U06CC><U06D5><U0643><U0634><U06D5><U0645><U0645><U06D5>";/ "<U062F><U0648><U0648><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0633><U06CE><U0634><U06D5><U0645><U0645><U06D5>";/ - "<U0686><U0648><U0624><U0631><U0634><U06D5><U0645><U0645><U06D5>";/ + "<U0686><U0648><U0627><U0631><U0634><U06D5><U0645><U0645><U06D5>";/ "<U067E><U06CE><U0646><U062C><U0634><U06D5><U0645><U0645><U06D5>";/ "<U0647><U06D5><U06CC><U0646><U06CC>";/ "<U0634><U06D5><U0645><U0645><U06D5>" lines 1-13/13 (END)
The master branch has been updated by Mike Fabian <mfabian@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=eb948facd894e66429e2e170043b7d36fe445a8d commit eb948facd894e66429e2e170043b7d36fe445a8d Author: Mike FABIAN <mfabian@redhat.com> Date: Tue Feb 11 10:17:12 2020 +0100 Fix typo in the name for Wednesday in Kurdish [BZ #9809]
Fixed in current master.
(In reply to Mike FABIAN from comment #85) > Fixed in current master. Thank you dear mike for everything, your help really appreciated. Best Regards.
HEY dear Mike Does ckb_IQ will be available in 2.32? I think there is a problem with re-order as i mentioned before, Due to this commit: https://sourceware.org/git/?p=glibc.git;a=commit;h=3404def00a1b332080fa51044733f6ead0eae5f3 Best Rgeards
(In reply to Jwtiyar Nariman from comment #87) > HEY dear Mike > Does ckb_IQ will be available in 2.32? Yes. > I think there is a problem with re-order as i mentioned before, Due to this > commit: > https://sourceware.org/git/?p=glibc.git;a=commit; > h=3404def00a1b332080fa51044733f6ead0eae5f3 > > Best Rgeards This is fixed by the mentioned commit. And even before the commit, it worked, it just printed a warning at build time.
Hey 2.32 is released and ckb is existed, thank you for everyone specially Gunnar and Mike, I wonder to know how Ubuntu will update it to latest 2.32?