Bug 2125

Summary: localedata has problems for ro_RO
Product: glibc Reporter: Eddy Petri&#351;or <eddy.petrisor>
Component: localedataAssignee: GNU C Library Locale Maintainers <libc-locales>
Status: RESOLVED FIXED    
Severity: minor CC: drepper.fsp, eddy.petrisor, glibc-bugs
Priority: P2 Flags: fweimer: security-
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: fixes 2125 issues for localedata ro_RO
ro_RO locale fix
ro_RO locale fix - unzipped
Main_ChangeLog_entry
modified_locale_iso-4217.def
localedata_ChangeLog_entry
Modified localedata/locales/ro_RO files
New snippet for localedata/ChangeLog
Full ro_RO locale data
Full ro_RO locale data
New localedata/Changelog snippet

Description Eddy Petri&#351;or 2006-01-09 10:26:45 UTC
Hello,

The current glibc has several problems regarding localedata for ro_RO:
* Letter order is incorrect (a with breve and a circumflex should be swapped)
* Capital letter A with breve is used inside day names
* Romanian Academy post-92 writing rules are not respected (a circumflex used
within words intead of i circumflex)


PS: I saw that the CHECKSUMS file seems to be outdated. Should this be this way?
Comment 1 Eddy Petri&#351;or 2006-01-09 10:28:17 UTC
Created attachment 821 [details]
fixes 2125 issues for localedata ro_RO


I have made some modifications to glibc 2.3.5 to correct some problems:
* locales/ro_RO: Correct the sorting order of the letters a 
circumflex and a with breve according to the Romanian alphabet.
* locales/ro_RO: Use lowercase A with breve within day names
* locales/ro_RO: Use Romanian post-92 writing rules within day 
and abday


The patch was tested on gentoo with a modified ebuild script and
really corrects all the mentioned bugs.

Please integrate it into the glibc source (I have added an entry in
localedata/Changelog; please modify if needed)
Comment 2 Eddy Petri&#351;or 2006-02-21 13:08:17 UTC
(In reply to comment #0)
> Hello,
> 
> The current glibc has several problems regarding localedata for ro_RO:
> * Letter order is incorrect (a with breve and a circumflex should be swapped)
> * Capital letter A with breve is used inside day names
> * Romanian Academy post-92 writing rules are not respected (a circumflex used
> within words intead of i circumflex)

I have looked thouroughly into the ro_RO locale and, besides the issues
explained, there are more of them.

I am currently working on the finalisation of a patch that would fix the issues
(while explaining all the changes, too).

Current progress can be seen on Debian's BTS:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=347173

I will post here the finalised patch, so this message is just a "stop, the
previous patch is incomplete" shout.
Comment 3 Eddy Petri&#351;or 2006-02-22 08:01:40 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > Hello,
> > 
> > The current glibc has several problems regarding localedata for ro_RO:
> > * Letter order is incorrect (a with breve and a circumflex should be swapped)
> > * Capital letter A with breve is used inside day names
> > * Romanian Academy post-92 writing rules are not respected (a circumflex used
> > within words intead of i circumflex)
> 
> I have looked thouroughly into the ro_RO locale and, besides the issues
> explained, there are more of them.
> 
> I am currently working on the finalisation of a patch that would fix the issues
> (while explaining all the changes, too).
> 
> Current progress can be seen on Debian's BTS:
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=347173
> 
> I will post here the finalised patch, so this message is just a "stop, the
> previous patch is incomplete" shout.
> 

As I said yesteday, I have finalised the ro_RO patch and currently Denis Barbier
is working on introducing this patch in Debian.

I will add here the (set of) patch(es) needed for this change, after it has
reached Debian unstable, in order to get some testing. (AFAICT Denis feels this
is a correct and clean patch for the ro_RO locale issue).
Comment 4 Eddy Petri&#351;or 2006-02-24 14:17:36 UTC
(In reply to comment #3)
> > I am currently working on the finalisation of a patch that would fix the issues
> > (while explaining all the changes, too).

Here is a list of all the changes (more detailed explanations are in the ro_RO
file):

+2006-01-07  Eddy Petrisor  <eddy.petrisor@gmail.com>
+
+	* locales/ro_RO: Correct the sorting order of the letters a 
+	circumflex and a with breve according to the Romanian alphabet.
+	* locales/ro_RO: Do not use capital A with breve within day names
+	* locales/ro_RO: Use Romanian post-92 writing rules within day
+	* locales/ro_RO: After denomination starting with the 1st of July 2005,
+	int'l currency symbol is RON (1 RON = 10000 ROL);
+	see http://publications.eu.int/code/en/en-5000700.htm; 
+	* locales/ro_RO: groupping sign for thousands is "."; group of 3
+	* locales/ro_RO: short date format is %d.%m.%Y for RO
+	* locales/ro_RO: placed year before time in date_fmt
+	* locales/ro_RO: replaced %Z with %z in date formats because %Z is not
+	used nor widely known in Romania, and Romania uses daylight saving and
+	the difference is more obvious this way
+	* locales/ro_RO: changed abday for Saturday as i> looks bad and is
+	incorrect according to post-92 rules
+	* locales/ro_RO: do not capitalize months and days as it is not correct
+	in Romanian
+	* locales/ro_RO: A4 is the prefered paper type; metric system is used
+	(removed FIXMEs)
+	* locales/ro_RO: added country_name, country_car, lang_name and lang_ab
+	* locales/ro_RO: added name_mr, name_mrs, name_miss (name_ms omitted as
+	there is no such proper form in Romanian)
+	* locales/ro_RO: added explanation related to the cedilla/comma issue and
+	the reson why the transliteration is a good idea
+	* locales/ro_RO: changed default encoding to UTF-8 - this is the only 
+	encoding that supports all Romanian specific symbols (see encoding table
+	in Debian BTS, #119528 and the corresponding comments in #347173)
+	* locales/ro_RO: Corrected the name format (salutation abbreviation was
+	omitted)
+	* locales/ro_RO: Corrected postal_fmt (See address examples from
+	Romanian Ministries' sites in Debian BTS #347173)
+	* locales/ro_RO: first_weekday and first_workday are both Monday
+	* locales/ro_RO: added terminology and bibliographic codes for RO
+	reference: http://www.loc.gov/standards/iso639-2/langcodes.html#qr
+	* locales/ro_RO: added isbn code
+	* locales/ro_RO: added postal code - RO (not 100% sure)


> > I will post here the finalised patch, so this message is just a "stop, the
> > previous patch is incomplete" shout.
> As I said yesteday, I have finalised the ro_RO patch and currently Denis Barbier
> is working on introducing this patch in Debian.
> 
> I will add here the (set of) patch(es) needed for this change, after it has
> reached Debian unstable, in order to get some testing. 

I am posing the patch here in case anybody wants to take a look over the whole
patch, faster.

I don't expect many changes to happen to it from now on (in other words, it
looks complete)
Comment 5 Eddy Petri&#351;or 2006-02-24 14:20:28 UTC
Created attachment 892 [details]
ro_RO locale fix

I have tested this patch on a Gentoo system and have published the results in
the Debian Bug Tracking system. If test results should be added here, please
communicate.
Comment 6 Eddy Petri&#351;or 2006-02-24 14:22:27 UTC
(In reply to comment #5)
> Created an attachment (id=892)
> ro_RO locale fix
> 
> I have tested this patch on a Gentoo system and have published the results in
> the Debian Bug Tracking system. If test results should be added here, please
> communicate.

I forgot, the patch is against glibc 2.3.5, so if the patch should be made
agaist CVS head or any other version, I will create a proper variant of the patch.
Comment 7 Eddy Petri&#351;or 2006-02-24 14:24:51 UTC
Created attachment 893 [details]
ro_RO locale fix - unzipped

Here is an ungzipped version of the patch
Comment 8 Ionel Mugurel Ciobîc&#259; 2006-03-08 14:13:19 UTC
I agree with and I welcome the changes proposed by Eddy Petri&#537;or to the ro_RO file.

Ionel Ciobîcã (tgakic _at_ chem _dot_ tue _dot_ nl)
Comment 9 Eddy Petri&#351;or 2006-03-19 23:48:34 UTC
(In reply to comment #7)
> Created an attachment (id=893)
> ro_RO locale fix - unzipped
> 
> Here is an ungzipped version of the patch

This patch has been in Debian unstable for some time. There were no negative
reports, only positive ones from the users.

Please merge this patch into current locale-data.

(If a patch against current version is needed instead of 2.3.5, please say so
and I will make that patch; the only data depndant on that version would be the
changelog and, if it changed, the currency refference info).
Comment 10 Ulrich Drepper 2006-04-24 06:11:44 UTC
Why the change to the default encoding.  This would only be needed if characters
are used which are not in ISO-8859-2.

Remove references to Debian from the ChangeLog.

Reformat the ChangeLog:
- it must not be in the form of a patch
- use one locales/ro_RO reference, no need to repeat this
- write complete sentences with capitalization and full stops.
Comment 11 Eddy Petri&#351;or 2006-04-25 08:11:49 UTC
(In reply to comment #10)
> Why the change to the default encoding.  This would only be needed if characters
> are used which are not in ISO-8859-2.

The correct diacritics for Romanian letters &#536;/&#537; and &#538;/&#539; are not U015E/U015F and
U0162/U0163, but U0218/U0219 and U021A/U021B which are not present in 8859-2;
also the quotation marks „” and «» are missing from 8859-2, so the only
reasonable choice is UTF-8.

> Remove references to Debian from the ChangeLog.

will do

> Reformat the ChangeLog:
> - it must not be in the form of a patch

Ok, will send some distinct files instead

> - use one locales/ro_RO reference, no need to repeat this

Ok, will fix.

Is the format bellow acceptable?

 * locales/ro_RO:
    - Item 1
    - Item 2
    - Item 3

> - write complete sentences with capitalization and full stops.

Ok.
Comment 12 Ulrich Drepper 2006-04-25 17:11:55 UTC
Just use complete sentences, not

 * locales/ro_RO:
    - Item 1
    - Item 2
    - Item 3


Also, today we provide ro_RO with ISO-8859-2.  This must be possible to be
generated.  So, make sure the appropriate transliterations are found.
Comment 13 Eddy Petri&#351;or 2006-04-25 18:40:59 UTC
> Just use complete sentences, not
>
> * locales/ro_RO:
>    - Item 1
>    - Item 2
>    - Item 3

"Item *" were thought as full sentences ;-)

> Also, today we provide ro_RO with ISO-8859-2.  This must be possible to be
> generated.  So, make sure the appropriate transliterations are found.

There is no change in the transliteration section except for the comments added
before it to explain why the necessity.
Comment 14 Eddy Petri&#351;or 2006-04-26 21:37:30 UTC
Created attachment 982 [details]
Main_ChangeLog_entry

This is the main Changelog entry - is needed because the international currency
symbol has changed to RON from ROL and the reference definitions shuold be
changed in order for glibc to compile.
Comment 15 Eddy Petri&#351;or 2006-04-26 21:39:06 UTC
Created attachment 984 [details]
modified_locale_iso-4217.def

The only chnage is the replacement of ROL symbol with RON.
Comment 16 Eddy Petri&#351;or 2006-04-26 21:40:45 UTC
Created attachment 985 [details]
localedata_ChangeLog_entry

This is the complete changelog entry which should be added to
localedata/Changelog file in order to document all the changes in
localdata/locales/ro_RO.
Comment 17 Eddy Petri&#351;or 2006-04-26 21:42:27 UTC
Created attachment 986 [details]
Modified localedata/locales/ro_RO files

This file contains all the documented changes in attachemnt 985.
Comment 18 Ulrich Drepper 2006-05-01 19:00:59 UTC
I added the locale changes.  But it seems you never relaly tested it.  The
lang_term and lang_lib fields were missing.

The ISO 4217 data has already been updated.
Comment 19 Eddy Petri&#351;or 2006-05-02 06:38:52 UTC
(In reply to comment #18)
> I added the locale changes.  But it seems you never relaly tested it.

I have been running those changes for almost two months now, so I wouldn't say I
never really tested them.

>  The
> lang_term and lang_lib fields were missing.

I am really sorry; it appears that I converted an older version of the files
used to create a patch when I was trying not to use patches (as you requested).

I will check again to see if nothing else was lost.

Could you be more clear, have you added lang_term and lang_lib?

FWIW, the file split attachements were consistent wit each other (the fields in
question were not mentioned in the changelog); Thanks for noticing!

> The ISO 4217 data has already been updated.

I don't know who did this, but in Debian this change was made some time before
the actual new locale enetered. Maybe Denis Barbier pushed it here.
Comment 20 Eddy Petri&#351;or 2006-05-03 19:09:50 UTC
I have just checked. The locale info is still incomplete.

I will send properly merged files for ro_RO and the localedata Changelog snippet.
Comment 21 Eddy Petri&#351;or 2006-05-03 19:28:25 UTC
Created attachment 1001 [details]
New snippet for localedata/ChangeLog

This contains the changelog entries that should be added to
localedata/ChangeLog
Comment 22 Eddy Petri&#351;or 2006-05-03 19:30:25 UTC
Created attachment 1002 [details]
Full ro_RO locale data

This contains the whole localedata/locale/ro_RO file with all the missing
changes (including the ones found as missing by Ulrich).
Comment 23 Eddy Petri&#351;or 2006-06-09 06:55:33 UTC
Created attachment 1067 [details]
Full ro_RO locale data

(In reply to comment #22)
> Created an attachment (id=1002)
> Full ro_RO locale data
> 
> This contains the whole localedata/locale/ro_RO file with all the missing
> changes (including the ones found as missing by Ulrich).

There is a small bug regarding collation rules which was present in the file
since  r1.13. There are a couple of characters are not marked as capitals,
although they should be.

See attachment.
Comment 24 Eddy Petri&#351;or 2006-06-09 06:56:20 UTC
Comment on attachment 1002 [details]
Full ro_RO locale data

File is obsoleted.
Comment 25 Eddy Petri&#351;or 2006-06-09 07:26:36 UTC
Created attachment 1068 [details]
New localedata/Changelog snippet

This ChangeLog snippet reflect changes from current CVS to the new ro_RO locale
data from attachment 1067 [details].
Comment 26 Ulrich Drepper 2006-08-03 18:38:25 UTC
I changed the cvs version.  Make sure this is _really_ the correct version or
whether you attached again something obsolete.
Comment 27 Eddy Petri&#351;or 2006-08-04 10:06:08 UTC
(In reply to comment #26)
> I changed the cvs version.  Make sure this is _really_ the correct version or
> whether you attached again something obsolete.

The current HEAD is correct.

Note that the collation rules were broken before I started working on this
issue, so this addition is actually a fix.