Sourceware Bugzilla – Bug 3156
LC_TIME for pl_PL doesn't match standard usage
Last modified: 2008-03-19 09:43:17 UTC
Currently, glibc displays dates in the pl_PL locale as:
pon sie 6 01:23:45 CEST 1984
This format violates several conventions for date abbreviations in the Polish
language. I include a patch against the current CVS localedata with the
* non-standard weekday abbreviations are replaced with standard ones
* non-standard month abbreviations are replaced with standard ones (based on
* middle-endian format (never used in Poland) is replaced with the little-endian
one (by far the most popular)
* standard padding is introduced, i.e. h:m:s are zero-padded, day of the month
is not padded
* fields are properly separated
With the patch, dates are displayed as:
Pn, 6 VIII 1984, 01:23:45 CEST
which matches the most common usage.
Please notice that the abbreviations are no longer fixed-width. Since this is
also the case in several other locales, I suppose it is not a problem.
Created attachment 1267 [details]
LC_TIME fixes for pl_PL locale
Subject: Re: New: LC_TIME for pl_PL doesn't match standard usage
On Thu, Aug 31, 2006 at 01:47:16AM -0000, inkerman42 at gmail dot com wrote:
> Currently, glibc displays dates in the pl_PL locale as:
> pon sie 6 01:23:45 CEST 1984
> This format violates several conventions for date abbreviations in the Polish
> language. I include a patch against the current CVS localedata with the
> following changes:
> * non-standard weekday abbreviations are replaced with standard ones
> * non-standard month abbreviations are replaced with standard ones (based on
> Roman numerals)
> * middle-endian format (never used in Poland) is replaced with the little-endian
> one (by far the most popular)
> * standard padding is introduced, i.e. h:m:s are zero-padded, day of the month
> is not padded
> * fields are properly separated
> With the patch, dates are displayed as:
> Pn, 6 VIII 1984, 01:23:45 CEST
> which matches the most common usage.
well, you could use that for the long format, but it seems not
convenient for the short (abbreviated) format. Both day names and month
names are variable length.
My understanding is also that day and month names in Polish are spelled
with small initial letters.
> Please notice that the abbreviations are no longer fixed-width. Since this is
> also the case in several other locales, I suppose it is not a problem.
The recommendation is that the abbreviated format be fixed
format/lenght, as this is intended to be used in log messages.
You have to provide evidence. Provide URLs of official documents, railway
[Sorry for delay, I have been on vacation for the last few days.]
First, some background to answer Mr. Drepper's and Mr. Simonsen's questions.
Weekday abbreviations are not part of any official standard. They ones described
above are, however, used nearly universally in calendars.
Examples of use:
* http://kalendarz.pwn.pl/ [calendar of PWN (Polish Scientific Publishers),
publisher of the largest and most authoritative Polish-language encyclopedias
* http://lot.pl/ [timetable of LOT, the largest Polish airline]
Please note that these abbreviations can only appear standalone or as part of a
standalone date (and yes, while Polish weekday names are lowercase, they are
not). To abbreviate weekday names in an intertextual context (which would be
quite uncommon), one would have to use an ad hoc abbreviation following standard
rules, i.e. match the case of the word, end with a consonant, and be followed by
a dot, e.g. 'poniedziałek' could be abbreviated as 'pn.', 'pon.', or 'poniedz.'.
Modern dictionaries of Polish language allow the following date abbreviations:
* 6 VIII 1984 (older dictionaries also allowed 6.VIII.1984)
* 6.8.1984, or 6.08.1984, or 06.08.1984
The use of other abbreviations (such as 1984.08.06) is explicitly discouraged,
unless neccessitated by specialized data processing requirements.
* http://so.pwn.pl/zasady.php?id=629747 [ortographical dictionary of PWN]
Examples of use:
* http://www.senat.gov.pl/senatrp/noty/dzieje.pdf [history of Senat, upper
chamber of the Polish parlament)]
* http://edukacja.sejm.gov.pl/historia_sejmu/ [history of Sejm, lower chamber of
the Polish parlament)]
* http://rjp.pl/?mod=uchwaly&id=2 [resolutions of the Polish Language Council,
official standarizing organization for the Polish language]
* http://intercity.pl/scripts/train/index.php?action=train_list [timetable of
PKP, the largest Polish railway company]
* http://lot.pl/ [timetable of LOT]
> The recommendation is that the abbreviated format be fixed
> format/lenght, as this is intended to be used in log messages.
Ah, I wasn't aware of this recommendation. Perhaps it might be a good idea to
document it somewhere? Is there some particular reason why so many locales don't
Which formats exactly should be fixed-width? For d_fmt, there should be no
problem. Weekday abbreviations can be made fixed-width as well, by using a
variant with N replaced with Nd. And while it isn't exactly common to mix
weekday abbreviations and numeric date format, I guess it can be done, too. How
about date_fmt? It's not fixed-width in the POSIX locale, either.
Has there been any activity on this bug recently? There has been no comment from
any of the developers for several months. I had submitted the requested
references, is that enough? My question about the formats, and which of them, if
any, should be changed to fixed-width hasn't been answered, either. Is there
anything else I can do to help?
The weekday abbreviations proposed by the above patch seem also to be identical
with ones used by Windows. (This is what the gtk2 calendar widget uses on
Windows XP, at least).
I applied the patch.
This change introduced a bug with strptime function.
D_FMT format is set to "%-d %b %Y" and strptime
function does not seem to support "-" in "%-d"
format specifier. This leads to a problem described
There are, indeed, two bugs in strptime() which prevent it from parsing d_fmt
correctly. Filed as:
There's significant uproar due to the month abbreviations change to roman
numerals. Though theoretically correct, it is not acceptable to the community at
large, and there are standards and expert opinions in favour of three-letter
abbreviations. Weekday abbreviations and field separator changes are under
debate, too. See bug 4789 for more details.
Please revert pl_PL-LC_TIME.patch ASAP. Someone made a joke on you. This archaic
date stamping using roman month numbers was never officially approved, never in
widespread use. It is only used if author of a document wanted to gain artistic
effect. That is why roman numbering of months is usually found in documents
together with Gothic fonts and Anno Domini A.D. prefix before date.
I'm sure glibc maintainers implemented this patch having good will in mind. Next
time before applying such patch please ask Polish Linux translation teams if
future localization patches are not a hostile joke.
This patch make Polish Linux community suffer (broken apps, logs, errors in data
processing). This patch is proof of concept how big negative impact can have
single person on whole community.
This patch breaks national standards:
PN-EN 28601:2002 which is the same as ISO 8601:2004
here is free access via wikipedia:
The PN-EN 28601:2002 (ISO 8601:2004) is required format for data sorting in data
processing machines. Example: 2007-09-24
In real life (government documents, administration, commerce) for comfortable
user view dd-MM-CCYY format is used. Where:
dd - day in two digits format
MM - month in two digits format
CC - century in two digits format
YY - year in two digits format
On paper documents instead of dash "-" separator a "." dot can be found. However
because of growing computerization of country administration and increasing
number of personal computer users dash becomes more popular over dot because it
is better visible on screen and after printing.
To all maintainers (no matter if it is glibc or other project): Please always
verify all localization patches by sending them (in human understandable format)
to official translation teams of a given Language/Region. This will keep Linux
safe and not compromised.
How about something like that:
Pon, 6 sie 1984 01:23:45 CEST
Looks nicer, has non-broken endianess and fixed-width names.
ISO is an already established, widely used, adopted by PN (Polska Norma/Polish
Norm) standard. Using everything else, like fancy roman formatting, is useless
(In reply to comment #11)
> PN-EN 28601:2002 which is the same as ISO 8601:2004
Here is something for you to criticise: