Bug 10871 - 'mon' array should contain both nominative and genitive cases
Summary: 'mon' array should contain both nominative and genitive cases
Status: ASSIGNED
Alias: None
Product: glibc
Classification: Unclassified
Component: locale (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Rafal Luzynski
URL: http://austingroupbugs.net/view.php?i...
Keywords:
: 10872 15606 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-10-30 06:50 IST by UrmasD
Modified: 2018-01-13 11:00 IST (History)
6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Proposed solution: alternative month names + smart algorithm (9.38 KB, patch)
2015-11-19 01:42 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (4.43 KB, patch)
2016-01-05 02:17 IST, Rafal Luzynski
Details | Diff
Add tests for the alternative month names (484 bytes, patch)
2016-01-05 02:23 IST, Rafal Luzynski
Details | Diff
Proposed solution: support day month order (4.34 KB, patch)
2016-01-05 02:31 IST, Rafal Luzynski
Details | Diff
Proposed solution: smart algorithm choosing the nominative/genitive month name (3.31 KB, patch)
2016-01-05 02:36 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish) (597 bytes, patch)
2016-01-05 02:43 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian) (832 bytes, patch)
2016-01-05 02:50 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Ukrainian) (1.22 KB, patch)
2016-01-05 02:55 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (version 2) (5.12 KB, patch)
2016-03-24 11:14 IST, Rafal Luzynski
Details | Diff
Add tests for the alternative month names (v2) (484 bytes, patch)
2016-03-24 11:15 IST, Rafal Luzynski
Details | Diff
Proposed solution: implement the %OB format specifier (1.98 KB, patch)
2016-03-24 11:20 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v2) (698 bytes, patch)
2016-03-24 11:23 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian, v2) (831 bytes, patch)
2016-03-24 11:23 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Ukrainian, v2) (1.10 KB, patch)
2016-03-24 11:24 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (version 3) (5.12 KB, patch)
2016-10-17 22:13 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for nl_langinfo family (1.56 KB, patch)
2016-10-17 22:18 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (1.88 KB, patch)
2016-10-17 22:19 IST, Rafal Luzynski
Details | Diff
Add tests for alternative month names (v3) (485 bytes, patch)
2016-10-17 22:23 IST, Rafal Luzynski
Details | Diff
Proposed solution: implement the %OB format specifier (v3) (1.99 KB, patch)
2016-10-17 22:27 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (2.90 KB, patch)
2016-10-17 22:47 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (1.94 KB, patch)
2016-10-17 22:51 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v3) (706 bytes, patch)
2016-10-17 22:56 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian, v3) (824 bytes, patch)
2016-10-17 22:57 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Ukrainian, v3) (1.11 KB, patch)
2016-10-17 22:58 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Czech) (647 bytes, patch)
2016-10-17 23:01 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (version 4) (4.97 KB, patch)
2016-10-27 23:46 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for nl_langinfo family (version 4) (1.69 KB, patch)
2016-10-27 23:50 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (version 4) (1.86 KB, patch)
2016-10-27 23:56 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (version 4) (2.99 KB, patch)
2016-10-28 00:01 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (version 4) (2.04 KB, patch)
2016-10-28 00:08 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (version 5) (4.97 KB, patch)
2016-12-22 23:04 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for nl_langinfo family (version 5) (1.61 KB, patch)
2016-12-22 23:07 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (version 5) (1.90 KB, patch)
2016-12-22 23:09 IST, Rafal Luzynski
Details | Diff
Add tests for alternative month names (v5) (486 bytes, patch)
2016-12-22 23:12 IST, Rafal Luzynski
Details | Diff
Proposed solution: implement the %OB format specifier (v5) (1.93 KB, patch)
2016-12-22 23:14 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (version 5) (2.97 KB, patch)
2016-12-22 23:20 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (version 5) (2.20 KB, patch)
2016-12-22 23:22 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v5) (706 bytes, patch)
2016-12-22 23:23 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian, v5) (825 bytes, patch)
2016-12-22 23:25 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Ukrainian, v5) (1.11 KB, patch)
2016-12-22 23:27 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Czech, v5) (647 bytes, patch)
2016-12-22 23:29 IST, Rafal Luzynski
Details | Diff
Alternative month names for all locales (42.83 KB, patch)
2016-12-22 23:32 IST, Rafal Luzynski
Details | Diff
Import month names from CLDR (13.68 KB, patch)
2016-12-22 23:37 IST, Rafal Luzynski
Details | Diff
Proposed solution: support alternative month names (version 6) (5.04 KB, patch)
2017-03-20 08:47 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for nl_langinfo family (version 6) (1.61 KB, patch)
2017-03-20 08:51 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (version 6) (1.92 KB, patch)
2017-03-20 08:58 IST, Rafal Luzynski
Details | Diff
Proposed solution: implement the %OB format specifier (v6) (1.89 KB, patch)
2017-03-20 09:01 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (version 6) (2.97 KB, patch)
2017-03-20 09:03 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (version 6) (1.97 KB, patch)
2017-03-20 09:09 IST, Rafal Luzynski
Details | Diff
Let alternative month names be a copy of regular ones (759 bytes, patch)
2017-03-20 09:18 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (5.71 KB, patch)
2017-03-20 09:28 IST, Rafal Luzynski
Details | Diff
Backward compatibility for abbreviated alternative month names and %Ob (1.02 KB, patch)
2017-03-20 09:35 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian, v6) (821 bytes, patch)
2017-03-20 09:45 IST, Rafal Luzynski
Details | Diff
Import genitive month names from CLDR (v6) (5.18 KB, patch)
2017-03-20 09:51 IST, Rafal Luzynski
Details | Diff
Import uppercase/lowercase month names from CLDR (v6) (3.36 KB, patch)
2017-03-20 09:54 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (version 7) (1.87 KB, patch)
2017-05-23 22:41 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (version 7) (2.97 KB, patch)
2017-05-23 22:45 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (version 7) (1.92 KB, patch)
2017-05-23 23:00 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for nl_langinfo family (version 8) (1.61 KB, patch)
2017-06-28 00:15 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect nl_langinfo changes (version 8) (1.82 KB, patch)
2017-06-28 00:21 IST, Rafal Luzynski
Details | Diff
Proposed solution: implement the %OB format specifier (version 8) (1.89 KB, patch)
2017-06-28 00:24 IST, Rafal Luzynski
Details | Diff
Provide backward compatibility for strftime family (version 8) (2.97 KB, patch)
2017-06-28 00:29 IST, Rafal Luzynski
Details | Diff
Rebuild abilists to reflect strftime family changes (version 8) (2.24 KB, patch)
2017-06-28 00:34 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (version 8) (5.72 KB, patch)
2017-06-28 00:36 IST, Rafal Luzynski
Details | Diff
Backward compatibility for abbreviated alternative month names and %Ob (version 8) (1.02 KB, patch)
2017-06-28 00:39 IST, Rafal Luzynski
Details | Diff
Correct the size of _nl_value_type_LC_... arrays (v9) (1.03 KB, patch)
2017-09-19 10:04 IST, Rafal Luzynski
Details | Diff
Implement alternative month names (v9) (4.12 KB, patch)
2017-09-19 10:10 IST, Rafal Luzynski
Details | Diff
Regenerate locfile-kw.h from locfile-kw.gperf (v9) (2.88 KB, patch)
2017-09-19 10:13 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (v9) (3.74 KB, patch)
2017-09-19 10:17 IST, Rafal Luzynski
Details | Diff
Again regenerate locfile-kw.h from locfile-kw.gperf (v9) (2.89 KB, patch)
2017-09-19 10:19 IST, Rafal Luzynski
Details | Diff
Documentation to the above changes(v9) (2.13 KB, patch)
2017-09-19 10:20 IST, Rafal Luzynski
Details | Diff
Implement alternative month names (v10) (4.74 KB, patch)
2017-11-16 01:49 IST, Rafal Luzynski
Details | Diff
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v10) (3.32 KB, patch)
2017-11-16 01:52 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (v10) (4.38 KB, patch)
2017-11-16 01:53 IST, Rafal Luzynski
Details | Diff
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v10) (3.28 KB, patch)
2017-11-16 01:55 IST, Rafal Luzynski
Details | Diff
Documentation to the above changes(v10) (2.56 KB, patch)
2017-11-16 02:01 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v10) (632 bytes, patch)
2017-11-16 02:03 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Russian, v10) (816 bytes, patch)
2017-11-16 02:04 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Ukrainian, v10) (1.08 KB, patch)
2017-11-16 02:04 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Czech, v10) (643 bytes, patch)
2017-11-16 02:06 IST, Rafal Luzynski
Details | Diff
Import genitive month names from CLDR (v10) (5.10 KB, patch)
2017-11-16 02:08 IST, Rafal Luzynski
Details | Diff
Import uppercase/lowercase month names from CLDR (v10) (3.35 KB, patch)
2017-11-16 02:09 IST, Rafal Luzynski
Details | Diff
Implement alternative month names (v11) (4.74 KB, patch)
2018-01-13 10:39 IST, Rafal Luzynski
Details | Diff
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v11) (3.33 KB, patch)
2018-01-13 10:42 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (v11) (4.38 KB, patch)
2018-01-13 10:43 IST, Rafal Luzynski
Details | Diff
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v11) (3.29 KB, patch)
2018-01-13 10:45 IST, Rafal Luzynski
Details | Diff
Documentation to the above changes (v11) (2.54 KB, patch)
2018-01-13 10:49 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v11) (608 bytes, patch)
2018-01-13 10:51 IST, Rafal Luzynski
Details | Diff
Implement alternative month names (v12) (5.04 KB, patch)
2018-01-13 10:54 IST, Rafal Luzynski
Details | Diff
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v12) (3.47 KB, patch)
2018-01-13 10:55 IST, Rafal Luzynski
Details | Diff
Also implement abbreviated alternative month names and %Ob (v12) (4.58 KB, patch)
2018-01-13 10:56 IST, Rafal Luzynski
Details | Diff
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v12) (3.45 KB, patch)
2018-01-13 10:57 IST, Rafal Luzynski
Details | Diff
Documentation to the above changes (v12) (2.55 KB, patch)
2018-01-13 10:57 IST, Rafal Luzynski
Details | Diff
Alternative month names NLS data (Polish, v12) (963 bytes, patch)
2018-01-13 11:00 IST, Rafal Luzynski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description UrmasD 2009-10-30 06:50:32 IST
In current Ru_RU locale month names are given capitalized, in nominative case. When this form is 
used to build a date with %B specifier, the resulting form is grammatically incorrect (30 Октябрь 2009 
for example). The correct form which should be used is lowercase, genitive case names (30 октября 
2009).
Comment 1 Andreas Schwab 2009-10-30 09:26:00 IST
*** Bug 10872 has been marked as a duplicate of this bug. ***
Comment 2 Ulrich Drepper 2010-04-09 03:02:29 IST
You'll have to provide a complete patch.
Comment 3 van.de.bugger 2011-02-27 22:43:44 IST
(In reply to comment #2)
> You'll have to provide a complete patch.

I could provide a patch, but there is a problem in specification. Neither "man 5 locale" nor "man date" specify the case of name. It is not clear whether it should be nominative, genitive or some another case.

For example, "date +'%d %B %Y'" produces "28 Февраль 2011" which looks incorrect (the first letter should be small, not capital, the form should be ). However, "date +'%B, %d-e'" produces "Февраль, 28-e" which looks correct. "date +'%B'" produces just months name "Февраль". I cannot say whether it correct or not without context.

The only clue could be `d_fmt' and `date_fmt' (but latter one is not described in "man 5 locale") values in locale definition file. It looks like date string produced with `date_fmt' should be correct, which means it Russian locale month name (%B) should be in genitive case. But it may break some existing programs which expect month name in nominative case...

The solution could be extending `mon' array. For example, it could contain 12, 24, or even more elements. Element #1 is name of January in nominative case, element #13 is name of January in genitive case, #25 is the name of January in some 3rd case, etc. `%B' is name of month in nominative case, `%1B' is the same as `%B', `%2B' is the name of months in genitive case, `%3B' is the name of months in the "3rd" case, etc.

Alternatively, it could be multidimensional array, so in construct `%nB' n selects the proper dimension.

Any thoughts on that?
Comment 4 a.m.suharev 2012-11-18 07:28:57 IST
(In reply to comment #3)
> (In reply to comment #2)
> > You'll have to provide a complete patch.
> 
> I could provide a patch, but there is a problem in specification. Neither "man
> 5 locale" nor "man date" specify the case of name. It is not clear whether it
> should be nominative, genitive or some another case.

If it is not specified, we should follow the language's rules. If it would be specified it has to be specified in accordance with the language's rules. So let me as a "native speaker" to suggest the solution.

Both months and weekdays should be in lower case. The correct form of date should be "28 февраля" (the genitive). If you add a weekday it is "понедельник, 28 февраля" (nominative for the weekday, genitive for month). A year, if necessary, should be after the day, optionally followed by the word "года" or the abbreviation "г.", line this "понедельник, 28 февраля 2011 года".

> The solution could be extending `mon' array. For example, it could contain 12,
> 24, or even more elements. Element #1 is name of January in nominative case,
> element #13 is name of January in genitive case, #25 is the name of January in
> some 3rd case, etc. `%B' is name of month in nominative case, `%1B' is the same
> as `%B', `%2B' is the name of months in genitive case, `%3B' is the name of
> months in the "3rd" case, etc.
> 
> Alternatively, it could be multidimensional array, so in construct `%nB' n
> selects the proper dimension.
> 
> Any thoughts on that?

I think one should definitely extend the "mon" array. However the better idea would be to put there not the cases (nominative, genitive etc) but the contexts (month's name "as-is", month's name "when used in a date" etc). The array could be extensible.
Comment 5 Dmitry V. Levin 2012-11-18 12:20:56 IST
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > You'll have to provide a complete patch.
> > 
> > I could provide a patch, but there is a problem in specification. Neither "man
> > 5 locale" nor "man date" specify the case of name. It is not clear whether it
> > should be nominative, genitive or some another case.
> 
> If it is not specified, we should follow the language's rules.

And the language rules require to use nominative or genitive or some another case depending on context.

> If it would be
> specified it has to be specified in accordance with the language's rules. So
> let me as a "native speaker" to suggest the solution.
> 
> Both months and weekdays should be in lower case.

It depends.

> The correct form of date
> should be "28 февраля" (the genitive).

Yes, but "февраль, 28-е" is also correct.

> > The solution could be extending `mon' array. For example, it could contain 12,
> > 24, or even more elements. Element #1 is name of January in nominative case,
> > element #13 is name of January in genitive case, #25 is the name of January in
> > some 3rd case, etc. `%B' is name of month in nominative case, `%1B' is the same
> > as `%B', `%2B' is the name of months in genitive case, `%3B' is the name of
> > months in the "3rd" case, etc.
> > 
> > Alternatively, it could be multidimensional array, so in construct `%nB' n
> > selects the proper dimension.
> > 
> > Any thoughts on that?
> 
> I think one should definitely extend the "mon" array. However the better idea
> would be to put there not the cases (nominative, genitive etc) but the contexts
> (month's name "as-is", month's name "when used in a date" etc). The array could
> be extensible.

Yes, putting "mon" array in a single case (no matter whether it would be nominative, genitive or other case) cannot fix the issue, it would just fix some language forms and break other language forms.  To fix the issue, an extension is necessary.
Comment 6 Ruslan Ivanyuk 2014-09-11 00:36:18 IST
The problem persists as of September 11, 2014, and concerns all Slavonic languages except Bulgarian and Macedonian (where declination is reduced). Among them are languages: Belarusian, Czech, Polish, Pomeranian/Kashubian, Russian, Rusyn, Slovak, Slovene, Serbo-Croatian (Croatian, Bosnian, Serbian, Montenegrin), Silesian, Upper and Lower Sorbian and Ukrainian. But not only in Slavonic, other languages using noun declinations are injured too. Finnish, for instance, is one extraordinary case that uses all three, nominative, genitive and partitive case month names.

The CLDR manages to fix the issue by introducing several data arrays to serve that specific purpose, both full and abbreviated versions (in case one needs those). In addition, it has correctly abbreviated weekday names. Unfortunately, I have no programming skills whatsoever to submit code that solves anything.
Comment 7 Rafal Luzynski 2015-10-28 02:57:06 IST
I'll be happy to provide a complete solution for this problem but some API design questions must be answered first.

Please note that CLDR mentions only "standalone" version of the month name which is probably always nominative, and "format" version which may be the same as "standalone" (e.g., in English) but may be genitive in some languages, it may also be another case in some other languages. For simplicity I will refer to these cases as nominative/genitive keeping in mind that CLDR refers to them as standalone/format. Also there may be languages which use other forms than nominative/genitive but I think there are probably always at most two forms since CLDR has decided to consider only two.


I. strftime() - http://linux.die.net/man/3/strftime

This function supports only one format which provides the full month name: %B. At the moment there is no way for this function to provide multiple forms of the full month name. Here are the API designs which would provide a full month name:

1. Do not change the API, implement an internal algorithm which would analyze a full format string and determine whether %B should format the month name in a nominative or genitive case. The simplest algorithm would check if %d or %e conversion specifiers are also present in the same format string, retrieve a genitive case if they are, nominative otherwise. More advanced version could check if the day and month conversion specifiers are adjacent, if they are separated with other conversion specifiers, with space/punctuation/other characters, if there are other letters concatenated with %B (which would mean that the caller already tries to provide a workaround for this bug), if the day/month order is correct (this is true only if day/month order is correct and month/day order is incorrect in all these languages).

Pros:
- once implemented correctly it will automagically fix all affected applications,
- even if the implementation will not be perfect for some languages the result will not be worse than the one currently existing: it will not break any currently correct application,
- if it turns out that this solution is completely wrong it will be easy to revert it and provide another one because we don't change the API.

Cons:
- may be difficult to implement,
- it is questionable if a perfect algorithm exists for all affected languages, even if we check it for all languages mentioned in the comment 6 there may be other languages which we don't know about and which also require the nominative/genitive case but use different rules,
- it is questionable how to handle the format strings which are incorrect from grammatical point of view: please note that strftime() API does not and should not say that there are illegal combination of the conversion specifiers.

2. Follow the specification already used in *BSD family (which also includes OS X and iOS): https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3. They implement the %OB conversion specifier which retrieves the nominative case while %B specifier retrieves the genitive case (sic!)

Pros:
- full portability between glibc and *BSD,
- simple and deterministic implementation,
- full programmer's control on whether they want a nominative or genitive case,
- will automagically fix all dates using %B conversion specifiers and displaying the nominative case which is incorrect (full dates).

Cons:
- at the same time will break formatting of all dates using %B conversion specifiers where the nominative case is required and is correctly provided now, the application developer may not even be aware that the application became broken in some languages,
- therefore will require urgent intervention from some application developers,
- it will be difficult or even impossible to provide a backward compatible solution which would detect if the current runtime version of glibc requires %OB or %B for the month name in nominative case,
- one may question if the *BSD decision to retrieve a genitive case from %B is correct since it causes so much trouble.

3. Mimic the *BSD specification but implement it conversely: let %B retrieve the nominative case (as it currently does) and let the new %OB specifier retrieve the genitive case. See also: http://austingroupbugs.net/view.php?id=258 - this seems to has accepted this solution.

Pros:
- simple and deterministic implementation,
- full programmer's control on whether they want a nominative or genitive case,
- full backward compatibility,
- will not break any existing application.

Cons:
- portability with *BSD family will never be possible (format specifiers war),
- will require intervention from the application developers but it will not be urgent because it will apply only the cases where they use %B explicitly and this is already incorrect.

I would choose the first solution: not to change the API and try to provide a smart algorithm which would determine if the month name retrieved by %B should be nominative or genitive but I will listen to your opinion.


II. nl_langinfo() - http://linux.die.net/man/3/nl_langinfo

Although strftime() does not call nl_langinfo() directly both these functions use the same backend database. We will need the new constants to be defined in langinfo.h, for example ALTMON_{1-12} and their wide-character equivalents _NL_WALTMON_{1-12}. This means it will affect the API of nl_langinfo() by adding new valid argument values. Please note that I am talking in the context of https://bugzilla.gnome.org/show_bug.cgi?id=749206 and the implementation of g_date_time_printf() does call nl_langinfo() to retrieve the month names. I hope it is valid to add these new symbols after _NL_TIME_CODESET and name them ALTMON_{1-12} and _NL_WALTMON_{1-12}.
Comment 8 Piotr Drąg 2015-10-28 21:04:26 IST
*** Bug 15606 has been marked as a duplicate of this bug. ***
Comment 9 Rafal Luzynski 2015-11-19 01:42:12 IST
Created attachment 8795 [details]
Proposed solution: alternative month names + smart algorithm

Good news: here is a working patch so please check, test, review, comment. It implements the solution with "%B" format specifier smartly expanding into a nominative or genitive (alternative) case depending on the current context. It hopefully does not break any existing solution and is fully backward compatible.

I have also provided the new locales for Polish, Russian and Ukrainian. I have checked this with CLDR data but feel free to drop the changes to these files if you think that other people should provide them. Please note that without the changed locales you would not see the effect of the code change. For Russian I have also changed the nominative month names to lowercase because some people requested it here. For Ukrainian I have removed the "alternative digits" hack and used the already provided genitive month names as the alternative names.

There are also some bad news:

1. The coreutils package (date, du) provides its own function fprintftime() which tries to bypass strftime() as much as possible and if it calls strftime() it retrieves only the month name so this patch will be unable to detect the "full date" context and provide the correct genitive form. This means that /usr/bin/date is not a good tool to test this patch. Perhaps it needs a patch similar to this one.

2. I had to introduce the new locale field day_month_order which will be available as nl_langinfo(_NL_DAY_MONTH_ORDER) and will determine if:
- only a day number before a month name forces the genitive month name (1),
- only a day number after a month name forces the genitive month name (3),
- both above orders force the genitive month name or this does not matter because the current language does not provide the alternative month names (2, default value when not specified).
Unfortunately, there is no universal rule for all languages. As stated in comment 5, Russian requires a genitive case if a day number is before the month name but a nominative case when reversed, both orders are correct. Other languages require the day number to be before the month name and do not allow to reverse them but we should not crash on grammatical errors, instead we should provide the closest result to the requirement. On the other hand, Lithuanian requires the month name to be before the day number and in the genitive case.

3. There are more languages which suffer from this bug. It is possible to retrieve the correct alternative month names from CLDR but I am not sure if this task should be a part of this bugfix or should be left to the native translators. I can provide more of them but providing all sounds like a horrible task.

4. I am not sure if you like the idea of "%B" to be smartly converted to the correct form of the month name. I will appreciate your feedback.
Comment 10 Piotr Drąg 2015-11-20 17:18:13 IST
I think you need to send the patch to libc-alpha: https://sourceware.org/glibc/wiki/Contribution%20checklist
Comment 11 Rafal Luzynski 2015-11-26 23:11:51 IST
OK, sent: https://sourceware.org/ml/libc-alpha/2015-11/msg00594.html
Comment 12 Rafal Luzynski 2016-01-05 02:17:51 IST
Created attachment 8874 [details]
Proposed solution: support alternative month names

The previous patch could be considered too long and too difficult to review so here I split it into 7 parts which can be reviewed and accepted or rejected individually. This first part just adds the ALTMON_... constants.

I think this change is not questionable except the question whether these constants should be considered public or private. I suggest public.
Comment 13 Rafal Luzynski 2016-01-05 02:23:33 IST
Created attachment 8875 [details]
Add tests for the alternative month names

The only reason why this part has been split out into a separate patch is that the file tst-langinfo.c belongs to the localedata directory which has its own ChangeLog. Feel free to merge this patch with the previous one if you think that splitting it is bad idea.
Comment 14 Rafal Luzynski 2016-01-05 02:31:07 IST
Created attachment 8876 [details]
Proposed solution: support day month order

If we are going to implement a smart algorithm which determines whether "%B" should generate a nominative or genitive form we need also a new parameter which would determine if month should be genitive if it appears after a day, or before a day, or in both cases, or it does not matter. Feel free to merge this patch with the previous ones if you think this idea is OK. Feel free to reject this patch if you think that we should implement the "%OB" format specifier which would select the genitive case explicitly. See also comment 7 for more info.
Comment 15 Rafal Luzynski 2016-01-05 02:36:25 IST
Created attachment 8877 [details]
Proposed solution: smart algorithm choosing the nominative/genitive month name

This is the main part: the implementation of the smart algorithm in strftime() which would select the correct form of the month name for the "%B" format specifier. Note that this patch requires the previous ones, also it requires some NLS data from the following patches.
Comment 16 Rafal Luzynski 2016-01-05 02:43:25 IST
Created attachment 8878 [details]
Alternative month names NLS data (Polish)

Here are the genitive month names in Polish. Please note that this is my native language so I can guarantee the correctness. Feel free to use this patch only for local tests and reject it from the public repository if you don't trust me. However, please note that you need some NLS data with the alternative month names, otherwise you will not see any effect of the previous patches.
Comment 17 Rafal Luzynski 2016-01-05 02:50:51 IST
Created attachment 8879 [details]
Alternative month names NLS data (Russian)

Here are the genitive month names in Russian. The names are taken from CLDR database: http://st.unicode.org/cldr-apps/v#/ru/Gregorian/677bb4a72253df63 Also I have changed the nominative month names to lowercase as suggested in comment 0 and comment 4. Please note that you don't have to trust my knowledge of Russian, feel free to take this patch for your local tests only and reject from the public repository.
Comment 18 Rafal Luzynski 2016-01-05 02:55:47 IST
Created attachment 8880 [details]
Alternative month names NLS data (Ukrainian)

Here are the genitive month names in Ukrainian. The alternative month names had been already present so I have only changed the alt_digits label to alt_mon and removed all remains of the alt_digits hack. However, please note that you absolutely should not trust my knowledge of Ukrainian language. Feel free to use this patch for local tests only and reject it from the public repository.

This is the end of this series of patches. I'll appreciate your reviews.
Comment 19 van.de.bugger 2016-01-13 19:07:10 IST
(In reply to Rafal Luzynski from comment #7)

Thanks for a good analysis. However, I do not agree with your approach. 

> 1. Do not change the API, implement an internal algorithm which would
> analyze a full format string and determine whether %B should format the
> month name in a nominative or genitive case. 

This is a bad idea. You left programmer with no control over the result. For example: someone used format string like "This is %B" (I used English, but imagine another language, like Polish). Later his decided to include day number: "This is %B, today is day #%e." Since %B remains intact, programmer would assume the first part of the message remains intact too, but "artificial intelligence" may have own opinion and change case of month name depending on presence or absence of %e or whatever else. To me this is not acceptable.

Also, month name is not the only name in strftime. Look at %A — it's weekday name, which also has multiple forms/cases.

> 2. Follow the specification already used in *BSD family (which also includes
> OS X and iOS): https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3.

This is not *BSD but POSIX:

http://pubs.opengroup.org/onlinepubs/009695399/functions/strftime.html

Description of strftime is not very clear what "alternative representation" and "alternative numeric symbols", but 

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html

adds some details. Both of %O and %E are not for changing case of month name.

Thus, I will propose an extension, dedicated for the problem. Current conversion specifier looks like:

"%" flag width [ modifier ] format

where "%" and "." are literal strings; flag — one of GNU flags: "-", "_", "0", "^" (see "info strftime"), width is a positive integer,  modifier is either "E" or "O", format — one of documented letters (a, b, B, c, C, ...).

Extension may utilize syntax of "precision" used in "printf", e. g.:

"%" flag width [ "." precision ] [ modifier ] format

in "strftime" precision = grammatic case number. In case of Russian "precision" would be in range 1..6 (since in Russian language nouns have 6 cases):

%.1B — month name in nominative case
%.2B — month name in accusative case
%.3B — month name in genitive case
%.4B — month name in dative case
%.5B — month name in instrumental case
%.6B — month name in prepositional case

Note that this is universal and can be used with week day names too:

%.1A, %.2A, etc.

"Precision" zero (".0") is used as default (when case is not specified) and as fallback (when specified "precision" is out of range for current locale).

"Precision" can be combined with width: %20.1B.

In such a case strftime maintains compatibility — behaviour of existing programs is not affected, but in new (versions of the) programs developers are free to specify case of names precisely.
Comment 20 Mike Frysinger 2016-01-13 23:49:01 IST
(In reply to van.de.bugger from comment #19)

i think you might have missed some of the things Rafal said.  he is correct when he said BSD because they already have a format specifier for this: %OB.  POSIX does *not* support this today (see the linked POSIX bug report for more details) which leads us to the multiple choices he outlined in comment #7.

i don't think we should create yet another standard -- either we use what BSD already has, or we wait for POSIX to come up with one (and we go with his suggestion for automatic detection in the mean time since it doesn't break ABI).  if you want to extend the standard, please post to the POSIX lists.
Comment 21 keld@keldix.com 2016-01-14 12:39:13 IST
On Wed, Jan 13, 2016 at 11:49:01PM +0000, vapier at gentoo dot org wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=10871
> 
> --- Comment #20 from Mike Frysinger <vapier at gentoo dot org> ---
> (In reply to van.de.bugger from comment #19)
> 
> i think you might have missed some of the things Rafal said.  he is correct
> when he said BSD because they already have a format specifier for this: %OB. 
> POSIX does *not* support this today (see the linked POSIX bug report for more
> details) which leads us to the multiple choices he outlined in comment #7.
> 
> i don't think we should create yet another standard -- either we use what BSD
> already has, or we wait for POSIX to come up with one (and we go with his
> suggestion for automatic detection in the mean time since it doesn't break
> ABI).  if you want to extend the standard, please post to the POSIX lists.

We follow ISO TR 30112, and not pOSIX in glibc. POSIX is quite outdated
wrt. i18n, and we cannot conform to POSIX's limited i18n capability.

But we can work on ISO TR 30112 which is under revision.

best regards
keld
Comment 22 Mike Frysinger 2016-01-14 12:51:04 IST
(In reply to keld@keldix.com from comment #21)

running s/POSIX/ISO TR 30112/ on my comment doesn't change my point.  we should not be inventing our own new behavior here when there's long standing precedence.
Comment 23 keld@keldix.com 2016-01-14 13:32:00 IST
On Thu, Jan 14, 2016 at 12:51:04PM +0000, vapier at gentoo dot org wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=10871
> 
> --- Comment #22 from Mike Frysinger <vapier at gentoo dot org> ---
> (In reply to keld@keldix.com from comment #21)
> 
> running s/POSIX/ISO TR 30112/ on my comment doesn't change my point.  we should
> not be inventing our own new behavior here when there's long standing
> precedence.

If the argument is that BSD already has this feature and we should follow their
specs, then I agree, and then I propose that we add this to 30112.

best regards
keld
Comment 24 Mike Frysinger 2016-01-14 14:41:57 IST
(In reply to keld@keldix.com from comment #23)

please read comment #7 in full
Comment 25 van.de.bugger 2016-01-17 00:22:38 IST
(In reply to Mike Frysinger from comment #20)

> i think you might have missed some of the things Rafal said.  he is correct
> when he said BSD because they already have a format specifier for this: %OB.
> POSIX does *not* support this today (see the linked POSIX bug report for
> more details) which leads us to the multiple choices he outlined in comment
> #7.

1. I do not understand what do you mean by "POSIX does *not* support this today". POSIX is a standard, not an implementation, and O and E modifiers are described in the standard, see http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html 

2. By this description, O modifier is applicable to d, e, H, I, m, M, S, u, U, V, w, W, and y. OB is BSD-specific *extension* to POSIX.

3. %OB is not just BSD extension, it is also ugly hack which likely does not solve the problem. BSD offers only two cases for month name — %B and %OB, while in Russian language a noun has six cases, in Finnish — more than dozen. I am not a linguist and not aware about other languages, but it is obviously 2 cases cannot cover all the needs.

> i don't think we should create yet another standard -- either we use what
> BSD already has, or we wait for POSIX to come up with one (and we go with
> his suggestion for automatic detection in the mean time since it doesn't
> break ABI).  if you want to extend the standard, please post to the POSIX
> lists.

GNU libc have A LOT of extensions. It is not clear why you decided to stop adding extensions now.

Automatic detection does not break ABI, it breaks behavior of existing programs.
Comment 26 van.de.bugger 2016-01-17 00:27:56 IST
(In reply to keld@keldix.com from comment #21)
> We follow ISO TR 30112, and not pOSIX in glibc. POSIX is quite outdated
> wrt. i18n, and we cannot conform to POSIX's limited i18n capability.
> 
> But we can work on ISO TR 30112 which is under revision.

Sorry, I cannot afford to pay CHF 198 for ISO TR 30112 at ISO official site:

http://www.iso.org/iso/catalogue_detail.htm?csnumber=53232

If there is a publicly available revision of ISO TR 30112, please give me a link.
Comment 27 van.de.bugger 2016-01-17 01:06:08 IST
(In reply to van.de.bugger from comment #25)
> ...it is obviously 2 cases cannot cover all the needs.

Just an example:

Imagine a program, which (in entertainment purposes) reminds you famous events occurred in the same month, but (some|few|many) years ago. In English it will be probably "In January: ...", or, as strftime format: "In %B: ...".

Let us translate it to Russian: "В январе: ...". Note: month name in this example is neither in nominative case (январь), and not in genitive case (января), so neither "В %B: ..." nor "В %OB: ..." (proposed hack for genitive case) is suitable.

***

I do not insist on format proposed by me. My point is that new implementation should be able to produce any explicitly requested case of name (either month or weekday, it seems there are no other names in this area). If you do not like "precision" notation (borrowed from printf formats) — not a problem, just propose another.
Comment 28 keld@keldix.com 2016-01-17 02:11:55 IST
On Sun, Jan 17, 2016 at 12:27:56AM +0000, van.de.bugger at gmail dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=10871
> 
> --- Comment #26 from van.de.bugger at gmail dot com ---
> (In reply to keld@keldix.com from comment #21)
> > We follow ISO TR 30112, and not pOSIX in glibc. POSIX is quite outdated
> > wrt. i18n, and we cannot conform to POSIX's limited i18n capability.
> > 
> > But we can work on ISO TR 30112 which is under revision.
> 
> Sorry, I cannot afford to pay CHF 198 for ISO TR 30112 at ISO official site:
> 
> http://www.iso.org/iso/catalogue_detail.htm?csnumber=53232
> 
> If there is a publicly available revision of ISO TR 30112, please give me a
> link.

http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
It should also be available from the SC35 documents site.

best regards
Keld
Comment 29 keld@keldix.com 2016-01-17 02:22:50 IST
On Sun, Jan 17, 2016 at 12:22:38AM +0000, van.de.bugger at gmail dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=10871
> 
> --- Comment #25 from van.de.bugger at gmail dot com ---
> (In reply to Mike Frysinger from comment #20)
> 
> > i think you might have missed some of the things Rafal said.  he is correct
> > when he said BSD because they already have a format specifier for this: %OB.
> > POSIX does *not* support this today (see the linked POSIX bug report for
> > more details) which leads us to the multiple choices he outlined in comment
> > #7.
> 
> 1. I do not understand what do you mean by "POSIX does *not* support this
> today". POSIX is a standard, not an implementation, and O and E modifiers are
> described in the standard, see
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html 
> 
> 2. By this description, O modifier is applicable to d, e, H, I, m, M, S, u, U,
> V, w, W, and y. OB is BSD-specific *extension* to POSIX.
> 
> 3. %OB is not just BSD extension, it is also ugly hack which likely does not
> solve the problem. BSD offers only two cases for month name ??? %B and %OB, while
> in Russian language a noun has six cases, in Finnish ??? more than dozen. I am
> not a linguist and not aware about other languages, but it is obviously 2 cases
> cannot cover all the needs.

Well, can we see what is the extent of the problem?
Russian, Polish, possibly all of the Slavian languages, Finnish,
Hungarian, Estonian?

One could also list all the grammatical cases,  number them and make a notation
in the format specifiers, but if the already implemented and specified
solution solves the problem for all languages with this problem, then
an extended solution could be overkill.

Best regards
Keld
Comment 30 Kalle Olavi Niemitalo 2016-01-17 07:38:52 IST
In Finnish, all the month names end with "kuu" (moon) in the singular nominative, and you can form the genitive, partitive, or inessive by appending "n", "ta", or "ssa": tammikuu, tammikuun, tammikuuta, tammikuussa.  So I think we can largely cover text output just by localizing the strftime format string.

I think we'd write

tammikuu = January (nominative)
tammikuun 17. päivä = 17th day of January (genitive)
tammikuun 17. päivänä = on the 17th day of January (genitive)
tammikuun 17. päivänä 2016 = on the 17th day of January (genitive) 2016
17. tammikuuta = 17 January (partitive)
17. tammikuuta 2016 = 17 January (partitive) 2016
tammikuussa 2016 = in January (inessive) 2016

("1. tammikuu" seems like it would mean the January of the first year, as in, it snowed this much in the first January but not so much a year later.)

In plural forms though, the "kuu" ending often has to be modified: tammikuut (nominative), tammikuiden (genitive), tammikuita (partitive), tammikuissa (inessive). But I don't see these being used in dates.

There are forms like sunnuntaisin = on Sundays, talvisin = in wintertime. One could likewise say tammikuisin but this seems uncommon. Perhaps that form could be used in a web application where the user selects whether they want to be sent an invoice every January or every February.
Comment 31 van.de.bugger 2016-01-17 14:03:00 IST
(In reply to keld@keldix.com from comment #29)

Thanks for the link. I didn't read entire spec, but LC_TIME section looks very similar to POSIX. The only additions I found are first_weekday, first_workday, cal_direction, and timezone; the rest looks the same.

After looking through ISO/IEC 30112 and studying existing locale definitions in glibc, it is clear to me that O modifier is intended for writing numbers in traditional style. Some languages (Japanese, Persian, Oriya, Burmese to name a few) have traditional symbols for digits and numbers. So O modifier is applicable to numeric values only, and using %OB is a non-standard BSD hack.

BTW, I found a dirty hack in glibc: uk_UA locale has *month names* in genitive case in alt_digits table. Now they can get month name in genitive case by %Om, but applying modifier O to another format specifier (like %Od, %Of, %OH) is meaningless: %OI converts hour to a months name in genitive case... 

You see: users want to have properly formatted dates and ready to implement any dirty hacks for getting result.

> Well, can we see what is the extent of the problem?
> Russian, Polish, possibly all of the Slavian languages, Finnish,
> Hungarian, Estonian?

All the Slavic languages for sure. Sorry, I am not aware about other languages.

> One could also list all the grammatical cases,  number them and make a
> notation
> in the format specifiers, but if the already implemented and specified
> solution solves the problem for all languages with this problem, then
> an extended solution could be overkill.

I already wrote that two cases (%B and %OB) does not solve the problem in general. 

However, I understand that two cases (nominative and genitive) cover the great majority of use cases (for Russian language at least), and I know why: (1) all the demanding applications have implemented (and had to implement) date/time formatting routines in-house due to lack of support in standard libraries; (2) fixing standard libraries takes ages: you see, this issue was open 6 years ago, but problem is not yet solved.
Comment 32 Piotr Drąg 2016-01-17 14:38:42 IST
(In reply to van.de.bugger from comment #25)
> 3. %OB is not just BSD extension, it is also ugly hack which likely does not
> solve the problem. BSD offers only two cases for month name — %B and %OB,
> while in Russian language a noun has six cases, in Finnish — more than
> dozen. I am not a linguist and not aware about other languages, but it is
> obviously 2 cases cannot cover all the needs.
> 

I don't think glibc is intended to be used to construct natural sentences, in which case we would need support for every single case in every single language. I don't think that's realistic or needed.

Instead, there are two applications: to show time and date, and to show a calendar. Using just standalone and format versions of the month names (or nominative and genitive) covers both uses.

Extending it would amount to overengineering a solution to a non-existing problem, while Rafal's proposition elegantly resolves a real-world, severe problem.
Comment 33 van.de.bugger 2016-01-17 17:43:30 IST
(In reply to Piotr Drąg from comment #32)
> Instead, there are two applications: to show time and date, and to show a
> calendar. Using just standalone and format versions of the month names (or
> nominative and genitive) covers both uses.

In such a case we would need just two functions: show_time_and_date() and show_calendar(). Indeed we have strftime() with great flexibility: 40 format specifiers, two modifiers O and E, 4 flags _, -, 0, and ^ (btw, all four flags are GNU extensions).

> I don't think glibc is intended to be used to construct natural sentences,
> in which case we would need support for every single case in every single
> language. 

...just for 12 (names) + 7 (weekdays) = 19 words in a language. It's not a big deal, is it?

Also, there is no need to prepare extended tables for all languages at once. If table is not yet extended, strftime should fallback to existing (nominative case) name.

> I don't think that's realistic or needed.

I always love people saying me "this is not needed" when I need it.

I already mentioned an example. Look at uk_UA locale definition:

>   % Initially alt_digits was supposed to hold alternative symbols for _digits_,
>   % corresponding to %O modified conversion specification.
>   % Although in Ukrainian language alternate _names_ are used instead of digits.
>   % We'll use this keyword to present a list of month names in proper form for
>   % date, see mon.  (%Om)
>   %
>   % This hack is dedicated for months it won't work for other %O* modifiers
>   % (weeks, days etc).

People need it, they have implemented a hack for it because there is no way to implement it in current infrastructure. (I am a bit surprised that this obvious hack was accepted by upstream.)
Comment 34 Piotr Drąg 2016-01-17 18:10:48 IST
The uk_UA hack fixes the exact same problem that Rafal's patch is fixing: the lack of genitive month names. Could we please focus on that?
Comment 35 van.de.bugger 2016-01-17 20:00:28 IST
(In reply to Kalle Olavi Niemitalo from comment #30)
> In Finnish, all the month names end with "kuu" (moon) in the singular
> nominative, and you can form the genitive, partitive, or inessive by
> appending "n", "ta", or "ssa": tammikuu, tammikuun, tammikuuta, tammikuussa.
> So I think we can largely cover text output just by localizing the strftime
> format string.

You are lucky. What about weekday names?
Comment 36 Kalle Olavi Niemitalo 2016-01-18 00:23:28 IST
(In reply to van.de.bugger from comment #35)
> You are lucky. What about weekday names?

In Finnish, most of those end with "tai", and their cases have simple suffixes. However, "keskiviikko" (Wednesday) is different.

maanantai, keskiviikko -- singular nominative
maanantain, keskiviikon -- singular genitive (or accusative)
maanantaita, keskiviikkoa -- singular partitive
maanantaina, keskiviikkona -- singular essive
maanantaiksi, keskiviikoksi -- singular translative
maanantaissa, keskiviikossa -- singular inessive
maanantaista, keskiviikosta -- singular elative
maanantaihin, keskiviikkoon -- singular illative
maanantailla, keskiviikolla -- singular adessive
maanantailta, keskiviikolta -- singular ablative
maanantaille, keskiviikolle -- singular allative
maanantaitta, keskiviikotta -- singular abessive

maanantait, keskiviikot -- plural nominative (or accusative)
maanantaiden, keskiviikkojen -- plural genitive
maanantaita, keskiviikkoja -- plural partitive
maanantaina, keskiviikkoina -- plural essive
maanantaiksi, keskiviikoiksi -- plural translative
maanantaissa, keskiviikoissa -- plural inessive
maanantaista, keskiviikoista -- plural elative
maanantaihin, keskiviikkoihin -- plural illative
maanantailla, keskiviikoilla -- plural adessive
maanantailta, keskiviikoilta -- plural ablative
maanantaille, keskiviikoille -- plural allative
maanantaitta, keskiviikoitta -- plural abessive
maanantaine-, keskiviikkoine- -- (plural) comitative (rare)
maanantain(?), keskiviikoin -- plural instructive (rare)

In many of these, the plural is written the same as the singular.

I suppose a calendar application could use:
- the singular nominative for column headers (Monday, Tuesday)
- the singular essive for dates of events (on Monday)
- the singular elative and illative for the starting and ending days of a multi-day event (spanning from Monday to Friday)
- the singular ablative and allative for the original and newly chosen dates of a rescheduled event (postponed from Tuesday to Wednesday)
- "keskiviikkoisin" (mentioned in comment #30; I'm not sure this is the instructive case of the adjective "keskiviikkoinen") for a recurring event

An email application that localizes the Date header might use only the singular essive, which can luckily be formed by appending "na" to the singular nominative case of any weekday and so doesn't need a separate format specifier in strftime. (Some other words aren't so lucky.) This assumes that the strftime format string is made localizable, but that should be done in any case because I don't see how a programmer could choose the correct grammatical cases without knowing the language. 

Are programmers or translators asking glibc to support any variants of weekday names other than what it already has?
Comment 37 van.de.bugger 2016-01-19 00:00:43 IST
(In reply to Kalle Olavi Niemitalo from comment #36)
> In Finnish, most of those end with "tai", and their cases have simple
> suffixes. However, "keskiviikko" (Wednesday) is different.

It means messages containing weekday names cannot be easily localized throug strftime format string: "%An" gives genitive case for all weekdays but Wednesday.

> Are programmers or translators asking glibc to support any variants of
> weekday names other than what it already has?

But today we have *only* nominative case, which is not enough for Russian language (at least). "%B" (month name in nominative case) is ok for month heading in calendar application, but "%d %B %Y" looks ugly because in this phrase month name should be in genitive case. However, if we change "mon" table in ru_RU locale definition to have month names in genitive cases to fix "%d %B %Y", we break "%B". 

So yes, at least few programmers/translators are asking for more. However, we can't get agreement on how much grammatical cases we want to have. Some guys here think two cases is enough while I believe we should not create artificial limits — let us have as many grammatical cases as required for the language.

> …This
> assumes that the strftime format string is made localizable, but that should
> be done in any case because I don't see how a programmer could choose the
> correct grammatical cases without knowing the language. 

You get the point.
Comment 38 Kalle Olavi Niemitalo 2016-01-20 07:04:06 IST
Microsoft Windows provides:
- GetCalendarInfoEx can retrieve the name of a month or the name of a day of the week. The nominative case is the default. For months only, the caller can add the CAL_RETURN_GENITIVE_NAMES flag.
- GetDateFormatEx formats a date according to a format picture string in which "MMMM" represents the name of the month.  The caller cannot explicitly request the genitive case. The function uses heuristics instead, and they don't always work perfectly.

https://msdn.microsoft.com/en-us/library/windows/desktop/dd317734(v=vs.85).aspx
"Calendar Type Information (Windows)"

http://www.siao2.com/2011/10/26/10230215.aspx
"Improving genitive. Or not…. (part 2): Explaining the point of Part 1"
Comment 39 Kalle Olavi Niemitalo 2016-01-20 07:37:57 IST
(In reply to Rafal Luzynski from comment #7)
> 3. Mimic the *BSD specification but implement it conversely: let %B retrieve
> the nominative case (as it currently does) and let the new %OB specifier
> retrieve the genitive case. See also:
> http://austingroupbugs.net/view.php?id=258 - this seems to has accepted this
> solution.

The Final Accepted Text in Austin Group bug 258 contains:
"alt_mon Define the full month names, corresponding to the %OB conversion specification. [...] For languages having both a genitive (when used with a day number) and a nominative (no day number) case, this operand shall be used to denote the nominative case."

The FreeBSD strftime(3) page says: "Additionally %OB implemented to represent alternative months names (used standalone, without day mentioned)." I believe this means the nominative case.

So %OB retrieves the nominative case in both of them; I don't see any conflict. The same could be implemented in glibc.
Comment 40 Rafal Luzynski 2016-01-21 22:09:27 IST
Thank you for all your comments and I'm sorry for not being able to reply immediately.

(In reply to Kalle Olavi Niemitalo from comment #39)
> [...]
> The Final Accepted Text in Austin Group bug 258 contains:
> "alt_mon Define the full month names, corresponding to the %OB conversion
> specification. [...] For languages having both a genitive (when used with a
> day number) and a nominative (no day number) case, this operand shall be
> used to denote the nominative case."
> 
> The FreeBSD strftime(3) page says: "Additionally %OB implemented to
> represent alternative months names (used standalone, without day
> mentioned)." I believe this means the nominative case.
> 
> So %OB retrieves the nominative case in both of them; I don't see any
> conflict. The same could be implemented in glibc.

Oops, somehow I must have read it backwards. :-) Having read this again carefully now I lean into the solution which is common to *BSD and POSIX which means that:

- nl_langinfo(ALT_MON_...) and strftime("%OB") both retrieve the nominative case,
- nl_langinfo(MON_...) and strftime("%B") both retrieve the genitive case (which is different than now!),
- for the locales which do not support genitive cases the nominative case is returned instead.

Will you agree for this solution?

Pros:
- full compatibility with *BSD and POSIX,
- simple and deterministic implementation,
- full programmer's control on whether they want a nominative or genitive case,
- will automagically fix all dates using %B conversion specifiers and displaying the nominative case which is incorrect (full dates).

Cons:
- at the same time will break formatting of all dates using %B conversion specifiers where the nominative case is required and is correctly provided now (e.g., calendar headers), the application developer may not even be aware that the application became broken in some languages,
- therefore will require urgent intervention from some application developers,
- it will be difficult or even impossible to provide a backward compatible solution which would detect if the current runtime version of glibc requires %OB or %B for the month name in nominative case.

This also means that the comment 7 does not make sense. The solution 3 should be merged into 2 because they actually should be the same, and solution 1 is no longer needed since 2 is common for *BSD and POSIX.

Some answers:


(In reply to keld@keldix.com from comment #29)
> [...]
> Well, can we see what is the extent of the problem?
> Russian, Polish, possibly all of the Slavian languages, Finnish,
> Hungarian, Estonian?

Comment 6 says that all Slavonic languages except Bulgarian and Macedonian, plus Finnish, https://bugzilla.gnome.org/show_bug.cgi?id=749206#c6 mentions Greek, I have accidentally found in CLDR that Baltic languages are injured, too. Roughly whole Eastern half of Europe with some exceptions plus some Western European languages (Icelandic?) Plus unknown number of non-European languages. For a full list you should browse the CLDR database.


(In reply to van.de.bugger from comment #31)
> [...]
> BTW, I found a dirty hack in glibc: uk_UA locale has *month names* in
> genitive case in alt_digits table. Now they can get month name in genitive
> case by %Om, but applying modifier O to another format specifier (like %Od,
> %Of, %OH) is meaningless: %OI converts hour to a months name in genitive
> case... 

Please see my patch for Ukrainian.
Comment 41 Kalle Olavi Niemitalo 2016-01-23 07:26:58 IST
I searched for uses of the strftime-style %B in the Finnish PO files at <http://translationproject.org/extra/matrix.html>. I found only the following:

http://translationproject.org/PO-files/fi/a2ps-4.14.fi.po
msgid "%A %B %d, %Y"
msgstr "%A, %e. %Bta %Y"

http://translationproject.org/PO-files/fi/help2man-1.46.6.fi.po
msgid "%B %Y"
msgstr "%B %Y"

http://translationproject.org/PO-files/fi/coreutils-8.25-pre1.fi.po
A list of all the strftime format specifiers.

If the glibc Finnish locale were changed to expand "%B" to the genitive case and "%OB" to the nominative case, the change would break those for no benefit to translators or users. There aren't many uses of "%B" in the Finnish PO files but users may also have strftime format strings in date commands in shell scripts. I think the Finnish locale should not be changed as part of fixing this Russian bug.
Comment 42 Rafal Luzynski 2016-01-27 22:19:24 IST
Three occurrences is not much and can be easily changed. Although you don't know how many projects not listed at translationproject.org will also be affected. But if appending "ta" to the nominative month name is a correct way to achieve a desired genitive form and this works for all months then you can use the "%Bta" format specifier and you don't need this new feature. Note that if locales do not provide the alternative month names then basic names will be returned for both "%B" and "%OB". You can choose whatever you find more appropriate. Unfortunately, some other languages have too complex rules to have such an easy workaround.
Comment 43 Piotr Drąg 2016-03-03 14:05:34 IST
Any chance of picking this up? The original issue is still there.
Comment 44 Rafal Luzynski 2016-03-04 22:18:36 IST
Sorry for this delay. I was overwhelmed with other projects. I'd like to use Fedora and GNOME freeze period which will be soon to return to this bug. glibc 2.23 has just been released after its freeze, I guess that 2.24 will be released in August so now seems to be a good time to come back. Things to do:

* As Mike pointed out here: https://sourceware.org/ml/libc-alpha/2016-01/msg00152.html I should remove the additional array containing the same month names as another one.
* As pointed out in comment 39 I have completely misunderstood the intentions of POSIX specification and I must reimplement the solution. Fortunately, the new implementation will be easier and to rework the old solution I must remove the heuristic code guessing if this is full date or just a month name and replace nominative case with genitive case in few places.

That means we are on a good path. I don't want to promise any specific date when it will land but I hope it would be soon. Of course, everybody are welcome to rework my patches.
Comment 45 Rafal Luzynski 2016-03-24 11:14:01 IST
Created attachment 9122 [details]
Proposed solution: support alternative month names (version 2)

This patch implements the new approach: we no longer need day-month-order trick (as pointed out in comment 39). Also fixes a bug: defines alternative month names to the empty string in the default (fallback) C locale.
Comment 46 Rafal Luzynski 2016-03-24 11:15:21 IST
Created attachment 9123 [details]
Add tests for the alternative month names (v2)

Actually not different from the old version, the only difference is that the subject line says it's PATCH 2/6 rather than PATCH 2/7.
Comment 47 Rafal Luzynski 2016-03-24 11:20:59 IST
Created attachment 9124 [details]
Proposed solution: implement the %OB format specifier

As pointed out in comment 39 implementing %OB format specifier is more correct solution than implementing a smart algorithm trying to guess if we are in the full date context or standalone. %B retrieves the default month name which is genitive from now (this is the change!) in those languages which need a genitive. %OB retrieves the alternative month name which is nominative in those languages which need it (this is the change because previously the default month name was nominative), also it returns the same as %B for those languages which do not need this feature or do not yet provide the updated locale. This means that %B does not have to be changed in most cases but in some cases it *must* be changed to %OB.
Comment 48 Rafal Luzynski 2016-03-24 11:23:05 IST
Created attachment 9125 [details]
Alternative month names NLS data (Polish, v2)

The change in this patch is that the mon and alt_mon arrays have been replaced. Previously I misunderstood their meaning, as pointed out in comment 39.

As previously, you can treat this patch as a test data. You can provide the updated locales from more reliable source but you will not see the effect of the change without the updated locale.
Comment 49 Rafal Luzynski 2016-03-24 11:23:43 IST
Created attachment 9126 [details]
Alternative month names NLS data (Russian, v2)

Same as above.
Comment 50 Rafal Luzynski 2016-03-24 11:24:29 IST
Created attachment 9127 [details]
Alternative month names NLS data (Ukrainian, v2)

Same as above. Enjoy!
Comment 51 Piotr Drąg 2016-08-24 10:35:25 IST
Any progress on the patches review? I desperately need to get this fixed. :(
Comment 52 Rafal Luzynski 2016-08-28 23:34:59 IST
Sorry for the delay, recently I was overwhelmed with another Linux related project. The last response is that this change breaks ABI so versioning should be implemented to retain the backward compatibility. [1] [2]  Unfortunately, I'll have to learn more about it as this concept is new for me.  Fortunately, also I got some tips about it but more tips would be welcome.

Piotr, believe me, I find this bug as blatant as you do.


[1] https://sourceware.org/ml/libc-alpha/2016-06/msg00009.html
[2] https://sourceware.org/ml/libc-alpha/2016-06/msg00019.html
Comment 53 Rafal Luzynski 2016-10-17 22:13:12 IST
Created attachment 9569 [details]
Proposed solution: support alternative month names (version 3)

Here is the new patch set including backward compatibility via the function versioning.

This first patch is actually the same as the one in comment 45, it's only rebased.
Comment 54 Rafal Luzynski 2016-10-17 22:18:05 IST
Created attachment 9570 [details]
Provide backward compatibility for nl_langinfo family

As requested in [1] and [2] this patch defines backward compatible versions of nl_langinfo() and nl_langinfo_l(). Is this sufficient? Can we assume that __nl_langinfo_l() is a private API (even if visible in public) and it is not guaranteed to be backward compatible?

[1] https://sourceware.org/ml/libc-alpha/2016-06/msg00009.html
[2] https://sourceware.org/ml/libc-alpha/2016-06/msg00019.html
Comment 55 Rafal Luzynski 2016-10-17 22:19:38 IST
Created attachment 9571 [details]
Rebuild abilists to reflect nl_langinfo changes

The changes in this patch are automatically generated.
Comment 56 Rafal Luzynski 2016-10-17 22:23:48 IST
Created attachment 9572 [details]
Add tests for alternative month names (v3)

Actually it's the same as the patch provided in comment 46, the only difference is that the subject line says it's PATCH 04/11 rather than PATCH 2/6.
Comment 57 Rafal Luzynski 2016-10-17 22:27:58 IST
Created attachment 9573 [details]
Proposed solution: implement the %OB format specifier (v3)

Actually it's the same patch as the one provided in comment 47, the only difference is that the subject line says it's PATCH 05/11 rather than PATCH 3/6.
Comment 58 joseph@codesourcery.com 2016-10-17 22:35:43 IST
__nl_langinfo_l is at a public symbol version and used by libstdc++ 
(libstdc++ needs to use internal symbols like that for namespace reasons; 
indeed, we probably need to add more such exports for libstdc++ use).  It 
needs to stay backward compatible.
Comment 59 Rafal Luzynski 2016-10-17 22:47:36 IST
Created attachment 9574 [details]
Provide backward compatibility for strftime family

This patch provides backward compatibility feature for the functions: strftime(), strftime_l(), wcsftime(), and wcsftime_l(). The backward compatible versions ignore the %OB format specifier and for the format specifier %B return the same results as %OB does for the new version.

Is this sufficient? This patch does not provide the backward compatibility for strptime() and strptime_l(). Do we need a backward compatible version? The possible backward compatible version would:

* return an error code if %OB format specifier is found,
* match the month names only against the nominative form (but since the month is recognized after the longest matching substring at the beginning the results will be usually the same no matter if we match nominative or genitive cases).
Comment 60 Rafal Luzynski 2016-10-17 22:51:37 IST
Created attachment 9575 [details]
Rebuild abilists to reflect strftime family changes

Same as the patch in comment 55, this patch contains the automatically generated changes.
Comment 61 Rafal Luzynski 2016-10-17 22:56:42 IST
Created attachment 9576 [details]
Alternative month names NLS data (Polish, v3)

This patch has been rebased against the current master.

As previously, you can treat this patch as a test data. You can provide the updated locales from more reliable source but you will not see the effect of the change without the updated locale.
Comment 62 Rafal Luzynski 2016-10-17 22:57:46 IST
Created attachment 9577 [details]
Alternative month names NLS data (Russian, v3)

Same comment as above.
Comment 63 Rafal Luzynski 2016-10-17 22:58:58 IST
Created attachment 9578 [details]
Alternative month names NLS data (Ukrainian, v3)

Same comment as above.
Comment 64 Rafal Luzynski 2016-10-17 23:01:19 IST
Created attachment 9579 [details]
Alternative month names NLS data (Czech)

New bonus: this patch contains an example of NLS data containing nominative and genitive month names in Czech. Enjoy!
Comment 65 Rafal Luzynski 2016-10-27 23:46:11 IST
Created attachment 9595 [details]
Proposed solution: support alternative month names (version 4)

Here goes a new set of patches. Some are skipped in this set because they have not been changed.

The difference between the version from comment 53 and this one: this patch does not add anything to conform/data/langinfo.h-data, as requested in https://sourceware.org/ml/libc-alpha/2016-10/msg00303.html
Comment 66 Rafal Luzynski 2016-10-27 23:50:29 IST
Created attachment 9596 [details]
Provide backward compatibility for nl_langinfo family (version 4)

This patch, compared to the one from comment 54, provides backward compatibility not only to nl_langinfo() and nl_langinfo_l() but also to __nl_langinfo_l(), as requested in comment 58.

Note: the patch description says "(version 4)" only for the consistency with other v4 patches.
Comment 67 Rafal Luzynski 2016-10-27 23:56:07 IST
Created attachment 9597 [details]
Rebuild abilists to reflect nl_langinfo changes (version 4)

Same changes as above, also this patch is rebased against the current master.

--------

SKIPPED PATCHES:
* Patch 0004 could go here, skipped because it is the same as in comment 56.
* Patch 0005 could go here, skipped because it is the same as in comment 57.
Comment 68 Rafal Luzynski 2016-10-28 00:01:12 IST
Created attachment 9598 [details]
Provide backward compatibility for strftime family (version 4)

This patch, compared to the one from comment 59, provides backward compatibility not only to strftime(), strftime_l(), wcsftime(), and wcsftime_l() but also to __strftime_l() and __wcsftime_l(), as requested in https://sourceware.org/ml/libc-alpha/2016-10/msg00304.html
Comment 69 Rafal Luzynski 2016-10-28 00:08:48 IST
Created attachment 9599 [details]
Rebuild abilists to reflect strftime family changes (version 4)

Same changes as above, also this patch is rebased against the current master.

--------

SKIPPED PATCHES:
0008 (Polish) - same as in commment 61
0009 (Russian) - same as in commment 62
0010 (Ukrainian) - same as in commment 63
0011 (Czech) - same as in commment 64

Also note that those patches only provide the test data, they are not intended to be committed (although I don't mind if you decide so.)
Comment 70 Rafal Luzynski 2016-12-22 23:04:29 IST
Created attachment 9709 [details]
Proposed solution: support alternative month names (version 5)

Here goes the new set of patches.

The main difference is what I realized after comparing how this problem has been solved in other systems.  My idea that alt_mon should be optional was wrong.  Making it obligatory simplifies the code, we don't have to check if a particular alternative month name is NULL or empty.  Locale data are part of glibc, we are able to provide the data fulfilling our own requirements.  For the languages which do not need two forms of months names this means that months names must be copied from mon to alt_mon even although both forms are identical.

Difference between this patch and the one from comment 65: alt_mon locale data section made obligatory.
Comment 71 Rafal Luzynski 2016-12-22 23:07:46 IST
Created attachment 9710 [details]
Provide backward compatibility for nl_langinfo family (version 5)

Differences between the patch from comment 66 and this one:

* dirty aliases ending with "2" (e.g., "__nl_langinfo_noaltmon_l2") replaced with more decent ending with "_alias" (e.g., "__nl_langinfo_noaltmon_l_alias");
* alt_mon section made obligatory, no need to check for NULLs and empty strings.
Comment 72 Rafal Luzynski 2016-12-22 23:09:48 IST
Created attachment 9711 [details]
Rebuild abilists to reflect nl_langinfo changes (version 5)

Compared to the one from comment 67: rebased against the current master.
Comment 73 Rafal Luzynski 2016-12-22 23:12:03 IST
Created attachment 9712 [details]
Add tests for alternative month names (v5)

Actually the same as the patch in comment 56 and comment 46, only the subject line changed to PATCH 04/13 (instead of 4/11 or 2/6).
Comment 74 Rafal Luzynski 2016-12-22 23:14:42 IST
Created attachment 9713 [details]
Proposed solution: implement the %OB format specifier (v5)

Difference between the patch from comment 57 and comment 47: as alt_mon is obligatory no need to check for NULLs and empty strings, this simplifies the code a little.
Comment 75 Rafal Luzynski 2016-12-22 23:20:44 IST
Created attachment 9714 [details]
Provide backward compatibility for strftime family (version 5)

Compared to the one from comment 68 this patch is a little simpler because there is no need to check if alt_mon is NULL or empty, as it is obligatory now.  Also replaces dirty alias names ending with "2" (like "__strftime_l_compat2") with more decent ending with "_alias" (like "__strftime_l_compat_alias").
Comment 76 Rafal Luzynski 2016-12-22 23:22:03 IST
Created attachment 9715 [details]
Rebuild abilists to reflect strftime family changes (version 5)

Compared to the one from comment 69: rebuilt against the current master.
Comment 77 Rafal Luzynski 2016-12-22 23:23:56 IST
Created attachment 9716 [details]
Alternative month names NLS data (Polish, v5)

Compared to the one from comment 61: no changes, only the subject line says "[PATCH 08/13]".
Comment 78 Rafal Luzynski 2016-12-22 23:25:59 IST
Created attachment 9717 [details]
Alternative month names NLS data (Russian, v5)

Compared to the one from comment 62: similar change as above.
Comment 79 Rafal Luzynski 2016-12-22 23:27:48 IST
Created attachment 9718 [details]
Alternative month names NLS data (Ukrainian, v5)

Compared to the one from comment 63: similar change as above.
Comment 80 Rafal Luzynski 2016-12-22 23:29:36 IST
Created attachment 9719 [details]
Alternative month names NLS data (Czech, v5)

Compared to the one form comment 64: similar change as above.
Comment 81 Rafal Luzynski 2016-12-22 23:32:10 IST
Created attachment 9720 [details]
Alternative month names for all locales

All month names (mon sections) in all supported languages (except those in the patches above) copied into alt_mon sections.
Comment 82 Rafal Luzynski 2016-12-22 23:37:39 IST
Created attachment 9721 [details]
Import month names from CLDR

This patch imports month names from CLDR, where alt_mon sections are not just copies of mon sections.  This adds actual full genitive/nominative support to 15 new locales, together with 4 supported previously makes total 19 locales (17 unique languages because Greek and Russian appear twice).  Other changes are only partially related with the nominative/genitive cases or totally unrelated, just spotted while importing data.

This is the end of this set of patches.
Comment 83 Rafal Luzynski 2017-03-20 08:47:47 IST
Created attachment 9908 [details]
Proposed solution: support alternative month names (version 6)

Here is the new set of patches.

Differences between the patch from comment 70 and this one:

- alt_mon section in locale data source made optional again, like in comment the data will be generated while compiling to binary;
- comments reworded not to suggest that alt_mon is standalone (nominative) and mon is for formatting full date (genitive), this will be decided later;
- ALTMON_x macros defined only if __USE_GNU is defined; for now they are treated as GNU-only extension.
Comment 84 Rafal Luzynski 2017-03-20 08:51:18 IST
Created attachment 9909 [details]
Provide backward compatibility for nl_langinfo family (version 6)

Difference between the patch from comment 71 and this one: glibc version rebased from 2.25 to 2.26.
Comment 85 Rafal Luzynski 2017-03-20 08:58:13 IST
Created attachment 9910 [details]
Rebuild abilists to reflect nl_langinfo changes (version 6)

Differences between the patch from comment 72 and this one:

- glibc version rebased from 2.25 to 2.26,
- this also adds the symbol GLIBC_2.26 because it does not exist yet.

--------

SKIPPED PATCH:

0004 Add tests for alternative month names - same as in comment 73 except the subject line changed
Comment 86 Rafal Luzynski 2017-03-20 09:01:23 IST
Created attachment 9911 [details]
Proposed solution: implement the %OB format specifier (v6)

Difference between the patch from comment 74: commit comment reworded not to suggest that %OB is standalone (nominative) and %B is for full date format (nominative), this will be decided later.  Really no changes in the code.
Comment 87 Rafal Luzynski 2017-03-20 09:03:50 IST
Created attachment 9912 [details]
Provide backward compatibility for strftime family (version 6)

Difference between the patch from comment 75 and this one: glibc version rebased from 2.25 to 2.26.
Comment 88 Rafal Luzynski 2017-03-20 09:09:03 IST
Created attachment 9913 [details]
Rebuild abilists to reflect strftime family changes (version 6)

Difference between the patch from comment 76 and this one: glibc version rebased from 2.25 to 2.26.
Comment 89 keld@keldix.com 2017-03-20 09:12:12 IST
Hi 

I would ask that our changes are backwards compatible, so that we need not change 
the existing data, just addin new data, (and changing the date format spec).

Something like the %B is the nominative case, %OB is the genitive case, and then
the full date spec is using %OB for the month names when required.

And the alt_mon is an optional keyword, and general. I believe these changes are
getting their way into both POSIX and i18n 30112 standards.

Best regads
keld
Comment 90 Rafal Luzynski 2017-03-20 09:18:46 IST
Created attachment 9914 [details]
Let alternative month names be a copy of regular ones

This is a new patch: alt_mon section is optional in the locale data source file but this patch modifies the compiler to generate the missing alt_mon section as a copy of mon section in the binary output file.

The difference between the patch from comment 81 and this one is not obvious: the old patch actually added alt_mon section to all source files while this one makes them unnecessary and generates the missing data automatically.
Comment 91 Rafal Luzynski 2017-03-20 09:28:12 IST
Created attachment 9915 [details]
Also implement abbreviated alternative month names and %Ob

This is a new patch.  I realized that at least in case of May in Russian and Belarusian also abbreviated month names may have nominative/genitive form, in this particular case because the word is so short that its abbreviated form equals the full form.  A similar phenomenon may also exist in other languages which I am not yet aware of.  So this patch repeats all the above steps to implement _NL_ABALTMON_x constants in nl_langinfo() and "%Ob" format specifier in strftime(), except the backward compatibility.
Comment 92 Rafal Luzynski 2017-03-20 09:35:37 IST
Created attachment 9916 [details]
Backward compatibility for abbreviated alternative month names and %Ob

This is a new patch.  Repeats the backward compatibility steps also for _NL_ABALTMON_x constants in nl_langinfo() and "%Ob" format specifier in strftime().

--------

SKIPPED PATCH:
0011 (Polish) - same as in comment 77
Comment 93 Rafal Luzynski 2017-03-20 09:45:22 IST
Created attachment 9917 [details]
Alternative month names NLS data (Russian, v6)

Differences between the patch from comment 78 and this one:

- Adds ab_alt_mon section (this is an example of how and why this feature can be used).
- No longer changes the nominative cases to lowercase.  CLDR suggests to use the standalone/full-date-format feature to start standalone cases with uppercase letter even for the languages which don't need always the uppercase letter because a standalone word is an expression by itself (like a sentence).  I'm not sure if the Russian language community will like it, let them decide.  This is just an example, I don't suggest they will actually push it without changes.

--------

SKIPPED PATCHES:
0013 (Ukrainian) - same as in commment 79
0014 (Czech) - same as in commment 80
Comment 94 Rafal Luzynski 2017-03-20 09:51:15 IST
Created attachment 9918 [details]
Import genitive month names from CLDR (v6)

Differences between the patch from comment 82 and this one:

- only those languages which have and need the nominative/genitive case in dates are imported here;
- this new patch does not assume that alt_mon section already exists (the patch from comment 81 is obsoleted/rejected) so it adds these sections;
- be_BY: also adds ab_alt_mon section;
- wa_BE: adds a space after "d’" which I did not know was needed.
Comment 95 Rafal Luzynski 2017-03-20 09:54:02 IST
Created attachment 9919 [details]
Import uppercase/lowercase month names from CLDR (v6)

Some languages which don't need the genitive/nominative difference already use this feature in CLDR to start the month names with uppercase if they are standalone and with lowercase when they are in full date context.  This was already imported previously in the comment 82, this patch splits out these cases.
Comment 96 Rafal Luzynski 2017-03-20 10:15:33 IST
Hi,

(In reply to keld@keldix.com from comment #89)
> Hi 
> 
> I would ask that our changes are backwards compatible, so that we need not
> change 
> the existing data, just addin new data, (and changing the date format spec).
> 
> Something like the %B is the nominative case, %OB is the genitive case, and
> then
> the full date spec is using %OB for the month names when required.

It's been discussed here and in the libc-alpha list.  The current consensus is that whether you decide that %B is nominative or %OB is genitive or the other way round does not change the source code: the difference is only on input (locale data) and output (format specifiers in applications).  Both controlled by language communities.  So these changes are acceptable as long as they do not suggest which is nominative and which is genitive and this will be decided later.  By later I mean later in this development cycle, not later in a distant future.

Also I have some reasons why I suggest switching %B to genitive and introduce %OB as nominative which I published many times before (links available).

> And the alt_mon is an optional keyword, and general.

Done in this series of patches.

> I believe these changes are
> getting their way into both POSIX and i18n 30112 standards.

POSIX has accepted (but not yet published) the opposite solution: http://austingroupbugs.net/view.php?id=258.  Regarding ISO 30112 I'm not aware of their opinion about it.  Can you please provide some more up to date info?

I have already prepared a copr repository for any modern Fedora so if you are brave enough you can install it and test already how many applications are fixed and how many are broken because of these changes.  I'll be glad to hear your feedback. The repo is at https://copr.fedorainfracloud.org/coprs/rluzynski/genitive/
Comment 97 Rafal Luzynski 2017-05-23 22:41:37 IST
Created attachment 10063 [details]
Rebuild abilists to reflect nl_langinfo changes (version 7)

Here is the new set of patches with really minor changes.  Due to this most of the patches are skipped:

--------

SKIPPED PATCHES:

0001 Implement alternative month names - same as in comment 83
0002 Provide backward compatibility for nl_langinfo family - same as in comment 84

--------

This is the new patch 0003.  Difference between the patch from comment 85 and this one: rebased against the current master (NaCl port removed)

--------

MORE SKIPPED PATCHES:

0004 Add tests for alternative month names - same as in comment 73 except the subject line changed
0005 Implement the %OB specifier - alternative month names - same as in comment 86
Comment 98 Rafal Luzynski 2017-05-23 22:45:18 IST
Created attachment 10064 [details]
Provide backward compatibility for strftime family (version 7)

Difference between the patch from comment 87 and this one: rebased against the current master
Comment 99 Rafal Luzynski 2017-05-23 23:00:45 IST
Created attachment 10065 [details]
Rebuild abilists to reflect strftime family changes (version 7)

Difference between the patch from comment 88 and this one: rebased against the current master (NaCl port removed)

--------

SKIPPED PATCHES:

All other patches are skipped because they have not been changed:

0008 Let alternative month names be a copy of regular ones - same as in comment 90
0009 Abbreviated alternative month names (%Ob) also added - same as in comment 91
0010 Backward compatibility for abbreviated alternative month names and %Ob - same as in comment 92
0011 pl_PL: Add alternative month names - same as in comment 77
0012 ru_RU: Add alternative month names - same as in comment 93
0013 uk_UA: Add alternative month names - same as in comment 79
0014 cs_CZ: Add alternative month names - same as in comment 80
0015 Genitive month names imported from CLDR - same as in comment 94
0016 Month names imported from CLDR (upper/lower case) - same as in comment 95
Comment 100 Rafal Luzynski 2017-06-28 00:15:51 IST
Created attachment 10225 [details]
Provide backward compatibility for nl_langinfo family (version 8)

Here is the new set of patches.  The changes are mostly the rebase to the current master and applying the latest changes in the source code repository (like: Use locale_t, not __locale_t https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=af85385).  Some patches are skipped because there are no relevant changes:

--------

SKIPPED PATCHES:

0001 Implement alternative month names - same as in comment 83

--------

This is the new patch 0002.  Difference between the patch from comment 84 and this one: rebased against the current master, reflects the change to use locale_t rather than __locale_t.
Comment 101 Rafal Luzynski 2017-06-28 00:21:26 IST
Created attachment 10226 [details]
Rebuild abilists to reflect nl_langinfo changes (version 8)

This is the new patch 0003.  Difference between the patch from comment 97 and this one: rebased against the current master (new functions have been added by previously accepted patches).

--------

SKIPPED PATCH:

0004 Add tests for alternative month names - same as in comment 73
Comment 102 Rafal Luzynski 2017-06-28 00:24:37 IST
Created attachment 10227 [details]
Proposed solution: implement the %OB format specifier (version 8)

Difference between the patch from comment 86 and this one: rebased against the current master.
Comment 103 Rafal Luzynski 2017-06-28 00:29:10 IST
Created attachment 10228 [details]
Provide backward compatibility for strftime family (version 8)

Difference between the patch from comment 98 and this one: rebased against the current master, uses locale_t instead of __locale_t.
Comment 104 Rafal Luzynski 2017-06-28 00:34:10 IST
Created attachment 10229 [details]
Rebuild abilists to reflect strftime family changes (version 8)

Differences between the patch from comment 99 and this one: rebased against the current master (new functions added by the previously accepted patches).

--------

SKIPPED PATCH:

0008 Let alternative month names be a copy of regular ones - same as in comment 90
Comment 105 Rafal Luzynski 2017-06-28 00:36:10 IST
Created attachment 10230 [details]
Also implement abbreviated alternative month names and %Ob (version 8)

Differences between the patch from comment 91 and this one: rebased against the current master.
Comment 106 Rafal Luzynski 2017-06-28 00:39:31 IST
Created attachment 10231 [details]
Backward compatibility for abbreviated alternative month names and %Ob (version 8)

Differences between the patch from comment 92 and this one: rebased against the current master, uses locale_t rather than __locale_t.

--------

SKIPPED PATCHES:

All other patches are skipped because they have not been changed:

0011 pl_PL: Add alternative month names - same as in comment 77
0012 ru_RU: Add alternative month names - same as in comment 93
0013 uk_UA: Add alternative month names - same as in comment 79
0014 cs_CZ: Add alternative month names - same as in comment 80
0015 Genitive month names imported from CLDR - same as in comment 94
0016 Month names imported from CLDR (upper/lower case) - same as in comment 95
Comment 107 Rafal Luzynski 2017-09-19 10:04:19 IST
Created attachment 10425 [details]
Correct the size of _nl_value_type_LC_... arrays (v9)

This is the new series of patches, attempting to be prepared to the final commit.

This patch is not directly related with this bug but necessary to be pushed before the other patches.  More related with bug 356.  I was asked to split it out.  Previously these changes were included in the patch attached in the comment 83.
Comment 108 Rafal Luzynski 2017-09-19 10:10:26 IST
Created attachment 10426 [details]
Implement alternative month names (v9)

This patch merges everything related with adding the alternative month names: nl_langinfo(ALTMON_…) and strftime("%OB").  At the same time it splits out the changes from the previous comment and some automatically generated files.
Comment 109 Rafal Luzynski 2017-09-19 10:13:27 IST
Created attachment 10427 [details]
Regenerate locfile-kw.h from locfile-kw.gperf (v9)

This patch contains changes to the automatically generated file split out from the previous patch because I was asked not to post them to libc-locale mailing list.  But these changes still need to be pushed to the repository so I post them here.
Comment 110 Rafal Luzynski 2017-09-19 10:17:57 IST
Created attachment 10428 [details]
Also implement abbreviated alternative month names and %Ob (v9)

Differences since the patch posted in the comment 105:
- in the default C locale, default abbreviated alternative month names initialized explicitly to "Jan", "Feb", etc.,
- changes to an automatically generated file split out,
- ChangeLog updated.
Comment 111 Rafal Luzynski 2017-09-19 10:19:19 IST
Created attachment 10429 [details]
Again regenerate locfile-kw.h from locfile-kw.gperf (v9)

Again the changes to an automatically generated file split out from the previous patch.
Comment 112 Rafal Luzynski 2017-09-19 10:20:45 IST
Created attachment 10430 [details]
Documentation to the above changes(v9)

This patch is new, updates the local documentation: NEWS and texinfo files.
Comment 113 Rafal Luzynski 2017-09-19 10:23:30 IST
What will happen with other patches:

- backward compatibility patches most probably will be rejected, please ignore for now,
- updates to the locale data: probably will be regenerated and pushed later, for now please treat them as sample data necessary to actually use the newly added features and see the difference.
Comment 114 Rafal Luzynski 2017-11-16 01:49:44 IST
Created attachment 10592 [details]
Implement alternative month names (v10)

Here is the new series of patches.  Changes:

- Fixed bug in strptime(), thanks Zack Weinberg for fixing it.
- Improved documentation, again thanks Zack Weinberg.
- Rebased to the current master.
Comment 115 Rafal Luzynski 2017-11-16 01:52:23 IST
Created attachment 10593 [details]
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v10)
Comment 116 Rafal Luzynski 2017-11-16 01:53:49 IST
Created attachment 10594 [details]
Also implement abbreviated alternative month names and %Ob (v10)
Comment 117 Rafal Luzynski 2017-11-16 01:55:12 IST
Created attachment 10595 [details]
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v10)
Comment 118 Rafal Luzynski 2017-11-16 02:01:07 IST
Created attachment 10596 [details]
Documentation to the above changes(v10)

--------

SKIPPED PATCHES:

0006 Provide backward compatibility for nl_langinfo family
0007 Rebuild abilists to reflect nl_langinfo changes
0008 Provide backward compatibility for strftime family
0009 Rebuild abilists to reflect strftime family changes
0010 Backward compatibility for abbreviated alternative month names and %Ob

Most probably they will be dropped so I think there is no need to maintain them.
Those who want their rebased version can find them on github.
Comment 119 Rafal Luzynski 2017-11-16 02:03:02 IST
Created attachment 10597 [details]
Alternative month names NLS data (Polish, v10)
Comment 120 Rafal Luzynski 2017-11-16 02:04:10 IST
Created attachment 10598 [details]
Alternative month names NLS data (Russian, v10)
Comment 121 Rafal Luzynski 2017-11-16 02:04:59 IST
Created attachment 10599 [details]
Alternative month names NLS data (Ukrainian, v10)
Comment 122 Rafal Luzynski 2017-11-16 02:06:37 IST
Created attachment 10600 [details]
Alternative month names NLS data (Czech, v10)

Note: Czech language community may not like these changes.  This patch should not be committed without asking them about their opinion.
Comment 123 Rafal Luzynski 2017-11-16 02:08:18 IST
Created attachment 10601 [details]
Import genitive month names from CLDR (v10)
Comment 124 Rafal Luzynski 2017-11-16 02:09:36 IST
Created attachment 10602 [details]
Import uppercase/lowercase month names from CLDR (v10)
Comment 125 Rafal Luzynski 2018-01-13 10:39:03 IST
Created attachment 10731 [details]
Implement alternative month names (v11)

Here is the new series posted to libc-alpha on January 9.  I upload it here mostly for the completeness and to facilitate tracking changes.  There are no major changes in this series, mostly rebased to the current master.
Comment 126 Rafal Luzynski 2018-01-13 10:42:15 IST
Created attachment 10732 [details]
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v11)
Comment 127 Rafal Luzynski 2018-01-13 10:43:56 IST
Created attachment 10733 [details]
Also implement abbreviated alternative month names and %Ob (v11)
Comment 128 Rafal Luzynski 2018-01-13 10:45:26 IST
Created attachment 10734 [details]
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v11)
Comment 129 Rafal Luzynski 2018-01-13 10:49:21 IST
Created attachment 10735 [details]
Documentation to the above changes (v11)
Comment 130 Rafal Luzynski 2018-01-13 10:51:31 IST
Created attachment 10736 [details]
Alternative month names NLS data (Polish, v11)

This patch is a little reordered, the mon array is placed immediately after the alt_mon array (rather than at the end of the LC_TIME section).
Comment 131 Rafal Luzynski 2018-01-13 10:54:40 IST
Created attachment 10737 [details]
Implement alternative month names (v12)

Here is the new series posted to libc-alpha on January 12.  I upload it here for the completeness and to facilitate tracking changes.  This new series contains documentation updates (commit comments, changelog, etc.), thank you Dmitry and Carlos for your recent reviews and remarks.
Comment 132 Rafal Luzynski 2018-01-13 10:55:34 IST
Created attachment 10738 [details]
Regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v12)
Comment 133 Rafal Luzynski 2018-01-13 10:56:21 IST
Created attachment 10739 [details]
Also implement abbreviated alternative month names and %Ob (v12)
Comment 134 Rafal Luzynski 2018-01-13 10:57:06 IST
Created attachment 10740 [details]
Again regenerate locfile-kw.h from locfile-kw.gperf + ChangeLog (v12)
Comment 135 Rafal Luzynski 2018-01-13 10:57:58 IST
Created attachment 10741 [details]
Documentation to the above changes (v12)
Comment 136 Rafal Luzynski 2018-01-13 11:00:54 IST
Created attachment 10742 [details]
Alternative month names NLS data (Polish, v12)

This version also contains an example of parsing the date (strptime) with the month name in a genitive case (required for correctness) in Polish language in the test file.

Also the previous patches contain some more date parsing test cases.