This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/10871] ru_RU: 'mon' array should contain both nominative and genitive cases


https://sourceware.org/bugzilla/show_bug.cgi?id=10871

Rafal Luzynski <digitalfreak at lingonborough dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |digitalfreak@lingonborough.
                   |                            |com

--- Comment #7 from Rafal Luzynski <digitalfreak at lingonborough dot com> ---
I'll be happy to provide a complete solution for this problem but some API
design questions must be answered first.

Please note that CLDR mentions only "standalone" version of the month name
which is probably always nominative, and "format" version which may be the same
as "standalone" (e.g., in English) but may be genitive in some languages, it
may also be another case in some other languages. For simplicity I will refer
to these cases as nominative/genitive keeping in mind that CLDR refers to them
as standalone/format. Also there may be languages which use other forms than
nominative/genitive but I think there are probably always at most two forms
since CLDR has decided to consider only two.


I. strftime() - http://linux.die.net/man/3/strftime

This function supports only one format which provides the full month name: %B.
At the moment there is no way for this function to provide multiple forms of
the full month name. Here are the API designs which would provide a full month
name:

1. Do not change the API, implement an internal algorithm which would analyze a
full format string and determine whether %B should format the month name in a
nominative or genitive case. The simplest algorithm would check if %d or %e
conversion specifiers are also present in the same format string, retrieve a
genitive case if they are, nominative otherwise. More advanced version could
check if the day and month conversion specifiers are adjacent, if they are
separated with other conversion specifiers, with space/punctuation/other
characters, if there are other letters concatenated with %B (which would mean
that the caller already tries to provide a workaround for this bug), if the
day/month order is correct (this is true only if day/month order is correct and
month/day order is incorrect in all these languages).

Pros:
- once implemented correctly it will automagically fix all affected
applications,
- even if the implementation will not be perfect for some languages the result
will not be worse than the one currently existing: it will not break any
currently correct application,
- if it turns out that this solution is completely wrong it will be easy to
revert it and provide another one because we don't change the API.

Cons:
- may be difficult to implement,
- it is questionable if a perfect algorithm exists for all affected languages,
even if we check it for all languages mentioned in the comment 6 there may be
other languages which we don't know about and which also require the
nominative/genitive case but use different rules,
- it is questionable how to handle the format strings which are incorrect from
grammatical point of view: please note that strftime() API does not and should
not say that there are illegal combination of the conversion specifiers.

2. Follow the specification already used in *BSD family (which also includes OS
X and iOS): https://www.freebsd.org/cgi/man.cgi?query=strftime&sektion=3. They
implement the %OB conversion specifier which retrieves the nominative case
while %B specifier retrieves the genitive case (sic!)

Pros:
- full portability between glibc and *BSD,
- simple and deterministic implementation,
- full programmer's control on whether they want a nominative or genitive case,
- will automagically fix all dates using %B conversion specifiers and
displaying the nominative case which is incorrect (full dates).

Cons:
- at the same time will break formatting of all dates using %B conversion
specifiers where the nominative case is required and is correctly provided now,
the application developer may not even be aware that the application became
broken in some languages,
- therefore will require urgent intervention from some application developers,
- it will be difficult or even impossible to provide a backward compatible
solution which would detect if the current runtime version of glibc requires
%OB or %B for the month name in nominative case,
- one may question if the *BSD decision to retrieve a genitive case from %B is
correct since it causes so much trouble.

3. Mimic the *BSD specification but implement it conversely: let %B retrieve
the nominative case (as it currently does) and let the new %OB specifier
retrieve the genitive case. See also:
http://austingroupbugs.net/view.php?id=258 - this seems to has accepted this
solution.

Pros:
- simple and deterministic implementation,
- full programmer's control on whether they want a nominative or genitive case,
- full backward compatibility,
- will not break any existing application.

Cons:
- portability with *BSD family will never be possible (format specifiers war),
- will require intervention from the application developers but it will not be
urgent because it will apply only the cases where they use %B explicitly and
this is already incorrect.

I would choose the first solution: not to change the API and try to provide a
smart algorithm which would determine if the month name retrieved by %B should
be nominative or genitive but I will listen to your opinion.


II. nl_langinfo() - http://linux.die.net/man/3/nl_langinfo

Although strftime() does not call nl_langinfo() directly both these functions
use the same backend database. We will need the new constants to be defined in
langinfo.h, for example ALTMON_{1-12} and their wide-character equivalents
_NL_WALTMON_{1-12}. This means it will affect the API of nl_langinfo() by
adding new valid argument values. Please note that I am talking in the context
of https://bugzilla.gnome.org/show_bug.cgi?id=749206 and the implementation of
g_date_time_printf() does call nl_langinfo() to retrieve the month names. I
hope it is valid to add these new symbols after _NL_TIME_CODESET and name them
ALTMON_{1-12} and _NL_WALTMON_{1-12}.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]