This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30




On 11.03.19 14:59, Egor Kobylkin wrote:


On 04.03.19 23:11, Egor Kobylkin wrote:
ping

On 14.02.19 17:48, Marko Myllynen wrote:
Hi Carlos, Mike, Rafal,

It seems clear that you all are currently too busy to have a look at
this but would you have any estimate when you might be able to review
this so that we could consider merging?

FWIW, I chatted with Egor off-list and we're on the same page wrt the
following, hopefully this gives you a bit off jump start for this
subject when you have time to dig deeper:

1) Built-in C locale doesn't read/use any translit_* files and it can't
have any fallback mechanisms and it only supports ASCII so using GOST
7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
be the appropriate way to implement Cyrillic transliteration for the
built-in C locale (it adds some 8KB to the binary).

2) Other locales read/use translit_* files and with them fallbacks and
non-ASCII are possible so it would seem preferable to first try ISO 9 /
GOST 7.79 System A and only if that fails then use GOST 7.79 System B
(in which case the end result should match with the built-in C locale).
For this the translit_cyrillic file should be added (as per patch v9 +
changes mentioned in patches v10 and v12).

3) Individual locale files can then be updated to use translit_cyrillic
as appropriate (see patch v9) and language/national specific conventions
(e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.

Thanks,

On 04/02/2019 09.14, Egor Kobylkin wrote:
Carlos,
are you comfortable to pick this up again this month?

I would really love to have a reliable action plan to get this committed for 2.30. Maybe cut out a subset that is undisputed and commit only that
first. It looks kinda like an eternal moving target otherwise.

for you reference:
https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html

Bests,
Egor Kobylkin

On 09.01.19 21:03, Marko Myllynen wrote:
Hi,

On 09/01/2019 02.46, Egor Kobylkin wrote:
On 07.01.19 21:37, Marko Myllynen wrote:
On 05/01/2019 23.12, Egor Kobylkin wrote:

Good catch! Should we maybe split this into two patches, one for C and
the other for "country" locales? They have different codes and
functionality so it looks like it would be easier to keep focus.

That would probably make sense, the standard C/POSIX locale won't
support System A so it also narrows down solution alternatives with it.

"Country" locales in localedata/locales/ can then have the exact same
translit table included or they can have any other flavor - I don't
see
a problem here.

Indeed, and since those files are not limited to ASCII, perhaps we
could
now reconsider the v9 approach for them, i.e., prefer System A if
possible, otherwise use System B / ASCII (just need to make sure that
the ASCII fall-back for them will match the built-in C ASCII rule)?

Happy to hear the split seems to be a clear cut one.
How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
(number) and title for clarity in communication?

I'm not sure is a new BZ really needed for such an addition, perhaps a
NEWS entry might be more appropriate (with the full details explained in the commit messages of course) but I'll leave this to others to decide.

This way it would probably be easier to have the decision making process tied up for both patches (separately). We may want to get the v12 POSIX out of the door in 2.30 then and can take all the time we need to set up
the rules for "Countries" locales as you need them to be.

Perhaps Rafal or Carlos have better suggestions but I would think we
could have a patch series where the patch 1/3 adds the C/POSIX locale
part (that would be what you posted as v12), then patch 2/3 adds
translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
System A and GOST 7.79 System B as a fall-back (which would match the
C/POSIX rules)), and finally the patch 3/3 updates locales to use
translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
alternative suggestions so it might be best to wait for their feedback
before doing anything yet (it's unfortunate you've had to do so many
iterations around this already but I think we've all learned something
during the process and the end result will be more correct than any of
the earlier versions).

Thanks,





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]