This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PING^9][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]


ping


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, June 17, 2019 10:59 AM, Diego (Egor) Kobylkin <egor@kobylkin.com> wrote:

> 

> 

> Carlos,
> 

> we seem to have a consensus of all involved that the patch can be committed as is.
> Do you see it like this on your side as well or are there any more questions or suggestions?
> 

> Bests,
> Egor
> 

> P.S. Just a clarification to Rafal points below and thanks @Rafal for the intensive "peer review" so far!
> It definitely looks to me like we finally don't have any more divergent points after all the issues discussed.
> 

> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Tuesday, June 11, 2019 12:40 AM, Rafal Luzynski digitalfreak@lingonborough.com wrote:
> ...
> 

> > 7.06.2019 14:59 "Diego (Egor) Kobylkin" egor@kobylkin.com wrote:
> > 

> > > But the target system doesn't support Russian locale and so you must
> > > transliterate the filenames.
> > 

> > While talking about the filesystem: I think the problem is not
> > that it does not support Russian locale but that it tries to
> > handle it and fails at this. If the filesystem accepted any
> > byte string as a file name wouldn't it accept a byte string which
> > constructs correct Cyrillic characters in UTF-8, without any
> > transliteration?
> 

> Just to clarify here - the need to transliterate is the essential part in this example, not the actual cause of that need.
> A lot of "things" don't support UTF-8 or Cyrillic - filesystems, some UNIX power tools, older network appliances, databases, key-value stores etc. We are talking about a situation where you are forced to transliterate to ASCII. So that requirement is a given.
> 

> ...
> 

> > > In glibc we don't have any framework for an intelligent conversion.
> > > We would have to write specific code to handle this case and add
> > > it into the translit code for special handling in this case.
> > 

> > My suggestion was to add such an intelligent conversion. The rule
> > should be simple: if a letter is followed by a lowercase it should
> > be a titlecase (Sh), otherwise it should be uppercase (SH). But
> > this may break Egor's requirement to keep them always uppercase.
> 

> Again for the record my "requirement" is to have a minimal patch committed sooner than later. It turned out surprisingly difficult to keep our focus even on a single flat mapping table that the ASCII transliteration really is.
> 

> > > I think we should today leave "Ш"->"SH" and "Сх"->"Sh", since it's
> > > the most conservative position that avoids ambiguity, and then we
> > > can discuss the aesthetics of this and the other impacts and solutions.
> > > I appreciate Rafal's position, but I think being conservative here,
> > > even if it's not as pretty as uconv, is a good guiding idea.
> > 

> > Just to summarize: if you want to apply the relaxed rules, more
> > technical than linguistic, then I am more willing to accept these
> > patches.
> 

> The great thing is that we seem to have a consensus now and can proceed.

Attachment: publickey - egor@kobylkin.com - 0x01FEB4E8.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]