This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Is it OK to write ASCII strings directly into locale source files?

From: Carlos O'Donell <carlos at redhat dot com>
To: Mike FABIAN <mfabian at redhat dot com>
Cc: Florian Weimer <fw at deneb dot enyo dot de>, Andreas Schwab <schwab at suse dot de>, libc-alpha at sourceware dot org
Date: Tue, 25 Jul 2017 08:17:44 -0400
Subject: Re: Is it OK to write ASCII strings directly into locale source files?
Authentication-results: sourceware.org; auth=none
References: <s9d8tje9e1k.fsf@redhat.com> <5f71f2f6-be0e-2b5d-91ce-03386eafa7f7@redhat.com> <mvmy3rdx577.fsf@suse.de> <87h8y13gvb.fsf@mid.deneb.enyo.de> <e43a088a-cb33-c322-7587-c20d993e7fa6@redhat.com> <87379lczdi.fsf@mid.deneb.enyo.de> <7fa0552d-c24b-3c5c-cad3-1359eb4dd6bd@redhat.com> <s9dbmo9xcjq.fsf@redhat.com>

On 07/25/2017 02:20 AM, Mike FABIAN wrote:
> Carlos O'Donell <carlos@redhat.com> wrote:
> 
>> My only argument is that when you are forced to use <Uxxx> encoding it
>> is empirically less likely you'll make a mistake. Like reading a sentence
>> backwards to catch errors since it prevents your brain from filling in
>> the missing information.
> 
> But there are also many mistakes because somebody mistyped code points.
> Several weird typos in things like month names look as if somebody
> mistyped code points.

Ultimately I defer to your judgement as localedata maintainer to create
a workflow that is easy for you and benefits your work.

However, I caution against throwing away the compatibility of our locales
with POSIX, which doesn't seem to allow UTF-8 in the specification.

I would suggest the following:

(a) Documentation:

    File an Austin bug to adjust the text of the standard to allow what
    we want. Effectively documenting the defacto glibc standard which
    uses UTF-8.

(b) New process:

    Post-process the locale source before commit, and enforce, that there
    is an auto-generated comment that contains either the UTF-8 or code
    points, for the author to review before commit. If we wrote UTF-8
    in a special markup comment, and auto-generated the locale entry
    with code points then we would remain mostly compatible with POSIX
    and what we have today (less churn for user tools).

-- 
Cheers,
Carlos.

Follow-Ups:
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Florian Weimer

References:
- Is it OK to write ASCII strings directly into locale source files?
  - From: Mike FABIAN
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Carlos O'Donell
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Andreas Schwab
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Florian Weimer
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Carlos O'Donell
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Florian Weimer
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Carlos O'Donell
- Re: Is it OK to write ASCII strings directly into locale source files?
  - From: Mike FABIAN

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]