This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Should glibc provide a builtin C.UTF-8 locale?
- From: Paul Eggert <eggert at cs dot ucla dot edu>
- To: Mike FABIAN <mfabian at redhat dot com>, keld at keldix dot com
- Cc: Carlos O'Donell <carlos at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, Pravin Satpute <psatpute at redhat dot com>, Jens Petersen <petersen at redhat dot com>
- Date: Tue, 27 Oct 2015 14:02:03 -0700
- Subject: Re: Should glibc provide a builtin C.UTF-8 locale?
- Authentication-results: sourceware.org; auth=none
- References: <54DB8243 dot 3050903 at redhat dot com> <20151021174936 dot GA26317 at vapier dot lan> <5627DAAE dot 8060703 at redhat dot com> <20151021205540 dot GA30739 at www5 dot open-std dot org> <s9dr3kgfqlx dot fsf at ari dot site>
On 10/27/2015 05:22 AM, Mike FABIAN wrote:
Do we care how a C.UTF-8 locale sorts outside of the ASCII range?
I think people will care about this. I expect the preference for the C
locale, will be to use Unicode code point order -- at least, that's been
my experience when helping to maintain GNU apps. Others might disagree....
What does (or should) collation do with encoding errors? This is not a
problem for a unibyte C locale that simply uses byte values, but it will
be a problem for C.UTF-8. Should encoding-error bytes be sorted after
the code points, or before the code points, or what? POSIX doesn't say
to do with encoding errors, but glibc should do something reasonable.