[RFC] Add new C.UTF-8 locale.
Florian Weimer
fweimer@redhat.com
Mon Jun 29 08:36:35 GMT 2020
* Andreas Schwab:
> On Jun 21 2020, Carlos O'Donell via Libc-alpha wrote:
>
>> + /* Three byte range. */
>> + if (cp >= 0x800 && cp <= 0xffff)
>
> Should that exclude the surrogate area?
I don't think so, for consistency with:
+ Note that old glibc UTF-8 charmap left the surrogates commented out.
+ We keep the surrogate entries because we want to be able to sort the
+ invalid values into a consistent location.
This refers to the entries for <UD800>, not the multibyte sequences.
I think we should aim for consistency between strcoll and wcscoll even
for invalid sequences.
Thanks,
Florian
More information about the Libc-alpha
mailing list