This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH V4][BZ #18441] fix sorting multibyte charsets with an improper locale


On Tue, Mar 29, 2016 at 6:05 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Tue, 29 Mar 2016, Carlos O'Donell wrote:
>
>> I believe this is technically inaccurate since it allows all 4-byte
>> sequences, when in reality the limit is at U+10FFFF?
>
> That glibc accepts UTF-8 according to the definition in the 2003 edition
> of ISO 10646 rather than the definition in the 2011 and later editions is
> a known issue.  I've filed bug 19883 for it since I couldn't find an
> existing bug report in Bugzilla.

Note that the U+10FFFF limit equates to a Y2541 bug, at the present
(post-2000) rate of codepoint assignment.  See
https://gist.github.com/zackw/f2e74a8d7b31baa88002 for calculations
and pretty graph.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]