This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [WIPv5] Expected behaviour for a-z, A-Z, and 0-9 (Bug 23393).


On 07/30/2018 07:45 PM, Carlos O'Donell wrote:
On 07/30/2018 01:39 PM, Florian Weimer wrote:
On 07/28/2018 03:12 AM, Carlos O'Donell wrote:
On 07/26/2018 10:50 AM, Florian Weimer wrote:
shs_CA: U+0000E6 matches /[a-z]/ unexpectedly
shs_CA: U+0000C6 matches /[A-Z]/ unexpectedly
shs_CA.utf8: U+0000E6 matches /[a-z]/ unexpectedly
shs_CA.utf8: U+0000C6 matches /[A-Z]/ unexpectedly

This is a WIP, because the number of tests now is too big
to simply add them to tst-fnmatch.input, and so I'm writing
a new tester tst-rational-ranges.c. I'm parsing SUPPORTED,
expecting all of the locales to be built for testing, and
then running through all the rational ranges to test
inclusion of the required datums.

Let me repeat my suggestion that we should initially fix the locales
with the common collation order, where glibc 2.28 regresses.

I do not think it is appropriate to release rational range support on
only a subset of the SUPPORTED set of locales. Either we support it on
all SUPPORTED locales or we work until we are ready.

At present glibc 2.28 does not regress because of commit
7cd7d36f1feb3ccacf476e909b115b45cdd46e77 to deinterlace lower and
uppercase.

In glibc 2.28 we simply have ~2500 characters in the range of a-z,
and in 2.27 we had ~250, it's still a large set of non-ASCII characters
accepted by the range, all because we caught up to Unicode 9.0.0 with
the ISO 14651 collation update (and will soon updated to Unicode 10.0.0
with the next release, and probably always lagging a bit).

Ahh.  So it's more complex and a regression longer in the making.

I don't see an urgent need to get rational range support into 2.28.
I was happy to get it in earlier, but now with deeper testing showing
that not all locales are working correctly, I'm not happy to see this
go out the door. I think it will be ready very shortly, and we can check
it in immediately into 2.29, and then continue our work on code point
ranges as the next step, which will require even more testing, and
internal API cleanup.

Sounds reasonable.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]