This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Rational Ranges - Rafal and Mike's opinion? (Bug 23393).


On 07/23/2018 11:10 AM, Florian Weimer wrote:
> On 07/20/2018 11:56 PM, Carlos O'Donell wrote:
>> v2
>> - Fixed tr_TR by duplicating A-Z rational range.
>> - Fixed tst-rxspender.
>> - Fixed bug-regex17.
>>
>> Tell me how the new version does.
> 
> My tester likes it.  tr_TR.ISO-8859-9 is now fixed.  I added fnmatch
> support, too, and initial results look good as well.

OK, so we have the capability to deploy rational ranges.

Florian,

Should we do so in 2.28? Avoiding all possible problems in the future
and making the ranges portable, rational, and safe from a security
perspective?

Rafal,

As localedata maintainer what is your opinion of changing the meaning
of [a-z], [A-Z], and [0-9] to be rational ranges for *all* locales
which mean exactly the latin character sequences you would expect
e.g. {a,b,c,d,e,f,g,h,i,j,k,l,n,m,o,p,q,r,s,t,u,v,w,x,y,z} for [a-z],
[A-Z] likewise, and {0,1,2,3,4,5,6,7,8,9}?

Mike,

Same question to you.

For historical context in gawk:
https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html

For context from POSIX:
http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
(see the section on "RE Bracket Expressions").

Support for rational ranges would make [a-z], [A-Z], [0-9] and other subranges
rational for all locales, and would no longer include mixed case, or accents.

I'd like to year affirmatives from the localedata maintainers on this issue.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]