This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug regex/23393] Handle [a-z] and [A-Z] in consistent portable fashion regardless of locale.

From: "carlos at redhat dot com" <sourceware-bugzilla at sourceware dot org>
To: glibc-bugs at sourceware dot org
Date: Thu, 19 Jul 2018 13:14:42 +0000
Subject: [Bug regex/23393] Handle [a-z] and [A-Z] in consistent portable fashion regardless of locale.
Auto-submitted: auto-generated
References: <bug-23393-131@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=23393

--- Comment #18 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Carlos O'Donell from comment #17)
> Proposal (c)
> - Handle the range a-z as an alias for :lower:.
> - Handle the range A-Z as an alias for :upper:.
> - Handle the range 0-9 as an alias for :digit:.
> 
> This would bring compatibility for regex's and still allow all other
> languages to include lowe-case alphabetic symbols like ñ in Spanish without
> breaking Spanish developer scripts that depend on that.
> 
> We would be breaking scripts for the 15 locales that have mixed aA-zZ
> collation if they expect a-z to include A-Y, but the potential for breakage
> in all the other languages is worse.

>From an implementation perspective I believe we can add this in whenever we
call __collseq_table_lookup in regcomp.c, regexec.c, and fnmatch_loop.c, we
need to do the substitution if we are looking for one of the above ranges and
use a fixed range.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]