This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Note on encodings (and locales) with shift state
- From: Joseph Myers <joseph at codesourcery dot com>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: Zack Weinberg <zackw at panix dot com>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Tue, 7 May 2019 15:51:13 +0000
- Subject: Re: Note on encodings (and locales) with shift state
- References: <87k1f39zuo.fsf@oldenburg2.str.redhat.com> <CAKCAbMg-HRP7pc7PB2CkEJE1X99W2bYBC=nqJO_RGBU_t-mxOw@mail.gmail.com> <87sgtq924w.fsf@oldenburg2.str.redhat.com>
On Tue, 7 May 2019, Florian Weimer wrote:
> > * In ALL locales, the <ctype.h> functions only recognize ASCII
> > characters (this is a consequence of narrow C strings always being
> > UTF-8; only ASCII characters fit in a single 'char' anymore) and the
> > <wctype.h> functions' behavior is locale-invariant and defined
> > strictly in terms of Unicode character properties. LC_CTYPE blocks
> > in locale definition files are either ignored or rejected (with the
> > possible exception of transliteration specs).
>
> I think we should do this for isidigt and a few other functions anyway.
> They really do not have to be locale-sensitive because a
> C-conforming/POSIX-conforming locale cannot change the tables anyway.
toupper / tolower in single-byte locales, and towupper / towlower in
general, however, do have to be locale-sensitive to behave correctly in
Turkish / Azerbaijani / ... (tr_TR and locales with 'copy "tr_TR"' in
LC_CTYPE) locales.
--
Joseph S. Myers
joseph@codesourcery.com