This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Is it OK to write ASCII strings directly into locale source files?
- From: Mike FABIAN <mfabian at redhat dot com>
- To: libc-alpha at sourceware dot org
- Date: Mon, 24 Jul 2017 15:09:59 +0200
- Subject: Is it OK to write ASCII strings directly into locale source files?
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=mfabian at redhat dot com
- Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com ED7237F3E3
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com ED7237F3E3
Currently the locale source files use a lot of code points even for
strings which are pure ASCII. For example localedata/locales/de_DE
contains:
% "%a %d %b %Y %T %Z"
d_t_fmt "<U0025><U0061><U0020><U0025><U0064><U0020><U0025><U0062><U0020><U0025><U0059><U0020><U0025><U0054><U0020><U0025><U005A>"
Would it be OK to write this as
d_t_fmt "%a %d %b %Y %T %Z"
??
This would make the files much more readable.
Stuff that is mostly ASCII can probably be written like this:
% https://oc.wikipedia.org/wiki/Fran%C3%A7a França
country_name "Fran<U00E7>a"
which is already more readable then writing it all in <U00??> code points.
It would be even nicer to write it completely in UTF-8, i.e.:
country_name "França"
but I am not sure whether this is allowed in the locale source files.
But at least for everything which is ASCII, it might be OK already to
write the characters directly.
Is writing ASCII there allowed or not??
--
Mike FABIAN <mfabian@redhat.com>