This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/17588] Update UTF-8 charmap and width to Unicode 7.0.0
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Wed, 03 Dec 2014 07:17:25 +0000
- Subject: [Bug localedata/17588] Update UTF-8 charmap and width to Unicode 7.0.0
- Auto-submitted: auto-generated
- References: <bug-17588-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=17588
--- Comment #9 from Mike FABIAN <maiku.fabian at gmail dot com> ---
I built glibc with the patch from comment#8.
I produces some FAILs in âmake checkâ:
FAIL: localedata/cs_CZ.UTF-8/LC_CTYPE
... similar FAILs ...
Shortly after starting âmake checkâ one sees:
./charmaps/UTF-8:42734: unknown character `U00009FCD'
... similar messages ...
All the above problems are cause by ranges of reserved code points
which are listed in EastAsianWidth.txt like this:
9FCD..9FFF;W # Cn [51] <reserved-9FCD>..<reserved-9FFF>
and these code points are not in UnicodeData.txt.
Therefore, they are not generated into the CHARMAP section
of glibcâs UTF-8 file and it causes the above problems if they
are generated into the WIDTH section of glibcâs UTF-8 file.
This can be fixed by not generating reserved code points into
the WIDTH section, i.e. by ignoring the reserved code points
mentioned in EastAsianWidth.txt. Patch for utf8-gen.py:
diff --git a/utf8-gen.py b/utf8-gen.py
index 57875b6..20b68bb 100755
--- a/utf8-gen.py
+++ b/utf8-gen.py
@@ -218,6 +218,8 @@ if __name__ == "__main__":
write_comments(outfile, 1)
elines = []
for line in easta_file.readlines():
+ if re.match(r'.*<reserved-.+>\.\.<reserved-.+>.*', line):
+ continue
if re.match(r'^[^;]*;[WF]', line):
elines.append(line.strip())
process_width(outfile, flines, elines)
--
You are receiving this mail because:
You are on the CC list for the bug.