Summary: | charmaps: Some of UTF-8 characters have invalid width | ||
---|---|---|---|
Product: | glibc | Reporter: | Łukasz Stelmach <stlman> |
Component: | localedata | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | egmont, fweimer, libc-locales |
Priority: | P2 | Flags: | fweimer:
security-
|
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
Łukasz Stelmach
2019-03-08 11:24:40 UTC
(In reply to Łukasz Stelmach from comment #0) > grep '^[^;]*;[WF]' EastAsianWidth.txt | grep 2693 > > returns no results which means the line 47261 (as of commit c5f65462a2) This command _does_ print "2693;W" for me, as of the aforementioned commit, assuming the input file is glibc's localedata/unicode-gen/EastAsianWidth.txt (line 1210). Note that the width of many codepoints, including this one, changed from narrow to wide with Unicode 9.0. Compare these two files: ftp://ftp.unicode.org/Public/8.0.0/ucd/EastAsianWidth.txt ("2670..269D;N") ftp://ftp.unicode.org/Public/9.0.0/ucd/EastAsianWidth.txt ("2693;W") Any chance you worked from a Unicode 8 (or older) EastAsianWidth.txt, rather than the one in glibc's source? (Also note that your grep command can easily miss matches, since the file defines ranges. It's not the case with U+2693 though.) TL;DR Indeed, I was working with an old data file. As an excuse I can only say, that several fonts provide this character as normal rather than wide, which matched ma observation of the outdated data file. I guess, this bug can be closed then. Thank you. Thanks, closing as requested. |