Bug 16969 - Outdated wcwidth character database
Summary: Outdated wcwidth character database
Status: RESOLVED DUPLICATE of bug 14094
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-20 19:35 UTC by Johannes Löthberg
Modified: 2014-06-12 19:16 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
Project(s) to access:
ssh public key:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Johannes Löthberg 2014-05-20 19:35:11 UTC
The wcwidth characted database in glibc is very outdated, which leads to programs not being able to display some characters properly. To work around this issue programs such as vim and xterm works it by ignoring whether glibc says that a given character is printable or not.

According to a mail[1] to a thread on the linux-utf8@nl.linux.org mailinglist there are even characters from Unicode 4 that isn't supported yet.

One workaround is to use another wcwidth, like mgk25's[2]. Example can be seen here[3].


[1]: http://article.gmane.org/gmane.comp.internationalization.linux/6572
[2]: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
[3]: https://theos.kyriasis.com/~kyrias/wcwidth.png
Comment 1 Johannes Löthberg 2014-05-20 19:43:12 UTC
Bug 4335 seems to be caused by the same issue, too.
Comment 2 jsm-csl@polyomino.org.uk 2014-05-20 22:06:48 UTC
Isn't this just part of bug 14094 - updating the Unicode data 
(localedata/charmaps/UTF-8 in this case) and ensuring there is automation 
to make future updates easier?  Again, we could do with someone who takes 
on the role of Unicode expert for glibc to deal with such bugs.
Comment 3 Johannes Löthberg 2014-05-20 22:13:48 UTC
Ah, missed that one. I guess it's mostly the same, yes, tho that one seems to only talk about adding the new things, not about adding the parts of the old Unicode standards that aren't added yes, but then without those parts that bug is incomplete.

*** This bug has been marked as a duplicate of bug 14094 ***