The wcwidth characted database in glibc is very outdated, which leads to programs not being able to display some characters properly. To work around this issue programs such as vim and xterm works it by ignoring whether glibc says that a given character is printable or not. According to a mail[1] to a thread on the linux-utf8@nl.linux.org mailinglist there are even characters from Unicode 4 that isn't supported yet. One workaround is to use another wcwidth, like mgk25's[2]. Example can be seen here[3]. [1]: http://article.gmane.org/gmane.comp.internationalization.linux/6572 [2]: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c [3]: https://theos.kyriasis.com/~kyrias/wcwidth.png
Bug 4335 seems to be caused by the same issue, too.
Isn't this just part of bug 14094 - updating the Unicode data (localedata/charmaps/UTF-8 in this case) and ensuring there is automation to make future updates easier? Again, we could do with someone who takes on the role of Unicode expert for glibc to deal with such bugs.
Ah, missed that one. I guess it's mostly the same, yes, tho that one seems to only talk about adding the new things, not about adding the parts of the old Unicode standards that aren't added yes, but then without those parts that bug is incomplete. *** This bug has been marked as a duplicate of bug 14094 ***