Bug 10122

Summary: Xft UTF-8 rendering
Product: cygwin Reporter: Yaakov Selkowitz <yselkowi>
Component: Cygwin/XAssignee: Yaakov Selkowitz <yselkowi>
Status: RESOLVED INVALID    
Severity: normal CC: jon.turney
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Example patch to fix scrambling unicode strings
QIconvCodec (4.4.3)
QIconvCodec (4.5.2)

Description Yaakov Selkowitz 2009-05-01 03:32:51 UTC
Certain program using Xft and UTF-8 characters render square boxes instead of
actual letters.  This occurs with both Cygwin 1.5 and 1.7.

Packages displaying this include blackbox and fltk2, hence I had to work around
this in both cases.  Qt4.5 (not yet in SVN) also shows this, although 4.4 did
not.  Each of these use iconv as well, so I'll bring Chuck Wilson in on this as
well.

Someone else reported this as well with blackbox:

http://www.stokebloke.com/cygwin/index.php

Any ideas?
Comment 1 Jon Turney 2009-05-13 22:27:57 UTC
The square boxes usually mean that the glyph isn't available in the font.  I
would have thought that basic latin characters should be available even if we
aren't using a uncode font, though.  Still, you might try forcing it to use an
iso10646-1 encoding font to see if that makes a different.

I guess I should take a look at the blackbox source and see exactly what it is
doing differently when unicode is allowed.
Comment 2 Jon Turney 2009-05-14 19:15:46 UTC
Created attachment 3934 [details]
Example patch to fix scrambling unicode strings

It seems that lib/Unicode.cc:bt::byte_swap() assumes that sizeof(wchar_t) is 4,
so that a UTF-32 code unit will fit in it, and utterly destroys the strings
we've carefully converted to UTF-32 if that isn't the case.

How we laughed.

Attached patch fixes this, although I'm just noticing it could be better as the
change is not consistent with the function signature which uses an unsigned
int. Both should probably use the 'Uchar' type as that's what the strings we
are processing should consist of.

It's left as an exercise for the reader to determine if assuming an unsigned
int has exactly 32 bits is portable. :-)
Comment 3 Yaakov Selkowitz 2009-06-28 19:09:39 UTC
Created attachment 4026 [details]
QIconvCodec (4.4.3)

Could you help me with this one?  This is qiconvcodec.cpp from qt-4.4.3, which
worked correctly...
Comment 4 Yaakov Selkowitz 2009-06-28 19:11:13 UTC
Created attachment 4027 [details]
QIconvCodec (4.5.2)

... and this is the same file from 4.5.2, which does not work.	You will see
that this file has changed quite a bit, but I don't see anything obvious.  Any
ideas?
Comment 5 Yaakov Selkowitz 2009-06-29 01:19:38 UTC
Hmm, look at this:

http://qt.gitorious.org/qt/qt/blobs/master/src/corelib/codecs/qtextcodec.cpp#line521

If I do the following at line 527:

-#ifndef QT_NO_ICONV
+#if !defined(QT_NO_ICONV) && !defined(Q_OS_CYGWIN)

Then it *seems* to work, IOW I get text instead of squares, and if I run
'LANG=ja_JP qtdemo', while everything remains in English, the typeface changes
to that seen in Japanese/English emails.
Comment 6 Jon Turney 2009-07-02 14:10:08 UTC
That seems pretty strong evidence that QIconvCodec isn't working correctly in
4.5.2, but I can't immediately see anything wrong with it, really needs to
stepping through with a debugger...
Comment 7 Yaakov Selkowitz 2009-10-12 17:58:52 UTC
Bottom line, looks like this isn't a Cygwin/X issue, so closing.