Cygwin fails to utilize Unicode replacement character

Andrey Repin
Tue Sep 4 19:50:00 GMT 2018

Greetings, Thomas Wolff!

>>> My vote is against the patch because the nodef glyph will often be
>>> just blank space which is certainly worse than ▒.
>>> If conhost does not provide a reasonable way to enquire 0xFFFD 
>>> availability it's conhost's fault, not cygwin's so why should cygwin 
>>> implement a bad compromise. If conhost ever improves, cygwin can adapt.
>> This is some dangerous commentary. I would like to counter it now with 
>> some actual research.
> No idea what you consider dangerous. Anyway, we obviously agree that 
> hardly any available console font supports the REPLACEMENT CHARACTER.

If by "console" you mean "raster", then terminal simply unable to render
U+FFFD in raster font mode.


$ php -r 'print "\u{FFFD}\n";' | cat -
cat: write error: Permission denied

This is regardless of selected codepage+locale.

> You had previously suggested code that might work (using CreateFont(0, 
> 0, ....)). Maybe you can sort out with Corinna how to get that work 
> inside cygwin. Otherwise, my opinion:
> - *working* fallback from FFFD to 2592: good

Neither that works.

$ php -r 'print "\u{2592}\n";' | cat -
cat: write error: Permission denied

> - revert to 2592: OK
> - fix FFFD: not good, because the .notdef glyph is not an appropriate 
> indication of illegal encoding (like broken UTF-8 bytes)

For both Consolas and Lucida Console, U+FFFD displays sensible presentation in
May be less sensible for Lucida Console. But it is still immediately
recognizable for anybody who had seen unknown character glyphs before.
And if Microsoft gets better, it will be only better with no additional effort.

Whereas U+2592
1. unrecognizable.
2. may actually appear in legitimate output.

With best regards,
Andrey Repin
Tuesday, September 4, 2018 22:10:29

Sorry for my terrible english...
Problem reports:
Unsubscribe info:

More information about the Cygwin mailing list