Issues with width of emoji

Thomas Wolff towo@towo.net
Fri Sep 21 17:43:00 GMT 2018


Am 21.09.2018 um 03:42 schrieb Pokechu22:
> Hello!  For a while I've had issues with emoji and cygwin, but due to
> some recent configuration changes on my end it's gotten to the point
> where it's actively causing problems.
Some of your problem descriptions are staying a bit obscure, e.g. what 
recent changes have caused which problems...

> My specific case involves running weechat on my rapsberry pi, which I
> connect to with `mosh pi -- screen -D -RR weechat
> /usr/local/bin/weechat-curses`.  When someone used an emoji on IRC,
> the entire screen would get messed up in some cases, as things got
> misaligned (an example of this: https://i.imgur.com/V7D6jPc.png).
> Previously I had a script that converted emoji into their escapes,
What are "their escapes"? Emojis are encoded in Unicode directly, not 
needing any escapes then.

> but that recently started misbehaving; even with that script there were
> other unicode characters such as the mathematical alphanumeric symbols
> characters (<https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols>)
Unicode does not define any emojis in the range Mathematical 
Alphanumeric Symbols (U+1D400-U+1D7FF).

> that caused the issue too; I'm still going to refer to these as emoji
> because I most commonly have this problem wtih emoji and I don't have
> a good name otherwise.
>
> I initially assumed that this was a problem with mosh on the pi, what
> with the pi being an ARM device.  However, after later investigation,
> it turns out that it's a cygwin problem.  Some different cases where
> things behave weirdly:
>
> * Typing an emoji and then pressing backspace twice ends up deleting
> the emoji and the character before visually, but the character before
> isn't actually deleted (e.g. echo hi<emoji> then backspace twice still
> prints hi)
See your own conclusion below.
> * Running mosh, even as a loopback (`mosh --local ::1`), shows 2
> characters when the emoji is typed
> * Emoji behave incorrectly when pasted into nano
> * curses apps (which include mosh and nano) write a 2-wide space for
> emoji, as can be seen in this script
> <https://gist.github.com/Pokechu22/45d19aa5e41ee6db00723f808ac4339e>.
> This is only 1 character wide on my pi.
This may be related to different Unicode versions. Width for many emojis 
changed from 1 to 2 in Unicode 9 (I think).
> * There are no problems when using SSH, at least to my pi, interestingly.
So please describe how you connect when the same test cases behave 
differently.
> * Python refuses to create a ctypes.c_wchar containing an emoji, but
> considers the len of a string with a single emoji to be 1.  On my pi
> it creates a c_wchar properly.
>
> I think that most of the desyncs and other weird things I've been
> getting are a result of different systems disagreeing about how wide
> the character should be;
Yes, and of different applications. Do you actually run the cygwin 
terminal or the cygwin console for your test cases?

> that makes the most sense at least.
> Alternatively, it might be an issue with the character being
> represented as multiple characters; as far as I can tell there are
> only problems with characters outside of the basic multilingual plane
> (i.e. value >= 0x10000).
Yes, as UTF-16 may be involved, which represents non-BMP characters as 
two "surrogate" code points.
It might be helpful to repeat all observations with other, non-emoji, 
non-BMP characters, in order to isolate the effects.

> One last thing I noticed: in ncurses, there seems to be some special
> stuff to implement wcwidth and wcswidth, including a comment in
> ncurses/widechar/widechars.c that says "MinGW has wide-character
> functions, but they do not work correctly."  As far as I can tell,
> this is not enabled on cygwin; I'm not sure if it should be enabled or not.
>
> I hope I explained this well enough; it's a somewhat complicated issue
> and I don't know all of the relevant unicode vocabulary.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list