Issues with width of emoji
Thomas Wolff
towo@towo.net
Fri Sep 21 17:43:00 GMT 2018
Am 21.09.2018 um 03:42 schrieb Pokechu22:
> Hello! For a while I've had issues with emoji and cygwin, but due to
> some recent configuration changes on my end it's gotten to the point
> where it's actively causing problems.
Some of your problem descriptions are staying a bit obscure, e.g. what
recent changes have caused which problems...
> My specific case involves running weechat on my rapsberry pi, which I
> connect to with `mosh pi -- screen -D -RR weechat
> /usr/local/bin/weechat-curses`. When someone used an emoji on IRC,
> the entire screen would get messed up in some cases, as things got
> misaligned (an example of this: https://i.imgur.com/V7D6jPc.png).
> Previously I had a script that converted emoji into their escapes,
What are "their escapes"? Emojis are encoded in Unicode directly, not
needing any escapes then.
> but that recently started misbehaving; even with that script there were
> other unicode characters such as the mathematical alphanumeric symbols
> characters (<https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols>)
Unicode does not define any emojis in the range Mathematical
Alphanumeric Symbols (U+1D400-U+1D7FF).
> that caused the issue too; I'm still going to refer to these as emoji
> because I most commonly have this problem wtih emoji and I don't have
> a good name otherwise.
>
> I initially assumed that this was a problem with mosh on the pi, what
> with the pi being an ARM device. However, after later investigation,
> it turns out that it's a cygwin problem. Some different cases where
> things behave weirdly:
>
> * Typing an emoji and then pressing backspace twice ends up deleting
> the emoji and the character before visually, but the character before
> isn't actually deleted (e.g. echo hi<emoji> then backspace twice still
> prints hi)
See your own conclusion below.
> * Running mosh, even as a loopback (`mosh --local ::1`), shows 2
> characters when the emoji is typed
> * Emoji behave incorrectly when pasted into nano
> * curses apps (which include mosh and nano) write a 2-wide space for
> emoji, as can be seen in this script
> <https://gist.github.com/Pokechu22/45d19aa5e41ee6db00723f808ac4339e>.
> This is only 1 character wide on my pi.
This may be related to different Unicode versions. Width for many emojis
changed from 1 to 2 in Unicode 9 (I think).
> * There are no problems when using SSH, at least to my pi, interestingly.
So please describe how you connect when the same test cases behave
differently.
> * Python refuses to create a ctypes.c_wchar containing an emoji, but
> considers the len of a string with a single emoji to be 1. On my pi
> it creates a c_wchar properly.
>
> I think that most of the desyncs and other weird things I've been
> getting are a result of different systems disagreeing about how wide
> the character should be;
Yes, and of different applications. Do you actually run the cygwin
terminal or the cygwin console for your test cases?
> that makes the most sense at least.
> Alternatively, it might be an issue with the character being
> represented as multiple characters; as far as I can tell there are
> only problems with characters outside of the basic multilingual plane
> (i.e. value >= 0x10000).
Yes, as UTF-16 may be involved, which represents non-BMP characters as
two "surrogate" code points.
It might be helpful to repeat all observations with other, non-emoji,
non-BMP characters, in order to isolate the effects.
> One last thing I noticed: in ncurses, there seems to be some special
> stuff to implement wcwidth and wcswidth, including a comment in
> ncurses/widechar/widechars.c that says "MinGW has wide-character
> functions, but they do not work correctly." As far as I can tell,
> this is not enabled on cygwin; I'm not sure if it should be enabled or not.
>
> I hope I explained this well enough; it's a somewhat complicated issue
> and I don't know all of the relevant unicode vocabulary.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list