[ANNOUNCEMENT] Updated: dash-0.5.8-3

Thomas Wolff towo@towo.net
Mon Feb 13 22:03:00 GMT 2017


Am 31.01.2017 um 16:32 schrieb Corinna Vinschen:
> On Jan 31 16:01, Houder wrote:
>> On Tue, 31 Jan 2017 14:16:16, Corinna Vinschen wrote:
>>
>> [snip]
>>
>>>> I'm not quite sure yet but apparently the problem is in the handling of
>>>> VERASE in the termios implementation.  In cooked mode it fills a char
>>>> buffer with what has been typed.  The code doesn't know if the bytes in
>>>> the buffer are UTF-8 chars or just random bytes.  So VERASE erases
>>>> exactly one byte, which means, in case of UTF-8 chars it only erases the
>>>> last byte of of a mulitbyte character.
>>>>
>>>> ...
>>> Ok, here's what happens on Linux:  The termios code support a flag
>>> IUTF8.  This flag determines if the termios code checks for UTF8
>>> characters in the input when performing an ERASE.  It checks if the
>>> IUTF8 flag is set and if so, it checks in a loop if the just erased byte
>>> is a UTF-8 continuation character.  If so, it erases another byte.
>> Agreed. One byte or more, depending on the "character" ... (which is
>> not a problem in case of UTF-8 encoding -- continuation bit).
>>
>> Of course, the terminal driver must receive the characters encoded in UTF-8.
>>
>> ...
> ... It's the termios implementation
> inside Cygwin.  I created a patch introducing the IUTF8 flag as on Linux
> as well as a code snippet trying to remove entire utf-8 characters from
> the input if the IUTF8 flag is set.  And it's set now by default since
> we default to UTF-8 anyway.
>
> Thomas, you may want to check for the IUTF8 flag in upcoming mintty
> versions and unset it if character set configured in the mintty options
> dialog is != UTF-8.
So the flag is always set initially? Also on Linux? Does it (on Linux) 
also have an effect for non-UTF-8 multibyte encodings?
And cannot the Cygwin DLL set the flag to match the locale setting when 
it was invoked?
I can (and will if appropriate) handle the flag in mintty as needed, but 
what if someone calls LC_ALL=.other_encoding dash later within the 
terminal session? I guess the more consistent solution would be to 
handle this in the cygwin DLL.
------
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list