[PATCH] [gdb/tui] Handle unicode chars in prompt

Tom de Vries tdevries@suse.de
Fri Jun 9 09:34:28 GMT 2023


On 5/26/23 15:56, Eli Zaretskii wrote:
>> Cc: Tom Tromey <tom@tromey.com>
>> Date: Fri, 26 May 2023 15:25:12 +0200
>> From: Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
>>
>> +/* Return true if STRING starts with a multi-byte char.  Return the length of
>> +   the multi-byte char in LEN, or 0 in case it's a multi-byte null char.
>> +   Implementation based on _rl_read_mbchar.  */
>> +
>> +static bool
>> +is_mb_char (const char *string, int &len)
>> +{
>> +  for (len = 1; len <= MB_CUR_MAX; len++)
>> +    {
>> +      size_t res;
>> +
>> +      {
>> +	wchar_t wc;  <<<<<<<<<<<<<<<<<<<<<<<
>> +	mbstate_t ps;
>> +	memset (&ps, 0, sizeof (mbstate_t));
>> +	res = mbrtowc (&wc, string, len, &ps);
> 
> The above assumes each call to mbrtowc produces only one wchar_t
> value.  But that's non-portable: on MS-Windows wchar_t is a 16-bit
> wide data type, and wchar_t "wide characters" are actually encoded in
> UTF-16.  So characters beyond the BMP will yield 2 wchar_t values, not
> one.
> 

Hi Eli,

I see, thanks for pointing that out.  I've fixed this by using nullptr 
instead of &wc.

> One additional caveat: "multibyte" != "UTF-8".  There's more than one
> multibyte encoding, and the current locale could use some non-UTF-8
> encoding instead.  For example, some encoding of the ISO-2022 family.
> I'm not sure what this means for the issue at hand.
> 

AFAIU, interpreting the currently locale and encoding correctly is up to 
mbrtowc, so as long as it does that correctly I think there's no problem.

> Yet another consideration is whether tui_puts_internal is used for
> outputting text in the target charset, in which case you may have
> problems with using mbrtowc, because AFAIK that supports only the
> current locale's codeset.  If the target charset is different from the
> locale's (basically, the host) charset, and we don't convert one to
> the other before calling tui_puts_internal, mbrtowc will fail.
> 

[ Addressed by Tom Tromey in this thread. ]

> Yes, this is a mess.
> 

Indeed :)

V2 posted here ( 
https://sourceware.org/pipermail/gdb-patches/2023-June/200181.html ).

Thanks,
- Tom



More information about the Gdb-patches mailing list