[PATCH] [gdb/tui] Handle unicode chars in prompt
Tom de Vries
tdevries@suse.de
Fri Jun 9 09:34:28 GMT 2023
On 5/26/23 15:56, Eli Zaretskii wrote:
>> Cc: Tom Tromey <tom@tromey.com>
>> Date: Fri, 26 May 2023 15:25:12 +0200
>> From: Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
>>
>> +/* Return true if STRING starts with a multi-byte char. Return the length of
>> + the multi-byte char in LEN, or 0 in case it's a multi-byte null char.
>> + Implementation based on _rl_read_mbchar. */
>> +
>> +static bool
>> +is_mb_char (const char *string, int &len)
>> +{
>> + for (len = 1; len <= MB_CUR_MAX; len++)
>> + {
>> + size_t res;
>> +
>> + {
>> + wchar_t wc; <<<<<<<<<<<<<<<<<<<<<<<
>> + mbstate_t ps;
>> + memset (&ps, 0, sizeof (mbstate_t));
>> + res = mbrtowc (&wc, string, len, &ps);
>
> The above assumes each call to mbrtowc produces only one wchar_t
> value. But that's non-portable: on MS-Windows wchar_t is a 16-bit
> wide data type, and wchar_t "wide characters" are actually encoded in
> UTF-16. So characters beyond the BMP will yield 2 wchar_t values, not
> one.
>
Hi Eli,
I see, thanks for pointing that out. I've fixed this by using nullptr
instead of &wc.
> One additional caveat: "multibyte" != "UTF-8". There's more than one
> multibyte encoding, and the current locale could use some non-UTF-8
> encoding instead. For example, some encoding of the ISO-2022 family.
> I'm not sure what this means for the issue at hand.
>
AFAIU, interpreting the currently locale and encoding correctly is up to
mbrtowc, so as long as it does that correctly I think there's no problem.
> Yet another consideration is whether tui_puts_internal is used for
> outputting text in the target charset, in which case you may have
> problems with using mbrtowc, because AFAIK that supports only the
> current locale's codeset. If the target charset is different from the
> locale's (basically, the host) charset, and we don't convert one to
> the other before calling tui_puts_internal, mbrtowc will fail.
>
[ Addressed by Tom Tromey in this thread. ]
> Yes, this is a mess.
>
Indeed :)
V2 posted here (
https://sourceware.org/pipermail/gdb-patches/2023-June/200181.html ).
Thanks,
- Tom
More information about the Gdb-patches
mailing list