[PATCH/WIP] C/C++ wchar_t/Unicode printing support
Eli Zaretskii
eliz@gnu.org
Fri Jan 16 09:36:00 GMT 2009
> Date: Thu, 15 Jan 2009 20:24:11 +0000
> From: Julian Brown <julian@codesourcery.com>
> Cc: tromey@redhat.com
>
> This patch contains (at least the start of) support for printing
> wchar_t strings from a debugged program within GDB. This is the subject
> for GDB bugs 9103 (and its duplicates 9369, 9268) and maybe 7821.
Thank you!
> OK to apply
Not without documentation, sorry. Such an important feature should
not go in undocumented.
> or any comments?
A few:
> (gdb) show host-charset
> The host character set is "UTF-8" (auto).
Elsewhere in GDB, we show such settings in a slightly different form:
(gdb) show language
The current source language is "auto; currently c".
I like this latter form better: it first says that the setting is
"auto", then what is the detected state.
> + #ifndef GDB_DEFAULT_TARGET_WIDE_CHARSET
> + #define GDB_DEFAULT_TARGET_WIDE_CHARSET "UTF-32"
> + #endif
> +
> + #ifndef GDB_INTERNAL_CODESET
> + #define GDB_INTERNAL_CODESET "UCS-4LE"
> + #endif
Why are these the defaults? because of what GNU/Linux (i.e. glibc)
does, or for some other reason? If the former, shouldn't this be
autoconfigured?
> + static const char *target_wide_charset_enum[] =
> + {
> + "UCS-2",
> + "UCS-2LE",
> + "UCS-2BE",
> + "UCS-4",
> + "UCS-4LE",
> + "UCS-4BE",
> + "UTF-16",
> + "UTF-16LE",
> + "UTF-16BE",
> + "UTF-32",
> + "UTF-32LE",
> + "UTF-32BE",
> + 0
> + };
Why do we need the UCS-2 charsets? That's just confusing; are there
important platforms that support UCS-2 instead of UTF-16? I'd also
suggest to consider removing UTF-32 and its endian variants, since
they are exactly identical to UCS-4. (Unless someone wants to support
the Emacs 23 internal representation, but that one should be called by
its own name anyway.)
More information about the Gdb-patches
mailing list