This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: printing wchar_t*
On Friday 14 April 2006 18:29, Eli Zaretskii wrote:
> > Date: Fri, 14 Apr 2006 09:43:01 -0400
> > From: Paul Koning <pkoning@equallogic.com>
> > Cc: ghost@cs.msu.su, gdb@sources.redhat.com
> >
> > If you have 16 bit wide chars, it seems possible that those might
> > contain UTF-16 encoding of full (beyond BMP) Unicode characters.
>
> You could use wchar_t arrays for that, but then not every array
> element will be a full character, and you will not be able to access
> individual characters by their positional index.
And what? Even if wchar_t is 32 bit then element at position 'i' can be
combining character modifying another character, and be of little use itself.
> In other words, in this case each element of the wchar_t array is no
> longer a ``wide character'', but one of the few shorts that encode a
> character.
>
> If we want to support wchar_t arrays that store UTF-16, we will need
> to add a feature to GDB to convert UTF-16 to the full UCS-4
> codepoints, and output those.
That's what I mentioned in a reply to Jim -- since the current string printing
code operated "one wchar_t at a time", it's not suitable for outputing UTF-16
encoded wchar_t values to the user.
> Alternatively, the FE will have to
> support display of UTF-16 encoded characters.
Speaking about FE, handling UTF-16 is trivial, so printing just wchar_t values
will be sufficient. Only if we want to properly show UTF-16 strings to a user
of console gdb, some work may be necessary.
- Volodya