This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: printing wchar_t*

From: Vladimir Prus <ghost at cs dot msu dot su>
To: gdb at sources dot redhat dot com
Date: Fri, 14 Apr 2006 10:10:19 +0400
Subject: Re: printing wchar_t*
References: <e1lsqg$aml$1@sea.gmane.org> <8f2776cb0604131031g370d6fa9p9361421bd21d178@mail.gmail.com>

Jim Blandy wrote:

> On 4/13/06, Vladimir Prus <ghost@cs.msu.su> wrote:
>> I have a user-defined command that can produce the output I want, but is
>> defining a custom command the right approach?
> 
> Well, you'd like wide strings to be printed properly when they appear
> in structures, as arguments to functions, and so on, right?  So a
> user-defined command isn't ideal.

I think I'll still need to do some processing for wchar_t* on frontend side.
The problem is that I don't see any way how gdb can print wchar_t in a way
that does not require post-processing. It can print it as UTF8, but then
for printing char* gdb should use local 8 bit encoding, which is likely to
be *not* UTF8. Gdb can probably use some extra markers for values: like:

   "foo"  for string in local 8-bit encoding
   L"foo" for string in UTF8 encoding.

It's also possible to use "\u" escapes.

But then there's a problem:

   - Do we assume that wchar_t is always UTF-16 or UTF-32?
   - If not:
     - how user can select this?
     - how user-specified encoding will be handled

> The best approach would be to extend charset.[ch] to handle wide
> character sets as well, and then add code to the language-specific
> printing routines to use the charset functions.  (This is fortunately
> much simpler than adding support for multibyte characters.)

For, for each wchar_t element language-specific code will call
'target_wchar_t_to_host', that will output specific representation of that
wchar_t. Hmm, the interface there seem to assume theres 1<->1 mapping
between target and host characters.  This makes L"UTF8" format and ascii
string with \u escapes format impossible, It seems.

- Volodya

Follow-Ups:
- Re: printing wchar_t*
  - From: Jim Blandy
- Re: printing wchar_t*
  - From: Eli Zaretskii

References:
- printing wchar_t*
  - From: Vladimir Prus
- Re: printing wchar_t*
  - From: Jim Blandy

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]