[RFC-v2] Allow explicit 16 or 32 char in 'x /s'

Thu Apr 1 09:34:00 GMT 2010

> -----Message d'origine-----
> De : gdb-patches-owner@sourceware.org [mailto:gdb-patches-
> owner@sourceware.org] De la part de Eli Zaretskii
> > +the unit size defaults to @samp{b}, unless it is explicitly given.
> > +Use @kbd{x /hs} to display 16-bit char strings and @kbd{x /ws} to
> display
> > +32-bit strings.  The next use of @kbd{x /s} will again display 8-bit
> > strings.
> 
> This is okay, but I still think we should mention that the encoding is
> UTF-16 and UCS-4, respectively, and that it cannot be changed.

   According to c_emit_char function, it is 
UTF-16 (LE or BE depending on target endianess)
or UTF-32 (LE or BE also).
  Is UCS-4 exactly the same as UTF-32?
  Furthermore, this is c_emit_char, which means that this
is a language specific output.
  Several languages have their own emit_char functions,
several of them start by a 
  c &= 0xFF;
line, which discards higher bytes of the character value.
(found in f-lang.c:86, m2-lang.c:45, objc-lang.c:287 and p-lang.c:161)
Of course these implementations would benefit from 
using the more up to date c-lang.c implementation, but that is another
story.

  This means that UTF-16 and UTF-32 will only be used
for c, cplus, assembler, minimal. 
  Java language seems to use another scheme to represent 
extended characters: it uses 
  fprintf_unfiltered (stream, "\\u%.4x", (unsigned int) c);

  To summarize, I don't think that saying that ' /hs'  uses UTF-16
without specifying that this is language specific is correct.

  Should I just mention that the output is language dependent
and uses UTF-16 or UTF-32 for c, cplus, assembler and minimal languages?

Pierre Muller