[PATCH/WIP] C/C++ wchar_t/Unicode printing support
Tom Tromey
tromey@redhat.com
Sun Feb 1 22:42:00 GMT 2009
>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:
>> I suppose one option would be to have a degraded mode where we require
>> that the host charset and the target charset be the same. Then maybe
>> we could make it work by redefining iswprint and wchar_t.
Daniel> I don't see the connection between the iconv dependency and
Daniel> iswprint / wchar_t. Are there portability issues for those
Daniel> too? They don't come from libiconv.
Yeah, there isn't a direct connection.
Basically there are two problems to solve.
One is having a way to convert from a target charset of some kind to a
host charset of some kind.
The other issue is deciding how to print things on the host. We want
to use a host wide character of some sort, so that we can print a
larger subset of characters on a capable terminal. This also lets us
handle "set print repeat", on some platforms anyway, without needing
details about a possible host-side variable-length encoding.
I chose to solve the first problem by using iconv for all the
conversions, and the second by using wchar_t and iswprint for host
printability decisions.
There may be portability issues for the use of wchar_t and iswprint.
I don't know. It would be helpful if someone with access to the more
exotic hosts out there could take a look.
Daniel> It seems like a dummy version of iconv_open which only succeeds if the
Daniel> two character sets are the same, plus a pass-through version of iconv,
Daniel> would be enough to remove the iconv dependency. That degraded mode
Daniel> covers all local debugging.
The wchar_t issue comes into play because we actually do two
conversions when printing: one from the target charset to the host
wchar_t, and then a second one from the host wchar_t to the host
"narrow" charset.
This just adds a wrinkle to the implementation, though -- the general
plan still applies. We could either pretend that wchar_t == char, or
we could make an iconv that uses the mb* functions.
I can implement this, but I'd rather do it only if it is truly needed.
How are you planning to handle this for Code Sourcery? Really I would
like to hear the answer to this from anybody shipping a gdb
executable.
I suppose my recommendation would be to put GNU libiconv into your
local tree, with some configury tweaks to make it build a static
library. This does not seem very hard, though I suppose it is only
suitable if you are not too concerned about the resulting executable
size.
Another portability question is whether there is a platform that does
not have iconv at all. If every host we care about has some form of
iconv, even a bad one, perhaps we don't have to worry much -- users
could still have a functional-for-basic-native-debugging gdb.
Daniel> There'd need to be a little additional
Daniel> logic too, to allow you to set all the charset variables at
Daniel> once
I think "set charset" already does this. It doesn't handle the target
wide charset, but that seems ok in the degraded functionality mode.
Tom
More information about the Gdb-patches
mailing list