Bug 9996 - wrong size of wide strings
Summary: wrong size of wide strings
Status: ASSIGNED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 6.50
: P2 normal
Target Milestone: 6.8
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-24 17:14 UTC by Pedro Alves
Modified: 2014-10-27 12:46 UTC (History)
3 users (show)

See Also:
Host: i686-pc-cygwin
Target: i686-pc-cygwin
Build: i686-pc-cygwin
Last reconfirmed: 2009-03-24 17:17:36


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pedro Alves 2009-03-24 17:14:16 UTC
On cygwin, with the new charset support, I get this:

 (top-gdb) whatis L""
 type = wchar_t [1]

 (top-gdb) ptype wchar_t
 type = short unsigned int

This is correct.  wchar_t is 16-bit on Windows.

 (top-gdb) ptype L""
 type = short unsigned int [1]
 (top-gdb) ptype L"a"
 type = short unsigned int [3]
 (top-gdb) ptype L"aa"
 type = short unsigned int [5]
 (top-gdb) ptype L"aaa"
 type = short unsigned int [7]
 (top-gdb) ptype L"aaaa"
 type = short unsigned int [9]

Notice how the size of the array grows 1,3,5,7,9,...

GDB was linked with the real iconv.

On x86_64-linux, I get:

 (top-gdb) ptype L""
 type = int [1]
 (top-gdb) ptype L"a"
 type = int [2]
 (top-gdb) ptype L"aa"
 type = int [3]
 (top-gdb) ptype L"aaa"
 type = int [4]
 (top-gdb) ptype L"aaaa"
 type = int [5]
Comment 1 Tom Tromey 2009-03-24 17:17:36 UTC
Mine.
Comment 2 Tom Tromey 2009-03-25 15:22:54 UTC
I think we tracked this down to a bad setting for target-wide-charset.
If you set this to UCS-2, it should work.

I'm leaving this open for the time being because I want to write
some additional documentation about this.
Comment 3 Pedro Alves 2009-03-25 15:29:30 UTC
Does it ever make sense to set target-wide-charset to UCS-4, when wchar_t is
16-bit, or UCS-2 when wchar_t is 32-bit?

Maybe we could have a more reasonable "auto" target-wide-charset
setting, that defaulted to ucs-2, or ucs-4, depending on wchar_t
width?
Comment 4 Ilya Konstantinov 2014-10-27 12:46:41 UTC
Keep in mind this is not Windows-specific but happens on Linux with -fshort-wchar as well.

Ideally, we should support both UTF-16 and UTF-32 in the iconv-less (aka phony iconv) build. (see gdb/charset.c)