This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [ping] [PATCH] Different outputs affected by locale
- From: Pedro Alves <palves at redhat dot com>
- To: Yao Qi <yao at codesourcery dot com>
- Cc: Tom Tromey <tromey at redhat dot com>, Joel Brobecker <brobecker at adacore dot com>, gdb-patches at sourceware dot org
- Date: Thu, 12 Jun 2014 18:23:34 +0100
- Subject: Re: [ping] [PATCH] Different outputs affected by locale
- Authentication-results: sourceware.org; auth=none
- References: <1401192650-29688-1-git-send-email-yao at codesourcery dot com> <538EAEE5 dot 2080708 at codesourcery dot com> <20140604124708 dot GR4289 at adacore dot com> <538F1CC3 dot 9090605 at codesourcery dot com> <87oay8a0t6 dot fsf at fleche dot redhat dot com> <538F803A dot 9020007 at redhat dot com> <538FE412 dot 1050806 at codesourcery dot com> <53903119 dot 6000204 at redhat dot com> <53903EE5 dot 8090107 at codesourcery dot com> <539042A2 dot 4050409 at redhat dot com> <539571C6 dot 40605 at codesourcery dot com> <53958862 dot 5020106 at redhat dot com> <5397BCEC dot 8080300 at codesourcery dot com> <539990BD dot 9020504 at redhat dot com> <5399BB32 dot 5050409 at codesourcery dot com>
On 06/12/2014 03:37 PM, Yao Qi wrote:
> On 06/12/2014 07:36 PM, Pedro Alves wrote:
>> What does "show host-charset" show on Windows, before and after
>> you make GDB pick LC_CTYPE=C from the environment (with the
>> setlocale gnulib module)?
>
> GDB on Windows gets host charset from GetACP(), in
> charset.c:_initialize_charset ().
>
> #elif defined (USE_WIN32API)
> {
> /* "CP" + x<=5 digits + paranoia. */
> static char w32_host_default_charset[16];
>
> snprintf (w32_host_default_charset, sizeof w32_host_default_charset,
> "CP%d", GetACP());
> auto_host_charset_name = w32_host_default_charset;
> auto_target_charset_name = auto_host_charset_name;
> }
> #endif
>
I note gnulib's nl_langinfo replacement actually does
the same thing.
> GetACP doesn't depend on locale,
Yeah, it's a mess, and those are really different
things. The former is the system locale, while the latter
the user locale. MSDN is confusing, but lots of blogs around
explaining this.
> so I don't think LC_CTYPE=C affects the
> host-charset in GDB. However, I do this:
>
> printf ("%d\n", GetACP());
>
> setlocale (LC_CTYPE, "");
> printf ("%d\n", GetACP());
>
> setlocale (LC_CTYPE, "C");
> printf ("%d\n", GetACP());
>
> On my Windows machine, 1252 is printed three times.
So what I'm thinking is indeed going with making the test
accept the cent, but conditioned, like:
# Fallback to assuming 7-bit ASCII. Test are ran under LC_CTYPE=C.
set cent "\\\\242"
set test "show host-charset"
gdb_test_multiple $test $test {
-re "CP1252\r\n$gdb_prompt $" {
# With Windows code page 1252 (Latin 1), the cent
# is printable.
set cent "\u00A2"
pass $test
}
-re "$gdb_prompt $" {
pass $test
}
}
>
>>
>> (Ideally, the wchar tests would actually iterate testing GDB
>> behaves as expected with different values of LC_CTYPE, etc. set
>> in the environment. With all other tests assuming ASCII as set
>> by default by the testsuite framework.)
>
> On the condition that we know or enumerate the expected output for
> wchars under each LC_CTYPE on different host (or OS). Test like this
> is out of the scope of GDB (or debugger) testing, IMO.
Not an exaustive test, and not by host, but just by picking a couple
charsets/locales. So that we at least ensure that the framework is
all in sync. That is, check:
$ unset LC_CTYPE; gdb -ex "show host-charset" -ex ' p "\u00A2"' --batch
$ LC_CTYPE=XXX gdb -ex "show host-charset" -ex ' p "\u00A2"' --batch
$ LC_CTYPE=en_US gdb -ex "show host-charset" -ex ' p "\u00A2"' --batch
$ LC_CTYPE=en_US.UTF-8 gdb -ex "show host-charset" -ex ' p "\u00A2"' --batch
--
Pedro Alves