This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: GDB/MI reporting non-ASCII file names
- From: Eli Zaretskii <eliz at gnu dot org>
- To: Pedro Alves <palves at redhat dot com>
- Cc: gdb at sourceware dot org
- Date: Fri, 09 Oct 2015 16:31:44 +0300
- Subject: Re: GDB/MI reporting non-ASCII file names
- Authentication-results: sourceware.org; auth=none
- References: <83a8s5d1nw dot fsf at gnu dot org> <560BCCF9 dot 2040202 at redhat dot com> <83bnckasck dot fsf at gnu dot org> <560BF686 dot 1030400 at redhat dot com> <83a8s3c3c3 dot fsf at gnu dot org> <5617A0FD dot 3020208 at redhat dot com>
- Reply-to: Eli Zaretskii <eliz at gnu dot org>
> Date: Fri, 09 Oct 2015 12:11:57 +0100
> From: Pedro Alves <palves@redhat.com>
> CC: gdb@sourceware.org
>
> On 09/30/2015 04:51 PM, Eli Zaretskii wrote:
>
> > If you compile a program from a source file whose name includes
> > non-ASCII characters, then debug that program with -i=mi, do you see
> > the file names correctly, after turning 7 bits off?
> >
>
> Looks like I see the same as you. With a file named "ÃÃÃ.c":
>
> (gdb)
> set print sevenbit-strings on
> &"set print sevenbit-strings on\n"
> =cmd-param-changed,param="print sevenbit-strings",value="on"
> ^done
> (gdb) start
> ...
> *stopped,reason="breakpoint-hit",disp="del",bkptno="2",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd838"}],file="\303\247\303\252\303\241.c",fullname="/home/pedro/gdb/tests/\303\247\303\252\303\241.c",line="5"},thread-id="1",stopped-threads="all",core="2"
> (gdb)
>
> (gdb)
> set print sevenbit-strings off
> &"set print sevenbit-strings off\n"
> =cmd-param-changed,param="print sevenbit-strings",value="off"
> ^done
> (gdb) start
> ...
> *stopped,reason="breakpoint-hit",disp="del",bkptno="3",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd838"}],file="ÃÃÃ.c",fullname="/home/pedro/gdb/tests/ÃÃÃ.c",line="5"},thread-id="1",stopped-threads="all",core="2"
>
>
>
>
> But with a file named "ÎÎÏÏÏÎ.c" + "set print sevenbit-strings off":
>
> *stopped,reason="breakpoint-hit",disp="del",bkptno="1",frame={addr="0x00000000004004fb",func="main",args=[{name="argc",value="1"},{name="argv",value="0x7fffffffd808"}],file="ÎÎï\216ï\203ï\203Î.c",fullname="/home/pedro/gdb/tests/ÎÎï\216ï\203ï\203Î.c",line="5"},thread-id="1",stopped-threads="all",core="3"
> =breakpoint-deleted,id="1"
> (gdb)
I think the 0x7F..0xA0 range is a left-over from the Latin-N era, and
is a bad idea with the current UTF-8 default.
Would something like the following be acceptable (if accompanied with
the suitable changes to NEWS and the manual)?
diff --git a/gdb/utils.c b/gdb/utils.c
index afeff12..56eb9d5 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -1509,12 +1509,11 @@ printchar (int c, void (*do_fputs) (const char *, struct ui_file *),
void (*do_fprintf) (struct ui_file *, const char *, ...)
ATTRIBUTE_FPTR_PRINTF_2, struct ui_file *stream, int quoter)
{
- c &= 0xFF; /* Avoid sign bit follies */
+ c &= 0xFF; /* Avoid sign bit follies */
- if (c < 0x20 || /* Low control chars */
- (c >= 0x7F && c < 0xA0) || /* DEL, High controls */
- (sevenbit_strings && c >= 0x80))
- { /* high order bit set */
+ if (c < 0x20 || /* Low control chars */
+ (sevenbit_strings && c >= 0x80)) /* High order bit set */
+ {
switch (c)
{
case '\n':