This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFA 1/2] PR gdb/20604 - fix "quit" when an invalid expression is used
- From: Pedro Alves <palves at redhat dot com>
- To: Ulrich Weigand <uweigand at de dot ibm dot com>
- Cc: Tom Tromey <tom at tromey dot com>, gdb-patches at sourceware dot org
- Date: Tue, 17 Oct 2017 18:28:46 +0100
- Subject: Re: [RFA 1/2] PR gdb/20604 - fix "quit" when an invalid expression is used
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=palves at redhat dot com
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 08A10C04B31A
- References: <20161107155718.B198310B7A7@oc8523832656.ibm.com>
Hi Ulrich, Tom,
On 11/07/2016 03:57 PM, Ulrich Weigand wrote:
> Pedro Alves wrote:
>> On 10/26/2016 08:44 PM, Ulrich Weigand wrote:
>>
>>> I just noticed that this test case completely breaks my daily testing.
>>> When running the quit.exp test, GDB will hang in a way that it isn't
>>> even killed by the timeout logic, and will keep blocking further
>>> execution forever.
>>>
>>> Attaching to GDB shows it blocked in an ioctl with this backtrace
>>> (which for some reason doesn't even include the quit path ...)
>>>
>>> #0 0x0feea474 in tcsetattr () from /lib/libc.so.6
>>> #1 0x102d7004 in _set_tty_settings (tty=0, tiop=0x10641adc) at /home/uweigand/dailybuild/spu-tc-2016-10-25/binutils-gdb-head/binutils-gdb/readline/rltty.c:476
>>> #2 0x102d70e0 in set_tty_settings (tty=<value optimized out>, tiop=<value optimized out>)
>>> at /home/uweigand/dailybuild/spu-tc-2016-10-25/binutils-gdb-head/binutils-gdb/readline/rltty.c:490
>>> #3 0x102d7530 in rl_deprep_terminal () at /home/uweigand/dailybuild/spu-tc-2016-10-25/binutils-gdb-head/binutils-gdb/readline/rltty.c:688
>>> #4 0x102ea2d4 in rl_callback_read_char () at /home/uweigand/dailybuild/spu-tc-2016-10-25/binutils-gdb-head/binutils-gdb/readline/callback.c:215
>>> #5 0x1017defc in gdb_rl_callback_read_char_wrapper (client_data=<value optimized out>)
>>
>>> Note that when running GDB directly, quit works fine. The problem
>>> occurs only when running GDB under the DejaGNU framework.
>>
>> In such cases, I suggest inserting a gdb_interact call in
>> the testcase and debugging that way.
>
> Thanks for the tip, this allowed me to debug this further.
>
>>> Does this ring any bells? Any thoughts what could cause this?
>>
>> The [wait -i $gdb_spawn_id] in the test does look dangerous
>> in the sense that it won't be subject to timeout logic.
>> So if the previous test fails, that'll likely hang forever.
>>
>> Other than that, no ideas. Can you tell from the gdb.log how
>> far the test went?
>
> Apparently the problem has nothing to do with the "quit" command
> in itself; GDB never even gets around to attempt to execute the
> command. Any test case along the lines of:
>
> send_gdb "<...>\n"
> set result [wait -i $gdb_spawn_id]
>
> will result in the same hang on my RHEL 5 system.
>
> What happens is that after the send_gdb command has sent the newline
> to the GDB process, readline triggers its end-of-line machinery.
> This will call "rl_deprep_terminal", which attempts to reset the
> TTY to its default settings. This uses the tcsetattr routine
> with the TCSADRAIN option, which gets translated by glibc into
> a TCSETSW ioctl.
>
> Now this ioctl causes the kernel (at least the 2.6.18 kernel in RHEL 5)
> to attempt to flush the *write* side of the TTY and wait until that
> flush has succeeded. However, apparently nobody is *reading* on that
> side of the TTY (since expect has only performed the send_gdb which
> writes on the other side of the TTY, and is now in a wait which does
> not read on any side of the TTY), and thus the wait never returns.
>
> Since rl_deprep_terminal also blocks SIGINT while issuing the tcsetattr,
> this cannot even be interrupted easily.
>
> Now I don't fully understand why more recent kernels don't appear to
> block indefinitely here; maybe something in the TTY buffering changed?
> In any case, I'm also not sure if the test is really doing the right
> thing here. Can this not be done using a gdb_expect or any of the
> other usual constructs that will actually read GDB's TTY output?
>
[Meanwhile, over a year passed somehow...]
Thanks for the analysis. I'm not sure what to do about the
specifics of the issue you describe above, but since I was
touching "quit" related tests this week, I remembered this
issue. I've sent a fix now that should at least prevent
hanging the testsuite forever:
https://sourceware.org/ml/gdb-patches/2017-10/msg00534.html
Let me know what you think.
Thanks,
Pedro Alves