[RFC][gdb] Handle ^D when terminal is not prepped

Mon May 9 14:04:06 GMT 2022

Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

> On 4/20/22 13:59, Andrew Burgess wrote:
>> Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
>> 
>>> Hi,
>>>
>>> When running test-case gdb.base/eof-exit.exp, I get:
>>> ...
>>> (gdb) ^M
>>> (gdb) quit^M
>>> PASS: gdb.base/eof-exit.exp: with non-dump terminal: close GDB with eof
>>> ...
>>> but when run in combination with taskset -c 0, I get instead:
>>> ...
>>> (gdb) ^M
>>> (gdb) FAIL: gdb.base/eof-exit.exp: default: close GDB with eof (timeout)
>>> ...
>>>
>>> The test-case is a bit unusual, in the sense that most test-cases wait for
>>> gdb to present the prompt '(gdb) ' before sending input.  Instead, this
>>> test-case first sends a newline to the prompt:
>>> ...
>>>      send_gdb "\n"
>>> ...
>>> to generate a new prompt, but rather than waiting for the new prompt, it just
>>> waits for the resulting newline:
>>> ...
>>>      gdb_test_multiple "" "discard newline" {
>>>          -re "^\r\n" {
>>>          }
>>>      }
>>> ...
>>> and then sends the End-of-Transmission character (ascii code 4, caret notation
>>> '^D'):
>>> ...
>>>      send_gdb "\004"
>>> ...
>>>
>>> The purpose of this is to verify that the result is:
>>> ...
>>> (gdb) quit
>>> ...
>>> instead of:
>>> ...
>>> quit)
>>> ...
>>>
>>> Putting this together with the failure log above, it seems like the ^D has no
>>> observable effect.
>>>
>>> After adding some debugging code in readline, to verify what chars are read
>>> from stdin (call to read in rl_getc), we find that in the passing case we get
>>> '^D', but in the failing case '^@' (ascii code 0, '\0') instead.
>>>
>>> Readline treats the '^D' as EOF, and calls gdb's installed handler with NULL
>>> (meaning EOF), which then issues the quit.
>>>
>>> But readline does not invoke gdb's installed handler for the '^@', AFAIU
>>> because as per default keymap it treats it as the 'set mark' function.
>>>
>>> So, why does ^D end up as ^@?
>>>
>>> My theory is that this is due to "canonical mode" (
>>> "https://man7.org/linux/man-pages/man3/tcflow.3.html" ):
>>> ...
>>> Input is made available line by line.  An input line is
>>> available when one of the line delimiters is typed (NL, EOL,
>>> EOL2; or EOF at the start of line).  Except in the case of EOF,
>>> the line delimiter is included in the buffer returned by read(2).
>>> ...
>>>
>>> So, if the line is empty to start out with, and then ^D is issued and
>>> interpreted by the terminal as EOF, it'll make available an empty line, in
>>> other words a pointer to char ^@.
>>>
>>> The canonical mode seems to be on by default, but is switched off when
>>> readline puts the terminal in "prepped" mode.
>>>
>>> Gdb switches forth and back between having the readline handler installed and
>>> not (which makes readline switch back and forth between "prepped" and
>>> "deprepped" mode), and even readline itself seems to switch back and forth
>>> internally, in rl_callback_read_char.  So, apparantly the '^D' arrives at a
>>> moment when the terminal happens to be unprepped.
>>>
>>> Indeed, adding a 'usleep (100 * 1000)' at the end of rl_deprep_terminal, makes
>>> it more likely to catch the terminal in unprepped mode, and triggers the FAIL
>>> without taskset.
>>>
>>> At this point, it's good to point out that I have no idea whether this
>>> behaviour is as expected, or should be considered a bug, and if so whether the
>>> bug is in gdb, readline, or the terminal emulator.
>>>
>>> Either way, I suppose it would be nice if we treat the ^D the same regardless.
>>>
>>> This patch has a way of achieving this, by setting the terminal to
>>> non-canonical mode before initializing readline, such that both the prepped
>>> and deprepped mode keep the terminal in non-canonical mode.
>>>
>>> But terminal settings are sticky so they also need to be undone.
>>>
>>> The patch adds functions termios_enable_readline and termios_disable_readline,
>>> which are called at points found necessary using trial-and-error.
>>>
>>> Tested on x86_64-linux, both with and without taskset -c 0.
>>>
>>> I have an open question whether it would be necessary or a good idea to change
>>> more that just the canonical mode.
>>>
>>> Andrew also noted that a potential readline fix / workaround could be:
>>> ...
>>> @@ -630,7 +630,7 @@ readline_internal_charloop (void)
>>>   	 previous character is interpreted as EOF.  This doesn't work when
>>>   	 READLINE_CALLBACKS is defined, so hitting a series of ^Ds will
>>>   	 erase all the chars on the line and then return EOF. */
>>> -      if (((c == _rl_eof_char && lastc != c) || c == EOF) && rl_end == 0)
>>> +      if (((c == _rl_eof_char && lastc != c) || c == EOF || c == 0) && rl_end == 0)
>>>   	{
>>>   #if defined (READLINE_CALLBACKS)
>>>   	  RL_SETSTATE(RL_STATE_DONE);
>>> ...
>>> which indeed fixes the test-case, but broader testing runs into trouble in
>>> gdb.tui, so that might need more work, but could of course be trivial
>>> to fix.
>> 
>> I looked at this a little more.  If you check out tui_getc_1 (tui-io.c)
>> you'll see that we sometimes return 0 to indicate "ignore this
>> character", so clearly my idea above for handling 0 is not a good one.
>> 
>
> I see that indeed, I do wonder though if that in itself is correct.
>
> I managed to manually feed ^@ (char with ascii code 0) to a tui session 
> in a meaningful way:
> - gdb -tui
> - c-x o to put focus in command window
> - type "word1" + space
> - type c-2 to get ^@, to set mark at start of "word2"
> - type "word2" + space + "word3"
> - move cursor to start of "word3"
> - type c-x c-x to swap point and mark
> - watch point move from start of "word3" to start of "word2"
>
> So, doesn't returning 0 in tui_getc_1 set the mark, in accordance with 
> the active key bindings?
>
>> I wonder if we should, instead be handling EOF earlier, e.g. in
>> readline/input.c, maybe in here we should spot that we read '\0' and
>> convert this to EOF?  But then what if the user (for some reason)
>> legitimately wanted to send \0 to GDB...
>> 
>
> Agreed, and I think the scenario mentioned above is an example.
>
>> So then I start wondering if this is an artefact of switching between
>> canonical and non-canonical mode?  The EOF arrives in canonical mode,
>> which doesn't add \004 to the terminal buffer, then we switch to
>> non-canonical mode to read, and by this point its too late.  I do wonder
>> if the kernel should actually be adding the \004 character to the
>> pending characters buffer when we switch between modes...
>> 
>
> Apropos, the mode change is done using tcsetattr and it takes an 
> argument that decides what to do with pending i/o, but that seems more 
> output oriented, so AFAIU the only input thingy we have available there 
> is TCSAFLUSH which would discard pending input.
>
> But regardless, I suspect the problem will be same if the ^D arrives 
> after switching to non-canonical mode.  I think the add-usleep scenario 
> exposes that.
>
>> Anyway, having thought about that for a while I did wonder if we really
>> need to fix this at all.  I mean, right now, the behaviour of GDB is
>> that EOF sent to GDB while we're _not_ at a prompt will basically be
>> ignored
>
> Well, my theory is that it's not ignored, but sets the mark, which 
> mostly looks like it's ignored, unless you do c-x c-x after which you 
> may find point somewhere in the prompt, I'm guessing.
>
>>, after your patch the EOF will be acted on once we get back to a
>> prompt.  Is that a desirable change?
>> 
>
> I'd say this is roughly how I expect interactive programs to behave, 
> yes.  I type keystrokes that are processed, if it's busy it'll buffer 
> those keystrokes and get back to processing them when it stops being 
> busy.  I don't expect the program to suddenly start to process the 
> keystrokes differently, just because it's busy for a short while.
>
> So I guess my argument is that the change makes the behaviour more 
> consistent for the user.
>
>> So, I wondered if there was a way we could just "fix" the test, that is,
>> ensure the Ctrl-D is only sent once the prompt has been displayed and
>> readline has put the terminal back into non-canonical mode.  Turns out
>> that's pretty easy (see the patch below).
>> 
>> With your suggested usleep, I see the eof-exit.exp test fail reliably
>> without the patch below, and pass reliably with the patch below.  What
>> are are your thoughts?
>> 
>
> I think that fixing the test-case is a good idea, since we now know 
> about this issue (and a PR is filed to keep track of it), and there's no 
> reason to keep this occasional fail around.  If we want to change gdb 
> behaviour, we can just add a test-case.

I agree.

>
> As mentioned before, I'm not confident that this patch is a good 
> solution, or even that I understand the problem entirely (or that this 
> is a problem rather than a feature).
>
> Anyway, test-case patch LGTM.

Thanks.

I pushed the patch below to fix the test failures you're seeing.


Thanks,
Andrew


---

commit 205d0542821c91e04e0ed174d8862f486a076950
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Apr 20 14:08:49 2022 +0100

    gdb: fix for gdb.base/eof-exit.exp test failures
    
    This fix relates to PR gdb/29032, this makes the test more stable by
    ensuring that the Ctrl-D is only sent once the prompt has been
    displayed.  This issue was also discussed on the mailing list here:
    
      https://sourceware.org/pipermail/gdb-patches/2022-April/187670.html
    
    The problem identified in the bug report is that sometimes the Ctrl-D
    (that the test sends to GDB) arrives while GDB is processing a
    command.  When this happens the Ctrl-D is handled differently than if
    the Ctrl-D is sent while GDB is waiting for input at a prompt.
    
    The original intent of the test was that the Ctrl-D be sent while GDB
    was waiting at a prompt, and that is the case the occurs most often,
    but, when the Ctrl-D arrives during command processing, then GDB will
    ignore the Ctrl-D, and the test will fail.
    
    This commit ensures the Ctrl-D is always sent while GDB is waiting at
    a prompt, which makes this test stable.
    
    But, that still leaves an open question, what should happen when the
    Ctrl-D arrives while GDB is processing a command?  This commit doesn't
    attempt to answer that question, which is while bug PR gdb/29032 will
    not be closed once this commit is merged.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29032

diff --git a/gdb/testsuite/gdb.base/eof-exit.exp b/gdb/testsuite/gdb.base/eof-exit.exp
index ad5f33d2f10..e04c38bf8d7 100644
--- a/gdb/testsuite/gdb.base/eof-exit.exp
+++ b/gdb/testsuite/gdb.base/eof-exit.exp
@@ -25,9 +25,27 @@ proc run_test {} {
     #
     # Send a newline character, which will cause GDB to redisplay the
     # prompt.
+    #
+    # We then consume the newline characters, and then make use of
+    # expect's -notransfer option to ensure that the prompt has been
+    # displayed, but to leave the prompt in expect's internal buffer.
+    # This is important as the following test wants to check how GDB
+    # displays the 'quit' message relative to the prompt, this is much
+    # easier to do if the prompt is still in expect's buffers.
+    #
+    # The other special thing we do here is avoid printing a 'PASS'
+    # result.  The reason for this is so that the GDB output in the
+    # log file will match what a user should see, this makes it much
+    # easier to debug issues.  Obviously we could print a 'PASS' here
+    # as the text printed by expect is not considered part of GDB's
+    # output, so the pattern matching will work just fine... but, the
+    # log file becomes much harder to read.
     send_gdb "\n"
     gdb_test_multiple "" "discard newline" {
 	-re "^\r\n" {
+	    exp_continue
+	}
+	-notransfer -re "^\[^\n\]*$::gdb_prompt $" {
 	}
     }