[RFC][gdb] Handle ^D when terminal is not prepped

Sun May 8 07:51:38 GMT 2022

On 4/20/22 13:59, Andrew Burgess wrote:
> Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> Hi,
>>
>> When running test-case gdb.base/eof-exit.exp, I get:
>> ...
>> (gdb) ^M
>> (gdb) quit^M
>> PASS: gdb.base/eof-exit.exp: with non-dump terminal: close GDB with eof
>> ...
>> but when run in combination with taskset -c 0, I get instead:
>> ...
>> (gdb) ^M
>> (gdb) FAIL: gdb.base/eof-exit.exp: default: close GDB with eof (timeout)
>> ...
>>
>> The test-case is a bit unusual, in the sense that most test-cases wait for
>> gdb to present the prompt '(gdb) ' before sending input.  Instead, this
>> test-case first sends a newline to the prompt:
>> ...
>>      send_gdb "\n"
>> ...
>> to generate a new prompt, but rather than waiting for the new prompt, it just
>> waits for the resulting newline:
>> ...
>>      gdb_test_multiple "" "discard newline" {
>>          -re "^\r\n" {
>>          }
>>      }
>> ...
>> and then sends the End-of-Transmission character (ascii code 4, caret notation
>> '^D'):
>> ...
>>      send_gdb "\004"
>> ...
>>
>> The purpose of this is to verify that the result is:
>> ...
>> (gdb) quit
>> ...
>> instead of:
>> ...
>> quit)
>> ...
>>
>> Putting this together with the failure log above, it seems like the ^D has no
>> observable effect.
>>
>> After adding some debugging code in readline, to verify what chars are read
>> from stdin (call to read in rl_getc), we find that in the passing case we get
>> '^D', but in the failing case '^@' (ascii code 0, '\0') instead.
>>
>> Readline treats the '^D' as EOF, and calls gdb's installed handler with NULL
>> (meaning EOF), which then issues the quit.
>>
>> But readline does not invoke gdb's installed handler for the '^@', AFAIU
>> because as per default keymap it treats it as the 'set mark' function.
>>
>> So, why does ^D end up as ^@?
>>
>> My theory is that this is due to "canonical mode" (
>> "https://man7.org/linux/man-pages/man3/tcflow.3.html" ):
>> ...
>> Input is made available line by line.  An input line is
>> available when one of the line delimiters is typed (NL, EOL,
>> EOL2; or EOF at the start of line).  Except in the case of EOF,
>> the line delimiter is included in the buffer returned by read(2).
>> ...
>>
>> So, if the line is empty to start out with, and then ^D is issued and
>> interpreted by the terminal as EOF, it'll make available an empty line, in
>> other words a pointer to char ^@.
>>
>> The canonical mode seems to be on by default, but is switched off when
>> readline puts the terminal in "prepped" mode.
>>
>> Gdb switches forth and back between having the readline handler installed and
>> not (which makes readline switch back and forth between "prepped" and
>> "deprepped" mode), and even readline itself seems to switch back and forth
>> internally, in rl_callback_read_char.  So, apparantly the '^D' arrives at a
>> moment when the terminal happens to be unprepped.
>>
>> Indeed, adding a 'usleep (100 * 1000)' at the end of rl_deprep_terminal, makes
>> it more likely to catch the terminal in unprepped mode, and triggers the FAIL
>> without taskset.
>>
>> At this point, it's good to point out that I have no idea whether this
>> behaviour is as expected, or should be considered a bug, and if so whether the
>> bug is in gdb, readline, or the terminal emulator.
>>
>> Either way, I suppose it would be nice if we treat the ^D the same regardless.
>>
>> This patch has a way of achieving this, by setting the terminal to
>> non-canonical mode before initializing readline, such that both the prepped
>> and deprepped mode keep the terminal in non-canonical mode.
>>
>> But terminal settings are sticky so they also need to be undone.
>>
>> The patch adds functions termios_enable_readline and termios_disable_readline,
>> which are called at points found necessary using trial-and-error.
>>
>> Tested on x86_64-linux, both with and without taskset -c 0.
>>
>> I have an open question whether it would be necessary or a good idea to change
>> more that just the canonical mode.
>>
>> Andrew also noted that a potential readline fix / workaround could be:
>> ...
>> @@ -630,7 +630,7 @@ readline_internal_charloop (void)
>>   	 previous character is interpreted as EOF.  This doesn't work when
>>   	 READLINE_CALLBACKS is defined, so hitting a series of ^Ds will
>>   	 erase all the chars on the line and then return EOF. */
>> -      if (((c == _rl_eof_char && lastc != c) || c == EOF) && rl_end == 0)
>> +      if (((c == _rl_eof_char && lastc != c) || c == EOF || c == 0) && rl_end == 0)
>>   	{
>>   #if defined (READLINE_CALLBACKS)
>>   	  RL_SETSTATE(RL_STATE_DONE);
>> ...
>> which indeed fixes the test-case, but broader testing runs into trouble in
>> gdb.tui, so that might need more work, but could of course be trivial
>> to fix.
> 
> I looked at this a little more.  If you check out tui_getc_1 (tui-io.c)
> you'll see that we sometimes return 0 to indicate "ignore this
> character", so clearly my idea above for handling 0 is not a good one.
> 

I see that indeed, I do wonder though if that in itself is correct.

I managed to manually feed ^@ (char with ascii code 0) to a tui session 
in a meaningful way:
- gdb -tui
- c-x o to put focus in command window
- type "word1" + space
- type c-2 to get ^@, to set mark at start of "word2"
- type "word2" + space + "word3"
- move cursor to start of "word3"
- type c-x c-x to swap point and mark
- watch point move from start of "word3" to start of "word2"

So, doesn't returning 0 in tui_getc_1 set the mark, in accordance with 
the active key bindings?

> I wonder if we should, instead be handling EOF earlier, e.g. in
> readline/input.c, maybe in here we should spot that we read '\0' and
> convert this to EOF?  But then what if the user (for some reason)
> legitimately wanted to send \0 to GDB...
> 

Agreed, and I think the scenario mentioned above is an example.

> So then I start wondering if this is an artefact of switching between
> canonical and non-canonical mode?  The EOF arrives in canonical mode,
> which doesn't add \004 to the terminal buffer, then we switch to
> non-canonical mode to read, and by this point its too late.  I do wonder
> if the kernel should actually be adding the \004 character to the
> pending characters buffer when we switch between modes...
> 

Apropos, the mode change is done using tcsetattr and it takes an 
argument that decides what to do with pending i/o, but that seems more 
output oriented, so AFAIU the only input thingy we have available there 
is TCSAFLUSH which would discard pending input.

But regardless, I suspect the problem will be same if the ^D arrives 
after switching to non-canonical mode.  I think the add-usleep scenario 
exposes that.

> Anyway, having thought about that for a while I did wonder if we really
> need to fix this at all.  I mean, right now, the behaviour of GDB is
> that EOF sent to GDB while we're _not_ at a prompt will basically be
> ignored

Well, my theory is that it's not ignored, but sets the mark, which 
mostly looks like it's ignored, unless you do c-x c-x after which you 
may find point somewhere in the prompt, I'm guessing.

>, after your patch the EOF will be acted on once we get back to a
> prompt.  Is that a desirable change?
> 

I'd say this is roughly how I expect interactive programs to behave, 
yes.  I type keystrokes that are processed, if it's busy it'll buffer 
those keystrokes and get back to processing them when it stops being 
busy.  I don't expect the program to suddenly start to process the 
keystrokes differently, just because it's busy for a short while.

So I guess my argument is that the change makes the behaviour more 
consistent for the user.

> So, I wondered if there was a way we could just "fix" the test, that is,
> ensure the Ctrl-D is only sent once the prompt has been displayed and
> readline has put the terminal back into non-canonical mode.  Turns out
> that's pretty easy (see the patch below).
> 
> With your suggested usleep, I see the eof-exit.exp test fail reliably
> without the patch below, and pass reliably with the patch below.  What
> are are your thoughts?
> 

I think that fixing the test-case is a good idea, since we now know 
about this issue (and a PR is filed to keep track of it), and there's no 
reason to keep this occasional fail around.  If we want to change gdb 
behaviour, we can just add a test-case.

As mentioned before, I'm not confident that this patch is a good 
solution, or even that I understand the problem entirely (or that this 
is a problem rather than a feature).

Anyway, test-case patch LGTM.

Thanks,
- Tom

> Thanks,
> Andrew
> 
> ---
> 
> diff --git a/gdb/testsuite/gdb.base/eof-exit.exp b/gdb/testsuite/gdb.base/eof-exit.exp
> index 2d9530ccebe..c88aced9f35 100644
> --- a/gdb/testsuite/gdb.base/eof-exit.exp
> +++ b/gdb/testsuite/gdb.base/eof-exit.exp
> @@ -25,9 +25,27 @@ proc run_test {} {
>       #
>       # Send a newline character, which will cause GDB to redisplay the
>       # prompt.
> +    #
> +    # We then consume the newline characters, and then make use of
> +    # expect's -notransfer option to ensure that the prompt has been
> +    # displayed, but to leave the prompt in expect's internal buffer.
> +    # This is important as the following test wants to check how GDB
> +    # displays the 'quit' message relative to the prompt, this is much
> +    # easier to do if the prompt is still in expect's buffers.
> +    #
> +    # The other special thing we do here is avoid printing a 'PASS'
> +    # result.  The reason for this is so that the GDB output in the
> +    # log file will match what a user should see, this makes it much
> +    # easier to debug issues.  Obviously we could print a 'PASS' here
> +    # as the text printed by expect is not considered part of GDB's
> +    # output, so the pattern matching will work just fine... but, the
> +    # log file becomes much harder to read.
>       send_gdb "\n"
>       gdb_test_multiple "" "discard newline" {
>   	-re "^\r\n" {
> +	    exp_continue
> +	}
> +	-notransfer -re "^\[^\n\]*$::gdb_prompt $" {
>   	}
>       }
>   
>