[PATCH 00/17] Interrupting programs that block/ignore SIGINT

Mon Jun 14 17:55:17 GMT 2021

On 2021-06-03 8:51 p.m., Eli Zaretskii wrote:
>> From: Pedro Alves <pedro@palves.net>
>> Date: Thu,  3 Jun 2021 20:02:26 +0100
>>
>> Currently, on GNU/Linux, it is not possible to interrupt with Ctrl-C
>> programs that block or ignore SIGINT, with e.g., sigprocmask or
>> signal(SIGINT, SIG_IGN).  You type Ctrl-C, but nothing happens.
> 
> I'm not sure I understand why this is a problem.  If a debuggee blocks
> or ignores SIGINT, then SIGINT cannot be used to stop it, neither when
> it runs outside the debugger nor when it runs under a debugger.  There
> are enough other methods to stop such a debuggee (e.g., send it a
> different signal), but basically such a program clearly tells that it
> doesn't want to be interrupted by SIGINT, period.

It sounds like you're taking the view that Ctrl-C should be seen by the
inferior first, and behave exactly as if you were not debugging
the program?

My view is that, as a user, I see Ctrl-C as the standard way across configurations
to interrupt programs so I can inspect them.  When I type Ctrl-C, I'm telling
GDB that I want to stop the program.  I'm not really thinking that I'm typing Ctrl-C
in the program's terminal.  I.e., I'm not specifically interested in the
SIGINT signal itself.  If the program doesn't stop, then I see that as a GDB bug.
It's not unlike how when you're remote debugging, or debugging a program you've
attached to, you can type Ctrl-C in GDB's terminal, and that stops the program.
It happens to stop it with SIGINT, but in those cases we're not really typing
the Ctrl-C in the program's terminal either.

Making the inferior have its own terminal makes native debugging work similarly to
remote debugging or attaching, in the sense that GDB gets the signal first, and then
is free to decide what to do with it, instead of being at the mercy of how the inferior
configured its signal masks.

> 
> Btw, what about programs that install a SIGINT handler that does
> something when SIGINT is delivered, but don't block or ignore SIGINT?

With current master, you type Ctrl-C, and a SIGINT reaches the inferior
directly without stopping the program.

After my patches, you type Ctrl-C, and GDB stops the program.  If you
want the program to get a SIGINT, you can then still do "signal SIGINT"
to pass a SIGINT to the program.  It's the same as if with current master you
had "handle SIGINT stop print nopass" and then typed Ctrl-C but still wanted to
pass the signal to the inferior after GDB intercepts it.

Like so:

~~~~~~~~~~~~
$ cat sigint.c
#include <stdio.h>
#include <sys/signal.h>

static void
handler (int sig)
{
  printf ("Got SIGINT\n");
}

int
main ()
{
  signal (SIGINT, handler);

  while (1)
    ;

  return 0;
}
~~~~~~~~~~~~

With my patches:

 ...
 (gdb) r
 Starting program: sigint
 # press ctrl-c
 Program stopped.
 main () at sigint.c:15
 15        while (1)
 (gdb) c
 Continuing.
 # press ctrl-c again
 Program stopped.
 main () at sigint.c:15
 15        while (1)
 (gdb) signal SIGINT
 Continuing with signal SIGINT.
 Got SIGINT

With current master GDB, or with my patches + "(gdb) tty /dev/tty":

 (gdb) handle SIGINT print stop nopass
 SIGINT is used by the debugger.
 Are you sure you want to change it? (y or n) y
 Signal        Stop      Print   Pass to program Description
 SIGINT        Yes       Yes     No              Interrupt
 (gdb) r
 Starting program: sigint 
 ^C
 Program received signal SIGINT, Interrupt.
 main () at sigint.c:15
 15        while (1)
 (gdb) c
 Continuing.
 ^C
 Program received signal SIGINT, Interrupt.
 main () at sigint.c:15
 15        while (1)
 (gdb) signal SIGINT
 Continuing with signal SIGINT.
 Got SIGINT


I guess I could mention this in the new interrupting programs section in the
manual.

> 
>> Similarly, if a program uses sigwait to wait for SIGINT, and the
>> program receives a SIGINT, the SIGINT is _not_ intercepted by ptrace,
>> it goes straight to the inferior.
> 
> This sounds like a more serious issue, but how many programs use this
> technique?

Obviously I can't know that number.  I only know the issue has been
reported several times over the years.  PRs gdb/9425, gdb/14559
are the oldest bugs I've seen about it, but I've seen people ask
about this a number of times over the years.


Note that the interrupting programs that block/ignore SIGINT part was
really the use case among several that I picked as being good standalone
justification for the series.  

The being able to take control of inferior input/output when we switch to the
TUI would be another one.  That the TUI gets messed up when the inferior prints
something is also a recurring wart people complain about.

Also, SIGINT handling in multi-target scenarios is quite awkward.  Say you're debugging
a multi-target program, and under control of one GDB, you have:

 - one native process
 - one remote process

Currently, whether GDB gets the SIGINT first, or whether the kernel sends a SIGINT
to the native process, is basically random -- it depends on which process
GDB resumed first -- if the native process, then GDB puts that process in the
foreground and SIGINT will reach it first.  If the remote target, then GDB
gets the SIGINT first, and then GDB tries to forward it to one of the processes,
via target_pass_ctrlc.

Now let's imagine one step forward, and consider a different kind of multi-target
debugging -- debugging a native GNU/Linux process which uses some some accelerator
co-processor threads, GPU threads, etc.  Imagine GDB is able to debug all that,
where "info threads" shows both CPU and device threads.  You have all threads of the CPU
and device stopped.  You now resume one of the accelerator device threads, leaving the
CPU threads stopped.  E.g., say the device thread is 100, and you do

  (gdb) set scheduler-locking on
  (gdb) thread 100
  (gdb) c

As a response to the resumption, GDB puts the process's terminal settings in effect
and puts the inferior in the foreground of its terminal.  

And now you type Ctrl-C to stop the program.  But, instead nothing happens,
Ctrl-C just does nothing.  

That's because while the process is in the foreground, no CPU thread has been resumed,
so there's no thread that can actually receive the SIGINT.  Unix signals don't exist
on the accelerator device.  In this situation, there would be no way to get back
the prompt, unless the device thread hits a breakpoint or some other kind of event
on its own.  This wasn't the original use case I wrote this series for, but I can actually
trigger this scenario with AMD ROCm GDB (debug CPU + GPU threads all in one).

If GDB always gets the signal first, then it can stop all processes in a more
controlled fashion.  

I'm not sure I mentioned this earlier, but we could even add a knob to make Ctrl-C
_not_ stop any process, but just drop to the prompt with the inferior still running.
That can be interesting in scenarios that do not want to disturb the inferior
if possible, as for example "set observer on".  Curiously, I remember that when the non-stop
mode was being designed back in 2007/2008, ctrl-c just giving you back the prompt
was on the table, but it wasn't possible to implement it back then so it was dropped.


So I see a good number of problems with the current way we handle SIGINT, and I
think that putting the inferior in its own terminal so that GDB gets the SIGINT
first finally lets us solve them.

> 
> I guess what I'm asking is whether these issues really justify the
> complications of setting up a separate terminal.  

I think they do.  The biggest wart seems to be the SIGHUP after detach,
but today I figured out a way to avoid it.  See below.

> The fact that the
> inferior could be stopped by a different signal, the fact that it
> could be hit by SIGHUP, the fact that users will need to decide
> whether or not they use the separate-terminal method (which will be
> the default, a backward-incompatible change) -- are all these
> complications really worth the (IMO) minor issues they solve?

Note that I think that the decision of whether to use a separate terminal only
really matters for the "run + detach" use case.  Our testcases mostly do that 
(run+detach) because they want to exercise something about detach, and using "run" is
the easiest way to launch the program.  I.e., it's just a convenience.  I don't think
real users want to do this very often.  If they do need to do it, the patch made GDB
inform you know how to fix it.

But, given your poke, I went back a bit to the drawing board, and found out how
to avoid the SIGHUP in the first place.  So thank you!

I was thinking -- if users really need to detach after run, then their programs will
already need to handle SIGHUP, because they will get the SIGHUP today already it they
close the terminal where GDB was running after detaching their program -- but that's
actually incorrect.  Today, after you run+detach, and then close gdb's terminal,
the detached process loses its terminal, but remains alive.  How could it be?

The reason is that in current master, by the time the process is detached,
we have target_terminal::ours() in effect, which means that the inferior's pgrp
is not the foreground process group.  And it's only processes in the
foreground process group that get the SIGHUP.  Background processes do not get it.

So that got me wondering, how to have the same with my changes.  And it turns out
not to be very hard.  We still need to make the inferior be the foreground process
group during normal debugging just as today.  But, we can make the session leader
process make itself the foreground pgrp after the detach, but before GDB closes
the terminal.

This actually works nicely.  The inferior output has nowhere to go after the detach,
but then that's what you would get if you closed GDB's terminal after the detach.
If you weren't going to do that, then you wouldn't need to detach in the first place.
So I think that's a minor detail, much less concerning than the SIGHUP.

I think that with this, I can drop the new query from GDB, the one
suggesting "tty /dev/tty" in the problem scenario.  And in turn, that lets
us not issue "tty /dev/tty" in all testcases touched by patch #8 except one, which
still needs it for another reason.

See below.

I'll folding this into the right spot in the series, and adjust the manual
changes accordingly too.  I'll also look at documenting the "signal SIGINT"
scenario discussed above.

WDTY?

>From ac842dc45bbe373e0c9a6c0cb839d8883da64fd0 Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Mon, 14 Jun 2021 15:49:06 +0100
Subject: [PATCH] no SIGHUP

Change-Id: I0a0170fab01f6566bed4d2a8a0d07e083d4fb768
---
 gdb/infcmd.c            | 23 -----------------------
 gdb/inflow.c            | 16 +++++++++++++---
 gdb/nat/fork-inferior.c | 31 +++++++++++++++++++++++++++++--
 3 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 56fd48f97cb..5f777b5138b 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -2716,27 +2716,6 @@ notice_new_inferior (thread_info *thr, bool leave_running, int from_tty)
   attach_post_wait (from_tty, mode);
 }
 
-/* Check if the current inferior is associated with a session/terminal
-   created by GDB.  If so, query the user if he/she really wants to
-   detach.  */
-
-static void
-query_detach_if_gdb_owns_session ()
-{
-  if (child_gdb_owns_session (current_inferior ()))
-    {
-      if (!query (_("\
-warning: The process was started by GDB and associated with its\n\
-own session/tty.  If you detach, GDB closes the process's terminal\n\
-and the likely consequence is that the process is killed by SIGHUP,\n\
-unless it explicitly handles or ignores that signal.\n\
-One way to avoid this is by starting the program in the same session\n\
-as GDB using the \"tty /dev/tty\" command.\n\
-Detach anyway? ")))
-	error (_("Not confirmed."));
-    }
-}
-
 /*
  * detach_command --
  * takes a program previously attached to and detaches it.
@@ -2758,8 +2737,6 @@ detach_command (const char *args, int from_tty)
 
   scoped_disable_commit_resumed disable_commit_resumed ("detaching");
 
-  query_detach_if_gdb_owns_session ();
-
   query_if_trace_running (from_tty);
 
   disconnect_tracing ();
diff --git a/gdb/inflow.c b/gdb/inflow.c
index 51700f971ad..38e29b26517 100644
--- a/gdb/inflow.c
+++ b/gdb/inflow.c
@@ -995,7 +995,14 @@ inflow_inferior_exit (struct inferior *inf)
 		  /* Flush any pending output and close the pty.  */
 		  delete_file_handler (run_terminal->pty_fd);
 		  child_terminal_flush_stdout (run_terminal);
-		  close (run_terminal->pty_fd);
+
+		  /* Explicitly send a SIGHUP instead of just closing
+		     the terminal and letting the kernel send it,
+		     because we want the session leader to have a
+		     chance to put itself in the foreground, so that
+		     its children, if any (e.g., we're detaching),
+		     don't get a SIGHUP too.  */
+		  kill (run_terminal->session_leader, SIGHUP);
 
 		  /* Since we closed the controlling terminal, the
 		     session leader should exit now, killed by SIGHUP.
@@ -1015,13 +1022,16 @@ inflow_inferior_exit (struct inferior *inf)
 			     inf->num, (int) run_terminal->session_leader,
 			     errno, safe_strerror (errno));
 		  else if (res != run_terminal->session_leader
-			   || !WIFSIGNALED (status)
-			   || WTERMSIG (status) != SIGHUP)
+			   || !WIFEXITED (status)
+			   || WEXITSTATUS (status) != 0)
 		    warning (_("unexpected waitstatus "
 			       "reaping session leader for inf %d (sid=%d): "
 			       "res=%d, status=0x%x"),
 			     inf->num, (int) run_terminal->session_leader,
 			     res, status);
+
+		  /* We can now close the terminal.  */
+		  close (run_terminal->pty_fd);
 		}
 #endif /* GDB_MANAGED_TERMINALS */
 	      delete run_terminal;
diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
index 31725673fd3..f140242ace5 100644
--- a/gdb/nat/fork-inferior.c
+++ b/gdb/nat/fork-inferior.c
@@ -85,6 +85,33 @@ class execv_argv
   std::string m_storage;
 };
 
+#if GDB_MANAGED_TERMINALS
+
+/* SIGHUP handler for the session leader processes.  GDB sends this
+   explicitly, though it'll also be called if GDB crashes and the
+   terminal is abruptly closed.  */
+
+static void
+session_leader_hup (int sig)
+{
+  scoped_ignore_sigttou ignore_sigttou;
+
+  /* We put the inferior (a child of the session leader) in the
+     foreground, so by default, on detach, if we did nothing else, the
+     inferior would get a SIGHUP when the terminal is closed by GDB.
+     That SIGHUP would likely kill the inferior.  To avoid it, we put
+     the session leader in the foreground before the terminal is
+     closed.  Only processes in the foreground process group get the
+     automatic SIGHUP, so the detached process doesn't get it.  */
+  int res = tcsetpgrp (0, getpid ());
+  if (res == -1)
+    trace_start_error (_("tcsetpgrp failed in session leader\n"));
+
+  _exit (0);
+}
+
+#endif
+
 /* Create argument vector for straight call to execvp.  Breaks up
    ALLARGS into an argument vector suitable for passing to execvp and
    stores it in M_ARGV.  E.g., on "run a b c d" this routine would get
@@ -433,8 +460,8 @@ fork_inferior (const char *exec_file_arg, const std::string &allargs,
 	      /* This is the parent / session leader process.  It just
 		 stays around until GDB closes the terminal.  */
 
-	      /* We want to exit when GDB closes our terminal.  */
-	      signal (SIGHUP, SIG_DFL);
+	      /* Gracefully handle SIGHUP.  */
+	      signal (SIGHUP, session_leader_hup);
 
 	      managed_tty_debug_printf
 		(_("session-leader (sid=%d): waiting for child pid=%d exit\n"),

-- 
2.26.2