This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Fix linux-ia64 on SIGILL for deleted breakpoint


Hi,

in some cases ia64 can generate SIGILL instead of SIGTRAP.  Guessing it is
a CPU bug instead of Linux kernel bug (but I may be wrong).
	https://bugzilla.redhat.com/show_bug.cgi?id=615538
	http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/ia64-sigill.c?cvsroot=systemtap

Anyway FSF GDB already contains code for handling such cases (and not only
SIGILL).  Just the time it gets executed the breakpoint which generated the
signal may be already deleted and GDB stops with:
	Program received signal SIGILL, Illegal instruction.

I do not think it can be solved fully target-independent.  linux-nat.c already
does some postponing of signals using
	kill_lwp (GET_LWP (lp->ptid), WSTOPSIG (lp->status));
and the moment linux-nat.c will read back such postponed signal its breakpoint
may be already deleted (and thus it will never get to target-independent parts
of GDB such as infrun.c).

Moreover I think the check could be more target specific as it could more
precisely check for siginfo_t.si_addr etc.  But there is currently no list of
targets and their SIGany->SIGTRAP conversions list so I would certainly break
compatibility with those arches.

Therefore I kept the arch-independent check in place but extended it also into
linux-nat.c.  BTW this patch works on GNU/Linux fine even with the
handle_inferior_event call completely removed.

Tested only with these patches applied (IMO they do not make a difference):
	[patch] Fix linux-nat.c {,lp->}status typo
	http://sourceware.org/ml/gdb-patches/2010-07/msg00270.html
	[patch] Fix linux-nat.c new_lp dropped status
	http://sourceware.org/ml/gdb-patches/2010-07/msg00272.html
No regressions on {x86_64,x86_64-m32,i686}-fedora12-linux-gnu.
No regressions on ia64-rhel55-linux-gnu.


Thanks,
Jan


gdb/
2010-07-19  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* inferior.h (stop_signal_is_trap): New prototype.
	* infrun.c (handle_inferior_event): Move the `Treating signal as
	SIGTRAP' code ...
	(stop_signal_is_trap): .... into a new function.
	* linux-nat.c
	(linux_nat_post_attach_wait) <WSTOPSIG (status) != SIGSTOP>: New
	variable target_signal.  Call stop_signal_is_trap.
	(linux_handle_extended_wait) <WSTOPSIG (status) != SIGSTOP>: Likewise.
	(wait_lwp): Call stop_signal_is_trap.
	(linux_nat_filter_event): Likewise.

gdb/testsuite/
2010-07-19  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* gdb.threads/ia64-sigill.exp: New file.
	* gdb.threads/ia64-sigill.c: New file.

--- a/gdb/inferior.h
+++ b/gdb/inferior.h
@@ -263,6 +263,9 @@ extern void error_is_running (void);
 /* Calls error_is_running if the current thread is running.  */
 extern void ensure_not_running (void);
 
+extern int stop_signal_is_trap (ptid_t ptid,
+				enum target_signal target_signal);
+
 void set_step_info (struct frame_info *frame, struct symtab_and_line sal);
 
 /* From infcmd.c */
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2926,6 +2926,31 @@ handle_syscall_event (struct execution_control_state *ecs)
   return 1;
 }
 
+/* Return 1 for signals caused by the debugger.  Return 0 for signals
+   that have to do with the program's own actions.  Note that breakpoint insns
+   may cause SIGTRAP or SIGILL or SIGEMT, depending on the operating system
+   version.  Here we detect when a SIGILL or SIGEMT is really a breakpoint and
+   change it to SIGTRAP.  We do something similar for SIGSEGV, since a SIGSEGV
+   will be generated when we're trying to execute a breakpoint instruction on
+   a non-executable stack.  This happens for call dummy breakpoints for
+   architectures like SPARC that place call dummies on the stack.  */
+
+int
+stop_signal_is_trap (ptid_t ptid, enum target_signal target_signal)
+{
+  if (target_signal == TARGET_SIGNAL_ILL
+      || target_signal == TARGET_SIGNAL_SEGV
+      || target_signal == TARGET_SIGNAL_EMT)
+    {
+      struct regcache *regcache = get_thread_regcache (ptid);
+
+      if (breakpoint_inserted_here_p (get_regcache_aspace (regcache),
+				      regcache_read_pc (regcache)))
+	return 1;
+    }
+  return 0;
+}
+
 /* Given an execution control state that has been freshly filled in
    by an event from the inferior, figure out what it means and take
    appropriate action.  */
@@ -2996,31 +3021,13 @@ handle_inferior_event (struct execution_control_state *ecs)
 
   breakpoint_retire_moribund ();
 
-  /* First, distinguish signals caused by the debugger from signals
-     that have to do with the program's own actions.  Note that
-     breakpoint insns may cause SIGTRAP or SIGILL or SIGEMT, depending
-     on the operating system version.  Here we detect when a SIGILL or
-     SIGEMT is really a breakpoint and change it to SIGTRAP.  We do
-     something similar for SIGSEGV, since a SIGSEGV will be generated
-     when we're trying to execute a breakpoint instruction on a
-     non-executable stack.  This happens for call dummy breakpoints
-     for architectures like SPARC that place call dummies on the
-     stack.  */
   if (ecs->ws.kind == TARGET_WAITKIND_STOPPED
-      && (ecs->ws.value.sig == TARGET_SIGNAL_ILL
-	  || ecs->ws.value.sig == TARGET_SIGNAL_SEGV
-	  || ecs->ws.value.sig == TARGET_SIGNAL_EMT))
+      && stop_signal_is_trap (ecs->ptid, ecs->ws.value.sig))
     {
-      struct regcache *regcache = get_thread_regcache (ecs->ptid);
-
-      if (breakpoint_inserted_here_p (get_regcache_aspace (regcache),
-				      regcache_read_pc (regcache)))
-	{
-	  if (debug_infrun)
-	    fprintf_unfiltered (gdb_stdlog,
-				"infrun: Treating signal as SIGTRAP\n");
-	  ecs->ws.value.sig = TARGET_SIGNAL_TRAP;
-	}
+      if (debug_infrun)
+	fprintf_unfiltered (gdb_stdlog,
+			    "infrun: Treating signal as SIGTRAP\n");
+      ecs->ws.value.sig = TARGET_SIGNAL_TRAP;
     }
 
   /* Mark the non-executing threads accordingly.  In all-stop, all
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1410,11 +1410,22 @@ linux_nat_post_attach_wait (ptid_t ptid, int first, int *cloned,
 
   if (WSTOPSIG (status) != SIGSTOP)
     {
+      enum target_signal target_signal;
+	  
       *signalled = 1;
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
 			    "LNPAW: Received %s after attaching\n",
 			    status_to_str (status));
+
+      target_signal = target_signal_from_host (WSTOPSIG (status));
+      if (stop_signal_is_trap (ptid, target_signal))
+	{
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"LNPAW: Treating signal as SIGTRAP\n");
+	  status = W_STOPCODE (SIGTRAP);
+	}
     }
 
   return status;
@@ -2222,6 +2233,8 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 
 	  if (WSTOPSIG (status) != SIGSTOP)
 	    {
+	      enum target_signal target_signal;
+
 	      /* This can happen if someone starts sending signals to
 		 the new thread before it gets a chance to run, which
 		 have a lower number than SIGSTOP (e.g. SIGUSR1).
@@ -2231,6 +2244,15 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 		 the other signal on to the thread below.  */
 
 	      new_lp->signalled = 1;
+
+	      target_signal = target_signal_from_host (WSTOPSIG (status));
+	      if (stop_signal_is_trap (new_lp->ptid, target_signal))
+		{
+		  if (debug_linux_nat)
+		    fprintf_unfiltered (gdb_stdlog,
+					"LHEW: Treating signal as SIGTRAP\n");
+		  status = W_STOPCODE (SIGTRAP);
+		}
 	    }
 	  else
 	    status = 0;
@@ -2433,6 +2455,15 @@ wait_lwp (struct lwp_info *lp)
 	return wait_lwp (lp);
     }
 
+  if (WIFSTOPPED (status)
+      && stop_signal_is_trap (lp->ptid,
+			      target_signal_from_host (WSTOPSIG (status))))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog, "WL: Treating signal as SIGTRAP\n");
+      status = W_STOPCODE (SIGTRAP);
+    }
+
   return status;
 }
 
@@ -3065,6 +3096,16 @@ linux_nat_filter_event (int lwpid, int status, int options)
 	return NULL;
     }
 
+  if (WIFSTOPPED (status)
+      && stop_signal_is_trap (lp->ptid,
+			      target_signal_from_host (WSTOPSIG (status))))
+    {
+      if (debug_linux_nat)
+	fprintf_unfiltered (gdb_stdlog,
+			    "LLW: Treating signal as SIGTRAP\n");
+      status = W_STOPCODE (SIGTRAP);
+    }
+
   if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP)
     {
       /* Save the trap's siginfo in case we need it later.  */
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/ia64-sigill.c
@@ -0,0 +1,360 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2010 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#define _GNU_SOURCE
+#include <pthread.h>
+#include <stdio.h>
+#include <limits.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <sys/types.h>
+#include <signal.h>
+#include <unistd.h>
+#include <asm/unistd.h>
+
+#define gettid() syscall (__NR_gettid)
+
+/* Terminate always in the main task, it can lock up with SIGSTOPped GDB
+   otherwise.  */
+#define TIMEOUT (gettid () == getpid() ? 10 : 15)
+
+static pid_t thread1_tid;
+static pthread_cond_t thread1_tid_cond = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t thread1_tid_mutex = PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP;
+
+static pid_t thread2_tid;
+static pthread_cond_t thread2_tid_cond = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t thread2_tid_mutex = PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP;
+
+static pthread_mutex_t terminate_mutex = PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP;
+
+/* Do not use alarm as it would create a ptrace event which would hang up us if
+   we are being traced by GDB which we stopped ourselves.  */
+
+static void timed_mutex_lock (pthread_mutex_t *mutex)
+{
+  int i;
+  struct timespec start, now;
+
+  i = clock_gettime (CLOCK_MONOTONIC, &start);
+  assert (i == 0);
+
+  do
+    {
+      i = pthread_mutex_trylock (mutex);
+      if (i == 0)
+	return;
+      assert (i == EBUSY);
+
+      i = clock_gettime (CLOCK_MONOTONIC, &now);
+      assert (i == 0);
+      assert (now.tv_sec >= start.tv_sec);
+    }
+  while (now.tv_sec - start.tv_sec < TIMEOUT);
+
+  fprintf (stderr, "Timed out waiting for internal lock!\n");
+  exit (EXIT_FAILURE);
+}
+
+static void *
+thread_func (void *threadno_voidp)
+{
+  int threadno = (intptr_t) threadno_voidp;
+  int i;
+
+  switch (threadno)
+  {
+    case 1:
+      timed_mutex_lock (&thread1_tid_mutex);
+
+      /* THREAD1_TID_MUTEX must be already locked to avoid race.  */
+      thread1_tid = gettid ();
+
+      i = pthread_cond_signal (&thread1_tid_cond);
+      assert (i == 0);
+      i = pthread_mutex_unlock (&thread1_tid_mutex);
+      assert (i == 0);
+
+      break;
+
+    case 2:
+      timed_mutex_lock (&thread2_tid_mutex);
+
+      /* THREAD2_TID_MUTEX must be already locked to avoid race.  */
+      thread2_tid = gettid ();
+
+      i = pthread_cond_signal (&thread2_tid_cond);
+      assert (i == 0);
+      i = pthread_mutex_unlock (&thread2_tid_mutex);
+      assert (i == 0);
+
+      break;
+
+    default:
+      assert (0);
+  }
+
+#ifdef __ia64__
+  asm volatile ("label:\n"
+		"nop.m 0\n"
+		"nop.i 0\n"
+		"nop.b 0\n");
+#endif
+  /* break-here */
+
+  /* Be sure the "t (tracing stop)" test can proceed for both threads.  */
+  timed_mutex_lock (&terminate_mutex);
+  i = pthread_mutex_unlock (&terminate_mutex);
+  assert (i == 0);
+
+  return NULL;
+}
+
+static const char *
+proc_string (const char *filename, const char *line)
+{
+  FILE *f;
+  static char buf[LINE_MAX];
+  size_t line_len = strlen (line);
+
+  f = fopen (filename, "r");
+  if (f == NULL)
+    {
+      fprintf (stderr, "fopen (\"%s\") for \"%s\": %s\n", filename, line,
+	       strerror (errno));
+      exit (EXIT_FAILURE);
+    }
+  while (errno = 0, fgets (buf, sizeof (buf), f))
+    {
+      char *s;
+
+      s = strchr (buf, '\n');
+      assert (s != NULL);
+      *s = 0;
+
+      if (strncmp (buf, line, line_len) != 0)
+	continue;
+
+      if (fclose (f))
+	{
+	  fprintf (stderr, "fclose (\"%s\") for \"%s\": %s\n", filename, line,
+		   strerror (errno));
+	  exit (EXIT_FAILURE);
+	}
+
+      return &buf[line_len];
+    }
+  if (errno != 0)
+    {
+      fprintf (stderr, "fgets (\"%s\": %s\n", filename, strerror (errno));
+      exit (EXIT_FAILURE);
+    }
+  fprintf (stderr, "\"%s\": No line \"%s\" found.\n", filename, line);
+  exit (EXIT_FAILURE);
+}
+
+static unsigned long
+proc_ulong (const char *filename, const char *line)
+{
+  const char *s = proc_string (filename, line);
+  long retval;
+  char *end;
+
+  errno = 0;
+  retval = strtol (s, &end, 10);
+  if (retval < 0 || retval >= LONG_MAX || (end && *end))
+    {
+      fprintf (stderr, "\"%s\":\"%s\": %ld, %s\n", filename, line, retval,
+	       strerror (errno));
+      exit (EXIT_FAILURE);
+    }
+  return retval;
+}
+
+static void
+state_wait (pid_t process, const char *wanted)
+{
+  char *filename;
+  int i;
+  struct timespec start, now;
+  const char *state;
+
+  i = asprintf (&filename, "/proc/%lu/status", (unsigned long) process);
+  assert (i > 0);
+
+  i = clock_gettime (CLOCK_MONOTONIC, &start);
+  assert (i == 0);
+
+  do
+    {
+      state = proc_string (filename, "State:\t");
+
+      /* torvalds/linux-2.6.git 464763cf1c6df632dccc8f2f4c7e50163154a2c0
+	 has changed "T (tracing stop)" to "t (tracing stop)".  Make the GDB
+	 testcase backward compatible with older Linux kernels.  */
+      if (strcmp (state, "T (tracing stop)") == 0)
+	state = "t (tracing stop)";
+
+      if (strcmp (state, wanted) == 0)
+	{
+	  free (filename);
+	  return;
+	}
+
+      if (sched_yield ())
+	{
+	  perror ("sched_yield()");
+	  exit (EXIT_FAILURE);
+	}
+
+      i = clock_gettime (CLOCK_MONOTONIC, &now);
+      assert (i == 0);
+      assert (now.tv_sec >= start.tv_sec);
+    }
+  while (now.tv_sec - start.tv_sec < TIMEOUT);
+
+  fprintf (stderr, "Timed out waiting for PID %lu \"%s\" (now it is \"%s\")!\n",
+	   (unsigned long) process, wanted, state);
+  exit (EXIT_FAILURE);
+}
+
+static volatile pid_t tracer = 0;
+static pthread_t thread1, thread2;
+
+static void
+cleanup (void)
+{
+  printf ("Resuming GDB PID %lu.\n", (unsigned long) tracer);
+
+  if (tracer)
+    {
+      int i;
+      int tracer_save = tracer;
+
+      tracer = 0;
+
+      i = kill (tracer_save, SIGCONT);
+      assert (i == 0);
+    }
+}
+
+int
+main (int argc, char **argv)
+{
+  int i;
+  int standalone = 0;
+
+  if (argc == 2 && strcmp (argv[1], "-s") == 0)
+    standalone = 1;
+  else
+    assert (argc == 1);
+
+  setbuf (stdout, NULL);
+
+  timed_mutex_lock (&thread1_tid_mutex);
+  timed_mutex_lock (&thread2_tid_mutex);
+
+  timed_mutex_lock (&terminate_mutex);
+
+  i = pthread_create (&thread1, NULL, thread_func, (void *) (intptr_t) 1);
+  assert (i == 0);
+
+  i = pthread_create (&thread2, NULL, thread_func, (void *) (intptr_t) 2);
+  assert (i == 0);
+
+  if (!standalone)
+    {
+      tracer = proc_ulong ("/proc/self/status", "TracerPid:\t");
+      if (tracer == 0)
+	{
+	  fprintf (stderr, "The testcase must be run by GDB!\n");
+	  exit (EXIT_FAILURE);
+	}
+      if (tracer != getppid ())
+	{
+	  fprintf (stderr, "The testcase parent must be our GDB tracer!\n");
+	  exit (EXIT_FAILURE);
+	}
+    }
+
+  /* SIGCONT our debugger in the case of our crash as we would deadlock
+     otherwise.  */
+
+  atexit (cleanup);
+
+  printf ("Stopping GDB PID %lu.\n", (unsigned long) tracer);
+
+  if (tracer)
+    {
+      i = kill (tracer, SIGSTOP);
+      assert (i == 0);
+      state_wait (tracer, "T (stopped)");
+    }
+
+  /* Threads are now waiting at timed_mutex_lock (thread1_tid_mutex) and so
+     they could not trigger the breakpoint before GDB gets unstopped later.
+     Threads get resumed at pthread_cond_wait below.  Use `while' loops for
+     protection against spurious pthread_cond_wait wakeups.  */
+
+  printf ("Waiting till the threads initialize their TIDs.\n");
+
+  while (thread1_tid == 0)
+    {
+      i = pthread_cond_wait (&thread1_tid_cond, &thread1_tid_mutex);
+      assert (i == 0);
+    }
+
+  while (thread2_tid == 0)
+    {
+      i = pthread_cond_wait (&thread2_tid_cond, &thread2_tid_mutex);
+      assert (i == 0);
+    }
+
+  printf ("Thread 1 TID = %lu, thread 2 TID = %lu, PID = %lu.\n",
+	  (unsigned long) thread1_tid, (unsigned long) thread2_tid,
+	  (unsigned long) getpid ());
+
+  printf ("Waiting till the threads get trapped by the breakpoint.\n");
+
+  if (tracer)
+    {
+      /* s390x-unknown-linux-gnu will fail with "R (running)".  */
+
+      state_wait (thread1_tid, "t (tracing stop)");
+
+      state_wait (thread2_tid, "t (tracing stop)");
+    }
+
+  cleanup ();
+
+  printf ("Joining the threads.\n");
+
+  i = pthread_mutex_unlock (&terminate_mutex);
+  assert (i == 0);
+
+  i = pthread_join (thread1, NULL);
+  assert (i == 0);
+
+  i = pthread_join (thread2, NULL);
+  assert (i == 0);
+
+  printf ("Exiting.\n");	/* break-at-exit */
+
+  return EXIT_SUCCESS;
+}
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/ia64-sigill.exp
@@ -0,0 +1,76 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2010 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test SIGILL generated by some special cases of breakpoints on ia64.  Problem
+# was SIGILL being stored in non-current thread for later retrieval when its
+# breakpoint has been already deleted.  moribund locations are not active in
+# the default all-stop mode.
+
+set testfile "ia64-sigill"
+set srcfile ${testfile}.c
+set binfile ${objdir}/${subdir}/${testfile}
+if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" ${binfile} executable [list debug additional_flags=-lrt]] != "" } {
+    return -1
+}
+
+clean_restart $testfile
+
+if ![runto_main] {
+    return -1
+}
+
+set test "info addr label"
+gdb_test_multiple $test $test {
+    -re "Symbol \"label\" is at 0x\[0-9a-f\]+0 in .*\r\n$gdb_prompt $" {
+	# Verify the label really starts at the start of ia64 bundle.
+	pass $test
+
+	# ia64 generates SIGILL for breakpoint at B slot of an MIB bundle.
+	gdb_test "break *label+2" {Breakpoint [0-9]+ at 0x[0-9a-f]+2:.*}
+    }
+    -re "No symbol \"label\" in current context\\.\r\n$gdb_prompt $" {
+	pass $test
+
+	# Either this target never generates non-SIGTRAP signals or they do
+	# not depend on the breakpoint address.  Try any address.
+	gdb_breakpoint [gdb_get_line_number "break-here"]
+    }
+}
+
+gdb_test_no_output {set $sigill_bpnum=$bpnum}
+
+gdb_breakpoint [gdb_get_line_number "break-at-exit"]
+
+gdb_test_no_output "set debug infrun 1"
+
+# The ia64 SIGILL signal is visible only in the lin-lwp debug.
+gdb_test_no_output "set debug lin-lwp 1"
+
+gdb_test "continue" "Breakpoint \[0-9\]+,( .* in)? thread_func .*"
+
+gdb_test_no_output {delete $sigill_bpnum}
+
+set test "continue for the pending signal"
+gdb_test_multiple "continue" $test {
+    -re "Breakpoint \[0-9\]+, .*break-at-exit.*\r\n$gdb_prompt $" {
+	# Breakpoint has been skipped in the other thread.
+	pass $test
+    }
+    -re "Program received signal .*\r\n$gdb_prompt $" {
+	fail $test
+    }
+}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]