Bug 29868 - [gdb/tdep, s390x] breakpoint.c:5784: internal-error: bpstat_stop_status_nowatch: Assertion `!target_stopped_by_watchpoint ()'
Summary: [gdb/tdep, s390x] breakpoint.c:5784: internal-error: bpstat_stop_status_nowat...
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: tdep (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: 13.1
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-12-10 08:18 UTC by Tom de Vries
Modified: 2023-01-11 09:26 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom de Vries 2022-12-10 08:18:02 UTC
With current master on s390-linux and target board -unix/-m31, I run into:
...
(gdb) PASS: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: watchpoints work
continue^M
Continuing.^M
/suse/tdevries/gdb/src/gdb/breakpoint.c:5784: internal-error: bpstat_stop_status_nowatch: Assertion `!target_stopped_by_watchpoint ()' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
----- Backtrace -----^M
FAIL: gdb.threads/watchpoint-fork.exp: parent: singlethreaded: breakpoint after the first fork (GDB internal error)
...

Backtrace:
...
$ ./show-bt.sh | c++filt 
0x122f20b gdb_internal_backtrace_1^M
        gdb/bt-utils.c:122^M
0x122f30d gdb_internal_backtrace()^M
        gdb/bt-utils.c:168^M
0x1c27655 internal_vproblem^M
        gdb/utils.c:396^M
0x1c27b21 internal_verror(char const*, int, char const*, __va_list_tag*)^M
        gdb/utils.c:476^M
0x25f0d49 internal_error_loc(char const*, int, char const*, ...)^M
        gdbsupport/errors.cc:58^M
0x11e4c2b bpstat_stop_status_nowatch(address_space const*, unsigned long, thread_info*, target_waitstatus const&)
^M
        gdb/breakpoint.c:5784^M
0x1634877 handle_inferior_event^M
        gdb/infrun.c:5677^M
0x162fddd fetch_inferior_event()^M
        gdb/infrun.c:4202^M
0x16049eb inferior_event_handler(inferior_event_type)^M
        gdb/inf-loop.c:41^M
0x169474f handle_target_event^M
        gdb/linux-nat.c:4205^M
0x25f22a3 handle_file_event^M
        gdbsupport/event-loop.cc:573^M
0x25f2749 gdb_wait_for_event^M
        gdbsupport/event-loop.cc:694^M
0x25f0ff1 gdb_do_one_event(int)^M
        gdbsupport/event-loop.cc:217^M
0x16faa07 start_event_loop^M
        gdb/main.c:411^M
0x16fab99 captured_command_loop^M
        gdb/main.c:471^M
0x16fc8f3 captured_main^M
        gdb/main.c:1330^M
0x16fc989 gdb_main(captured_main_args*)^M
        gdb/main.c:1345^M
0x101affd main^M
        gdb/gdb.c:32^M
...

Well, it's current master and the tentative patch for PR29867, but I expect it's unrelated.
Comment 1 Tom de Vries 2022-12-10 11:25:57 UTC
(In reply to Tom de Vries from comment #0)
> Well, it's current master and the tentative patch for PR29867, but I expect
> it's unrelated.

Reproduces without that patch, both with and without -m31.
Comment 2 Tom de Vries 2022-12-12 03:14:59 UTC
I've added some code for debugging:
...
diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
index 96833e804e9..654a62545f3 100644
--- a/gdb/s390-linux-nat.c
+++ b/gdb/s390-linux-nat.c
@@ -602,6 +602,9 @@ s390_linux_nat_target::low_forget_process (pid_t pid)
     }
 }
 
+static void
+s390_show_debug_regs (int tid, const char *where);
+
 /* linux_nat_new_fork hook.   */
 
 void
@@ -625,6 +628,12 @@ s390_linux_nat_target::low_new_fork (struct lwp_info *parent, pid_t child_pid)
 
   child_state->watch_areas = parent_state->watch_areas;
   child_state->break_areas = parent_state->break_areas;
+
+  if (show_debug_regs)
+    {
+      s390_show_debug_regs (parent_pid, "fork -- parent");
+      s390_show_debug_regs (child_pid, "fork -- child");
+    }
 }
 
 /* Dump PER state.  */
@@ -656,6 +665,10 @@ s390_show_debug_regs (int tid, const char *where)
                per_info.lowcore.words.perc_atmid,
                per_info.lowcore.words.address,
                per_info.lowcore.words.access_id);
+
+  int result = (per_info.lowcore.bits.perc_storage_alteration == 1
+                && per_info.lowcore.bits.perc_store_real_address == 0);
+  debug_printf ("  stopped_by_watchpoint: %d\n", result);
 }
 
 bool
@@ -688,6 +701,8 @@ s390_linux_nat_target::stopped_by_watchpoint ()
       memset (&per_lowcore, 0, sizeof (per_lowcore));
       if (ptrace (PTRACE_POKEUSR_AREA, s390_inferior_tid (), &parea, 0) < 0)
        perror_with_name (_("Couldn't clear watchpoint status"));
+      if (show_debug_regs)
+       s390_show_debug_regs (s390_inferior_tid (), "clear");
     }
 
   return result;
...
and then get the following log:
...
$ gdbh -q -batch -ex "set trace-commands on" -x gdb.in
+file /suse/tdevries/gdb/build/s390x/gdb/testsuite/outputs/gdb.threads/watchpoint-fork/watchpoint-fork-parent-st
+set follow-fork-mode parent
+start
Temporary breakpoint 1 at 0x1000810: file /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c, line 49.

Temporary breakpoint 1, main () at /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c:49
49        marker ();
+watch var
Hardware watchpoint 2: var
+break marker
Breakpoint 3 at 0x10007da: file /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c, line 33.
+info break
Num     Type           Disp Enb Address            What
2       hw watchpoint  keep y                      var
3       breakpoint     keep y   0x00000000010007da in marker at /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c:33
+maint set show-debug-regs 1
+continue
PER (debug) state for 55488 -- resume
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 40f0  address: 10006a4  PAID: 0
  stopped_by_watchpoint: 0
PER (debug) state for 55488 -- stop
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 40f0  address: 10006a4  PAID: 0
  stopped_by_watchpoint: 0

Breakpoint 3, marker () at /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c:33
33      }
+continue
PER (debug) state for 55488 -- resume
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 40f0  address: 10006a4  PAID: 0
  stopped_by_watchpoint: 0
PER (debug) state for 55488 -- stop
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 40f0  address: 10006a4  PAID: 0
  stopped_by_watchpoint: 0
PER (debug) state for 55488 -- stop
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 40f0  address: 10006a4  PAID: 0
  stopped_by_watchpoint: 0
PER (debug) state for 55488 -- stop
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1
PER (debug) state for 55488 -- clear
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1

Hardware watchpoint 2: var

Old value = 0
New value = 1
main () at /suse/tdevries/gdb/src/gdb/testsuite/gdb.threads/watchpoint-fork-st.c:52
52        forkoff (1);
+continue
PER (debug) state for 55488 -- resume
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1
PER (debug) state for 55488 -- fork -- parent
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1
PER (debug) state for 55490 -- fork -- child
  cr9-11: 0 0 0
  start, end: 0 0
  code/ATMID: 0  address: 0  PAID: 0
  stopped_by_watchpoint: 0
PER (debug) state for 55488 -- stop
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1
PER (debug) state for 55488 -- clear
  cr9-11: 20200000 100203c 100203f
  start, end: 100203c 100203f
  code/ATMID: 20f0  address: 1000820  PAID: 0
  stopped_by_watchpoint: 1
/suse/tdevries/gdb/src/gdb/breakpoint.c:5784: internal-error: bpstat_stop_status_nowatch: Assertion `!target_stopped_by_watchpoint ()' failed.
...

So, the watchpoint triggers, and we try to clear the watch point status in s390_linux_nat_target::stopped_by_watchpoint to make sure that we "Do not report this watchpoint again".  That doesn't seem to have any effect though, and when we stop for the fork s390_linux_nat_target::stopped_by_watchpoint still returns true, causing the assert.
Comment 3 Tom de Vries 2022-12-12 09:46:54 UTC
(In reply to Tom de Vries from comment #2)
> we try to clear the watch point status in
> s390_linux_nat_target::stopped_by_watchpoint to make sure that we "Do not
> report this watchpoint again".  That doesn't seem to have any effect though,

And that seems to be documented behaviour, in ./arch/s390/kernel/ptrace.c, in __poke_user_per we have:
...
        /*                                                                                    
         * There are only three fields in the per_info struct that the                        
         * debugger user can write to.                                                        
         * 1) cr9: the debugger wants to set a new PER event mask                             
         * 2) starting_addr: the debugger wants to set a new starting                         
         *    address to use with the PER event mask.                                         
         * 3) ending_addr: the debugger wants to set a new ending                             
         *    address to use with the PER event mask.                                         
         * The user specified PER event mask and the start and end                            
         * addresses are used only if single stepping is not in effect.                       
         * Writes to any other field in per_info are ignored.                                 
         */
...

So, the bits that we're trying to clear here in per_struct.lowcore, after the fields listed above.
Comment 4 Tom de Vries 2022-12-12 09:48:02 UTC
I found this bit of code in aarch64_linux_nat_target::stopped_data_address:
...
diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
index 96833e804e9..02b797e422f 100644
--- a/gdb/s390-linux-nat.c
+++ b/gdb/s390-linux-nat.c
@@ -673,6 +673,13 @@ s390_linux_nat_target::stopped_by_watchpoint ()
   if (state->watch_areas.empty ())
     return false;
 
+  siginfo_t siginfo;
+  if (!linux_nat_get_siginfo (inferior_ptid, &siginfo))
+    return false;
+  if (siginfo.si_signo != SIGTRAP
+      || (siginfo.si_code & 0xffff) != TRAP_HWBKPT)
+    return false;
+
   parea.len = sizeof (per_lowcore);
   parea.process_addr = (addr_t) & per_lowcore;
   parea.kernel_addr = offsetof (struct user_regs_struct, per_info.lowcore);
...
and it fixes the test-case.
Comment 5 Tom de Vries 2022-12-12 12:08:31 UTC
Also fixes:
...
(gdb) PASS: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1
continue^M
Continuing.^M
Reading symbols from /suse/tdevries/gdb/build/s390x/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent...^M
^M
^M
Fatal signal: Segmentation fault^M
  ...
ERROR: GDB process no longer exists
GDB process exited with wait status 248659 exp19 0 0 CHILDKILLED SIGSEGV {segmentation violation}
UNRESOLVED: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent
Comment 6 Ulrich Weigand 2022-12-12 13:51:01 UTC
Thanks for the investigation!

The GDB code to "Do not report this watchpoint again" was added back in 2007 as part of this patch series:
https://sourceware.org/pipermail/gdb-patches/2007-September/052502.html
https://sourceware.org/pipermail/gdb-patches/2007-September/052503.html
and back then, it probably worked as expected.

The kernel ptrace implementation was changed to prohibit modifying those bits only in 2011 as part of some generic cleanup and tightening of access rules.  At that point, we probably missed that it broke GDB.

I agree that the TRAP_HWBKPT check looks correct - several other platforms have an equivalent check, so I think the patch is fine.  (Longer term, I'm wondering if this check shouldn't move to Linux generic code then?)

However, I think we should then *also* remove the "Do not report this watchpoint again" ptrace call - no sense in having this in if it doesn't work anyway; it just wastes time and confuses anyone reading that code.
Comment 7 Tom de Vries 2022-12-12 15:08:53 UTC
(In reply to Ulrich Weigand from comment #6)
> Thanks for the investigation!
> 

Np :)

> The GDB code to "Do not report this watchpoint again" was added back in 2007
> as part of this patch series:
> https://sourceware.org/pipermail/gdb-patches/2007-September/052502.html
> https://sourceware.org/pipermail/gdb-patches/2007-September/052503.html
> and back then, it probably worked as expected.
> 
> The kernel ptrace implementation was changed to prohibit modifying those
> bits only in 2011 as part of some generic cleanup and tightening of access
> rules.  At that point, we probably missed that it broke GDB.
> 

Ack.

> I agree that the TRAP_HWBKPT check looks correct - several other platforms
> have an equivalent check, so I think the patch is fine.  (Longer term, I'm
> wondering if this check shouldn't move to Linux generic code then?)
> 
> However, I think we should then *also* remove the "Do not report this
> watchpoint again" ptrace call - no sense in having this in if it doesn't
> work anyway; it just wastes time and confuses anyone reading that code.

I've submitted a patch implementing that approach at https://sourceware.org/pipermail/gdb-patches/2022-December/194635.html