Bug 31194 - Stepping on "int3" instruction fails assertion.
Summary: Stepping on "int3" instruction fails assertion.
Status: NEW
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: HEAD
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-27 22:54 UTC by Peter Damianov
Modified: 2024-02-08 19:10 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2023-12-29 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Damianov 2023-12-27 22:54:44 UTC
I tested this with:
GNU gdb (GDB) 15.0.50.20231227-git

But I could also reproduce it with gdb 13.1, and gdb 13.2

For the following code:

#include <stdio.h>

int main(void)
{
        __asm__("int3");
        for (int i = 0; i < 5; ++i)
        {
                printf("%d\n", i);
        }
}

Compiled with:
gcc test.c -o test -O2 -g3

GCC does lack an intrinsic to generate the "int3" instruction, but clang generates it from __builtin_debugbreak().

When I run, the debugger breaks as expected, but as soon as I step, the following happens:
../../gdb/infrun.c:2973: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.
Comment 1 Peter Damianov 2023-12-27 22:58:27 UTC
For clang, I meant to write __builtin_debugtrap()
And I can reproduce the same issue using clang to build the executable, as well.
Comment 2 Tom de Vries 2023-12-28 11:25:43 UTC
I couldn't reproduce this on ubuntu 22.04.3 with gcc 11.4.0.
Comment 3 Tom Tromey 2023-12-28 16:42:21 UTC
I can reproduce on Fedora 38 with the system gcc.

I'm not really sure how int3 is supposed to be handled
but basically gdb decides to keep stepping, then the assert
notices that we're stepping out of the range of that line.

(gdb) step
[infrun] clear_proceed_status_thread: 601872.601872.0
[infrun] set_step_info: symtab = q.c, line = 5, step_frame_id = {stack=0x7fffffffdec0,code=0x0000000000401126,!special}, step_stack_frame_id = {stack=0x7fffffffdec0,code=0x0000000000401126,!special}
[infrun] proceed: enter
  [infrun] follow_fork: enter
  [infrun] follow_fork: exit
  [infrun] proceed: cur_thr = 601872.601872.0
  [infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT, resume_ptid=601872.0.0
  [infrun] scoped_disable_commit_resumed: reason=proceeding
  [infrun] start_step_over: enter
    [infrun] start_step_over: stealing global queue of threads to step, length = 0
    [infrun] operator(): step-over queue now empty
  [infrun] start_step_over: exit
  [infrun] proceed: start: resuming threads, all-stop-on-top-of-non-stop
    [infrun] proceed_resume_thread_checked: resuming 601872.601872.0
    [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [601872.601872.0] at 0x40112e
    [infrun] set_thread_options: [options for 601872.601872.0 are now 0x2 [GDB_THREAD_OPTION_EXIT]]
    [infrun] do_target_resume: resume_ptid=601872.601872.0, step=1, sig=GDB_SIGNAL_0
    [infrun] infrun_async: enable=1
    [infrun] prepare_to_wait: prepare_to_wait
  [infrun] proceed: end: resuming threads, all-stop-on-top-of-non-stop
  [infrun] reset: reason=proceeding
  [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target native
  [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native
[infrun] proceed: exit
[infrun] fetch_inferior_event: enter
  [infrun] scoped_disable_commit_resumed: reason=handling event
  [infrun] random_pending_event_thread: None found.
  [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
  [infrun] print_target_wait_results:   601872.601872.0 [Thread 0x7ffff7fa1640 (LWP 601872)],
  [infrun] print_target_wait_results:   status->kind = STOPPED, sig = GDB_SIGNAL_TRAP
  [infrun] handle_inferior_event: status->kind = STOPPED, sig = GDB_SIGNAL_TRAP
  [infrun] start_step_over: enter
    [infrun] start_step_over: stealing global queue of threads to step, length = 0
    [infrun] operator(): step-over queue now empty
  [infrun] start_step_over: exit
  [infrun] context_switch: Switching context from 0.0.0 to 601872.601872.0
  [infrun] handle_signal_stop: stop_pc=0x40112e
  [infrun] process_event_stop_test: stepping inside range [0x40112e-0x40112f]
  [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [601872.601872.0] at 0x40112f
../../binutils-gdb/gdb/infrun.c:2973: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x5a6806 gdb_internal_backtrace_1
	../../binutils-gdb/gdb/bt-utils.c:122
0x5a68a9 _Z22gdb_internal_backtracev
	../../binutils-gdb/gdb/bt-utils.c:168
0xd19ad9 internal_vproblem
	../../binutils-gdb/gdb/utils.c:396
0xd19e94 _Z15internal_verrorPKciS0_P13__va_list_tag
	../../binutils-gdb/gdb/utils.c:476
0x14e0e5b _Z18internal_error_locPKciS0_z
	../../binutils-gdb/gdbsupport/errors.cc:58
0x882a43 resume_1
	../../binutils-gdb/gdb/infrun.c:2973
0x882b78 resume
	../../binutils-gdb/gdb/infrun.c:2989
0x8930e3 keep_going_pass_signal
	../../binutils-gdb/gdb/infrun.c:8992
0x89324b keep_going
	../../binutils-gdb/gdb/infrun.c:9011
0x88eae0 process_event_stop_test
	../../binutils-gdb/gdb/infrun.c:7620
0x88dd7f handle_signal_stop
	../../binutils-gdb/gdb/infrun.c:7295
0x88beb0 handle_inferior_event
	../../binutils-gdb/gdb/infrun.c:6514
0x887137 _Z20fetch_inferior_eventv
	../../binutils-gdb/gdb/infrun.c:4663
0x8635c8 _Z22inferior_event_handler19inferior_event_type
	../../binutils-gdb/gdb/inf-loop.c:42
0x896055 infrun_async_inferior_event_handler
	../../binutils-gdb/gdb/infrun.c:10282
0x531cf0 _Z26check_async_event_handlersv
	../../binutils-gdb/gdb/async-event.c:338
0x14e11b3 _Z16gdb_do_one_eventi
	../../binutils-gdb/gdbsupport/event-loop.cc:221
0x92457c start_event_loop
	../../binutils-gdb/gdb/main.c:408
0x9246dc captured_command_loop
	../../binutils-gdb/gdb/main.c:472
0x925faa captured_main
	../../binutils-gdb/gdb/main.c:1343
0x926044 _Z8gdb_mainP18captured_main_args
	../../binutils-gdb/gdb/main.c:1362
0x417bc1 main
	../../binutils-gdb/gdb/gdb.c:39
---------------------
  [infrun] infrun_async: enable=0
../../binutils-gdb/gdb/infrun.c:2973: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
Comment 4 Tom de Vries 2023-12-29 05:57:26 UTC
(In reply to Tom de Vries from comment #2)
> I couldn't reproduce this on ubuntu 22.04.3 with gcc 11.4.0.

Still can't reproduce this, on openSUSE Leap15.4 with gcc 7, 8, 9, 10 and 11, and on openSUSE Tumbleweed with gcc 13.2.1.
Comment 5 Tom de Vries 2023-12-29 06:25:49 UTC
(In reply to Tom Tromey from comment #3)
> I can reproduce on Fedora 38 with the system gcc.

Initially I also was not able to reproduce in a fedora 38 container.

I did:
...
$ gdb -q -batch a.out -ex run -ex step
...
because of "When I run" in $description, but it turns out I should have used start instead.

Using this:
...
$ gdb -q -batch a.out -ex start -ex step
...
I managed to reproduce, also on openSUSE Leap 15.4, not with system gcc 7, but with gcc 8 and later.
Comment 6 Tom Tromey 2023-12-30 17:30:14 UTC
(In reply to Tom de Vries from comment #5)

> I managed to reproduce, also on openSUSE Leap 15.4, not with system gcc 7,
> but with gcc 8 and later.

I wonder what's different for gcc 7.
Perhaps the line table would be informative.
Comment 7 Peter Damianov 2023-12-31 01:46:14 UTC
> I'm not really sure how int3 is supposed to be handled
Perhaps it would be beneficial to state what my preferences would be.
On Windows it is common to use it as a "continuable assertion", the Microsoft compiler has the intrinsic __debugbreak() to generate it, along with clang __builtin_debugtrap(). I expect it to break pointing to the line containing the assertion, if defined in a macro likeso:
#define assert(e) do { if (!e) __asm__("int3") } while (0)

After which "continue" or "step" would behave "as expected"
Maybe there's some cases I haven't considered where this could be problematic, or the issue actually lies with the gcc/clang generated debug info. To be clear, the assertion fails in the same way even if you build the executable using the clang __builtin_debugtrap() intrinsic.

There is a workaround for my case - you can use __builtin_trap, which compiles to "ud2", and then use `j +1` to skip the ud2 when debugging. However, gcc treats code following __builtin_trap() as dead, so it's not always possible to step past the assertion.

Related info that made me think about assert macros more generally:
https://nullprogram.com/blog/2022/06/26/

Some of this is off topic but I figure more info can't hurt.
Comment 8 Tom de Vries 2023-12-31 05:15:59 UTC
(In reply to Tom Tromey from comment #6)
> (In reply to Tom de Vries from comment #5)
> 
> > I managed to reproduce, also on openSUSE Leap 15.4, not with system gcc 7,
> > but with gcc 8 and later.
> 
> I wonder what's different for gcc 7.
> Perhaps the line table would be informative.

The difference is that for gcc 7, start stops at line 4 instead of 5, which I failed to notice.  With an extra step, I also manage to trigger the assertion failure with gcc 7.

FWIW, line table gcc 7:
...
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END 
0      4      0x0000000000400430 0x0000000000400430 Y                    
1      5      0x0000000000400431 0x0000000000400431 Y                    
2      6      0x0000000000400432 0x0000000000400432 Y                    
3      8      0x0000000000400434 0x0000000000400434 Y                    
4      6      0x000000000040043d 0x000000000040043d Y                    
5      8      0x0000000000400440 0x0000000000400440 Y                    
6      6      0x0000000000400445 0x0000000000400445 Y                    
7      10     0x000000000040044a 0x000000000040044a Y                    
8      END    0x000000000040044e 0x000000000040044e Y                    
...
and line table gcc 8:
...
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END 
0      4      0x0000000000400430 0x0000000000400430 Y                    
1      5      0x0000000000400430 0x0000000000400430 Y                    
2      4      0x0000000000400430 0x0000000000400430                      
3      5      0x0000000000400431 0x0000000000400431                      
4      6      0x0000000000400432 0x0000000000400432 Y                    
5      6      0x0000000000400432 0x0000000000400432 Y                    
6      6      0x0000000000400432 0x0000000000400432                      
7      8      0x0000000000400434 0x0000000000400434 Y                    
8      6      0x000000000040043d 0x000000000040043d                      
9      8      0x0000000000400440 0x0000000000400440                      
10     6      0x0000000000400445 0x0000000000400445                      
11     10     0x000000000040044a 0x000000000040044a                      
12     10     0x000000000040044d 0x000000000040044d                      
13     END    0x000000000040044e 0x000000000040044e Y                    
...
Comment 9 Tom Tromey 2023-12-31 20:02:27 UTC
(In reply to Peter0x44 from comment #7)
> > I'm not really sure how int3 is supposed to be handled
> Perhaps it would be beneficial to state what my preferences would be.

Yes, thanks.

I dug into the code a bit and, first of all, gdb already does
support this situation.  It's called a "permanent breakpoint"
in the implementation.

There's even a test for this, gdb.arch/i386-bp_permanent.exp.
However, that tests "continue", not stepping.

I think perhaps the issue is that we reach the end of handle_signal_stop
with:

(top-gdb) p random_signal
$30 = 0

and in particular maybe:

  /* If not, perhaps stepping/nexting can.  */
  if (random_signal)
    random_signal = !(ecs->event_thread->stop_signal () == GDB_SIGNAL_TRAP
		      && currently_stepping (ecs->event_thread));

here the call to currently_stepping returns true.

Maybe this should be checking the PC and setting random_signal if it
is out of range?
Or maybe the earlier code that calls gdbarch_program_breakpoint_here_p
should set some sort of local flag, so that this can check it?

I don't really know infrun all that well and so I'm not sure how this
is intended to work.  I wonder if there are other scenarios to be
concerned with.
Comment 10 Tom Tromey 2023-12-31 20:45:03 UTC
This works but I have no idea if it is correct.

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 1d863896c40..c9c16001f82 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -7115,6 +7115,7 @@ handle_signal_stop (struct execution_control_state *ecs)
 
   /* Maybe this was a trap for a software breakpoint that has since
      been removed.  */
+  bool program_bp = false;
   if (random_signal && target_stopped_by_sw_breakpoint ())
     {
       if (gdbarch_program_breakpoint_here_p (gdbarch,
@@ -7123,6 +7124,7 @@ handle_signal_stop (struct execution_control_state *ecs)
 	  struct regcache *regcache;
 	  int decr_pc;
 
+	  program_bp = true;
 	  /* Re-adjust PC to what the program would see if GDB was not
 	     debugging it.  */
 	  regcache = get_thread_regcache (ecs->event_thread);
@@ -7161,7 +7163,8 @@ handle_signal_stop (struct execution_control_state *ecs)
   /* If not, perhaps stepping/nexting can.  */
   if (random_signal)
     random_signal = !(ecs->event_thread->stop_signal () == GDB_SIGNAL_TRAP
-		      && currently_stepping (ecs->event_thread));
+		      && currently_stepping (ecs->event_thread)
+		      && !program_bp);
 
   /* Perhaps the thread hit a single-step breakpoint of _another_
      thread.  Single-step breakpoints are transparent to the
Comment 11 Peter Damianov 2024-01-07 22:03:10 UTC
A current workaround for this issue is to put a nop after the int3.
I guess for this case, rip is still going to correspond to the same source line, so the assertion isn't a problem
Changing the assert macro definition to:

#define assert(e) do { if (!(e)) __asm__("int3; nop"); } while(0)

has exactly the result I am looking for. I suppose this confirms the issue is that rip is "off by one"?
Comment 12 Tom Tromey 2024-01-10 18:43:31 UTC
(In reply to Peter0x44 from comment #11)

> has exactly the result I am looking for. I suppose this confirms the issue
> is that rip is "off by one"?

I think it's more that gdb just doesn't remember that it is stepping
off a permanent breakpoint, and then is confused when the line number
changes.
Comment 13 Tom Tromey 2024-02-08 19:10:32 UTC
FWIW my naive patch regresses an existing test.

FAIL: gdb.arch/amd64-disp-step.exp: add into rax: send_signal=off: continue to test_rip_rax
FAIL: gdb.arch/amd64-disp-step.exp: add into rax: send_signal=off: continue to test_rip_rax_end
FAIL: gdb.arch/amd64-disp-step.exp: add into rax: send_signal=off: verify_regs: rax expected value
FAIL: gdb.arch/amd64-disp-step.exp: continue to test_int3_end