Bug 29272 - regression: previously OK rust tests crash with SIGILL/SEGV/ABRT on Debian armhf
Summary: regression: previously OK rust tests crash with SIGILL/SEGV/ABRT on Debian armhf
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: gdb (show other bugs)
Version: 11.2
: P2 normal
Target Milestone: ---
Assignee: Luis Machado
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-06-21 09:42 UTC by infinity0
Modified: 2022-11-11 12:48 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-10-21 00:00:00


Attachments
rustc debuginfo test sample #1 (930 bytes, text/rust)
2022-06-21 09:42 UTC, infinity0
Details

Note You need to log in before you can comment on or make changes to this bug.
Description infinity0 2022-06-21 09:42:13 UTC
Created attachment 14158 [details]
rustc debuginfo test sample #1

1. Compile the attached test file `rustc -g associated-types.rs`.
2. Run `gdb -x dbg.script ./associated-types`, dbg.script as follows:

~~~~
set charset UTF-8
show version
add-auto-load-safe-path /home/infinity0/rustc/./src/etc
set print pretty off
directory /home/infinity0/rustc/./src/etc
file /home/infinity0/rustc/build/armv7-unknown-linux-gnueabihf/test/debuginfo/associated-types.gdb/a
set language rust
break 'associated-types.rs':111
break 'associated-types.rs':118
break 'associated-types.rs':122
break 'associated-types.rs':130
break 'associated-types.rs':137
break 'associated-types.rs':140
run
print arg
continue
print inferred
print explicitly
continue
print arg
continue
print arg
continue
print a
print b
continue
print a
print b
continue
quit
~~~~

This works for all rustc versions (I was able to test 1.13 - 1.59) on gdb 10 but fails with SIGILL on gdb 11.2 armhf Debian.

Other rustc debuginfo tests fail with other signals, SIGSEGV, SIGABRT, etc. More specific details here: https://github.com/rust-lang/rust/issues/96983
Comment 1 infinity0 2022-06-21 09:56:02 UTC
Whoops, I copied a dbg.script with local paths. Just delete those lines and make sure you have the source file associated-types.rs in the current directory, the reproduction still works. You also need to give `RUSTC_BOOTSTRAP=` when compiling `rustc -g` as the file uses some unstable features only meant for testing the rustc compiler.

~~~~
set charset UTF-8
show version
set print pretty off
set language rust
break 'associated-types.rs':111
break 'associated-types.rs':118
break 'associated-types.rs':122
break 'associated-types.rs':130
break 'associated-types.rs':137
break 'associated-types.rs':140
run
print arg
continue
print inferred
print explicitly
continue
print arg
continue
print arg
continue
print a
print b
continue
print a
print b
continue
quit
~~~~
Comment 2 infinity0 2022-06-23 11:02:08 UTC
> `RUSTC_BOOTSTRAP=`

Whoops, this should be `RUSTC_BOOTSTRAP=1`.

Same issue still exists with gdb 12.1 on Debian armhf.
Comment 3 infinity0 2022-06-23 11:10:39 UTC
Switching off ASLR with `setarch -R` and forcing single-threaded mode with `taskset -c 0` has no effect on the bug.
Comment 4 Luis Machado 2022-10-21 10:54:28 UTC
Sorry for the delayed reply.

I managed to reproduce this on Ubuntu 22.04 with rustc 1.58.1, but I get a SIGSEGV. On Ubuntu 20.04, with rustc 1.57, I get a SIGILL. The gdb's are the same, top-of-trunk.

This is an issue with displaced stepping in the Arm port of GDB. If you disable it (set displaced-stepping off), the test runs fine.

I'll investigate this.
Comment 5 Luis Machado 2022-10-25 13:03:05 UTC
I have a WIP fix. Should hopefully be able to put it on the ML soon.
Comment 6 Luis Machado 2022-10-31 10:16:26 UTC
Tentative patch: https://sourceware.org/pipermail/gdb-patches/2022-October/193162.html
Comment 7 Sourceware Commits 2022-11-11 12:47:47 UTC
The master branch has been updated by Luis Machado <luisgpm@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=1e5ccb9c5ff4fd8ade4a8694676f99f4abf2d679

commit 1e5ccb9c5ff4fd8ade4a8694676f99f4abf2d679
Author: Luis Machado <luis.machado@arm.com>
Date:   Tue Oct 25 11:01:32 2022 +0100

    Make sure a copy_insn_closure is available when we have a match in copy_insn_closure_by_addr
    
    PR gdb/29272
    
    Investigating PR29272, it was mentioned a particular test used to work on
    GDB 10, but it started failing with GDB 11 onwards. I tracked it down to
    some displaced stepping improvements on commit
    187b041e2514827b9d86190ed2471c4c7a352874.
    
    In particular, one of the corner cases using copy_insn_closure_by_addr got
    silently broken. It is hard to spot because it doesn't have any good tests
    for it, and the situation is quite specific to the Arm target.
    
    Essentially, the change from the displaced stepping improvements made it so
    we could still invoke copy_insn_closure_by_addr correctly to return the
    pointer to a copy_insn_closure, but it always returned nullptr due to
    the order of the statements in displaced_step_buffer::prepare.
    
    The way it is now, we first write the address of the displaced step buffer
    to PC and then save the copy_insn_closure pointer.
    
    The problem is that writing to PC for the Arm target requires figuring
    out if the new PC is thumb mode or not.
    
    With no copy_insn_closure data, the logic to determine the thumb mode
    during displaced stepping doesn't work, and gives random results that
    are difficult to track (SIGILL, SIGSEGV etc).
    
    Fix this by reordering the PC write in displaced_step_buffer::prepare
    and, for safety, add an assertion to
    displaced_step_buffer::copy_insn_closure_by_addr so GDB stops right
    when it sees this invalid situation. If this gets broken again in the
    future, it will be easier to spot.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29272
    
    Approved-By: Simon Marchi <simon.marchi@efficios.com>
Comment 8 Luis Machado 2022-11-11 12:48:40 UTC
Fixed. Please reopen if you see any issues.