Bug 30090 - Using TUI on musl libc with Rust program, finish failed a bounds check
Summary: Using TUI on musl libc with Rust program, finish failed a bounds check
Status: RESOLVED FIXED
Alias: None
Product: gdb
Classification: Unclassified
Component: rust (show other bugs)
Version: 12.1
: P2 normal
Target Milestone: 14.1
Assignee: Tom Tromey
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-06 22:29 UTC by Alex Martin
Modified: 2023-02-27 18:20 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2023-02-09 00:00:00


Attachments
Rust Cargo binary crate triggering the bug (tar archive) (5.01 KB, application/x-tar)
2023-02-09 01:32 UTC, Alex Martin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Martin 2023-02-06 22:29:16 UTC
gdbtypes.h:1064: internal-error: field: Assertion `idx >= 0 && idx < num_fields ()' failed.

While trying to print Rust function return value from finish command while in TUI default "asm" mode in GDB 12.1 on musl libc 1.2.3 (both Alpine builds). I have attached a core dump.
Comment 1 Alex Martin 2023-02-06 22:31:42 UTC
Okay, well, I *have* a core dump, but it seems to be too large to attach. I can email it to anyone who wants it.
Comment 2 Tom Tromey 2023-02-07 18:12:29 UTC
Probably the thing to examine is the return type of the function in question.

This is the same symptom as bug #29985 but I wouldn't assume it has the
same cause.

Also I wonder whether the bug can be reproduced without the TUI.

The core dump probably is not useful unless you have a gdb built
with debugging and are willing to examine some stuff for us.
Comment 3 Alex Martin 2023-02-07 22:32:01 UTC
Oh this is an incredibly predictable question and yet I'm not sure anymore which function it was... I *think* it was `gtk::glib::object::ObjectExt::connect()` whose return type is defined as:

    pub struct SignalHandlerId(NonZeroU64);

But it also might have been `gtk::Prelude::BuilderExtManual::object::<GObject>()` whose return type would be `Option<GObject>`.

(Yes, GObject is also involved here, but I don't *think* gdb can see it from where I was.)
Comment 4 Alex Martin 2023-02-07 22:48:48 UTC
Unfortunately it's been a few days since this happened as I was waiting for an account registration, I don't think the code has changed too much so I might still be able to reproduce...

Yes, I was able to reproduce, and without the TUI. It's `object::<GObject>()`, so the return type is supposed to be `Option<GObject>`, where `GObject` is a wrapper type produced by a 650-line macro[1].

On the instruction it returns to, I'm able to `print $rax as GObject` and it works perfectly.

[1] https://gtk-rs.org/gtk-rs-core/stable/latest/docs/src/glib/object.rs.html#802-1467
Comment 5 Tom Tromey 2023-02-08 04:32:26 UTC
Based on a cursory look at the macro, it doesn't seem like
it could be the same as the other bug.
Do you know any way I could reproduce the bug?
Comment 6 Alex Martin 2023-02-08 04:41:14 UTC
I'll try to reduce an example tomorrow.
Comment 7 Tom Tromey 2023-02-08 23:50:43 UTC
(In reply to Alex Martin from comment #6)
> I'll try to reduce an example tomorrow.

Thank you.
It's fine if it is some github thing I can build with cargo.
Comment 8 Alex Martin 2023-02-09 01:32:52 UTC
Created attachment 14661 [details]
Rust Cargo binary crate triggering the bug (tar archive)
Comment 9 Alex Martin 2023-02-09 01:39:51 UTC
I've attached a tar archive with a minimal example.

    $ cargo build
    $ gdb target/debug/gdb30090
    (gdb) break main.rs:12
    (gdb) run
    (gdb) ni # Until reaching call ... gtk..builder..BuilderExtManual$GT6object which should be the second call instruction after the break.
    (gdb) si
    (gdb) finish # Should crash

Note that this issue occurs for me, but I use Alpine, which links all its binaries with musl libc. Configuration is included in the .cargo/ directory to work around a musl-specific issue with the Rust GTK bindings (without this workaround the binary segfaults before we can get to the GDB issue).
Comment 10 Alex Martin 2023-02-09 01:40:48 UTC
Oh, the program makes a relative reference to Window.xml, so be in the gdb30090/ directory when you run it.
Comment 11 Alex Martin 2023-02-09 02:03:17 UTC
I got a GDB from Guix and had Guix build the example program while I was at it to get glibc versions of everything, and it reproduced under those conditions, so hopefully it will work on your machine.
Comment 12 Tom Tromey 2023-02-09 17:22:39 UTC
Thank you very much.  I can reproduce the problem here.
It is definitely not the same as #29985.

What's happening here is some confusion about field numbers
in rust_language::print_enum.  It might possibly be
specific to the 'finish' path, which is pretty surprising if true.
Comment 13 Tom Tromey 2023-02-09 17:49:40 UTC
I can solve the crash pretty easily but my test case is printing
the wrong result for the 'finish'.  So I think there's a deeper
problem than the Rust value-printing code.
Comment 14 Tom Tromey 2023-02-09 18:39:31 UTC
Ok, I think I'll just have to xfail the actual result here.
Basically, gdb has to rely on programs using the platform ABI
to extract return values.  However, my test function doesn't
seem to be doing this.

amd64-tdep.c thinks my type should be returned entirely in $rax, but
really it's being returned in a combo of $eax and $edx.

(gdb) disassemble return_some
Dump of assembler code for function _ZN6finish11return_some17hc6e727e1fe11047cE:
   0x000055555555c400 <+0>:	push   %rax
   0x000055555555c401 <+1>:	movl   $0x61,0x4(%rsp)
   0x000055555555c409 <+9>:	movl   $0x1,(%rsp)
   0x000055555555c410 <+16>:	mov    (%rsp),%eax
   0x000055555555c413 <+19>:	mov    0x4(%rsp),%edx
^^^ here you can see eax/edx being set.
   0x000055555555c417 <+23>:	pop    %rcx
   0x000055555555c418 <+24>:	ret

(gdb) info regist
rax            0x1                 1
rbx            0x1                 1
rcx            0x6100000001        416611827713
rdx            0x61                97

The caller does:

   0x000055555555c420 <+0>:	sub    $0x18,%rsp
   0x000055555555c424 <+4>:	call   0x55555555c400 <_ZN6finish11return_some17hc6e727e1fe11047cE>
=> 0x000055555555c429 <+9>:	mov    %eax,0x8(%rsp)
   0x000055555555c42d <+13>:	mov    %edx,0xc(%rsp)

.. i.e., you can see it moving out of edx.

Now, arguably this is maybe some reading of the ABI.
amd64-tdep knows that rdx is the second integer register.
But, it seems to me that since the entire structure is 8
bytes, it should probably all be stuffed into rax.
Comment 15 Tom Tromey 2023-02-09 18:58:36 UTC
I remembered there's a bug where this is being discussed:
https://github.com/rust-lang/rust/issues/85641
Comment 17 Sourceware Commits 2023-02-27 18:19:44 UTC
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=debd0556e519c3d258299cf5f14a44cc01c795da

commit debd0556e519c3d258299cf5f14a44cc01c795da
Author: Tom Tromey <tom@tromey.com>
Date:   Thu Feb 9 12:12:42 2023 -0700

    Fix crash with "finish" in Rust
    
    PR rust/30090 points out that a certain "finish" in a Rust program
    will cause gdb to crash.  This happens due to some confusion about
    field indices in rust_language::print_enum.  The fix is to use
    value_primitive_field so that the correct type can be passed; other
    spots in rust-lang.c already do this.
    
    Note that the enclosed test case comes with an xfail.  This is needed
    because for this function, rustc doesn't follow the platform ABI.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30090
Comment 18 Tom Tromey 2023-02-27 18:20:31 UTC
Fixed.