Bug 25215

Summary: systemtap probes appear to break armhf gdb with arm64 kernel
Product: gdb Reporter: Michael Hudson-Doyle <michael.hudson>
Component: gdbAssignee: Sergio Durigan Junior <sergiodj>
Status: WAITING ---    
Severity: normal CC: fche, luis.machado, mark, michael.hudson, sergiodj
Priority: P2    
Version: 8.2   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed: 2019-11-22 00:00:00

Description Michael Hudson-Doyle 2019-11-22 01:50:40 UTC
This is in an Ubuntu 19.04 armhf container running on an Ubuntu 19.04 arm64 host (it seems to happen in other host/user version combinations too but I haven't been exhaustive):

ubuntu@juju-b11c42-ubuntu-17:~$ lxc exec disco -- bash
groot@disco:~# gdb /bin/true
GNU gdb (Ubuntu 8.2.91.20190405-0ubuntu3) 8.2.91.20190405-git
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /bin/true...
(No debugging symbols found in /bin/true)
(gdb) r
Starting program: /usr/bin/true 

Program received signal SIGSEGV, Segmentation fault.
0xf7fc8ee0 in ?? () from /lib/ld-linux-armhf.so.3

Stripping out the .note.stapsdt notes makes things behave:

root@disco:~# cp /lib/ld-linux-armhf.so.3  .
root@disco:~# objcopy -R .note.stapsdt ld-linux-armhf.so.3 ld-linux-armhf.so.3-nostap
root@disco:~# gdb --args ./ld-linux-armhf.so.3-nostap /bin/true
GNU gdb (Ubuntu 8.2.91.20190405-0ubuntu3) 8.2.91.20190405-git
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./ld-linux-armhf.so.3-nostap...
(No debugging symbols found in ./ld-linux-armhf.so.3-nostap)
(gdb) r
Starting program: /root/ld-linux-armhf.so.3-nostap /bin/true
[Inferior 1 (process 1408) exited normally]
(gdb)
Comment 1 Sergio Durigan Junior 2019-11-22 18:13:26 UTC
Thanks for the bug report.

This does not ring any bells offhand.  Can you provide more details?  A coredump of GDB would be great.  If this is not possible, then I'd appreciate some more detailed instructions on how to reproduce this.

Thanks!
Comment 2 Frank Ch. Eigler 2019-11-22 18:23:01 UTC
Could be worth dumping the .note.stapsdt section here, along with a disassembly of the affected functions (init_start etc.), to confirm that the addresses are computed correctly by gas.

Could be worth getting a native armhf gdb binary to run against the same one, in case there's an arm64-cross-armhf incompatibility.

Could be worth running gdb with more tracing (gdb) set debug target 999
Comment 3 Michael Hudson-Doyle 2019-11-22 23:01:23 UTC
(In reply to Sergio Durigan Junior from comment #1)
> Thanks for the bug report.
> 
> This does not ring any bells offhand.

Oh, er, good.

> Can you provide more details?

> A coredump of GDB would be great.

gdb isn't crashing though. Do you mean just gcore -p gdb after the problem has occurred?

> If this is not possible, then I'd
> appreciate some more detailed instructions on how to reproduce this.

What I did was boot an arm64 vm running ubuntu 19.10 (on our internal cloud, I guess it would be interesting to try this on an amazon a1 instance). Then as root:

lxd init
lxc launch ubuntu-daily:disco disco
lxc exec disco -- sh -c 'apt update && apt install -y gdb'
lxc exec disco -- gdb /bin/true
<type r>
Comment 4 Michael Hudson-Doyle 2019-11-22 23:17:40 UTC
(In reply to Frank Ch. Eigler from comment #2)
> Could be worth dumping the .note.stapsdt section here, along with a
> disassembly of the affected functions (init_start etc.), to confirm that the
> addresses are computed correctly by gas.

root@disco:~# readelf -x .note.stapsdt /lib/ld-linux-armhf.so.3

Hex dump of section '.note.stapsdt':
  0x00000000 08000000 27000000 03000000 73746170 ....'.......stap
  0x00000010 73647400 d2280000 cc7e0100 00000000 sdt..(...~......
  0x00000020 72746c64 00696e69 745f7374 61727400 rtld.init_start.
  0x00000030 2d344023 30203440 72340000 08000000 -4@#0 4@r4......
  0x00000040 2a000000 03000000 73746170 73647400 *.......stapsdt.
  0x00000050 842e0000 cc7e0100 00000000 72746c64 .....~......rtld
  0x00000060 00696e69 745f636f 6d706c65 7465002d .init_complete.-
  0x00000070 34402330 20344072 34000000 08000000 4@#0 4@r4.......
  0x00000080 27000000 03000000 73746170 73647400 '.......stapsdt.
  0x00000090 06430000 cc7e0100 00000000 72746c64 .C...~......rtld
  0x000000a0 006d6170 5f666169 6c656400 2d344072 .map_failed.-4@r
  0x000000b0 33203440 72350000 08000000 26000000 3 4@r5......&...
  0x000000c0 03000000 73746170 73647400 cc460000 ....stapsdt..F..
  0x000000d0 cc7e0100 00000000 72746c64 006d6170 .~......rtld.map
  0x000000e0 5f737461 7274002d 34407233 20344072 _start.-4@r3 4@r
  0x000000f0 35000000 08000000 2e000000 03000000 5...............
  0x00000100 73746170 73647400 c0de0000 cc7e0100 stapsdt......~..
  0x00000110 00000000 72746c64 006d6170 5f636f6d ....rtld.map_com
  0x00000120 706c6574 65002d34 40723320 34407236 plete.-4@r3 4@r6
  0x00000130 20344072 34000000 08000000 28000000  4@r4.......(...
  0x00000140 03000000 73746170 73647400 44df0000 ....stapsdt.D...
  0x00000150 cc7e0100 00000000 72746c64 0072656c .~......rtld.rel
  0x00000160 6f635f73 74617274 002d3440 72332034 oc_start.-4@r3 4
  0x00000170 40723200 08000000 30000000 03000000 @r2.....0.......
  0x00000180 73746170 73647400 9ce10000 cc7e0100 stapsdt......~..
  0x00000190 00000000 72746c64 0072656c 6f635f63 ....rtld.reloc_c
  0x000001a0 6f6d706c 65746500 2d344072 33203440 omplete.-4@r3 4@
  0x000001b0 72322034 40723400 08000000 28000000 r2 4@r4.....(...
  0x000001c0 03000000 73746170 73647400 38ea0000 ....stapsdt.8...
  0x000001d0 cc7e0100 00000000 72746c64 00756e6d .~......rtld.unm
  0x000001e0 61705f73 74617274 002d3440 72352034 ap_start.-4@r5 4
  0x000001f0 40723400 08000000 2b000000 03000000 @r4.....+.......
  0x00000200 73746170 73647400 00ec0000 cc7e0100 stapsdt......~..
  0x00000210 00000000 72746c64 00756e6d 61705f63 ....rtld.unmap_c
  0x00000220 6f6d706c 65746500 2d344072 33203440 omplete.-4@r3 4@
  0x00000230 72340000 08000000 29000000 03000000 r4......).......
  0x00000240 73746170 73647400 3c200100 cc7e0100 stapsdt.< ...~..
  0x00000250 00000000 72746c64 00736574 6a6d7000 ....rtld.setjmp.
  0x00000260 34407230 202d3440 72312034 40723134 4@r0 -4@r1 4@r14
  0x00000270 00000000 08000000 29000000 03000000 ........).......
  0x00000280 73746170 73647400 a8200100 cc7e0100 stapsdt.. ...~..
  0x00000290 00000000 72746c64 006c6f6e 676a6d70 ....rtld.longjmp
  0x000002a0 00344072 30202d34 40723120 34407234 .4@r0 -4@r1 4@r4
  0x000002b0 00000000 08000000 31000000 03000000 ........1.......
  0x000002c0 73746170 73647400 da200100 cc7e0100 stapsdt.. ...~..
  0x000002d0 00000000 72746c64 006c6f6e 676a6d70 ....rtld.longjmp
  0x000002e0 5f746172 67657400 34407230 202d3440 _target.4@r0 -4@
  0x000002f0 72312034 40723134 00000000          r1 4@r14....

Not sure how to disassemble the functions you want so I uploaded the ld-2.29.so (and detached symbols) to:

https://people.canonical.com/~mwh/ld-2.29.so
https://people.canonical.com/~mwh/ld-2.29.so-detached

> Could be worth getting a native armhf gdb binary to run against the same
> one, in case there's an arm64-cross-armhf incompatibility.

This is an armhf gdb binary. Its running in a container on a machine running an arm64 kernel though.

> Could be worth running gdb with more tracing (gdb) set debug target 999

https://paste.ubuntu.com/p/PtBKQwDZTg/
Comment 5 Michael Hudson-Doyle 2019-11-22 23:18:52 UTC
(In reply to Michael Hudson-Doyle from comment #3)
> (In reply to Sergio Durigan Junior from comment #1)
> > Thanks for the bug report.
> > 
> > This does not ring any bells offhand.
> 
> Oh, er, good.
> 
> > Can you provide more details?

Forgot to say here: of course, just let me know how. 

> > A coredump of GDB would be great.
> 
> gdb isn't crashing though. Do you mean just gcore -p gdb after the problem
> has occurred?
> 
> > If this is not possible, then I'd
> > appreciate some more detailed instructions on how to reproduce this.
> 
> What I did was boot an arm64 vm running ubuntu 19.10 (on our internal cloud,
> I guess it would be interesting to try this on an amazon a1 instance). Then
> as root:
> 
> lxd init
> lxc launch ubuntu-daily:disco disco

Correction: lxc launch ubuntu-daily:disco/armhf disco

> lxc exec disco -- sh -c 'apt update && apt install -y gdb'
> lxc exec disco -- gdb /bin/true
> <type r>
Comment 6 Mark Wielaard 2019-11-22 23:24:58 UTC
(In reply to Michael Hudson-Doyle from comment #4)
> (In reply to Frank Ch. Eigler from comment #2)
> > Could be worth dumping the .note.stapsdt section here, along with a
> > disassembly of the affected functions (init_start etc.), to confirm that the
> > addresses are computed correctly by gas.
> 
> root@disco:~# readelf -x .note.stapsdt /lib/ld-linux-armhf.so.3

Try eu-readelf -n which will decode them
Comment 7 Michael Hudson-Doyle 2019-11-22 23:29:44 UTC
(In reply to Mark Wielaard from comment #6)
> (In reply to Michael Hudson-Doyle from comment #4)
> > (In reply to Frank Ch. Eigler from comment #2)
> > > Could be worth dumping the .note.stapsdt section here, along with a
> > > disassembly of the affected functions (init_start etc.), to confirm that the
> > > addresses are computed correctly by gas.
> > 
> > root@disco:~# readelf -x .note.stapsdt /lib/ld-linux-armhf.so.3
> 
> Try eu-readelf -n which will decode them

eu-reroot@disco:~# eu-readelf -n /lib/ld-linux-armhf.so.3

Note section [ 1] '.note.gnu.build-id' of 36 bytes at offset 0x114:
  Owner          Data size  Type
  GNU                   20  GNU_BUILD_ID
    Build ID: 2aadba4a58d4f3052c349d7fa6ac8bd070c75eb4

Note section [21] '.note.stapsdt' of 764 bytes at offset 0x1989c:
  Owner          Data size  Type
  stapsdt               39  Version: 3
    PC: 0x28d2, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: init_start, Args: '-4@#0 4@r4'
  stapsdt               42  Version: 3
    PC: 0x2e84, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: init_complete, Args: '-4@#0 4@r4'
  stapsdt               39  Version: 3
    PC: 0x4306, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: map_failed, Args: '-4@r3 4@r5'
  stapsdt               38  Version: 3
    PC: 0x46cc, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: map_start, Args: '-4@r3 4@r5'
  stapsdt               46  Version: 3
    PC: 0xdec0, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: map_complete, Args: '-4@r3 4@r6 4@r4'
  stapsdt               40  Version: 3
    PC: 0xdf44, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: reloc_start, Args: '-4@r3 4@r2'
  stapsdt               48  Version: 3
    PC: 0xe19c, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: reloc_complete, Args: '-4@r3 4@r2 4@r4'
  stapsdt               40  Version: 3
    PC: 0xea38, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: unmap_start, Args: '-4@r5 4@r4'
  stapsdt               43  Version: 3
    PC: 0xec00, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: unmap_complete, Args: '-4@r3 4@r4'
  stapsdt               41  Version: 3
    PC: 0x1203c, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: setjmp, Args: '4@r0 -4@r1 4@r14'
  stapsdt               41  Version: 3
    PC: 0x120a8, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: longjmp, Args: '4@r0 -4@r1 4@r4'
  stapsdt               49  Version: 3
    PC: 0x120da, Base: 0x17ecc, Semaphore: 0
    Provider: rtld, Name: longjmp_target, Args: '4@r0 -4@r1 4@r14'
Comment 8 Michael Hudson-Doyle 2019-11-22 23:32:52 UTC
Also, the process is segfaulting for what looks like a valid reason:

root@disco:~# gdb --args /bin/true
GNU gdb (Ubuntu 8.2.91.20190405-0ubuntu3) 8.2.91.20190405-git
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /bin/true...
(No debugging symbols found in /bin/true)
(gdb) r
Starting program: /usr/bin/true 

Program received signal SIGSEGV, Segmentation fault.
0xf7fc8ee0 in ?? () from /lib/ld-linux-armhf.so.3
(gdb) x/i $pc
=> 0xf7fc8ee0:	ldr.w	r3, [r8]
(gdb) p $r8
$1 = 0

But trying to work out why $r8 is 0 didn't go so well:

(gdb) disassemble 
No function contains program counter for selected frame.
(gdb) disassemble $pc-12,$pc+12
Dump of assembler code from 0xf7fc8ed4 to 0xf7fc8eec:
   0xf7fc8ed4:	ldr.w	r8, [r2, #428]	; 0x1ac
   0xf7fc8ed8:	add.w	r5, r5, #608	; 0x260
   0xf7fc8edc:	mov	r4, r3
   0xf7fc8ede:	mov	r6, r2
=> 0xf7fc8ee0:	ldr.w	r3, [r8]
   0xf7fc8ee4:	cbz	r3, 0xf7fc8eec
   0xf7fc8ee6:	movs	r1, #0
   0xf7fc8ee8:	mov	r0, r5
   0xf7fc8eea:	blx	r3
End of assembler dump.
(gdb) br *0xf7fc8ed4
Breakpoint 1 at 0xf7fc8ed4
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/bin/true 
�&��K{Dh���1������uFF��: Assertion `&��K{Dh���1������uFF��' failed!
[Inferior 1 (process 2440) exited with code 0177]
(gdb)
Comment 9 Luis Machado 2022-10-27 23:42:11 UTC
Is this https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192?

If so, it's been fixed in Ubuntu.

Please let us know if you still see it.