Bug 31115 - [ARM] The minimalistic DWARF DIE for function has wrong address in Thumb mode
Summary: [ARM] The minimalistic DWARF DIE for function has wrong address in Thumb mode
Status: ASSIGNED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.39
: P2 normal
Target Milestone: ---
Assignee: Nick Clifton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-05 23:33 UTC by Thiago Jung Bauermann
Modified: 2023-12-11 17:15 UTC (History)
1 user (show)

See Also:
Host:
Target: armv8l-linux-gnueabihf
Build:
Last reconfirmed: 2023-12-06 00:00:00


Attachments
Proposed patch (1.16 KB, patch)
2023-12-06 14:23 UTC, Nick Clifton
Details | Diff
ELF files demonstrating the issue (4.43 KB, application/gzip)
2023-12-09 16:58 UTC, Thiago Jung Bauermann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thiago Jung Bauermann 2023-12-05 23:33:51 UTC
Commit 591cc9fbbfd6 ("gas/Dwarf: record functions") introduced
minimalistic DWARF symbols for functions which have their size specified.

Unfortunately in Arm Thumb mode the minimalistic symbol has the LSB bit
set. This is indeed the convention for ELF symbols to denote a function
using Thumb instructions but isn't used for DWARF symbols, where
DW_AT_low_pc should contain the actual function address.

This causes a failure in GDB testcase gdb.arch/pr25124.exp:

(gdb) x /i main+8
   0x10521 <main+7>:	vrhadd.u16	d14, d14, d31
(gdb) FAIL: gdb.arch/pr25124.exp: disassemble thumb instruction (1st try)

Whereas when using a gas version not affected by the bug:

(gdb) x /i main+8
   0x10520 <main+8>:	bx	lr
(gdb) PASS: gdb.arch/pr25124.exp: disassemble thumb instruction (1st try)
Comment 1 Nick Clifton 2023-12-06 14:23:22 UTC
Created attachment 15240 [details]
Proposed patch

Hi Thiago,

  Please could you try out this patch and let me know if it works for you ?

Cheers
  Nick
Comment 2 Thiago Jung Bauermann 2023-12-06 17:45:01 UTC
(In reply to Nick Clifton from comment #1)
> Created attachment 15240 [details]
> Proposed patch
> 
> Hi Thiago,
> 
>   Please could you try out this patch and let me know if it works for you ?
> 
> Cheers
>   Nick

Hello Nick,

Thank you for the quick response! I tested the patch, but unfortunately the DIE still has the LSB bit set in DW_AT_low_pc, and GDB still fails:

 <1><1fe>: Abbrev Number: 2 (DW_TAG_subprogram)
    <1ff>   DW_AT_name        : (strp) (offset: 0x47a): main
    <203>   DW_AT_external    : (flag_present) 1
    <203>   DW_AT_type        : (ref_udata) <0x209>
    <204>   DW_AT_low_pc      : (addr) 0x10517
    <208>   DW_AT_high_pc     : (udata) 12


(gdb) x /i main+8
   0x1051f <main+7>:	b.n	0x10c62
(gdb) FAIL: gdb.arch/pr25124.exp: disassemble thumb instruction (1st try)

Without the patch, DW_AT_low_pc had value 0x10519:

$ diff -U 4 main-86b775c51597/readelf-w.out patch-6eab43ba8bd7/readelf-w.out 
--- main-86b775c51597/readelf-w.out	2023-12-04 14:48:06.217429953 -0300
+++ patch-6eab43ba8bd7/readelf-w.out	2023-12-06 14:37:58.742262164 -0300
@@ -316,9 +316,9 @@
  <1><1fe>: Abbrev Number: 2 (DW_TAG_subprogram)
     <1ff>   DW_AT_name        : (strp) (offset: 0x47a): main
     <203>   DW_AT_external    : (flag_present) 1
     <203>   DW_AT_type        : (ref_udata) <0x209>
-    <204>   DW_AT_low_pc      : (addr) 0x10519
+    <204>   DW_AT_low_pc      : (addr) 0x10517
     <208>   DW_AT_high_pc     : (udata) 12
  <1><209>: Abbrev Number: 3 (DW_TAG_unspecified_type)
  <1><20a>: Abbrev Number: 0
   Compilation Unit @ offset 0x20b:
Comment 3 Nick Clifton 2023-12-07 14:43:31 UTC
Hi THiago,

(In reply to Thiago Jung Bauermann from comment #2)
 
> Thank you for the quick response! I tested the patch, but unfortunately the
> DIE still has the LSB bit set in DW_AT_low_pc, and GDB still fails:

Hmm, I must have missed something.  Please can you upload a small test case to reproduce the failure ?

Cheers
  Nick
Comment 4 Thiago Jung Bauermann 2023-12-09 16:58:18 UTC
Created attachment 15249 [details]
ELF files demonstrating the issue

The testcase I'm using is GDB's pr25124.S, built with an
arm-linux-gnueabihf toolchain:

$ cat ~/src/binutils-gdb/gdb/testsuite/gdb.arch/pr25124.S
/* Test proper disassembling of ARM thumb instructions when reloading a symbol
   file.

   Copyright 2012-2023 Free Software Foundation, Inc.

   This file is part of GDB.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

	.syntax unified
	.thumb
	.text
	.p2align 2
	.global	main
	.thumb
	.thumb_func
	.type main, %function
main:
	bx	pc
	nop
.code 32
	mov	r0, #0
	bx	lr
	.size	main, .-main
	.section	.note.GNU-stack,"",%progbits
$ gcc -g -o pr25124 ~/src/binutils-gdb/gdb/testsuite/gdb.arch/pr25124.S
$ gdb pr25124
Reading symbols from pr25124...
(gdb) x/i main
   0x103e5 <main>:      bx      pc
(gdb) x/i main+8
   0x103ed <main+7>:    vrhadd.u16      d14, d14, d31
(gdb) quit

Interestingly, as can be seen above "x/i main" actually works fine.
it's "x/i main+8" that breaks. This is a detail that I just noticed.

There's another thing I just discovered : I can reproduce GDB's bad
behavior on an ELF executable (produced by GCC from the .S file), but
not on a .o file produced directly by gas:

$ as -g -o pr25124.o ~/src/binutils-gdb/gdb/testsuite/gdb.arch/pr25124.S
$ gdb pr25124.o
Reading symbols from pr25124.o...
(gdb) x/i main
   0x0 <main>:  bx      pc
(gdb) x/i main+8
   0x8 <main+8>:        bx      lr
(gdb) quit

Both the executable and the .o file have the LSB bit set in main's DW_AT_low_pc:

Object file:

 <1><26>: Abbrev Number: 2 (DW_TAG_subprogram)
    <27>   DW_AT_name        : (strp) (offset: 0x97): main
    <2b>   DW_AT_external    : (flag) 1
    <2c>   DW_AT_low_pc      : (addr) 0x1
    <30>   DW_AT_high_pc     : (addr) 0xd

Executable:

 <1><24>: Abbrev Number: 2 (DW_TAG_subprogram)
    <25>   DW_AT_name        : (strp) (offset: 0x97): main
    <29>   DW_AT_external    : (flag_present) 1
    <29>   DW_AT_type        : (ref_udata) <0x2f>
    <2a>   DW_AT_low_pc      : (addr) 0x103e5
    <2e>   DW_AT_high_pc     : (udata) 12
 <1><2f>: Abbrev Number: 3 (DW_TAG_unspecified_type)
 <1><30>: Abbrev Number: 0

So for some reason GDB is fine with an object file containing the wrong
DW_AT_low_pc in main's DIE, but not when it's with a "full blown"
executable.

I'm attaching the files I generated with a toolchain that uses binutils
from the commit immediately before the one that introduced this
behavior, and with a toolchain that uses a recent commit from trunk.

Any toolchain from a distro that uses binutils >= 2.39 should be enough
to demonstrate the problem. I was able to check with Ubuntu 23.04's
gcc-arm-linux-gnueabihf for example, which uses binutils 2.40.
Comment 5 Nick Clifton 2023-12-11 17:15:28 UTC
(In reply to Thiago Jung Bauermann from comment #4)
Hi Thiago,

> $ gcc -g -o pr25124 ~/src/binutils-gdb/gdb/testsuite/gdb.arch/pr25124.S
> $ gdb pr25124
> Reading symbols from pr25124...
> (gdb) x/i main
>    0x103e5 <main>:      bx      pc
> (gdb) x/i main+8
>    0x103ed <main+7>:    vrhadd.u16      d14, d14, d31
> (gdb) quit
> 
> Interestingly, as can be seen above "x/i main" actually works fine.

Except that the address is displayed with the bottom bit set, which
might be confusing the readers.  Maybe.

> it's "x/i main+8" that breaks. This is a detail that I just noticed.
> 
> There's another thing I just discovered : I can reproduce GDB's bad
> behavior on an ELF executable (produced by GCC from the .S file), but
> not on a .o file produced directly by gas:

Or one that is produced by using gcc to invoke just gas.  ie compiling with "-c".
 
> $ as -g -o pr25124.o ~/src/binutils-gdb/gdb/testsuite/gdb.arch/pr25124.S
> $ gdb pr25124.o
> Reading symbols from pr25124.o...
> (gdb) x/i main
>    0x0 <main>:  bx      pc
> (gdb) x/i main+8
>    0x8 <main+8>:        bx      lr
> (gdb) quit

Interestingly my patch makes things even worse for the fully linked executable:

  $ gdb pr25124.with-nicks-patch 
  GNU gdb (GDB) Fedora Linux 13.2-6.fc38
  [...]
  (gdb) x/i main
     0x8243 <_start+158>:	movs	r0, r0
  (gdb) x/i main+8
     0x824b <main+7>:	b.n	0x898e

So please consider it withdrawn.


> Both the executable and the .o file have the LSB bit set in main's
> DW_AT_low_pc:

> So for some reason GDB is fine with an object file containing the wrong
> DW_AT_low_pc in main's DIE, but not when it's with a "full blown"
> executable.

My guess - totally unproven - is that GDB has special code to mask thumb
addresses for object files, but for some reason this code is not applied
to linked executables.

I do wonder if this is something that GDB has already partially fixed, and
maybe this fix needs to be extended...

Anyway I am investigating gas some more...