Bug 27584

Summary: nm riscv: Suppress empty name symbols unless --special-syms?
Product: binutils Reporter: Fangrui Song <i>
Component: binutilsAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: nelson.chu, nelsonc1225
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target: riscv*-*-*
Build: Last reconfirmed:

Description Fangrui Song 2021-03-15 23:06:04 UTC
On ARM, mapping symbols are suppressed by default.

% arm-linux-gnueabi-gcc-nm test.o                                      
00000000 T _start
% arm-linux-gnueabi-gcc-nm --special-syms test.o
00000000 t $a.0
00000008 t $d.1
00000000 T _start

RISC-V needs many empty name symbols for -mrelax. Should such symbols be suppressed as well?

% riscv64-linux-gnu-nm test.o
0000000000000002 t 
0000000000000002 t 
0000000000000002 t 
0000000000000004 t 
0000000000000008 t 
0000000000000008 t 
000000000000000a t 
0000000000000012 t 
000000000000001a t 
000000000000001a t 
0000000000000000 N 
0000000000000015 N 
000000000000001c N 
0000000000000023 N 
000000000000002a N 
0000000000000000 N 
000000000000001a t 
0000000000000000 N .Lline_table_start0
0000000000000002 T _start
Comment 1 Nelson Chu 2021-03-16 01:15:35 UTC
Umm that's weird.  Is it convenient for you to provide the C or assembly code?  Thanks.
Comment 2 Fangrui Song 2021-03-16 04:07:40 UTC
You can try any C file. Due to label differences, there are always lots of STB_LOCAL STT_NOTYPE symbols. It seems that GCC uses .L0 while clang uses an empty name.

The question is whether such symbols should be treated similar to arm $a/$d/$t and suppressed in normal nm output (can be displayed with --special-syms).
Comment 3 Nelson Chu 2021-03-16 04:45:07 UTC
(In reply to Fangrui Song from comment #2)
> You can try any C file. Due to label differences, there are always lots of
> STB_LOCAL STT_NOTYPE symbols. It seems that GCC uses .L0 while clang uses an
> empty name.

OK, I didn't consider clang's behavior, so it makes sense to me now.  The .L0 (or empty name for clang) are used to mark the high part AUIPC for the low part instructions, so even if we disable the relax by -mno-relax, they probably still be there.

> The question is whether such symbols should be treated similar to arm
> $a/$d/$t and suppressed in normal nm output (can be displayed with
> --special-syms).

Sounds good to me.  Arm defines their mapping symbols in the bfd_is_arm_special_symbol_name of bfd/cpu-arm.c, so we can define a new name for those .L0 symbols (or keep the old one), then do the similar things as ARM did.

Do you have any suggestions about the new symbol names (or keep the old one)?  I don't have strong opinion about this.  I agree with you that they can be suppressed, the idea is good.
Comment 4 Andreas Schwab 2021-03-16 08:41:04 UTC
I think it would generally be useful to add an option to omit local .L symbols from both nm and objdump output, including disassembler output.
Comment 5 Nelson Chu 2021-04-13 08:12:56 UTC
(In reply to Andreas Schwab from comment #4)
> I think it would generally be useful to add an option to omit local .L
> symbols from both nm and objdump output, including disassembler output.

Thanks Andreas, this looks good.
Comment 6 Nelson Chu 2021-04-15 02:44:12 UTC
This should be fixed in the FSF mainline, so marked as Resolved/Fixed.

Copy the example from llvm, https://reviews.llvm.org/D98669
$ cat tmp.s
.globl foo
foo:
  nop
  .file 1 "/tmp" "a.s"
  .loc 1 1 0
  nop

.section .debug_line,"",@progbits
$ llvm-mc -filetype=obj -triple=riscv64 tmp.s -o tmp.o

Use the old nm,
$ riscv64-unknown-elf-nm tmp.o
0000000000000004 t
0000000000000000 T foo

Use the mainline nm,
$ riscv64-unknown-elf-nm tmp.o
0000000000000000 T foo
$ riscv64-unknown-elf-nm --special-sym tmp.o
0000000000000004 t
0000000000000000 T foo
Comment 7 Nelson Chu 2021-04-15 02:48:58 UTC
$ cat tmp.s
foo:
        lla     a0, foo
$ riscv64-unknown-elf-as tmp.s -o tmp-gnu.o
$ riscv64-unknown-elf-nm tmp-gnu.o
0000000000000000 t foo
riscv64-unknown-elf-nm --special-syms tmp-gnu.o
0000000000000000 t .L0
0000000000000000 t foo