Bug 29342

Summary: RISC-V 32: disassembly mishandles negative symbols
Product: binutils Reporter: H. Peter Anvin <hpa>
Component: binutilsAssignee: Not yet assigned to anyone <unassigned>
Severity: normal CC: nelsonc1225, research_trasio
Priority: P2    
Version: 2.38   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: highsym.s test case
highsym.elf test case (compiled)

Description H. Peter Anvin 2022-07-08 23:00:56 UTC
It is common on embedded systems to put I/O registers in the high half of the address space; on RISC-V it is particularly desirable to put address space for I/O devices which only need a limited number of addresses in the high 2K, which means they can be addressed using negative offsets from (zero), avoiding the need for a base pointer.

However, objdump disassembly shows addresses in the upper half of the address space as offsets from the highest-addressed symbol (possibly due to incorrectly treating them as 64-bit numbers before searching?)

The result means disassembly is needlessly hard to read:

        iobase = 0xffffff00

        .globl IOREG_FOO
IOREG_FOO       = iobase
        .globl IOREG_BAR
IOREG_BAR       = iobase + 4

        .section ".text","ax"
        .globl _start
        .if iobase >= 0xfffff800
        sw a0, IOREG_FOO(zero)
        la t0, IOREG_FOO
        sw a0, (t0)

riscv32-unknown-elf-as -march=rv32i -o highsym.o highsym.s
riscv32-unknown-elf-ld -o highsym.elf highsym.o
riscv32-unknown-elf-objdump -d highsym.elf
highsym.elf:     file format elf32-littleriscv

Disassembly of section .text:

00010074 <_start>:
   10074:       f0a02023                sw      a0,-256(zero) # ffffff00 <IOREG_BAR+0xfffffffc>
   10078:       00008067                ret
Comment 1 H. Peter Anvin 2022-07-08 23:01:30 UTC
Created attachment 14198 [details]
highsym.s test case
Comment 2 H. Peter Anvin 2022-07-08 23:02:06 UTC
Created attachment 14199 [details]
highsym.elf test case (compiled)
Comment 3 Tsukasa OI 2022-07-28 17:52:13 UTC
I think I found a cause. Testing...
Comment 4 Tsukasa OI 2022-07-29 13:16:27 UTC
Posted a patchset:

In my environment, disassembler dumps of your ELF file (highsym.elf) and self-compiled version seem fixed.

$ # binutils configuration (partial): --target=riscv64-unknown-elf --enable-multilib
$ riscv64-unknown-elf-as -march=rv32i -o highsym.o highsym.s
$ riscv64-unknown-elf-ld -m elf32lriscv -o highsym.elf highsym.o
$ riscv64-unknown-elf-objdump -d highsym.elf
Comment 5 Nelson Chu 2022-09-02 07:55:28 UTC
Marked as resolved and fixed since the following commit,

commit 48525554d5222d98953202b9252ff65fdead58a4
Refs: gdb-12-branchpoint-1830-g48525554d52
Author:     Tsukasa OI <research_trasio@irq.a4lg.com>
AuthorDate: Sat Aug 27 00:11:00 2022 +0000
Commit:     Nelson Chu <nelson@rivosinc.com>
CommitDate: Fri Sep 2 12:06:27 2022 +0800

    RISC-V: PR29342, Fix RV32 disassembler address computation

    If either the base register is `zero', `tp' or `gp' and XLEN is 32, an
    incorrectly sign-extended address is produced when printing.  This commit
    fixes this by fitting an address into a 32-bit value on RV32.

    Besides, H. Peter Anvin discovered that we have wrong address computation
    for JALR instruction (the initial bug is back in 2018).  This commit also
    fixes that based on the idea of Palmer Dabbelt.

            * testsuite/gas/riscv/lla32.d: Reflect RV32 address computation fix.
            * testsuite/gas/riscv/dis-addr-overflow.s: New testcase.
            * testsuite/gas/riscv/dis-addr-overflow-32.d: Likewise.
            * testsuite/gas/riscv/dis-addr-overflow-64.d: Likewise.
            * riscv-dis.c (maybe_print_address): Fit address into 32-bit on RV32.
            (print_insn_args): Fix JALR address by adding EXTRACT_ITYPE_IMM.