Bug 23244 - RISC-V 64 relocation truncated to fit in case of undefined weak references
Summary: RISC-V 64 relocation truncated to fit in case of undefined weak references
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Jim Wilson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-29 05:59 UTC by Sebastian Huber
Modified: 2018-06-03 22:45 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2018-05-30 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian Huber 2018-05-29 05:59:52 UTC
The following test program

void f(void) __attribute__((__weak__));

void _start(void)
{
        if (f != 0) {
                f();
                f();
        }
}

leads to

riscv64-elf-gcc weakref.c -T weakref.ld -mcmodel=medany
ld: weakref.o: in function `.L0 ':
weakref.c:(.text+0x12): relocation truncated to fit: R_RISCV_CALL against undefined symbol `f'
collect2: error: ld returned 1 exit status

using this linker script

ENTRY(_start)
SECTIONS {
        .text 0x90000000 : {
                *(.text*)
        }
}
Comment 1 Jim Wilson 2018-05-30 00:25:47 UTC
I see the problem.  This is working for 32-bit targets, and failing for 64-bit targets.  For a 32-bit target, the offset ends up as 0x70000000 which when added to 0x90000000 via auipc gives zero.  But for a 64-bit target, the addition gives 0x100000000 which is not the right address.  We can only address +/- 0x80000000 from the current PC, which means address zero is not in range.

The other part of this problem is that relaxation is deliberately disabled for weak functions, as the address might change later when another object file or library is added in.

I think the only easy workaround at the moment would be to use riscv32 instead of riscv64.

I think we would need a new code model to make this work for riscv64.

Or alternatively compile with -fpie, which then indirects via the GOT which will be in range, but requires dynamic linker support.

There might be a more complex way to fix this by changing how the R_RISCV_CALL relocation works, so that it writes both instruction opcodes and fields, in which case we can handle a zero address directly, or alternatively we can relax it multiple times.  I'm not sure if this can work, and may not have time to try it for a while.
Comment 2 Sebastian Huber 2018-05-30 05:17:22 UTC
Thanks for your analysis.

Calling a weakly undefined function is undefined behaviour. Would it be possible to replace the call to zero with a call to the current PC (infinite loop) or a nop?

On the ARM architecture it looks like a nop is created:

        .type   _start, %function
_start:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        push    {r3, lr}
        movw    r3, #:lower16:f
        movt    r3, #:upper16:f
        cbz     r3, .L1
        bl      f
        bl      f
.L1:
        pop     {r3, pc}
        .size   _start, .-_start
        .weak   f

Final executable code (Thumb-2):

Disassembly of section .text:

90000000 <_start>:
90000000:       b508            push    {r3, lr}
90000002:       f240 0300       movw    r3, #0
90000006:       f2c0 0300       movt    r3, #0
9000000a:       b11b            cbz     r3, 90000014 <_start+0x14>
9000000c:       f3af 8000       nop.w
90000010:       f3af 8000       nop.w
90000014:       bd08            pop     {r3, pc}
90000016:       bf00            nop
Comment 3 Jim Wilson 2018-05-30 17:16:45 UTC
On Wed, 2018-05-30 at 05:17 +0000, sebastian.huber@embedded-brains.de
wrote:
> Calling a weakly undefined function is undefined behaviour. Would it
> be
> possible to replace the call to zero with a call to the current PC
> (infinite
> loop) or a nop?

Yes, this is what I alluded to at the end, changing the behavior of the
R_RISCV_CALL relocation.  We have a register hard wired to zero, so we
don't need to replace the instruction with a nop, we can just change it
to use an x0 + 0 address.  That will make it easier to understand what
is going on when looking at disassembly.

Thanks for the pointer to arm behavior.  I will look at the arm port
and see if I can do something similar in the RISC-V port.  There is a
gcc unwind bug I need to try to fix first, and then I will get back to
this.
Comment 4 Palmer Dabbelt 2018-05-31 09:22:35 UTC
I fixed up a similar problem for PCREL_* relocations in this commit

    https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b1308d2c3749cc454f00b70768ee33724d919527

A similar technique might be possible for R_RISCV_CALL, we could convert it to an internal R_RISCV_CALL_ABS relocation and then generate a lui-based sequence.
Comment 5 Jim Wilson 2018-05-31 15:36:04 UTC
On Thu, 2018-05-31 at 09:22 +0000, palmer at gcc dot gnu.org wrote:
> A similar technique might be possible for R_RISCV_CALL, we could
> convert it to
> an internal R_RISCV_CALL_ABS relocation and then generate a lui-based 
> sequence.

For jalr we can just use x0 for zero-based addressing.  I have an
untested prototype patch that modifies the jalr instruction to use x0
and sets the relocation value to the auipc address, so the offset comes
out as zero.  I don't touch the auipc.  This is just 13 lines of code. 

An R_RISCV_CALL_ABS relocation might be cleaner, but may end up as a
larger patch, and we don't need the generality as we only need support
for a single value, 0.  Zero-page addressing is already handled by
relaxation, it is only weak undef 0 that we need to handle here.
Comment 6 Sourceware Commits 2018-06-03 22:43:58 UTC
The master branch has been updated by Jim Wilson <wilson@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cf7a5066b92552b62ca4d247c241a19d1a6e599d

commit cf7a5066b92552b62ca4d247c241a19d1a6e599d
Author: Jim Wilson <jimw@sifive.com>
Date:   Sun Jun 3 15:42:29 2018 -0700

    RISC-V: Handle out-of-range calls to undefined weak.
    
    	bfd/
    	PR ld/23244
    	* elfnn-riscv.c (riscv_elf_relocate_section) <R_RISCV_CALL>: Check
    	for and handle an undefined weak with no PLT.
    
    	ld/
    	* testsuite/ld-riscv-elf/ld-riscv-elf.exp: Run new weak ref tests.
    	* testsuite/ld-riscv-elf/weakref.ld: New.
    	* testsuite/ld-riscv-elf/weakref32.d: New.
    	* testsuite/ld-riscv-elf/weakref32.s: New.
    	* testsuite/ld-riscv-elf/weakref64.d: New.
    	* testsuite/ld-riscv-elf/weakref64.s: New.
Comment 7 Jim Wilson 2018-06-03 22:45:33 UTC
Fixed on mainline.