Bug 29259

Summary: ld -r may create reloc sections with unordered relocs again
Product: binutils Reporter: Tomoaki Kawada <kawada>
Component: ldAssignee: Alan Modra <amodra>
Status: RESOLVED FIXED    
Severity: normal CC: akaher, nickc
Priority: P2    
Version: 2.39   
Target Milestone: 2.39   
Host: Target:
Build: Last reconfirmed: 2022-06-18 00:00:00
Attachments: Proposed patch

Description Tomoaki Kawada 2022-06-17 02:42:52 UTC
Created attachment 14149 [details]
Proposed patch

The optimization to the relocation entry sorting algorithm introduced in
commit bca6d0e31 can lead to incorrect (unsorted) outputs in rare
circumstances.

The symptoms include a link warning about `.eh_frame_hdr` not being created
(similar to bug #16345) and a C++ exception causing abort instead of being
caught by an exception handler.

Reproducer:

    #!/bin/sh
    cat > test.ld <<EOF
    SECTIONS {
        .text : {
            KEEP(*(.text.1));
            KEEP(*(.text.4));
            KEEP(*(.text.3));
            KEEP(*(.text.2));
        }
    }
    EOF
    ${AS:-as} -o test.o <<EOF
        .text
    .L1:
        .word 0
        .section ".text.1", "ax", %progbits
    .L2:
        .dc.a .L1
        .section ".text.2", "ax", %progbits
        .dc.a .L1
        .section ".text.3", "ax", %progbits
        .dc.a .L1
        .section ".text.4", "ax", %progbits
        .dc.a .L1
    EOF
    ${LD:-ld} -o test2.o test.o -r -q -T test.ld
    ${READELF:-readelf} -r test2.o
        
The above script outputs reloc entries out-of-order:

    Relocation section '.rela.text' at offset 0x118 contains 4 entries:
      Offset          Info           Type           Sym. Value    Sym. Name + Addend
    000000000000  000100000001 R_X86_64_64       0000000000000000 .text + 20
    000000000010  000100000001 R_X86_64_64       0000000000000000 .text + 20
    000000000008  000100000001 R_X86_64_64       0000000000000000 .text + 20
    000000000018  000100000001 R_X86_64_64       0000000000000000 .text + 20

The problem is that the current algorithm incorrectly assembles "runs" from
unsorted entries and inserted them to an already-sorted prefix, breaking the
loop invariants of insertion sort. The attached patch fixes the problem by
adding a check to ensure that only sorted entries are included in the runs.
Comment 1 Sourceware Commits 2022-06-18 10:43:50 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=fba1ac87dcb76e61f270d236f1e7c8aaec80f9c4

commit fba1ac87dcb76e61f270d236f1e7c8aaec80f9c4
Author: Tomoaki Kawada <kawada@kmckk.co.jp>
Date:   Thu Jun 16 09:54:30 2022 +0000

    Fix the sorting algorithm for reloc entries
    
    The optimized insertion sort algorithm in `elf_link_adjust_relocs`
    incorrectly assembled "runs" from unsorted entries and inserted them to an
    already-sorted prefix, breaking the loop invariants of insertion sort.
    This commit updates the run assembly loop to break upon encountering a
    non-monotonic change in the sort key.
    
            PR 29259
    bfd/
            * elflink.c (elf_link_adjust_relocs): Ensure run being inserted
            is sorted.
    ld/
            * testsuite/ld-elf/pr29259.d,
            * testsuite/ld-elf/pr29259.s,
            * testsuite/ld-elf/pr29259.t: New test.
Comment 2 Alan Modra 2022-06-18 10:46:13 UTC
Fixed for 2.39
Comment 3 Ajay Kaher 2023-09-19 09:56:25 UTC
Thanks for this fix.

Initially we observed ~10% degradation to compiling Linux Kernel with gcc-12.x as compare to gcc-10.x. During investigation we found ~5% is because of binutils and with git bisect end up with following commit:

https://github.com/bminor/binutils-gdb/commit/fba1ac87dcb76e61f270d236f1e7c8aaec80f9c4 

Could this problem be fixed (in some other way) without any degradation?

-Ajay
Comment 4 Nick Clifton 2023-09-27 10:15:32 UTC
(In reply to Ajay Kaher from comment #3)
> Initially we observed ~10% degradation to compiling Linux Kernel 

I assume that you are talking about the time-taken-to-compile-the-kernel as opposed to the performance-speed-of-the-compiled-and-linked-kernel ?

> with
> gcc-12.x as compare to gcc-10.x. During investigation we found ~5% is
> because of binutils and with git bisect end up with following commit:
> 
> https://github.com/bminor/binutils-gdb/commit/
> fba1ac87dcb76e61f270d236f1e7c8aaec80f9c4 
> 
> Could this problem be fixed (in some other way) without any degradation?

I think that the short answer is "no" since this is an area of the code where a lot of care has to be taken to get things in the right order.  The answer could be "yes", but I suspect that would involve a complete rewrite of a lot of the linker, something that is bound to cause a lot more problems.

Sorry but I think that in this case you are going to have choose between a slower but better able to handle special cases linker (2.39 or later) and a faster but breaks under certain circumstances linker (2.38 or earlier).
Comment 5 Ajay Kaher 2023-09-28 06:40:02 UTC
Thanks Nick for your response.

Yes your assumption is correct, it's for time-taken-to-compile-the-kernel. Due to this patch complier-servers are taking ~5% more time.

Would like to know what are the rare/certain circumstances/condition where we could have side effect of not taking this patch.

-Ajay
Comment 6 Nick Clifton 2023-09-28 09:56:15 UTC
(In reply to Ajay Kaher from comment #5)
> Would like to know what are the rare/certain circumstances/condition where
> we could have side effect of not taking this patch.

Well the good news is that the if you remove the patch the linker will not silently start producing bad code.  Instead, if the circumstances are triggered you will start getting warning messages from the linker about .eh_frame_hdr not being created.  So you could look out for these and decide if you need to reintroduce the patch.  (Or use the --fatal-warnings command line option to force the linker to stop).

As for the actual circumstances which might trigger the problem - the cause is when you have a linker script that specifies the order of individual input sections that does not match the order of those sections in the input file.  See the description for an example of this.  The input file has the sections .text.1, .text.2, .text.3 and .text.4 in that order whereas the linker script places them into the output section in the order .text.1, .text.4, .text.3, .text.2.

So - if you are not using a linker script to order specific input sections, you should be OK.