| Summary: | m68k: Removal of ELF_DURING_STARTUP optimization broke ld.so | ||
|---|---|---|---|
| Product: | glibc | Reporter: | James Le Cuirot <chewi> |
| Component: | dynamic-link | Assignee: | Adhemerval Zanella <adhemerval.zanella> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | dilfridge, fweimer, maskray, toolchain |
| Priority: | P2 | Flags: | fweimer:
security-
|
| Version: | 2.35 | ||
| Target Milestone: | 2.36 | ||
| Host: | m68k-unknown-linux-gnu | Target: | m68k-unknown-linux-gnu |
| Build: | m68k-unknown-linux-gnu | Last reconfirmed: | |
| Project(s) to access: | ssh public key: | ||
| Attachments: |
avoid switch statement when performing relocations
handle less relocation types during ld.so bootstrap |
||
|
Description
James Le Cuirot
2022-04-17 16:55:02 UTC
@ Adhemerval: is that commit safe to (temporarily) revert for all arches? Can you provide a reproduce? I cannot reproduce the failure at master 23102686ec67b856a2d4fd25ddaa1c0b8d175c4f (2022-04-15) or release/2.35/master ba9c42ac0e265bf1e4ec1075fa20e7166fda8bfc (2022-04-15). The output looks like: ``` % m68k-linux-gnu-gcc --version m68k-linux-gnu-gcc (Debian 11.2.0-10) 11.2.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. % mkdir -p out/m68k && cd out/m68k % ../../configure --prefix=/tmp/glibc/m68k && make -j 50 && make -j 50 install && 'cp' -f /usr/m68k-linux-gnu/lib/libgcc_s.so.2 /tmp/glibc/m68k/lib ... % /tmp/glibc/m68k/lib/ld-linux-x86-64.so.2 --version ld.so (GNU libc) development release version 2.35.9000. Copyright (C) 2022 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. % /tmp/glibc/m68k/bin/localedef --version localedef (GNU libc) 2.35.9000 Copyright (C) 2022 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Ulrich Drepper. ``` Sorry, the command should specify --host. I confirm the breakage. ../../configure --prefix=/tmp/glibc/m68k --host=m68k-glibc-linux-gnu CC=m68k-linux-gnu-gcc CXX=m68k-linux-gnu-g++ && make -j 50 && make -j 50 install && 'cp' -f /usr/m68k-linux-gnu/lib/libgcc_s.so.2 /tmp/glibc/m68k/lib /tmp/glibc/m68k/lib/ld.so.1 does crash and reverting f64f4ce069300f33e26b025ebb0233d5ca3957a5 fixes it. Unfortunately I don't have a m68k machine to debug this. % /tmp/glibc/m68k/lib/ld.so.1 /tmp/glibc/m68k/lib/ld.so.1: missing program name Try '/tmp/glibc/m68k/lib/ld.so.1 --help' for more information. I think the issue is pre-existing. `elf: Assume disjointed .rela.dyn and .rela.plt for loader` just tickled the compiler to generate the code which breaks. The m68k ld.so runs the following instructions in order. switch (r_type) # load jump table from GOT to fp. note the GOT entry hasn't been relocated by a R_68K_RELATIVE below 15738: 2c6d 08d4 moveal %a5@(2260),%fp *reloc_addr = l_addr + reloc->r_addend; # process relative relocations 1576a: 2784 c800 movel %d4,%a3@(0000000000000000,%a4:l) switch (r_type) # load a value from the jump table 157c8: 3c36 6800 movew %fp@(0000000000000000,%d6:l),%d6 We are(In reply to Fangrui Song from comment #4) > I think the issue is pre-existing. `elf: Assume disjointed .rela.dyn and > .rela.plt for loader` just tickled the compiler to generate the code which > breaks. > The m68k ld.so runs the following instructions in order. > > switch (r_type) # load jump table from GOT to fp. note the GOT entry > hasn't been relocated by a R_68K_RELATIVE below > 15738: 2c6d 08d4 moveal %a5@(2260),%fp > > *reloc_addr = l_addr + reloc->r_addend; # process relative relocations > 1576a: 2784 c800 movel %d4,%a3@(0000000000000000,%a4:l) > > switch (r_type) # load a value from the jump table > 157c8: 3c36 6800 movew %fp@(0000000000000000,%d6:l),%d6 We are most likely hitting a compiler bug here, unrolling the loops fixes the issue: diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h index 25dd7ca4f2..28cb25bdcd 100644 --- a/elf/dynamic-link.h +++ b/elf/dynamic-link.h @@ -114,13 +114,18 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], } \ } \ \ - for (int ranges_index = 0; ranges_index < 2; ++ranges_index) \ - elf_dynamic_do_##reloc ((map), scope, \ - ranges[ranges_index].start, \ - ranges[ranges_index].size, \ - ranges[ranges_index].nrelative, \ - ranges[ranges_index].lazy, \ - skip_ifunc); \ + elf_dynamic_do_##reloc ((map), scope, \ + ranges[0].start, \ + ranges[0].size, \ + ranges[0].nrelative, \ + ranges[0].lazy, \ + skip_ifunc); \ + elf_dynamic_do_##reloc ((map), scope, \ + ranges[1].start, \ + ranges[1].size, \ + ranges[1].nrelative, \ + ranges[1].lazy, \ + skip_ifunc); \ } while (0) # if ELF_MACHINE_NO_REL || ELF_MACHINE_NO_RELA It does increase the loader size, though. The compiler is perfectly entitled to reorder the insns as above. It cannot know that it is in a special environment where the GOT isn't relocated yet. Created attachment 14069 [details]
avoid switch statement when performing relocations
Please try this patch.
Created attachment 14070 [details]
handle less relocation types during ld.so bootstrap
Please try this instead.
Unrolling this may mitigate the issue likely because the optimization is related to loop-invariant code motion. gcc somehow decides hoisting the GOT address of the jump table is beneficial (for such a non-PI_STATIC_AND_HIDDEN arch). musl ld.so uses a function call as a barrier (__dls2) to avoid such code generation problems. (In reply to Andreas Schwab from comment #8) > Created attachment 14070 [details] > handle less relocation types during ld.so bootstrap > > Please try this instead. Thanks. This looks good to me. `case R_68K_NONE` can also be placed in `#ifdef RTLD_BOOTSTRAP` if it helps reduce the possibility that gcc uses a jump table (which needs a relocated address in a GOT entry). I can confirm that the patch works here. Many thanks for your speedy fix. (In reply to James Le Cuirot from comment #10) > I can confirm that the patch works here. Many thanks for your speedy fix. Thanks for testing. Pushed https://sourceware.org/git/?p=glibc.git;a=commit;h=a8e9b5b8079d18116ca69c9797e77804ecf2ee7e to master and cherry picked it into release/2.35/master This is fixed, but I don't have permission to change the status. Thanks! Marking as fixed then. |