Bug 27491 - ld: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' can not be used when making a shared object
Summary: ld: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' ...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.38
: P2 normal
Target Milestone: 2.38
Assignee: H.J. Lu
URL:
Keywords:
Depends on: 27495
Blocks:
  Show dependency treegraph
 
Reported: 2021-03-01 08:05 UTC by Fangrui Song
Modified: 2021-12-02 11:57 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*
Build:
Last reconfirmed: 2021-03-02 00:00:00


Attachments
A patch (369 bytes, patch)
2021-12-01 05:37 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2021-03-01 08:05:06 UTC
% cat x.s
.weak __start_xx
.weak __stop_xx

.global _start
_start:
  movq __start_xx@gotpcrel(%rip), %rdi
  movq __stop_xx@gotpcrel(%rip), %rsi

.section xx,"a",unique,0
.byte 0

.section xx,"a",unique,1
.byte 1

% as x.s -o x.o

# With an ld newer than 2021-03-01
% ./ld-new x.o --gc-sections --print-gc-sections -z start-stop-gc -shared                    
./ld-new: removing unused section 'xx' in file 'x.o'
./ld-new: removing unused section 'xx' in file 'x.o'
./ld-new: x.o: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' can not be used when making a shared object
./ld-new: final link failed: bad value


I suspect GOTPCRELX optimization is performed while it shouldn't.

aarch64 ld works without a diagnostic.
Comment 1 Alan Modra 2021-03-02 08:05:30 UTC
powerpc64le on a similar testcase with
_start:
  pld 3,__start_xx@got@pcrel
  pld 4,__stop_xx@got@pcrel
does a similar optimisation without complaint, generating a binary that performs two pla instructions setting both registers to zero pc-relative (which isn't zero when the shared library is loaded somewhere other than address zero).  Technically that is wrong since the value of an undefined weak symbol is zero, and so the got entries should be left in order to load an absolute zero.  But it might be reasonable in this case since you expect __start_xx and __stop_xx to have an address somewhere in the executable.  However, I think I'll fix this.

In contrast
 .weak xxx
 .global _start
_start:
  pld 3,xxx@got@pcrel
leaves the pld alone when xxx is undefined, loading the reg with zero from the got entry.
Comment 2 H.J. Lu 2021-03-02 13:02:48 UTC
-z start-stop-gc is a bad idea.  LLVM should use __gc_start/__gc_stop.
Comment 3 Fangrui Song 2021-03-02 18:36:31 UTC
(In reply to H.J. Lu from comment #2)
> -z start-stop-gc is a bad idea.  LLVM should use __gc_start/__gc_stop.

I think the history is:

* In 2006-10, [BZ3400](https://sourceware.org/bugzilla/show_bug.cgi?id=3400) reported that stdio flushing did not work with static linking && `--gc-sections`.
* In 2010-01, the problem was re-raised and gold got the rule (<https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f1ec9ded5c740c22735843025e5d3a8ff4c4079e>).
* [PR11133#c13](https://sourceware.org/bugzilla/show_bug.cgi?id=11133#c13) installed a rule for GNU ld, but it did not appear to work or did not work with other translation units.
* In 2015-10, the rule was properly installed ([PR19161](https://sourceware.org/bugzilla/show_bug.cgi?id=19161) [PR19167](https://sourceware.org/bugzilla/show_bug.cgi?id=19167)).

I don't know what happened to the 2010-01 patch. It likely did not let __start_ reference retain magic sections in other translation unit (in metadata section usage __start_/__stop_ references are always in a separate runtime file), so "__start_ retaining all magic sections" was not effective before 2015-10.  Why do we need new start/stop symbols? The orphan __start_/__stop_ usage worked before 2015-10 and will work with -z start-stop-gc.

(I think of the possibility enabling -z start-stop-gc by default because the rule was essentially invalid before 2015-10.)




As is, all clang -fprofile-generate and -fprofile-instr-generate instrumented text sections cannot be GCed. (Fuchsia folks measured that GC can reduce binary size by 28%.). Below is the story:

A function references its `__llvm_prf_cnts` and may reference its `__llvm_prf_data` if value profiling applies.
The `__llvm_prf_data` references the text section, the associated `__llvm_prf_cnts` and the associated `__llvm_prf_vals`.

If `__start_` references retain the `__llvm_prf_data` and the `__llvm_prf_cnts`, the text section will be retained as well.
With GNU ld and gold all instrumented text sections cannot be discarded.

Another example is https://clang.llvm.org/docs/SanitizerCoverage.html#pc-table: "Note: this instrumentation might be incompatible with dead code stripping (-Wl,-gc-sections) for linkers other than LLD, thus resulting in a significant binary size overhead. For more information, see Bug 34636."

I envision new metadata section usage will appear and those sections will need a way for GC.

If people are interested a longer analysis, please check out https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
Comment 4 H.J. Lu 2021-03-02 19:27:49 UTC
(In reply to Fangrui Song from comment #3)
>
> I don't know what happened to the 2010-01 patch. It likely did not let
> __start_ reference retain magic sections in other translation unit (in
> metadata section usage __start_/__stop_ references are always in a separate
> runtime file), so "__start_ retaining all magic sections" was not effective
> before 2015-10.  Why do we need new start/stop symbols? The orphan
> __start_/__stop_ usage worked before 2015-10 and will work with -z
> start-stop-gc.

You got it wrong.  --gc-section never worked correctly before all these
fixed applied.

> (I think of the possibility enabling -z start-stop-gc by default because the
> rule was essentially invalid before 2015-10.)
>

LLVM requires a different __start_/__stop_ semantic, which isn't wrong
by itself.  But it shouldn't hijack the current one.
Comment 5 Fangrui Song 2021-12-01 01:15:35 UTC
(In reply to H.J. Lu from comment #4)
> (In reply to Fangrui Song from comment #3)
> >
> > I don't know what happened to the 2010-01 patch. It likely did not let
> > __start_ reference retain magic sections in other translation unit (in
> > metadata section usage __start_/__stop_ references are always in a separate
> > runtime file), so "__start_ retaining all magic sections" was not effective
> > before 2015-10.  Why do we need new start/stop symbols? The orphan
> > __start_/__stop_ usage worked before 2015-10 and will work with -z
> > start-stop-gc.
> 
> You got it wrong.  --gc-section never worked correctly before all these
> fixed applied.

> > (I think of the possibility enabling -z start-stop-gc by default because the
> > rule was essentially invalid before 2015-10.)
> >
> 
> LLVM requires a different __start_/__stop_ semantic, which isn't wrong
> by itself.  But it shouldn't hijack the current one.

On Mach-O, ld64 define section$start$__DATA$__data and section$end$__DATA$__data which are similar to the ELF encapsulation symbols __start_/__stop_. ld64's behavior has always been similar to ld.lld -z start-stop-gc.

Wow, really unfortunate that gold broke the nice GC semantics in 2010-01 and GNU ld in 2015-10.
It'd be really nice to fix the issue as this made x86-64 -z start-stop-gc not usable for -pie / -shared.

All of Clang PGO/SanitizerCoverage/source based Coverage documentation have noted that --gc-sections may not work well <--- hope they can stop hinting the users this way one day.
Comment 6 Fangrui Song 2021-12-01 01:19:50 UTC
reopen
Comment 7 H.J. Lu 2021-12-01 05:37:51 UTC
Created attachment 13811 [details]
A patch

Try this.
Comment 8 Fangrui Song 2021-12-01 06:05:53 UTC
(In reply to H.J. Lu from comment #7)
> Created attachment 13811 [details]
> A patch
> 
> Try this.

Thanks! I applied it locally and the original example works.
Adding `.section xx,"aR",unique,2; .byte 2` works, too.
Comment 9 Sourceware Commits 2021-12-02 11:56:35 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=794f2bba0f338d467bbf9c55b5aba415ecd5e138

commit 794f2bba0f338d467bbf9c55b5aba415ecd5e138
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Dec 1 04:55:24 2021 -0800

    x86: Skip __[start|stop]_SECNAME for --gc-sections -z start-stop-gc
    
    Don't convert memory load to immediate load on __start_SECNAME and
    __stop_SECNAME for --gc-sections -z start-stop-gc if all SECNAME
    sections been garbage collected.
    
    bfd/
    
            PR ld/27491
            * elf32-i386.c (elf_i386_convert_load_reloc): Skip __start_SECNAME
            and __stop_SECNAME for --gc-sections -z start-stop-gc if the input
            section been garbage collected.
            * elf64-x86-64.c (elf_x86_64_convert_load_reloc): Likewise.
            * elfxx-x86.h (elf_x86_start_stop_gc_p): New function.
    
    ld/
            PR ld/27491
            * testsuite/ld-i386/i386.exp: Run PR ld/27491 tests.
            * testsuite/ld-x86-64/x86-64.exp: Likewise.
            * testsuite/ld-i386/pr27491-1.s: New file.
            * testsuite/ld-i386/pr27491-1a.d: Likewise.
            * testsuite/ld-i386/pr27491-1b.d: Likewise.
            * testsuite/ld-i386/pr27491-1c.d: Likewise.
            * testsuite/ld-i386/pr27491-2.d: Likewise.
            * testsuite/ld-i386/pr27491-2.s: Likewise.
            * testsuite/ld-i386/pr27491-3.d: Likewise.
            * testsuite/ld-i386/pr27491-3.s: Likewise.
            * testsuite/ld-i386/pr27491-4.d: Likewise.
            * testsuite/ld-i386/pr27491-4a.s: Likewise.
            * testsuite/ld-i386/pr27491-4b.s: Likewise.
            * testsuite/ld-x86-64/pr27491-1.s: Likewise.
            * testsuite/ld-x86-64/pr27491-1a.d: Likewise.
            * testsuite/ld-x86-64/pr27491-1b.d: Likewise.
            * testsuite/ld-x86-64/pr27491-1c.d: Likewise.
            * testsuite/ld-x86-64/pr27491-2.d: Likewise.
            * testsuite/ld-x86-64/pr27491-2.s: Likewise.
            * testsuite/ld-x86-64/pr27491-3.d: Likewise.
            * testsuite/ld-x86-64/pr27491-3.s: Likewise.
            * testsuite/ld-x86-64/pr27491-4.d: Likewise.
            * testsuite/ld-x86-64/pr27491-4a.s: Likewise.
            * testsuite/ld-x86-64/pr27491-4b.s: Likewise.
Comment 10 H.J. Lu 2021-12-02 11:57:22 UTC
Fixed for 2.38.