Bug 27491 - ld: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' can not be used when making a shared object
Summary: ld: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' ...
Status: RESOLVED NOTABUG
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 27495
Blocks:
  Show dependency treegraph
 
Reported: 2021-03-01 08:05 UTC by Fangrui Song
Modified: 2021-03-02 19:27 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*
Build:
Last reconfirmed: 2021-03-02 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fangrui Song 2021-03-01 08:05:06 UTC
% cat x.s
.weak __start_xx
.weak __stop_xx

.global _start
_start:
  movq __start_xx@gotpcrel(%rip), %rdi
  movq __stop_xx@gotpcrel(%rip), %rsi

.section xx,"a",unique,0
.byte 0

.section xx,"a",unique,1
.byte 1

% as x.s -o x.o

# With an ld newer than 2021-03-01
% ./ld-new x.o --gc-sections --print-gc-sections -z start-stop-gc -shared                    
./ld-new: removing unused section 'xx' in file 'x.o'
./ld-new: removing unused section 'xx' in file 'x.o'
./ld-new: x.o: relocation R_X86_64_PC32 against undefined protected symbol `__start_xx' can not be used when making a shared object
./ld-new: final link failed: bad value


I suspect GOTPCRELX optimization is performed while it shouldn't.

aarch64 ld works without a diagnostic.
Comment 1 Alan Modra 2021-03-02 08:05:30 UTC
powerpc64le on a similar testcase with
_start:
  pld 3,__start_xx@got@pcrel
  pld 4,__stop_xx@got@pcrel
does a similar optimisation without complaint, generating a binary that performs two pla instructions setting both registers to zero pc-relative (which isn't zero when the shared library is loaded somewhere other than address zero).  Technically that is wrong since the value of an undefined weak symbol is zero, and so the got entries should be left in order to load an absolute zero.  But it might be reasonable in this case since you expect __start_xx and __stop_xx to have an address somewhere in the executable.  However, I think I'll fix this.

In contrast
 .weak xxx
 .global _start
_start:
  pld 3,xxx@got@pcrel
leaves the pld alone when xxx is undefined, loading the reg with zero from the got entry.
Comment 2 H.J. Lu 2021-03-02 13:02:48 UTC
-z start-stop-gc is a bad idea.  LLVM should use __gc_start/__gc_stop.
Comment 3 Fangrui Song 2021-03-02 18:36:31 UTC
(In reply to H.J. Lu from comment #2)
> -z start-stop-gc is a bad idea.  LLVM should use __gc_start/__gc_stop.

I think the history is:

* In 2006-10, [BZ3400](https://sourceware.org/bugzilla/show_bug.cgi?id=3400) reported that stdio flushing did not work with static linking && `--gc-sections`.
* In 2010-01, the problem was re-raised and gold got the rule (<https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f1ec9ded5c740c22735843025e5d3a8ff4c4079e>).
* [PR11133#c13](https://sourceware.org/bugzilla/show_bug.cgi?id=11133#c13) installed a rule for GNU ld, but it did not appear to work or did not work with other translation units.
* In 2015-10, the rule was properly installed ([PR19161](https://sourceware.org/bugzilla/show_bug.cgi?id=19161) [PR19167](https://sourceware.org/bugzilla/show_bug.cgi?id=19167)).

I don't know what happened to the 2010-01 patch. It likely did not let __start_ reference retain magic sections in other translation unit (in metadata section usage __start_/__stop_ references are always in a separate runtime file), so "__start_ retaining all magic sections" was not effective before 2015-10.  Why do we need new start/stop symbols? The orphan __start_/__stop_ usage worked before 2015-10 and will work with -z start-stop-gc.

(I think of the possibility enabling -z start-stop-gc by default because the rule was essentially invalid before 2015-10.)




As is, all clang -fprofile-generate and -fprofile-instr-generate instrumented text sections cannot be GCed. (Fuchsia folks measured that GC can reduce binary size by 28%.). Below is the story:

A function references its `__llvm_prf_cnts` and may reference its `__llvm_prf_data` if value profiling applies.
The `__llvm_prf_data` references the text section, the associated `__llvm_prf_cnts` and the associated `__llvm_prf_vals`.

If `__start_` references retain the `__llvm_prf_data` and the `__llvm_prf_cnts`, the text section will be retained as well.
With GNU ld and gold all instrumented text sections cannot be discarded.

Another example is https://clang.llvm.org/docs/SanitizerCoverage.html#pc-table: "Note: this instrumentation might be incompatible with dead code stripping (-Wl,-gc-sections) for linkers other than LLD, thus resulting in a significant binary size overhead. For more information, see Bug 34636."

I envision new metadata section usage will appear and those sections will need a way for GC.

If people are interested a longer analysis, please check out https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
Comment 4 H.J. Lu 2021-03-02 19:27:49 UTC
(In reply to Fangrui Song from comment #3)
>
> I don't know what happened to the 2010-01 patch. It likely did not let
> __start_ reference retain magic sections in other translation unit (in
> metadata section usage __start_/__stop_ references are always in a separate
> runtime file), so "__start_ retaining all magic sections" was not effective
> before 2015-10.  Why do we need new start/stop symbols? The orphan
> __start_/__stop_ usage worked before 2015-10 and will work with -z
> start-stop-gc.

You got it wrong.  --gc-section never worked correctly before all these
fixed applied.

> (I think of the possibility enabling -z start-stop-gc by default because the
> rule was essentially invalid before 2015-10.)
>

LLVM requires a different __start_/__stop_ semantic, which isn't wrong
by itself.  But it shouldn't hijack the current one.