Bug 27495 - -z start_stop_gc isn't compatible with static glibc
Summary: -z start_stop_gc isn't compatible with static glibc
Status: NEW
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.37
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 11133 19161 19167 20022 21562 27491
  Show dependency treegraph
 
Reported: 2021-03-01 19:19 UTC by H.J. Lu
Modified: 2021-03-10 14:35 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2021-03-01 19:19:08 UTC
On Linux/x86-64, with

diff --git a/ld/ldmain.c b/ld/ldmain.c
index 7a3c02aeaa6..3b2ebe168c6 100644
--- a/ld/ldmain.c
+++ b/ld/ldmain.c
@@ -357,7 +357,7 @@ main (int argc, char **argv)
 #ifdef DEFAULT_NEW_DTAGS
   link_info.new_dtags = DEFAULT_NEW_DTAGS;
 #endif
-  link_info.start_stop_gc = FALSE;
+  link_info.start_stop_gc = TRUE;
   link_info.start_stop_visibility = STV_PROTECTED;
 
   ldfile_add_arch ("");

I got

FAIL: ld-elf/pr21562a
FAIL: ld-elf/pr21562b
FAIL: ld-elf/pr21562c
FAIL: ld-elf/pr21562d
FAIL: ld-elf/pr21562i
FAIL: ld-elf/pr21562j
FAIL: ld-elf/pr21562k
FAIL: ld-elf/pr21562l
FAIL: ld-elf/pr21562m
FAIL: ld-elf/pr21562n
FAIL: --gc-sections with __start_
FAIL: ld-gc/pr19167
FAIL: pr20022
FAIL: --gc-sections with __start_SECTIONNAME
Comment 1 Alan Modra 2021-03-01 21:59:17 UTC
Every one of these tests has --gc-sections and is testing the behaviour of __start_* or __stop_*.  -z start-stop-gc is designed to drop sections that are only kept due to magically defined __start/__stop symbols, so the fact that these tests now fail with undefined references is exactly what you should expect.  -z start-stop-gc is *not* an option that you can turn on by default.
Comment 2 H.J. Lu 2021-03-01 22:47:38 UTC
So this option generates broken outputs by design.  I don't this is a good
idea.  Instead, LLVM should generate __gc_start/__gc_stop instead of
__start/__stop.
Comment 3 Fangrui Song 2021-03-02 02:08:46 UTC
(In reply to H.J. Lu from comment #2)
> FAIL: ld-elf/pr21562a
> FAIL: ld-elf/pr21562b

Expected. __start_scnfoo references do not retain scnfoo input sections.

> FAIL: ld-elf/pr21562c
> FAIL: ld-elf/pr21562d

Expected. Use KEEP(*(scnfoo)) to mark scnfoo as GC roots.

> FAIL: ld-elf/pr21562i
> FAIL: ld-elf/pr21562j
> FAIL: ld-elf/pr21562k
> FAIL: ld-elf/pr21562l
> FAIL: ld-elf/pr21562m
> FAIL: ld-elf/pr21562n

These reuse the other .s files.

> So this option generates broken outputs by design.  I don't this is a good idea.

I don't think so. I have performed a large scale test internally, everything except Swift and systemd works.

> Instead, LLVM should generate __gc_start/__gc_stop instead of __start/__stop.

If __start_/__stop_ were broken, creating new magic symbols might be an option.

__start_/__stop_ work in 99.9% cases so not sure new symbols need to be invented.


bug 27492 tracks the glibc static linking issue related to stdio flushing.
Comment 4 H.J. Lu 2021-03-02 02:43:23 UTC
(In reply to Fangrui Song from comment #3)
> 
> __start_/__stop_ work in 99.9% cases so not sure new symbols need to be
> invented.

To you, it is 0.1% case.  To other people, it is 100%.  We shouldn't
do it to other people if there is a choice.
Comment 5 Alan Modra 2021-03-02 22:09:10 UTC
(In reply to H.J. Lu from comment #2)
> So this option generates broken outputs by design.
Not at all.  The option turns off some linker garbage collection magic in cases where you do not want that magic.  Saying this option generates broken output is a little silly.  It's like saying -z norelro is broken because I want relro output!
Comment 6 H.J. Lu 2021-03-02 23:05:08 UTC
(In reply to Alan Modra from comment #5)
> (In reply to H.J. Lu from comment #2)
> > So this option generates broken outputs by design.
> Not at all.  The option turns off some linker garbage collection magic in
> cases where you do not want that magic.  Saying this option generates broken
> output is a little silly.  It's like saying -z norelro is broken because I
> want relro output!

Packages, like libc.a from glibc, depend on such "magic".  How does one
know when -z start_stop_gc is safe to to use?
Comment 7 Alan Modra 2021-03-03 00:49:05 UTC
Here is a testcase showing when -z start-stop-gc might be useful.  Imagine that you want to add some per-function data to each function, and collect that data in a table.  You use section groups naturally so that --gc-sections will remove the per-function data along with the function code.  However, you find that --gc-sections is not doing anything, because the reference to __start_xx that you are using to access the table is marking all the xx sections and along with them their code sections.

If you had control of the startup objects, you could add a __start_xx definition in crt1.o and a __stop_xx in crtn.o to prevent the linker magic with start/stop symbols.  Also, if you had control of the linker scripts, then you could place an output xx section with __start_xx = .; and __stop_xx = .; around the input xx sections which would also disable the linker magic..  But it is likely you have control over neither.  Other work-arounds might be possible too.

It is true that --start-stop-gc may break linking with current static glibc, a fact that limits --start-stop-gc usefulness.

BTW, this testcase also shows an interaction between section groups and start/stop symbol gc.  If you move the foo reference before the __start_xx reference then no magic linker marking of xx sections happens.


 .weak __start_xx
 .weak __stop_xx

 .text
 .global _start
_start:
 .dc.a __start_xx, __stop_xx
 .dc.a foo


 .section .text,"axG",%progbits,foo_group
foo:
 .dc.a 0

 .section xx,"aG",%progbits,foo_group
 .dc.a 1


 .section .text,"axG",%progbits,bar_group
bar:
 .dc.a 2

 .section xx,"aG",%progbits,bar_group
 .dc.a 3
Comment 8 H.J. Lu 2021-03-03 02:27:30 UTC
(In reply to Alan Modra from comment #7)
> 
> It is true that --start-stop-gc may break linking with current static glibc,
> a fact that limits --start-stop-gc usefulness.
> 
> BTW, this testcase also shows an interaction between section groups and
> start/stop symbol gc.  If you move the foo reference before the __start_xx
> reference then no magic linker marking of xx sections happens.
> 

It is not just static glibc.  PR 19161 has another usage.
Comment 9 Fangrui Song 2021-03-03 04:59:14 UTC
(In reply to H.J. Lu from comment #8)
> (In reply to Alan Modra from comment #7)
> > 
> > It is true that --start-stop-gc may break linking with current static glibc,
> > a fact that limits --start-stop-gc usefulness.
> > 
> > BTW, this testcase also shows an interaction between section groups and
> > start/stop symbol gc.  If you move the foo reference before the __start_xx
> > reference then no magic linker marking of xx sections happens.
> > 
> 
> It is not just static glibc.  PR 19161 has another usage.

David Li is a main reviewer of LLVM PGO. He should be happy if GNU ld and gold can now garbage collect PGO sections https://reviews.llvm.org/D97649#inline-915909 (he asked me to add some tests for GNU ld and gold).

PR19161 was filed in an era (where gold had the rule and GNU ld didn't) when they probably needed this for some other experiments, but there are now better approaches retaining such sections (SHF_GNU_RETAIN). `__start_xx retaining xx input sections` is outdated.
Comment 10 H.J. Lu 2021-03-09 21:06:53 UTC
They fail at random on both i686 and x86-64.
Comment 11 H.J. Lu 2021-03-10 14:35:43 UTC
(In reply to H.J. Lu from comment #10)
> They fail at random on both i686 and x86-64.

Ignore this.