Bug 23648

Summary: Symbols based on MEMORY regions confuse --gc-sections.
Product: binutils Reporter: Nick Bowler <nbowler>
Component: ldAssignee: Alan Modra <amodra>
Severity: normal    
Priority: P2    
Version: 2.32   
Target Milestone: 2.32   
Host: Target:
Build: Last reconfirmed: 2018-09-19 00:00:00
Attachments: Possible patch

Description Nick Bowler 2018-09-13 14:10:49 UTC
It seems that ld's --gc-sections feature can get confused by the
MEMORY command.

It is possible to define symbols that depend on the memory regions in a
linker script.  The --gc-sections function appears to discard sections
as if the MEMORY command was omitted from the linker script.

If removing the MEMORY command affects which sections get referenced,
this means --gc-sections will discard sections that are actually needed
by the link.  The results are obviously not good.

For example, consider the following linker script:
  % cat >test.ld <<'EOF'
  MEMORY { code : ORIGIN = 0, LENGTH = 8M }
     .text 1M : { *(.text*) } >code
  foo = LENGTH(code) > 32M ? test0 : test1;

And a C source file that uses the linker-defined "foo" like this:

  % cat >test.c <<'EOF'
  void test0(void) { }
  void test1(void) { }
  void entry(void)
     extern void foo(void);

Compiled like this:

  % gcc -ffunction-sections -c test.c

The intention of the example is to call one or the other function based
on the link-time MEMORY configuration.  This works fine normally:

  % ld -Ttest.ld test.o
  % readelf -Ws a.out
  6: 0000000000100007     7 FUNC    GLOBAL DEFAULT    1 test1
  7: 0000000000100007     0 FUNC    GLOBAL DEFAULT    1 foo
  8: 0000000000100000     7 FUNC    GLOBAL DEFAULT    1 test0

So foo = test1, as expected since LENGTH(code) is less than 32M.

However, if we add --gc-sections, things go awry:

  % ld -Ttest.ld --gc-sections --print-gc-sections test.o
  ld: Removing unused section '.text.test1' in file 'test.o'


  % readelf -Ws a.out
  5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  ABS foo
  6: 0000000000100000     7 FUNC    GLOBAL DEFAULT    1 test0

So test1 is discarded, the unused test0 is kept, and foo is set to
garbage.  The resulting code is correspondingly broken.

Testing suggests that --gc-sections evaluates "foo" as if LENGTH(code)
returned -1, regardless of what is specified in the MEMORY command.
This is equivalent to what you'd get if the MEMORY command was omitted.

This appears to be independent of target, as I tried several targets
(and versions) with identical effect.
Comment 1 Nick Bowler 2018-09-13 17:59:52 UTC
Created attachment 11243 [details]
Possible patch

Simply moving the call to lang_do_memory_regions earlier in lang_process is sufficient to make the example case work as expected.

Unsure if this is a good idea though.
Comment 2 cvs-commit@gcc.gnu.org 2018-09-19 05:52:24 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:


commit 7f0cfc6eb8bbead5c9a1ff3a8dec11d93304f5ad
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Sep 19 13:22:43 2018 +0930

    PR23648, Symbols based on MEMORY regions confuse --gc-sections
    Running lang_do_memory_regions earlier seems a reasonable solution to
    gaining better values for symbols prior to lang_gc_sections.
    	PR ld/23648
    	* ldlang.c (lang_process): Move lang_do_memory_regions earlier.
    	Comment on lang_do_assignments call.
    	* ldgram.y (origin_exp): Don't assign region->current.
Comment 3 Alan Modra 2018-09-19 06:46:41 UTC
I'll commit a ld testsuite test too once my multi-target test run completes (again).
Comment 4 cvs-commit@gcc.gnu.org 2018-09-19 08:11:41 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:


commit 7e00df65a4e3e281604acab2450812a5fb275743
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Sep 19 16:07:44 2018 +0930

    PR23648 testcase
    	PR 23648
    	* testsuite/ld-elf/pr23648.d,
    	* testsuite/ld-elf/pr23648.s,
    	* testsuite/ld-elf/pr23648.t: New test.