Bug 19162

Summary: Huge binary after linking sections with "a" and "wa" flags
Product: binutils Reporter: Ilya Verbin <iverbin>
Component: ldAssignee: Alan Modra <amodra>
Status: RESOLVED FIXED    
Severity: normal CC: hjl.tools
Priority: P2    
Version: 2.26   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Ilya Verbin 2015-10-22 15:07:20 UTC
The testcase:

$ cat t1.s
.section ".AAA", "a"
.long 0x12345678
$ cat t2.s
.section ".AAA", "wa"
.long 0x12345678
$ as t1.s -o t1.o
$ as t2.s -o t2.o
$ ld -shared t1.o t2.o
$ ls -lh a.out
2.1M a.out

Some strange 2MB offset is inserted into another section:

[Nr] Name              Type             Address           Offset
     Size              EntSize          Flags  Link  Info  Align
[ 0]                   NULL             0000000000000000  00000000
     0000000000000000  0000000000000000           0     0     0
[ 1] .hash             HASH             00000000000000b0  000000b0
     0000000000000028  0000000000000004   A       2     0     8
[ 2] .dynsym           DYNSYM           00000000000000d8  000000d8
     0000000000000078  0000000000000018   A       3     2     8
[ 3] .dynstr           STRTAB           0000000000000150  00000150
     0000000000000019  0000000000000000   A       0     0     1
[ 4] .AAA              PROGBITS         0000000000000169  00000169
     0000000000000008  0000000000000000  WA       0     0     1
[ 5] .eh_frame         PROGBITS         0000000000000178  00000178
     0000000000000000  0000000000000000   A       0     0     8
[ 6] .dynamic          DYNAMIC          0000000000200178  00200178  <-- ???
     00000000000000b0  0000000000000010  WA       3     0     8

It seems that something goes wrong during section-to-segment mapping, because when both .AAA have "wa" flags, we got small binary with 2 LOAD segments:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000001a8 0x00000000000001a8  R      200000
  LOAD           0x00000000000001a8 0x00000000002001a8 0x00000000002001a8
                 0x00000000000000b8 0x00000000000000b8  RW     200000

But when one .AAA has "a" flag, and another .AAA has "wa" flag, we got huge binary with only one big LOAD segment:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000200228 0x0000000000200228  RW     200000
Comment 1 H.J. Lu 2015-10-22 15:23:08 UTC
(In reply to Ilya Verbin from comment #0)
> The testcase:
> 
> $ cat t1.s
> .section ".AAA", "a"
> .long 0x12345678
> $ cat t2.s
> .section ".AAA", "wa"
> .long 0x12345678
> $ as t1.s -o t1.o
> $ as t2.s -o t2.o
> $ ld -shared t1.o t2.o
> $ ls -lh a.out
> 2.1M a.out
> 
> Some strange 2MB offset is inserted into another section:
> 
> [Nr] Name              Type             Address           Offset
>      Size              EntSize          Flags  Link  Info  Align
> [ 0]                   NULL             0000000000000000  00000000
>      0000000000000000  0000000000000000           0     0     0
> [ 1] .hash             HASH             00000000000000b0  000000b0
>      0000000000000028  0000000000000004   A       2     0     8
> [ 2] .dynsym           DYNSYM           00000000000000d8  000000d8
>      0000000000000078  0000000000000018   A       3     2     8
> [ 3] .dynstr           STRTAB           0000000000000150  00000150
>      0000000000000019  0000000000000000   A       0     0     1
> [ 4] .AAA              PROGBITS         0000000000000169  00000169
>      0000000000000008  0000000000000000  WA       0     0     1
> [ 5] .eh_frame         PROGBITS         0000000000000178  00000178
>      0000000000000000  0000000000000000   A       0     0     8
> [ 6] .dynamic          DYNAMIC          0000000000200178  00200178  <-- ???
>      00000000000000b0  0000000000000010  WA       3     0     8
> 
> It seems that something goes wrong during section-to-segment mapping,
> because when both .AAA have "wa" flags, we got small binary with 2 LOAD
> segments:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
>                  0x00000000000001a8 0x00000000000001a8  R      200000
>   LOAD           0x00000000000001a8 0x00000000002001a8 0x00000000002001a8
>                  0x00000000000000b8 0x00000000000000b8  RW     200000
> 
> But when one .AAA has "a" flag, and another .AAA has "wa" flag, we got huge
> binary with only one big LOAD segment:
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
>   LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
>                  0x0000000000200228 0x0000000000200228  RW     200000

Since ld defaults to 2MB maximum page size, it is normal:

[hjl@gnu-6 pr19162]$ make clean
rm -f *.o *.so *.a
[hjl@gnu-6 pr19162]$ make
gcc  -fPIC -O2 -c -o t1.o t1.s
gcc  -fPIC -O2 -c -o t2.o t2.s
./ld -shared  -o x.so t1.o t2.o
ls -lh x.so
-rwxrwxr-x 1 hjl hjl 2.1M Oct 22 08:22 x.so
[hjl@gnu-6 pr19162]$ make clean
rm -f *.o *.so *.a
[hjl@gnu-6 pr19162]$ make LDFLAGS="-z max-page-size=0x1000"
gcc  -fPIC -O2 -c -o t1.o t1.s
gcc  -fPIC -O2 -c -o t2.o t2.s
./ld -shared -z max-page-size=0x1000 -o x.so t1.o t2.o
ls -lh x.so
-rwxrwxr-x 1 hjl hjl 5.6K Oct 22 08:22 x.so
[hjl@gnu-6 pr19162]$
Comment 2 Alan Modra 2015-10-28 05:11:51 UTC
There is a real problem here.  It is that orphan sections are placed according to the flags of the first input section encountered, not according to the resultant output section flags.

Try "ld -shared t2.o t1.o" to see.
Comment 3 Sourceware Commits 2015-10-28 07:25:34 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=199af1503922ce2134d774a78be0d9e2ae055ab1

commit 199af1503922ce2134d774a78be0d9e2ae055ab1
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Oct 28 17:18:13 2015 +1030

    Orphan output section with multiple input sections
    
    If given input sections with differing flags, we'd like to place the
    section according to the final output section flags.
    
    bfd/
    	PR ld/19162
    	* elflink.c (_bfd_elf_gc_mark_reloc): Move code iterating over
    	linker input bfds..
    	* section.c (bfd_get_next_section_by_name): ..to here.  Add ibfd param.
    	(bfd_get_linker_section): Adjust bfd_get_next_section_by_name call.
    	* tekhex.c (first_phase): Likewise.
    	* elflink.c (bfd_elf_gc_sections): Likewise.
    	* bfd-in2.h: Regenerate.
    ld/
    	PR ld/19162
    	* emultempl/elf32.em (gld${EMULATION_NAME}_place_orphan): Check flags
    	before calling _bfd_elf_match_sections_by_type.  Merge flags for
    	any other input sections that might match a new output section to
    	decide placement.
Comment 4 Alan Modra 2015-10-28 07:26:39 UTC
Fixed
Comment 5 Sourceware Commits 2015-10-28 10:25:31 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7963511fbf0459fff586c3129705bfbc706770e3

commit 7963511fbf0459fff586c3129705bfbc706770e3
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Oct 28 03:20:55 2015 -0700

    Add a test for PR ld/19162
    
    	PR ld/19162
    	* ld-x86-64/x86-64.exp: Run pr19162.
    	* ld-x86-64/pr19162.d: New file.
    	* ld-x86-64/pr19162a.s: Likewise.
    	* ld-x86-64/pr19162b.s: Likewise.
Comment 6 Sourceware Commits 2015-10-29 09:20:14 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=936384714fa8b0f7ca8cc3b5637394461bc998c8

commit 936384714fa8b0f7ca8cc3b5637394461bc998c8
Author: Alan Modra <amodra@gmail.com>
Date:   Thu Oct 29 16:16:22 2015 +1030

    Re: Orphan output section with multiple input sections
    
    The last patch missed handling the case where the ideal place to put
    an orphan was after a non-existent output section statement, as can
    happen when not using the builtin linker scripts.  This patch uses the
    updated flags for that case too, and extends the support to mmo and pe.
    
    	PR ld/19162
    	* emultempl/elf32.em (gld${EMULATION_NAME}_place_orphan): Pass
    	updated flags to lang_output_section_find_by_flags.
    	* emultempl/mmo.em (mmo_place_orphan): Merge flags for any
    	other input sections that might match a new output section to
    	decide placement.
    	* emultempl/pe.em (gld_${EMULATION_NAME}_place_orphan): Likewise.
    	* emultempl/pep.em (gld_${EMULATION_NAME}_place_orphan): Likewise.
    	* ldlang.c (lang_output_section_find_by_flags): Add sec_flags param.
    	* ldlang.h (lang_output_section_find_by_flags): Update prototype.
Comment 7 Sourceware Commits 2016-02-29 18:39:03 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7f50ebc1b1215520b85cb9a8e709e502898fd2c8

commit 7f50ebc1b1215520b85cb9a8e709e502898fd2c8
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Feb 29 10:37:59 2016 -0800

    Add a testcase for PR ld/19162
    
    	PR ld/19162
    	* testsuite/ld-elf/pr19162.d: New file.
    	* testsuite/ld-elf/pr19162a.s: Likwise.
    	* testsuite/ld-elf/pr19162b.s: Likwise.