Bug 19446

Summary: BFD linker discards section without alloc section attribute under certain conditions
Product: binutils Reporter: David Li <xinliangli>
Component: binutilsAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED WORKSFORME    
Severity: normal CC: ccoutant, hjl.tools, nickc
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description David Li 2016-01-11 22:33:42 UTC
If a section is not marked with SHF_ALLOC attribute and when --gc-sections is on, the BFD linker only keeps it if the section has no references to other symbols. If there is reference to other symbols, the linker will garbage collect it.

By comparison, Gold linker does not garbage collect such sections regardless of the contents of the section.

Example:  t.s

gcc -fuse-ld=bfd -Wl,--gc-sections t.s

objdump -h a.out |grep UNREF

Changing the assembly file a little by making unref initialized to 0, linker will keep the UNREF section.

The version of the linker tested is 2.24.

        .file   "unref2.c"
        .comm   g0,4,4

        .globl  unref
        .section        UNREF,"",@progbits
        .align 8
        .type   unref, @object
        .size   unref, 8
unref:
        .quad   g0

        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $1, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4"
        .section        .note.GNU-stack,"",@progbits
Comment 1 H.J. Lu 2016-01-11 23:04:08 UTC
Since UNREF section is not referenced, it should be GCed.  Am I missing
something?
Comment 2 David Li 2016-01-11 23:07:22 UTC
(In reply to H.J. Lu from comment #1)
> Since UNREF section is not referenced, it should be GCed.  Am I missing
> something?

Note that the section does not have 'a' bit -- just like debug sections. Linker won't GC debug sections, right?

Also there is inconsistent behavior here -- when g0 is not referenced by UNREF, UNREF will be kept by the linker.
Comment 3 H.J. Lu 2016-01-11 23:25:13 UTC
(In reply to David Li from comment #2)
> (In reply to H.J. Lu from comment #1)
> > Since UNREF section is not referenced, it should be GCed.  Am I missing
> > something?
> 
> Note that the section does not have 'a' bit -- just like debug sections.
> Linker won't GC debug sections, right?

ld only treats known debug sections as debug sections.  Are you looking
for a way to prevent unreferenced section from GC?

> Also there is inconsistent behavior here -- when g0 is not referenced by
> UNREF, UNREF will be kept by the linker.

g0 is removed by ld in binutils 2.26.
Comment 4 David Li 2016-01-11 23:44:07 UTC
(In reply to H.J. Lu from comment #3)
> (In reply to David Li from comment #2)
> > (In reply to H.J. Lu from comment #1)
> > > Since UNREF section is not referenced, it should be GCed.  Am I missing
> > > something?
> > 
> > Note that the section does not have 'a' bit -- just like debug sections.
> > Linker won't GC debug sections, right?
> 
> ld only treats known debug sections as debug sections.  Are you looking
> for a way to prevent unreferenced section from GC?

yes.


> 
> > Also there is inconsistent behavior here -- when g0 is not referenced by
> > UNREF, UNREF will be kept by the linker.
> 
> g0 is removed by ld in binutils 2.26.

No -- that is not the point. The point is that if UNDEF section is defined as follows,  ld *will* keep UNREF (I have not tried 2.26). What makes ld think UNREF should be kept here as they are not known debug sections?



        .globl  unref
        .section        UNREF,"",@progbits
        .align 8
        .type   unref, @object
        .size   unref, 8
unref:
        .long 0

        .text
        .globl  main
        .type   main, @function
Comment 5 H.J. Lu 2016-01-12 00:36:42 UTC
(In reply to David Li from comment #4)
> (In reply to H.J. Lu from comment #3)
> > (In reply to David Li from comment #2)
> > > (In reply to H.J. Lu from comment #1)
> > > > Since UNREF section is not referenced, it should be GCed.  Am I missing
> > > > something?
> > > 
> > > Note that the section does not have 'a' bit -- just like debug sections.
> > > Linker won't GC debug sections, right?
> > 
> > ld only treats known debug sections as debug sections.  Are you looking
> > for a way to prevent unreferenced section from GC?
> 
> yes.
> 

Will

        .globl  unref
        .section        UNREF,"",@note
        .align 8
        .type   unref, @object
        .size   unref, 8
unref:
        .quad   g0

work for you?

> > 
> > > Also there is inconsistent behavior here -- when g0 is not referenced by
> > > UNREF, UNREF will be kept by the linker.
> > 
> > g0 is removed by ld in binutils 2.26.
> 
> No -- that is not the point. The point is that if UNDEF section is defined
> as follows,  ld *will* keep UNREF (I have not tried 2.26). What makes ld
> think UNREF should be kept here as they are not known debug sections?
> 
> 
> 
>         .globl  unref
>         .section        UNREF,"",@progbits
>         .align 8
>         .type   unref, @object
>         .size   unref, 8
> unref:
>         .long 0
> 
>         .text
>         .globl  main
>         .type   main, @function

GCC in ld in binutils 2.26 will remove it.
Comment 6 David Li 2016-01-12 07:34:57 UTC
(In reply to H.J. Lu from comment #5)
> (In reply to David Li from comment #4)
> > (In reply to H.J. Lu from comment #3)
> > > (In reply to David Li from comment #2)
> > > > (In reply to H.J. Lu from comment #1)
> > > > > Since UNREF section is not referenced, it should be GCed.  Am I missing
> > > > > something?
> > > > 
> > > > Note that the section does not have 'a' bit -- just like debug sections.
> > > > Linker won't GC debug sections, right?
> > > 
> > > ld only treats known debug sections as debug sections.  Are you looking
> > > for a way to prevent unreferenced section from GC?
> > 
> > yes.
> > 
> 
> Will
> 
>         .globl  unref
>         .section        UNREF,"",@note
>         .align 8
>         .type   unref, @object
>         .size   unref, 8
> unref:
>         .quad   g0
> 
> work for you?
> 
> > > 
> > > > Also there is inconsistent behavior here -- when g0 is not referenced by
> > > > UNREF, UNREF will be kept by the linker.
> > > 
> > > g0 is removed by ld in binutils 2.26.
> > 
> > No -- that is not the point. The point is that if UNDEF section is defined
> > as follows,  ld *will* keep UNREF (I have not tried 2.26). What makes ld
> > think UNREF should be kept here as they are not known debug sections?
> > 
> > 
> > 
> >         .globl  unref
> >         .section        UNREF,"",@progbits
> >         .align 8
> >         .type   unref, @object
> >         .size   unref, 8
> > unref:
> >         .long 0
> > 
> >         .text
> >         .globl  main
> >         .type   main, @function
> 
> GCC in ld in binutils 2.26 will remove it.

making it a note section works fine for both ld and gold
Comment 7 H.J. Lu 2016-01-12 15:46:55 UTC
Works for me since ld in binutils 2.26 is consistent.
Comment 8 Nick Clifton 2016-01-21 16:35:14 UTC
Hi David,

  Right - I think that I have got a handle on what it going on here.

  So what happens is this - g0 is a data symbol.  In your case it is a common symbol, but it could equally be an ordinary data symbol too.  Either way it is present in unref2.o.

  When the linker is invoked it sees that g0 is only ever referenced from UNREF an unallocated, unloaded section.  So it has two choices.  It can either a) decide that since UNREF will never be loaded that there is no need to allocate space for g0 in the runtime image or b) decide that g0 is referenced by something it must have space allocated to it and so it is put into the .bss section.  The LD linker makes choice a), GOLD makes choice b).

  The results of garbage collection are then affected by this choice.  With LD, since g0 is being discarded, it has to also discard UNREF, as otherwise you would be left with a section in the executable that references an object that no longer exists.  With GOLD, since g0 exists, the UNREF section can be kept.  (By default unallocated sections are kept since they do not contribute anything to the runtime memory usage of the executable, and they can be presumed to contain something useful).

  So that is the situation.  Nothing has changed with 2.26 or the mainline sources by the way - they still behave in the way that you saw.

  To be honest however I do not think that there is a bug here to be fixed.  The fact that an unallocated, unloaded, non-debug section references data in the executable is very strange.  Or at least pretty unusual.  There are ways around the problem - by making the unref'ed section a note section or a fake debug section - but as far as I can see the only real issue is that LD and GOLD differ in their handling of the situation.  I don't believe that there is a standard that specifies what should happen in this situation, so either choice is valid.  You pick the linker that gives you the behaviour you want.

  Does that satisfy you ?

Cheers
  Nick
Comment 9 David Li 2016-01-21 17:43:24 UTC
(In reply to Nick Clifton from comment #8)
> Hi David,
> 
>   Right - I think that I have got a handle on what it going on here.
> 
>   So what happens is this - g0 is a data symbol.  In your case it is a
> common symbol, but it could equally be an ordinary data symbol too.  Either
> way it is present in unref2.o.
> 
>   When the linker is invoked it sees that g0 is only ever referenced from
> UNREF an unallocated, unloaded section.  So it has two choices.  It can
> either a) decide that since UNREF will never be loaded that there is no need
> to allocate space for g0 in the runtime image or b) decide that g0 is
> referenced by something it must have space allocated to it and so it is put
> into the .bss section.  The LD linker makes choice a), GOLD makes choice b).
> 
>   The results of garbage collection are then affected by this choice.  With
> LD, since g0 is being discarded, it has to also discard UNREF, as otherwise
> you would be left with a section in the executable that references an object
> that no longer exists.  With GOLD, since g0 exists, the UNREF section can be
> kept.  (By default unallocated sections are kept since they do not
> contribute anything to the runtime memory usage of the executable, and they
> can be presumed to contain something useful).
> 
>   So that is the situation.  Nothing has changed with 2.26 or the mainline
> sources by the way - they still behave in the way that you saw.
> 
>   To be honest however I do not think that there is a bug here to be fixed. 
> The fact that an unallocated, unloaded, non-debug section references data in
> the executable is very strange.  Or at least pretty unusual.  There are ways
> around the problem - by making the unref'ed section a note section or a fake
> debug section - but as far as I can see the only real issue is that LD and
> GOLD differ in their handling of the situation.  I don't believe that there
> is a standard that specifies what should happen in this situation, so either
> choice is valid.  You pick the linker that gives you the behaviour you want.
> 
>   Does that satisfy you ?
> 
> Cheers
>   Nick

Nick, thanks for the detailed explanation. However I do think ld's behavior is not correct here. Should linker first decide if some sections are 'root' sections that should not be throw away and then decide if other sections should be GCed? Here linker first prunes the references and then is forced to discard section not because it should do so, but to make the link succeed.

Other than gold, the linker for Mach-O also does not throw away unknown sections it does know about.

As a result of the ld behavior, I did what hj suggested by making the section a note section, and ld behaves as expected.
Comment 10 Nick Clifton 2016-01-22 09:44:24 UTC
Hi Xinliangli,

> However I do think ld's behavior is
> not correct here. Should linker first decide if some sections are 'root'
> sections that should not be throw away and then decide if other sections should
> be GCed? Here linker first prunes the references and then is forced to discard
> section not because it should do so, but to make the link succeed.

Yes, but... the linker has not been told that the UNREF section is a root section.  If the test had used a linker script that specified that the UNREF section should be kept[1] then the linker would have acted differently.  It would keep the UNREF section and g0 variable and everything would have worked as expected.

Since however the UNREF section is an orphan section, the linker has more latitude in what it can do.  The LD linker decides that since the section has relocations against it, and since these relocations refers to symbols which are otherwise unused, then it makes sense to discard the section.  You disagree with this decision.  I don't.  But since there are several 
available workarounds, and, as far as I know, it is not breaking real programs, I do not plan to make any changes to the linker.

Cheers
   Nick

[1] ie:

   UNREF : { KEEP (*(UNREF)) }
Comment 11 David Li 2016-01-22 18:54:59 UTC
No problem with this as long as ld does not throw away note sections.

thanks,