Bug 21557

Summary: __start_SCN not provided if SCN used in linker script
Product: binutils Reporter: Alexander Monakov <amonakov>
Component: ldAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: hjl.tools
Priority: P2    
Version: 2.29   
Target Milestone: 2.29   
Host: Target:
Build: Last reconfirmed: 2017-06-09 00:00:00

Description Alexander Monakov 2017-06-08 17:26:23 UTC
It appears that automagic provision of __start_/__stop_SCN symbol names doesn't work in BFD linker (works in Gold) if the section in question participates in linker script SECTIONS command, even though computing reachability for the purposes of --gc-sections works as expected.

cat <<EOF >test.s
        .section        scnfoo,"aw",@progbits
        .type   foo, @object
        .size   foo, 1
foo:
        .zero   1
        .globl  bar
        .data
        .align 8
        .type   bar, @object
        .size   bar, 8
bar:
        .quad   __start_scnfoo
EOF

cat <<EOF >lds
SECTIONS {
        scnfoo : { *(scnfoo) }
}
EOF

gcc -B. -fuse-ld=bfd -shared -o test.so test.o -Wl,-z,defs -T lds
test.o:(.data+0x0): undefined reference to `__start_scnfoo'


(this issue is minimized from analyzing why Glibc libc.so gets mislinked with --gc-sections, but the original Glibc issue is different: they use PROVIDE statements overriding magic symbols:

         PROVIDE(__start___libc_subfreeres = .);
         __libc_subfreeres : { *(__libc_subfreeres) }
         PROVIDE(__stop___libc_subfreeres = .);

and imho there the main question would be if linkers want to support such usage, or Glibc would have to drop the now-useless PROVIDE statements to allow building with --gc-sections, but that leads to the above ld.bfd bug)
Comment 1 H.J. Lu 2017-06-09 12:07:24 UTC
From ld manual:

   If an orphaned section's name is representable as a C identifier then
the linker will automatically *note PROVIDE:: two symbols:
__start_SECNAME and __stop_SECNAME, where SECNAME is the name of the
section.  These indicate the start address and end address of the
orphaned section respectively.  Note: most section names are not
representable as C identifiers because they contain a '.' character.

When scnfoo is used in the linker script, it is no longer an orphan section.
I think this should be closed as WONTFIX.
Comment 2 Alexander Monakov 2017-06-09 13:50:38 UTC
Thanks for the reference, I was missing that this handling is explicitly limited to orphan sections. This makes my testcase invalid, but please consider that current linker behavior appears inconsistent, for two separate reasons.

First, gc-sections considers these non-orphan sections reachable; taking the testcase and linking it without '-z defs', with --gc-sections, you get:

$ ld.bfd -shared -o test.so test.o --gc-sections --print-gc-sections -T lds
<no output, section scnfoo was not garbage-collected>

But eliminating the reference from 'bar' allows it to be collected:

$ sed -i -e s/__start_scnfoo/undef/ test.s
$ gcc -c test.s
$ ld.bfd -shared -o test.so test.o --gc-sections --print-gc-sections -T lds
ld.bfd: Removing unused section 'scnfoo' in file 'test.o'

To the user this means that __start_scnfoo was virtually present when sections were marked for garbage-collection, but not present for actual symbol resolution.


Second, it implies there's no good way to use linker-scripts, --gc-sections and __start_SCN together, even if one's use of linker scripts is limited to rearranging sections. If one tries to manually provide __start_SCN:

SECTIONS {
        .dynamic : { *(.dynamic) }
        .data : { *(.data) }
        PROVIDE(__start_scnfoo = .);
        scnfoo : { *(scnfoo) }
}

... then it leads to scnfoo eliminated with --gc-sections because __start_scnfoo no longer ties it to public symbol 'bar':

ld.bfd -shared -o test.so test.o --gc-sections --print-gc-sections -T lds
ld.bfd: Removing unused section 'scnfoo' in file 'test.o'

(with linker scripts it's possible to have output sections have different names than input sections, this may cause ambiguity in what SCN __start_SCN should be referring to; is that the reason why handling is limited to orphans?)

(it's not clear to me why bug status was set to WAITING, please clarify if you'd like to see further clarifications, new bugreports for above issues, or something else)
Comment 3 H.J. Lu 2017-06-09 14:54:36 UTC
(In reply to Alexander Monakov from comment #2)
> Thanks for the reference, I was missing that this handling is explicitly
> limited to orphan sections. This makes my testcase invalid, but please
> consider that current linker behavior appears inconsistent, for two separate
> reasons.
> 
> First, gc-sections considers these non-orphan sections reachable; taking the
> testcase and linking it without '-z defs', with --gc-sections, you get:
> 

Please open a new bug report.  BTW, foo in your testcase isn't reachable
outside of test.so.

> (it's not clear to me why bug status was set to WAITING, please clarify if
> you'd like to see further clarifications, new bugreports for above issues,
> or something else)

I wanted to see more info.
Comment 4 Alexander Monakov 2017-06-09 17:40:52 UTC
> Please open a new bug report.

Done, bug 21562.

Now I see that to use __start_SCN together with linker scripts and --gc-sections, one should put the PROVIDE statement inside of the output section declaration:

SECTIONS {
        scnfoo : {
        PROVIDE_HIDDEN(__start_scnfoo = .);
        *(scnfoo)
        }
}

This looks logical and seems to work with -gc-sections, with both ld.bfd and Gold. I think the Glibc approach quoted in comment #0 with PROVIDE statements outside of output groups sets a bad example (and breaks build with --gc-sections).
Comment 5 Sourceware Commits 2017-06-13 15:55:12 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cbd0eecf261c2447781f8c89b0d955ee66fae7e9

commit cbd0eecf261c2447781f8c89b0d955ee66fae7e9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Jun 13 08:53:22 2017 -0700

    Always define referenced __start_SECNAME/__stop_SECNAME
    
    Currently, linker will define __start_SECNAME and __stop_SECNAME symbols
    only for orphaned sections.
    
    However, during garbage collection, ELF linker marks all sections with
    references to __start_SECNAME and __stop_SECNAME symbols as used even
    when section SECNAME isn't an orphaned section and linker won't define
    __start_SECNAME nor __stop_SECNAME.  And ELF linker stores the first
    input section whose name matches __start_SECNAME or __stop_SECNAME in
    u.undef.section for garbage collection.  If these symbols are provided
    in linker script, u.undef.section is set to the section where they will
    defined by linker script, which leads to the incorrect output.
    
    This patch changes linker to always define referenced __start_SECNAME and
    __stop_SECNAME if the input section name is the same as the output section
    name, which is always true for orphaned sections, and SECNAME is a C
    identifier.  Also __start_SECNAME and __stop_SECNAME symbols are marked
    as hidden by ELF linker so that __start_SECNAME and __stop_SECNAME symbols
    for section SECNAME in different modules are unique.  For garbage
    collection, ELF linker stores the first matched input section in the
    unused vtable field.
    
    bfd/
    
    	PR ld/20022
    	PR ld/21557
    	PR ld/21562
    	PR ld/21571
    	* elf-bfd.h (elf_link_hash_entry): Add start_stop.  Change the
    	vtable field to a union.
    	(_bfd_elf_is_start_stop): Removed.
    	* elf32-i386.c (elf_i386_convert_load_reloc): Also check for
    	__start_SECNAME and __stop_SECNAME symbols.
    	* elf64-x86-64.c (elf_x86_64_convert_load_reloc): Likewise.
    	* elflink.c (_bfd_elf_is_start_stop): Removed.
    	(_bfd_elf_gc_mark_rsec): Check start_stop instead of calling
    	_bfd_elf_is_start_stop.
    	(elf_gc_propagate_vtable_entries_used): Skip __start_SECNAME and
    	__stop_SECNAME symbols.  Updated.
    	(elf_gc_smash_unused_vtentry_relocs): Likewise.
    	(bfd_elf_gc_record_vtinherit): Likewise.
    	(bfd_elf_gc_record_vtentry): Likewise.
    
    ld/
    
    	PR ld/20022
    	PR ld/21557
    	PR ld/21562
    	PR ld/21571
    	* ld.texinfo: Update __start_SECNAME/__stop_SECNAME symbols.
    	* ldlang.c (lang_insert_orphan): Move handling of __start_SECNAME
    	and __stop_SECNAME symbols to ...
    	(lang_set_startof): Here.  Also define __start_SECNAME and
    	__stop_SECNAME for -Ur.
    	* emultempl/elf32.em (gld${EMULATION_NAME}_after_open): Mark
    	referenced __start_SECNAME and __stop_SECNAME symbols as hidden
    	and set start_stop for garbage collection.
    	* testsuite/ld-elf/pr21562a.d: New file.
    	* testsuite/ld-elf/pr21562a.s: Likewise.
    	* testsuite/ld-elf/pr21562a.t: Likewise.
    	* testsuite/ld-elf/pr21562b.d: Likewise.
    	* testsuite/ld-elf/pr21562b.s: Likewise.
    	* testsuite/ld-elf/pr21562b.t: Likewise.
    	* testsuite/ld-elf/pr21562c.d: Likewise.
    	* testsuite/ld-elf/pr21562c.t: Likewise.
    	* testsuite/ld-elf/pr21562d.d: Likewise.
    	* testsuite/ld-elf/pr21562d.t: Likewise.
    	* testsuite/ld-elf/pr21562e.d: Likewise.
    	* testsuite/ld-elf/pr21562f.d: Likewise.
    	* testsuite/ld-elf/pr21562g.d: Likewise.
    	* testsuite/ld-elf/pr21562h.d: Likewise.
    	* testsuite/ld-elf/pr21562i.d: Likewise.
    	* testsuite/ld-elf/pr21562j.d: Likewise.
    	* testsuite/ld-elf/pr21562k.d: Likewise.
    	* testsuite/ld-elf/pr21562l.d: Likewise.
    	* testsuite/ld-elf/pr21562m.d: Likewise.
    	* testsuite/ld-elf/pr21562n.d: Likewise.
    	* testsuite/ld-gc/pr20022.d: Likewise.
    	* testsuite/ld-gc/pr20022a.s: Likewise.
    	* testsuite/ld-gc/pr20022b.s: Likewise.
    	* testsuite/ld-gc/gc.exp: Run PR ld/20022 tests.
    	* testsuite/ld-gc/pr19161.d: Also accept local __start_SECNAME
    	symbol.
    	* testsuite/ld-gc/start.d: Likewise.
    	* testsuite/ld-x86-64/lea1a.d: Updated.
    	* testsuite/ld-x86-64/lea1b.d: Updated.
    	* testsuite/ld-x86-64/lea1d.d: Updated.
    	* testsuite/ld-x86-64/lea1e.d: Likewise.
Comment 6 H.J. Lu 2017-06-13 15:56:06 UTC
Fixed by

ommit cbd0eecf261c2447781f8c89b0d955ee66fae7e9
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Jun 13 08:53:22 2017 -0700

    Always define referenced __start_SECNAME/__stop_SECNAME
Comment 7 Sourceware Commits 2017-06-16 14:09:52 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7dba9362c172f1073487536eb137feb2da30b0ff

commit 7dba9362c172f1073487536eb137feb2da30b0ff
Author: Alan Modra <amodra@gmail.com>
Date:   Fri Jun 16 19:41:41 2017 +0930

    Rewrite __start and __stop symbol handling
    
    This arranges for __start and __stop symbols to be defined before
    garbage collection, for all target formats.  That should allow the
    COFF and PE --gc-sections to keep a singleton orphan input section,
    a feature lost by 2017-06-13 commit cbd0eecf26.  The fancier ELF
    treatment of keeping all input sections associated with a __start or
    __stop symbol, from 2015-10-23 commit 1cce69b9dc, is retained.
    
    .startof. and .sizeof. symbols are deliberately not defined before
    garbage collection, so these won't affect garbage collection of
    sections.
    
    The patch also ensures __start, __stop, .startof. and .sizeof. symbols
    are defined before target size_dynamic_sections is called, albeit
    with a preliminary value, so that target code doesn't need to cope
    with a symbol changing from undefined at size_dynamic_sections to
    defined at relocate_section.
    
    Also, a number of problems with the testcases have been fixed.
    
    	PR ld/20022
    	PR ld/21557
    	PR ld/21562
    	PR ld/21571
    include/
    	* bfdlink.h (struct bfd_link_hash_entry): Delete undef.section.
    bfd/
    	* targets.c (struct bfd_target): Add _bfd_define_start_stop.
    	(BFD_JUMP_TABLE_LINK): Likewise.
    	* elf-bfd.h (bfd_elf_define_start_stop): Declare.
    	* elflink.c (_bfd_elf_gc_mark_rsec): Update comment.
    	(bfd_elf_define_start_stop): New function.
    	* linker.c (bfd_generic_define_start_stop): New function.
    	* coff64-rs6000.c (rs6000_xcoff64_vec, rs6000_xcoff64_aix_vec): Init
    	new field.
    	* aout-adobe.c (aout_32_bfd_define_start_stop): Define.
    	* aout-target.h (MY_bfd_define_start_stop): Define.
    	* aout-tic30.c (MY_bfd_define_start_stop): Define.
    	* binary.c (binary_bfd_define_start_stop): Define.
    	* bout.c (b_out_bfd_define_start_stop): Define.
    	* coff-alpha.c (_bfd_ecoff_bfd_define_start_stop): Define.
    	* coff-mips.c (_bfd_ecoff_bfd_define_start_stop): Define.
    	* coff-rs6000.c (_bfd_xcoff_bfd_define_start_stop): Define.
    	* coffcode.h (coff_bfd_define_start_stop): Define.
    	* elfxx-target.h (bfd_elfNN_bfd_define_start_stop): Define.
    	* i386msdos.c (msdos_bfd_define_start_stop): Define.
    	* i386os9k.c (os9k_bfd_define_start_stop): Define.
    	* ieee.c (ieee_bfd_define_start_stop): Define.
    	* ihex.c (ihex_bfd_define_start_stop): Define.
    	* libbfd-in.h (_bfd_nolink_bfd_define_start_stop): Define.
    	* mach-o-target.c (bfd_mach_o_bfd_define_start_stop): Define.
    	* mmo.c (mmo_bfd_define_start_stop): Define.
    	* nlm-target.h (nlm_bfd_define_start_stop): Define.
    	* oasys.c (oasys_bfd_define_start_stop): Define.
    	* pef.c (bfd_pef_bfd_define_start_stop): Define.
    	* plugin.c (bfd_plugin_bfd_define_start_stop): Define.
    	* ppcboot.c (ppcboot_bfd_define_start_stop): Define.
    	* som.c (som_bfd_define_start_stop): Define.
    	* srec.c (srec_bfd_define_start_stop): Define.
    	* tekhex.c (tekhex_bfd_define_start_stop): Define.
    	* versados.c (versados_bfd_define_start_stop): Define.
    	* vms-alpha.c (vms_bfd_define_start_stop): Define.
    	(alpha_vms_bfd_define_start_stop): Define.
    	* xsym.c (bfd_sym_bfd_define_start_stop): Define.
    	* bfd-in2.h: Regenerate.
    	* libbfd.h: Regenerate.
    ld/
    	* emultempl/elf32.em (gld${EMULATION_NAME}_after_open): Don't set
    	__start/__stop syms here.
    	* ldlang.c (lang_set_startof): Delete.
    	(start_stop_syms, start_stop_count, start_stop_alloc): New vars.
    	(lang_define_start_stop, lang_init_start_stop, foreach_start_stop,
    	undef_start_stop, lang_undef_start_stop, lang_init_startof_sizeof,
    	set_start_stop, lang_finalize_start_stop): New functions.
    	(lang_process): Call _start_stop functions.
    	* testsuite/ld-elf/pr21562a.d: Use xfail rather than notarget.
    	Correct typos and list of xfail targets.
    	* testsuite/ld-elf/pr21562b.d: Likewise.
    	* testsuite/ld-elf/pr21562c.d: Likewise.
    	* testsuite/ld-elf/pr21562d.d: Likewise.
    	* testsuite/ld-elf/pr21562e.d: Likewise.
    	* testsuite/ld-elf/pr21562f.d: Likewise.
    	* testsuite/ld-elf/pr21562g.d: Likewise.
    	* testsuite/ld-elf/pr21562h.d: Likewise.
    	* testsuite/ld-elf/pr21562i.d: Likewise.
    	* testsuite/ld-elf/pr21562j.d: Likewise.
    	* testsuite/ld-elf/pr21562k.d: Likewise.
    	* testsuite/ld-elf/pr21562l.d: Likewise.
    	* testsuite/ld-elf/pr21562m.d: Likewise.
    	* testsuite/ld-elf/pr21562n.d: Likewise.
    	* testsuite/ld-elf/sizeofa.d: Likewise.  Adjust to pass for generic ELF.
    	* testsuite/ld-elf/sizeofb.d: Likewise.
    	* testsuite/ld-elf/startofa.d: Likewise.
    	* testsuite/ld-elf/startofb.d: Likewise.