This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] PR ld/15365: Restrict __ehdr_start's export class (was: RE: [COMMITTED PATCH] Use __ehdr_start, if available, as fallback for AT_PHDR.)


On Mon, 8 Apr 2013, Roland McGrath wrote:

> Such an error is by definition a bug in binutils.  If it occurs in
> released versions of binutils, then we would consider trying to work
> around it somehow.  But the thing to do first is to find a reduced
> test case that demonstrates the linker bug and take it up with the
> binutils maintainers.  (I would guess that any simple static link
> making use of __ehdr_start with a weak reference would hit the same
> problem, but you'll have to see what you get.)  Once it's resolved
> for trunk binutils, we'll know much more about the details and be in
> a better position to consider the situation for any older binutils
> versions that are affected.

 The issue here is the symbol is defined too late to change its binding.  
When all the input files have been read and the weak reference has not 
been resolved, then the symbol is considered global and undefined 
(remember that weak means global with a lower precedence).  The MIPS 
backend's GOT layout machinery is then called as the arrangement of all 
the dynamic sections is finalised (in bfd_elf_size_dynamic_sections; in 
its "traditional" SVR4 ABI MIPS uses GOT rather than copy relocs in all 
cases, even static executables).  Being global the symbol is assigned an 
index in the global part of the GOT; the index will then be used to 
satisfy any GOT relocations referring to this symbol (at the run time the 
local part of the GOT is relocated by adding the base address only while 
the global part undergoes full symbol resolution).

 Now if this was an ordinary unresolved weak reference and the executable 
ended up as dynamic then the reference would be entered into the dynamic 
symbol table and could still be satisfied by a dynamic object, e.g.:

Symbol table '.dynsym' contains 14 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
[...]
    13: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND foo

In a static executable the dynamic symbol is of course not produced after 
all, however GOT processing is the same and the symbol's entry remains 
zero forever.

 When later on the symbol is actually defined, then -- by definition -- it 
becomes local, because symbols defined in an executable cannot be 
preempted by a dynamic library.  This is checked by
_bfd_elf_symbol_refs_local_p:

  /* At this point, we know the symbol is defined and dynamic.  In an
     executable it must resolve locally, likewise when building symbolic
     shared libraries.  */
  if (info->executable || SYMBOLIC_BIND (info, h))
    return TRUE;

and that triggers when an R_MIPS_GOT_DISP relocation used by this 
reference (http://sourceware.org/bugzilla/show_bug.cgi?id=15365) is 
resolved.  The symbol is therefore treated as local; however it has 
already been assigned to the global part of the GOT and hence the 
assertion failure.  At this stage of processing GOT entries cannot be 
shuffled around anymore as some GOT relocations may have already been 
resolved and GOT indices fixed there.

 I'm not sure if there's an easy way to solve this problem or indeed any 
at all (without turning the linker internals upside down).  However 
question is: do you really need this symbol to be global?  It is defined 
unconditionally, for any executables or shared objects (which makes a lot 
of sense to me), and it looks to me its semantics would be more natural if 
it pointed at the containing binary file's ELF header rather than that of 
the main executable.  Therefore I propose to use the export class to limit 
the scope of the symbol; it's of course going to make no difference to 
static executables.

 Here's a change to implement it.  Just as with the binding, in 
assign_file_positions_for_non_load_sections it's too late to change a 
symbol's export class, so I propose to do it early, knowing that the 
symbol will be defined later on.  It's a bit hackish, but it seems about 
the best that we can do it (I'll gladly accept any better proposal of 
course).  The right place is after all the symbol assignments have been 
processed from the linker script used, but before 
bfd_elf_size_dynamic_sections has been called and this is in 
gld${EMULATION_NAME}_before_allocation.

 GOLD does not currently support the MIPS target, so I have only adjusted 
it to make the export class of the symbol produced hidden.  This matches 
what the LD change does and I hope it works as expected.  If MIPS SVR4 ABI 
is ever supported by GOLD, then a more complicated update may be required.

 I have regression-tested this patch against my usual list of 138 targets.  
I had to modify ld-elf/ehdr_start.d because on hppa64-linux making 
__ehdr_start's export class hidden causes its binding to become local as 
well.  As a result `nm' reports a lowercase "r".  As "d" or "t" are 
likewise legitimate, I have updated the test case to include them too.  
With this change in place there were no regressions.

 The new MIPS test case covers the problem across the usual three ABIs, 
"traditional" MIPS SVR4 ABI code is used -- these days the "non-PIC 
abicalls" modification is typically used with the o32 ABI meaning the 
problem does not trigger as GOT is not used for data symbols there, copy 
relocs are instead.  The original ld-elf/ehdr_start.d test case does not 
ever trigger this problem because a data reference is used there, so no 
GOT entry is created either.  The symbol type of __ehdr_start is NOTYPE or 
OBJECT depending on whether the traditional or the SGI/IRIX ELF flavour is 
requested.  The test covers mips*-sde-elf* too as that target includes 
ELF headers in the text segment (doesn't set EMBEDDED).

 I don't know how to validate GOLD; it doesn't seem to be enabled by 
default for any of the 138 targets I use in testing.

 And last but not least I have included a workaround for glibc to work 
with problematic binutils versions.  It effectively amounts to what the 
binutils change makes the linker do internally; additionally it has a 
potential to let GCC produce better code, although this is marginal (on 
MIPS targets only the MIPS16 and microMIPS instruction sets allow for 
PC-relative data accesses; however MIPS16 and microMIPS is typically built 
using the "non-PIC abicalls" model).  Conceptually I think it is the right 
change regardless of being a workaround -- it signifies the variable is 
not supposed to be exported to or imported from outside the executable 
itself.

 Questions or comments?  If this looks good, then OK to apply to the 
respective repositories?

2013-04-24  Maciej W. Rozycki  <macro@codesourcery.com>

	gold/
	PR ld/15365
	* layout.cc (Layout::finalize): Make __ehdr_start STV_HIDDEN.

	ld/
	PR ld/15365
	* emultempl/elf32.em (gld${EMULATION_NAME}_before_allocation):
	Restrict __ehdr_start's export class to no less than STV_HIDDEN.

	ld/testsuite/
	PR ld/15365
	* ld-elf/ehdr_start.d: Also accept STB_LOCAL __ehdr_start.
	* ld-mips-elf/ehdr_start.nd: New test.
	* ld-mips-elf/ehdr_start-new.s: New test source.
	* ld-mips-elf/ehdr_start-o32.s: New test source.
	* ld-mips-elf/mips-elf.exp: Run the new test.

2013-04-24  Maciej W. Rozycki  <macro@codesourcery.com>

	* csu/libc-start.c (__libc_start_main) [!SHARED]: Declare
	__ehdr_start with hidden visibility.

  Maciej

binutils-ehdr_start-hidden.diff
Index: binutils-fsf-trunk-quilt/gold/layout.cc
===================================================================
--- binutils-fsf-trunk-quilt.orig/gold/layout.cc	2013-04-24 01:37:47.000000000 +0100
+++ binutils-fsf-trunk-quilt/gold/layout.cc	2013-04-24 02:19:46.653830728 +0100
@@ -2662,7 +2662,7 @@ Layout::finalize(const Input_objects* in
     symtab->define_in_output_segment("__ehdr_start", NULL,
 				     Symbol_table::PREDEFINED, load_seg, 0, 0,
 				     elfcpp::STT_NOTYPE, elfcpp::STB_GLOBAL,
-				     elfcpp::STV_DEFAULT, 0,
+				     elfcpp::STV_HIDDEN, 0,
 				     Symbol::SEGMENT_START, true);
 
   // Set the file offsets of all the non-data sections we've seen so
Index: binutils-fsf-trunk-quilt/ld/emultempl/elf32.em
===================================================================
--- binutils-fsf-trunk-quilt.orig/ld/emultempl/elf32.em	2013-04-24 01:37:47.000000000 +0100
+++ binutils-fsf-trunk-quilt/ld/emultempl/elf32.em	2013-04-24 02:19:46.653830728 +0100
@@ -1488,6 +1488,29 @@ gld${EMULATION_NAME}_before_allocation (
      referred to by dynamic objects.  */
   lang_for_each_statement (gld${EMULATION_NAME}_find_statement_assignment);
 
+  /* The magic __ehdr_start symbol needs to have its export class restricted
+     at least to hidden.  Force it now rather than when the symbol is defined
+     in assign_file_positions_for_non_load_sections, because it is too late
+     there and by then backends may have already arranged for the symbol to
+     have a global scope as with the default export class.  */
+  if (!link_info.relocatable)
+    {
+      struct elf_link_hash_entry *h;
+
+      h = elf_link_hash_lookup (elf_hash_table (&link_info), "__ehdr_start",
+				FALSE, FALSE, TRUE);
+
+      /* Only adjust the export class if the symbol was referenced and not
+         defined, otherwise leave it alone.  */
+      if (h != NULL
+	  && (h->root.type == bfd_link_hash_new
+	      || h->root.type == bfd_link_hash_undefined
+	      || h->root.type == bfd_link_hash_undefweak
+	      || h->root.type == bfd_link_hash_common)
+	  && ELF_ST_VISIBILITY (h->other) != STV_INTERNAL)
+	h->other = (h->other & ~ELF_ST_VISIBILITY (-1)) | STV_HIDDEN;
+    }
+
   /* Let the ELF backend work out the sizes of any sections required
      by dynamic linking.  */
   rpath = command_line.rpath;
Index: binutils-fsf-trunk-quilt/ld/testsuite/ld-elf/ehdr_start.d
===================================================================
--- binutils-fsf-trunk-quilt.orig/ld/testsuite/ld-elf/ehdr_start.d	2013-04-24 01:37:47.000000000 +0100
+++ binutils-fsf-trunk-quilt/ld/testsuite/ld-elf/ehdr_start.d	2013-04-24 02:19:46.653830728 +0100
@@ -4,5 +4,5 @@
 #target: *-*-linux* *-*-gnu* *-*-nacl*
 
 #...
-[0-9a-f]*000 [ADRT] __ehdr_start
+[0-9a-f]*000 [ADdRrTt] __ehdr_start
 #pass
Index: binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start-new.s
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start-new.s	2013-04-24 02:19:56.652584991 +0100
@@ -0,0 +1,13 @@
+	.abicalls
+	.text
+	.weak	__ehdr_start
+	.globl	__start
+	.ent	__start
+	.frame	$29, 0, $31
+	.mask	0x00000000, 0
+__start:
+	.cplocal $2
+	.cpsetup $t9, $zero, __start
+	lw	$2, __ehdr_start
+	jr	$31
+	.end	__start
Index: binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start-o32.s
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start-o32.s	2013-04-24 02:19:56.652584991 +0100
@@ -0,0 +1,14 @@
+	.abicalls
+	.text
+	.weak	__ehdr_start
+	.globl	__start
+	.ent	__start
+	.frame	$29, 0, $31
+	.mask	0x00000000, 0
+__start:
+	.set	noreorder
+	.cpload	$25
+	.set	reorder
+	lw	$2, __ehdr_start
+	jr	$31
+	.end	__start
Index: binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start.nd
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/ehdr_start.nd	2013-04-24 03:16:48.593668879 +0100
@@ -0,0 +1,4 @@
+Symbol table '\.symtab' contains [0-9]+ entries:
+#...
+ *[0-9]+: +[0-9a-f]+ +0 +(?:NOTYPE|OBJECT) +GLOBAL +HIDDEN +[0-9]+ +__ehdr_start
+#pass
Index: binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/mips-elf.exp
===================================================================
--- binutils-fsf-trunk-quilt.orig/ld/testsuite/ld-mips-elf/mips-elf.exp	2013-04-24 02:04:41.000000000 +0100
+++ binutils-fsf-trunk-quilt/ld/testsuite/ld-mips-elf/mips-elf.exp	2013-04-24 03:27:23.673834863 +0100
@@ -709,3 +709,19 @@ foreach { abi } $abis {
 		"readelf -A export-class-call16-${abi}.gd"] \
 	    "export-class-call16-${abi}.so"]]
 }
+
+# Magic __ehdr_start symbol tests.
+if { !$embedded_elf || [istarget mips*-sde-elf*] } {
+    set abis [concat o32 [expr {$has_newabi ? "n32 n64" : ""}]]
+    foreach { abi } $abis {
+	set suff [string map {o32 o32 n32 new n64 new} $abi]
+	run_ld_link_tests [list \
+	    [list \
+		"MIPS magic __ehdr_start symbol test ($abi)" \
+		"$abi_ldflags($abi)" "" \
+		"$abi_asflags($abi)" \
+		[list ehdr_start-${suff}.s] \
+		[list "readelf -s ehdr_start.nd"] \
+		"ehdr_start-${abi}"]]
+    }
+}

glibc-ehdr_start-hidden.diff
Index: glibc-fsf-trunk-quilt/csu/libc-start.c
===================================================================
--- glibc-fsf-trunk-quilt.orig/csu/libc-start.c	2013-04-19 23:07:37.000000000 +0100
+++ glibc-fsf-trunk-quilt/csu/libc-start.c	2013-04-23 00:33:26.607271070 +0100
@@ -161,7 +161,8 @@ LIBC_START_MAIN (int (*main) (int, char 
          So we can set up _dl_phdr and _dl_phnum even without any
          information from auxv.  */
 
-      extern const ElfW(Ehdr) __ehdr_start __attribute__ ((weak));
+      extern const ElfW(Ehdr) __ehdr_start
+	__attribute__ ((weak, visibility ("hidden")));
       if (&__ehdr_start != NULL)
         {
           assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]