[RFA:] Fix DEFINED in ld scripts.

Hans-Peter Nilsson hp@bitrange.com
Fri Oct 10 08:12:00 GMT 2003


This *is* a change in behavior: you can no longer use DEFINED on symbols
defined after the DEFINED expression, including the current symbol
definition, if in such a statement.  I believe the new behavior is
intuitive.

See the test-case below.  Without this patch, DEFINED as in the example
in ld.texinfo gets surprising results in relaxing links if that symbol
changes value due to relaxing as mentioned in previous messages.

A slight change to the manual was necessary, to describe that DEFINED only
works for symbols defined before the current statement.

No regressions were found in testing:
A full toolchain was build and checked (cross; check-gas check-ld
check-binutils check-gcc) for mips-elf, arm-elf, sh-elf, v850-elf, sh-coff.
Binutils was checked (cross; check-gas check-ld check-binutils)
for z8k-coff and mmix-knuth-mmixware.
A full native toolchain was build and checked (check) on i686-pc-linux-gnu
(with and without --enable-targets=all --enable-64-bit-bfd}
and ia64-unknown-linux-gnu.

FWIW, a quick grep in linux-2.4.14 yields no hits in linker scripts.

Ok to commit?

ld:

	* ld.texinfo (Builtin Functions) <DEFINED>: Say that only symbols
	defined before the statement using DEFINED yield 1.
	* ldexp.c (fold_name) <case DEFINED>: In lang_first_phase_enum,
	call lang_track_definedness on symbol.  In subsequent phases, use
	lang_symbol_definition_iteration and lang_statement_iteration to
	check whether the symbol was defined before the current statement.
	(exp_fold_tree) <case etree_assign et al>: Call
	lang_update_definedness before updating symbol type when setting
	symbol.
	* ldlang.c (lang_definedness_table): New variable.
	(lang_definedness_newfunc, lang_track_definedness)
	(lang_symbol_definition_iteration, lang_update_definedness): New
	functions.
	(lang_init): Initialize lang_definedness_table and
	lang_statement_iteration.
	(lang_finish): Destroy bfd_hash_table_free.
	(lang_size_sections): Increment lang_statement_iteration.
	(lang_process): Increment lang_statement_iteration before calls to
	lang_do_assignments.
	* ldlang.h (struct lang_definedness_hash_entry)
	(lang_statement_iteration, lang_track_definedness)
	(lang_symbol_definition_iteration, lang_update_definedness):
	Declare.

ld/testsuite:
	* ld-scripts/defined3.d, ld-scripts/defined3.t: New test.
	* ld-scripts/defined.exp: Run defined3.

Index: ld.texinfo
===================================================================
RCS file: /cvs/src/src/ld/ld.texinfo,v
retrieving revision 1.98
diff -p -c -r1.98 ld.texinfo
*** ld.texinfo	20 Aug 2003 08:37:15 -0000	1.98
--- ld.texinfo	10 Oct 2003 07:37:26 -0000
*************** evaluation purposes.
*** 4452,4458 ****
  @kindex DEFINED(@var{symbol})
  @cindex symbol defaults
  Return 1 if @var{symbol} is in the linker global symbol table and is
! defined, otherwise return 0.  You can use this function to provide
  default values for symbols.  For example, the following script fragment
  shows how to set a global symbol @samp{begin} to the first location in
  the @samp{.text} section---but if a symbol called @samp{begin} already
--- 4452,4459 ----
  @kindex DEFINED(@var{symbol})
  @cindex symbol defaults
  Return 1 if @var{symbol} is in the linker global symbol table and is
! defined before the statement using DEFINED in the script, otherwise
! return 0.  You can use this function to provide
  default values for symbols.  For example, the following script fragment
  shows how to set a global symbol @samp{begin} to the first location in
  the @samp{.text} section---but if a symbol called @samp{begin} already
Index: ldexp.c
===================================================================
RCS file: /cvs/src/src/ld/ldexp.c,v
retrieving revision 1.25
diff -p -c -r1.25 ldexp.c
*** ldexp.c	8 Oct 2003 12:40:26 -0000	1.25
--- ldexp.c	10 Oct 2003 07:37:26 -0000
*************** fold_name (etree_type *tree,
*** 485,494 ****
        break;
      case DEFINED:
        if (allocation_done == lang_first_phase_enum)
! 	result.valid_p = FALSE;
        else
  	{
  	  struct bfd_link_hash_entry *h;

  	  h = bfd_wrapped_link_hash_lookup (output_bfd, &link_info,
  					    tree->name.name,
--- 485,499 ----
        break;
      case DEFINED:
        if (allocation_done == lang_first_phase_enum)
! 	{
! 	  lang_track_definedness (tree->name.name);
! 	  result.valid_p = FALSE;
! 	}
        else
  	{
  	  struct bfd_link_hash_entry *h;
+ 	  int def_iteration
+ 	    = lang_symbol_definition_iteration (tree->name.name);

  	  h = bfd_wrapped_link_hash_lookup (output_bfd, &link_info,
  					    tree->name.name,
*************** fold_name (etree_type *tree,
*** 496,502 ****
  	  result.value = (h != NULL
  			  && (h->type == bfd_link_hash_defined
  			      || h->type == bfd_link_hash_defweak
! 			      || h->type == bfd_link_hash_common));
  	  result.section = abs_output_section;
  	  result.valid_p = TRUE;
  	}
--- 501,509 ----
  	  result.value = (h != NULL
  			  && (h->type == bfd_link_hash_defined
  			      || h->type == bfd_link_hash_defweak
! 			      || h->type == bfd_link_hash_common)
! 			  && (def_iteration == lang_statement_iteration
! 			      || def_iteration == -1));
  	  result.section = abs_output_section;
  	  result.valid_p = TRUE;
  	}
*************** exp_fold_tree (etree_type *tree,
*** 738,743 ****
--- 745,751 ----
  		{
  		  /* FIXME: Should we worry if the symbol is already
  		     defined?  */
+ 		  lang_update_definedness (tree->assign.dst, h);
  		  h->type = bfd_link_hash_defined;
  		  h->u.def.value = result.value;
  		  h->u.def.section = result.section->bfd_section;
Index: ldlang.c
===================================================================
RCS file: /cvs/src/src/ld/ldlang.c,v
retrieving revision 1.118
diff -p -c -r1.118 ldlang.c
*** ldlang.c	8 Oct 2003 22:17:35 -0000	1.118
--- ldlang.c	10 Oct 2003 07:37:28 -0000
*************** static const char *current_target;
*** 60,65 ****
--- 60,66 ----
  static const char *output_target;
  static lang_statement_list_type statement_list;
  static struct lang_phdr *lang_phdr_list;
+ static struct bfd_hash_table lang_definedness_table;

  /* Forward declarations.  */
  static void exp_init_os (etree_type *);
*************** static bfd_boolean wildcardp (const char
*** 67,72 ****
--- 68,75 ----
  static lang_input_statement_type *lookup_name (const char *);
  static bfd_boolean load_symbols (lang_input_statement_type *,
  				 lang_statement_list_type *);
+ static struct bfd_hash_entry *lang_definedness_newfunc
+  (struct bfd_hash_entry *, struct bfd_hash_table *, const char *);
  static void insert_undefined (const char *);
  static void print_statement (lang_statement_union_type *,
  			     lang_output_section_statement_type *);
*************** bfd_boolean delete_output_file_on_failur
*** 95,100 ****
--- 98,104 ----
  struct lang_nocrossrefs *nocrossref_list;
  struct unique_sections *unique_section_list;
  static bfd_boolean ldlang_sysrooted_script = FALSE;
+ int lang_statement_iteration = 0;

  etree_type *base; /* Relocation base - or null */

*************** lang_init (void)
*** 477,482 ****
--- 481,496 ----
      lang_output_section_statement_lookup (BFD_ABS_SECTION_NAME);

    abs_output_section->bfd_section = bfd_abs_section_ptr;
+
+   /* The value "3" is ad-hoc, somewhat related to the expected number of
+      DEFINED expressions in a linker script.  For most default linker
+      scripts, there are none.  */
+   if (bfd_hash_table_init_n (&lang_definedness_table,
+ 			     lang_definedness_newfunc, 3) != TRUE)
+     einfo (_("%P%F: out of memory during initialization"));
+
+   /* Callers of exp_fold_tree need to increment this.  */
+   lang_statement_iteration = 0;
  }

  /*----------------------------------------------------------------------
*************** lang_reasonable_defaults (void)
*** 1867,1872 ****
--- 1881,1965 ----
  #endif
  }

+ /* Add a symbol to a hash of symbols used in DEFINED (NAME) expressions.  */
+
+ void
+ lang_track_definedness (const char *name)
+ {
+   if (bfd_hash_lookup (&lang_definedness_table, name, TRUE, FALSE) == NULL)
+     einfo (_("%P%F: bfd_hash_lookup failed creating symbol %s\n"), name);
+ }
+
+ /* New-function for the definedness hash table.  */
+
+ static struct bfd_hash_entry *
+ lang_definedness_newfunc (struct bfd_hash_entry *entry,
+ 			  struct bfd_hash_table *table ATTRIBUTE_UNUSED,
+ 			  const char *name ATTRIBUTE_UNUSED)
+ {
+   struct lang_definedness_hash_entry *ret
+     = (struct lang_definedness_hash_entry *) entry;
+
+   if (ret == NULL)
+     ret = (struct lang_definedness_hash_entry *)
+       bfd_hash_allocate (table, sizeof (struct lang_definedness_hash_entry));
+
+   if (ret == NULL)
+     einfo (_("%P%F: bfd_hash_allocate failed creating symbol %s\n"), name);
+
+   ret->iteration = -1;
+   return &ret->root;
+ }
+
+ /* Return the iteration when the definition of NAME was last updated.  A
+    value of -1 means that the symbol is not defined in the linker script
+    or the command line, but may be defined in the linker symbol table.  */
+
+ int
+ lang_symbol_definition_iteration (const char *name)
+ {
+   struct lang_definedness_hash_entry *defentry
+     = (struct lang_definedness_hash_entry *)
+     bfd_hash_lookup (&lang_definedness_table, name, FALSE, FALSE);
+
+   /* We've already created this one on the presence of DEFINED in the
+      script, so it can't be NULL unless something is borked elsewhere in
+      the code.  */
+   if (defentry == NULL)
+     FAIL ();
+
+   return defentry->iteration;
+ }
+
+ /* Update the definedness state of NAME.  */
+
+ void
+ lang_update_definedness (const char *name, struct bfd_link_hash_entry *h)
+ {
+   struct lang_definedness_hash_entry *defentry
+     = (struct lang_definedness_hash_entry *)
+     bfd_hash_lookup (&lang_definedness_table, name, FALSE, FALSE);
+
+   /* We don't keep track of symbols not tested with DEFINED.  */
+   if (defentry == NULL)
+     return;
+
+   /* If the symbol was already defined, and not from an earlier statement
+      iteration, don't update the definedness iteration, because that'd
+      make the symbol seem defined in the linker script at this point, and
+      it wasn't; it was defined in some object.  If we do anyway, DEFINED
+      would start to yield false before this point and the construct "sym =
+      DEFINED (sym) ? sym : X;" would change sym to X despite being defined
+      in an object.  */
+   if (h->type != bfd_link_hash_undefined
+       && h->type != bfd_link_hash_common
+       && h->type != bfd_link_hash_new
+       && defentry->iteration == -1)
+     return;
+
+   defentry->iteration = lang_statement_iteration;
+ }
+
  /* Add the supplied name to the symbol table as an undefined reference.
     This is a two step process as the symbol table doesn't even exist at
     the time the ld command line is processed.  First we put the name
*************** lang_size_sections
*** 3131,3136 ****
--- 3224,3233 ----
     bfd_boolean check_regions)
  {
    bfd_vma result;
+   asection *o;
+
+   /* Callers of exp_fold_tree need to increment this.  */
+   lang_statement_iteration++;

    exp_data_seg.phase = exp_dataseg_none;
    result = lang_size_sections_1 (s, output_section_statement, prev, fill,
*************** lang_finish (void)
*** 3445,3450 ****
--- 3545,3552 ----
  	    }
  	}
      }
+
+   bfd_hash_table_free (&lang_definedness_table);
  }

  /* This is a small function used when we want to ignore errors from
*************** lang_process (void)
*** 4108,4114 ****
  	     pe-dll.c also.  DJ  */

  	  /* Do all the assignments with our current guesses as to
! 	     section sizes.  */
  	  lang_do_assignments (statement_list.head, abs_output_section,
  			       NULL, 0);

--- 4210,4224 ----
  	     pe-dll.c also.  DJ  */

  	  /* Do all the assignments with our current guesses as to
! 	     section sizes.
!
! 	     Callers of exp_fold_tree need to increment
! 	     lang_statement_iteration before starting to iterate over the
! 	     statement list.  We need to do it before each call instead of
! 	     in lang_do_assignments because we can't tell the root call
! 	     from its recursions.  */
!
! 	  lang_statement_iteration++;
  	  lang_do_assignments (statement_list.head, abs_output_section,
  			       NULL, 0);

*************** lang_process (void)
*** 4129,4134 ****
--- 4239,4245 ----

        /* Final extra sizing to report errors.  */
        lang_reset_memory_regions ();
+       lang_statement_iteration++;
        lang_do_assignments (statement_list.head, abs_output_section, NULL, 0);
        lang_size_sections (statement_list.head, abs_output_section,
  			  &statement_list.head, 0, 0, NULL, TRUE);
*************** lang_process (void)
*** 4143,4149 ****

    /* Do all the assignments, now that we know the final resting places
       of all the symbols.  */
!
    lang_do_assignments (statement_list.head, abs_output_section, NULL, 0);

    /* Make sure that the section addresses make sense.  */
--- 4254,4260 ----

    /* Do all the assignments, now that we know the final resting places
       of all the symbols.  */
!   lang_statement_iteration++;
    lang_do_assignments (statement_list.head, abs_output_section, NULL, 0);

    /* Make sure that the section addresses make sense.  */
Index: ldlang.h
===================================================================
RCS file: /cvs/src/src/ld/ldlang.h,v
retrieving revision 1.29
diff -p -c -r1.29 ldlang.h
*** ldlang.h	27 Jul 2003 11:58:28 -0000	1.29
--- ldlang.h	10 Oct 2003 07:37:29 -0000
*************** struct unique_sections {
*** 361,366 ****
--- 361,374 ----
    const char *name;
  };

+ /* This structure records symbols for which we need to keep track of
+    definedness for use in the DEFINED () test.  */
+
+ struct lang_definedness_hash_entry {
+   struct bfd_hash_entry root;
+   int iteration;
+ };
+
  extern struct unique_sections *unique_section_list;

  extern lang_output_section_statement_type *abs_output_section;
*************** extern const char *entry_section;
*** 375,380 ****
--- 383,390 ----
  extern bfd_boolean entry_from_cmdline;
  extern lang_statement_list_type file_chain;

+ extern int lang_statement_iteration;
+
  extern void lang_init
    (void);
  extern struct memory_region_struct *lang_memory_region_lookup
*************** extern void lang_add_unique
*** 517,521 ****
--- 527,535 ----
    (const char *);
  extern const char *lang_get_output_target
    (void);
+ extern void lang_track_definedness (const char *);
+ extern int lang_symbol_definition_iteration (const char *);
+ extern void lang_update_definedness
+   (const char *, struct bfd_link_hash_entry *);

  #endif
Index: testsuite/ld-scripts/defined.exp
===================================================================
RCS file: /cvs/src/src/ld/testsuite/ld-scripts/defined.exp,v
retrieving revision 1.4
diff -p -c -r1.4 defined.exp
*** testsuite/ld-scripts/defined.exp	8 Oct 2003 12:35:18 -0000	1.4
--- testsuite/ld-scripts/defined.exp	10 Oct 2003 07:37:30 -0000
*************** if ![ld_simple_link $ld tmpdir/def "-T $
*** 56,58 ****
--- 56,59 ----

  set prms_id 0
  run_dump_test "defined2"
+ run_dump_test "defined3"
--- /dev/null	Tue Jan  1 05:00:00 1980
+++ testsuite/ld-scripts/defined3.d	Wed Oct  8 04:31:00 2003
@@ -0,0 +1,25 @@
+#ld: -Tdefined3.t
+#nm: -B
+#source: phdrs.s
+#source: defined.s
+
+# Check that DEFINED matches only symbols defined before its location.
+# The ellipsis account for target-specific symbols.  Matching both A and T
+# accounts for formats that can't tell a .text symbol from an absolute
+# symbol (mmo), but matches whatever section that contains an address
+# matching the value.
+
+#...
+0+1 [AT] defined
+0+200 A defined1
+0+201 A defined2
+0+100 A defined3
+0+ [AT] defined4
+0+2a A defined5
+0+ [AT] defined6
+0+1 [AT] defined7
+0+1 [AT] defined8
+#...
+0+2a A sym1
+[0-9a-f]+ T sym2
+#pass
--- /dev/null	Tue Jan  1 05:00:00 1980
+++ testsuite/ld-scripts/defined3.t	Wed Oct  8 02:28:49 2003
@@ -0,0 +1,15 @@
+defined6 = DEFINED (sym2) ? 1 : 0;
+SECTIONS {
+	.text : { *(.text) sym2 = .; }
+	.data : { *(.data) }
+	.bss : { *(.bss) *(COMMON) }
+}
+defined4 = DEFINED (sym1) ? 1 : 0;
+sym1 = 42;
+defined3 = DEFINED (defined1) ? defined1 + 1 : 256;
+defined1 = DEFINED (defined1) ? defined1 + 1 : 512;
+defined2 = DEFINED (defined1) ? defined1 + 1 : 1024;
+defined5 = DEFINED (sym1) ? sym1 : 0;
+defined7 = DEFINED (sym2);
+defined8 = !DEFINED (defined8);
+defined = DEFINED (defined) ? defined : 42;

brgds, H-P



More information about the Binutils mailing list