[PATCH] DT_GNU_HASH: ~ 50% dynamic linking improvement

Jakub Jelinek jakub@redhat.com
Wed Jun 28 18:45:00 GMT 2006


Hi!

The following patches introduce an optional ELF hash section
replacement, optimized for speed and data cache accesses, which in our
tests improves dynamic linking by about 50%.  Prelinking of course
eliminates the costs altogether but if prelinked apps use dlopen they
benefit for this portion.  This is incidently where some apps today
incur high costs today.

The initial design was done by Ulrich Drepper and was discussed with
Michael Meeks a few months ago as well.  But nothing came off of these
discussions and the proposed approach is quite different.

The binutils patch adds a new option to ld, --hash-style, which allows
selection of which hash sections to emit.  ld can either emit the old
style SHT_HASH section, or the new style SHT_GNU_HASH, or both (and
perhaps in the future there could be a mode in which it would emit
both, but SHT_HASH for slow compatibility only (minimal number of
buckets)).

The .gnu.hash section uses always sh_entsize 4 (so doesn't repeat the
historic mistakes on Alpha/s390x with .hash).  The first word there is
nbuckets like in .hash section, followed by nbuckets offsets into the
chains area of the new section.  If the offset is 0xffffffff, it means
there are no defined symbol names with hash % nbuckets equal to the
offset's position.  Otherwise, offset N means the corresponding chain
starts at offset (1 + nbuckets + N) * 4 into .gnu.hash section.  The
chain are does not mirror in size the symbol table anymore.

Each chain starts with a symbol index word, followed by chain length
word and then length times hash value of the corresponding symbol
name.  If DT_GNU_HASH is present, .dynsym must be sorted, so that all
symbols with the same name hash value % nbuckets are grouped together.
So, if some chain contains

symindx0 3 hash0 hash1 hash2

then dynamic symbol symindx0 has name hash hash0, symindx0+1 has hash1 and
symindx0+2 has hash2 and (hash0%nbuckets)==(hash1%nbuckets)==(hash2%nbuckets).

The hash function used is

static uint_fast32_t
dl_new_hash (const char *s)
{
  uint_fast32_t h = 5381;
  for (unsigned char c = *s; c != '\0'; c = *++s)
    h = h * 33 + c;
  return h & 0xffffffff;
}

(Dan Bernstein's string hash function posted eons ago on comp.lang.c.)

For an unsuccessful lookup (the usual case), the dynamic linker has to
read the buckets[hash % nbuckets] (one cache line) and then go through
the chain if it is not 0xffffffff, comparing each hashN with hash and
as the hashN values are 32-bit and the hashing function spreads the
strings well, it is very likely that if the hash is equal, then we
have found the symbol (and just need to verify .dynsym/.dynstr), if it
is different from all hashN values in the chain, ld.so can go on with
another shared library.  Usually the whole chain will be in one cache
line or at most 2 cache lines.

We have tested a bunch of different hash functions on the set of ~
530000 unique dynamic symbols in one Linux installation and Dan
Bernstein's hash had the fewest collisions (only 29), with very short
maximum common prefixes for the colliding symbols (just 3 chars) and
is also very cheap to compute (h * 33 is (h << 5) + h), also both ld
-O1 and non-optimizing ld hash sizing gave good results with that hash
function.  The standard SysV ELF hash function gave bad results, on
the same set of symbols had 1726 collisions and some of the colliding
symbols had very long common name prefixes.  One of the reasons could
be that SysV ELF hash function is 28 bit, while Bernstein's is 32 bit.

Attached stats file shows the biggest benchmark we used (a proglet
that attempts to do all dynamic linking OpenOffice.org 2.0.3
swriter.bin does), most of the libraries (except about 8 libs from the
total of 146 libraries used by the testcase) were relinked with
-Wl,--hash-style=both and glibc on top of the DT_GNU_HASH support
patch had also a hack, where if LD_X=1 was in environment, it would
pretend DT_GNU_HASH is not present to allow easier benchmarking.  The
LD_X support will not be in the final glibc patch.

Ok for trunk?

	Jakub
-------------- next part --------------
2006-06-27  Jakub Jelinek  <jakub@redhat.com>

include/
	* bfdlink.h (struct bfd_link_info): Add emit_hash and
	emit_gnu_hash bitfields.
include/elf/
	* common.h (SHT_GNU_HASH, DT_GNU_HASH): Define.
ld/
	* scripttempl/elf.sc: Add .gnu.hash section.
	* emultempl/elf32.em (OPTION_HASH_STYLE): Define.
	(gld${EMULATION_NAME}_add_options): Register --hash-style option.
	(gld${EMULATION_NAME}_handle_option): Handle it.
	(gld${EMULATION_NAME}_list_options): Document it.
	* ldmain.c (main): Initialize emit_hash and emit_gnu_hash.
	* ld.texinfo: Document --hash-style option.
bfd/
	* elf.c (_bfd_elf_print_private_bfd_data): Handle DT_GNU_HASH.
	(bfd_section_from_shdr, elf_fake_sections, assign_section_numbers):
	Handle SHT_GNU_HASH.
	(special_sections_g): Include .gnu.hash section.
	(bfd_elf_gnu_hash): New function.
	* elf-bfd.h (bfd_elf_gnu_hash): New prototype.
	* elflink.c (_bfd_elf_link_create_dynamic_sections): Create .hash
	only if info->emit_hash, create .gnu.hash section if
	info->emit_gnu_hash.
	(struct collect_gnu_hash_codes): New type.
	(elf_collect_gnu_hash_codes, elf_renumber_gnu_hash_syms): New
	functions.
	(compute_bucket_count): Don't compute HASHCODES array, instead add
	that and NSYMS as arguments.  Use bed->s->sizeof_hash_entry
	instead of bed->s->arch_size / 8.  Fix .hash size estimation.
	When not optimizing, use the number of hashed symbols rather than
	dynsymcount.
	(bfd_elf_size_dynamic_sections): Only add DT_HASH if info->emit_hash,
	and ADD DT_GNU_HASH if info->emit_gnu_hash.
	(bfd_elf_size_dynsym_hash_dynstr): Size .hash only if info->emit_hash,
	adjust compute_bucket_count caller.  Create and populate .gnu.hash
	section if info->emit_gnu_hash.
	(elf_link_output_extsym): Only populate .hash section if
	finfo->hash_sec != NULL.
	(bfd_elf_final_link): Adjust assertion.  Handle DT_GNU_HASH.
binutils/
	* readelf.c (get_dynamic_type): Handle DT_GNU_HASH.
	(get_section_type_name): Handle SHT_GNU_HASH.
	(dynamic_info_DT_GNU_HASH): New variable.
	(process_dynamic_section): Handle DT_GNU_HASH.
	(process_symbol_table): Print also DT_GNU_HASH histogram.

--- ld/scripttempl/elf.sc.jj	2006-01-01 01:02:16.000000000 +0100
+++ ld/scripttempl/elf.sc	2006-06-22 11:11:53.000000000 +0200
@@ -260,6 +260,7 @@ SECTIONS
   ${INITIAL_READONLY_SECTIONS}
   ${TEXT_DYNAMIC+${DYNAMIC}}
   .hash         ${RELOCATING-0} : { *(.hash) }
+  .gnu.hash     ${RELOCATING-0} : { *(.gnu.hash) }
   .dynsym       ${RELOCATING-0} : { *(.dynsym) }
   .dynstr       ${RELOCATING-0} : { *(.dynstr) }
   .gnu.version  ${RELOCATING-0} : { *(.gnu.version) }
--- ld/ldmain.c.jj	2006-06-01 15:50:33.000000000 +0200
+++ ld/ldmain.c	2006-06-22 11:21:11.000000000 +0200
@@ -304,6 +304,8 @@ main (int argc, char **argv)
   link_info.create_object_symbols_section = NULL;
   link_info.gc_sym_list = NULL;
   link_info.base_file = NULL;
+  link_info.emit_hash = TRUE;
+  link_info.emit_gnu_hash = FALSE;
   /* SVR4 linkers seem to set DT_INIT and DT_FINI based on magic _init
      and _fini symbols.  We are compatible.  */
   link_info.init_function = "_init";
--- ld/ld.texinfo.jj	2006-06-15 14:31:06.000000000 +0200
+++ ld/ld.texinfo	2006-06-22 14:03:21.000000000 +0200
@@ -1883,6 +1883,14 @@ time it takes the linker to perform its 
 increasing the linker's memory requirements.  Similarly reducing this
 value can reduce the memory requirements at the expense of speed.
 
+@kindex --hash-style=@var{style}
+@item --hash-style=@var{style}
+Set the type of linker's hash table(s).  @var{style} can be either
+@code{sysv} for classic ELF @code{.hash} section, @code{gnu} for
+new style GNU @code{.gnu.hash} section or @code{both} for both
+the classic ELF @code{.hash} and new style GNU @code{.gnu.hash}
+hash tables.  The default is @code{sysv}.
+
 @kindex --reduce-memory-overheads
 @item --reduce-memory-overheads
 This option reduces memory requirements at ld runtime, at the expense of
--- ld/emultempl/elf32.em.jj	2006-06-20 18:34:24.000000000 +0200
+++ ld/emultempl/elf32.em	2006-06-22 14:39:25.000000000 +0200
@@ -1725,7 +1725,8 @@ cat >>e${EMULATION_NAME}.c <<EOF
 #define OPTION_GROUP			(OPTION_ENABLE_NEW_DTAGS + 1)
 #define OPTION_EH_FRAME_HDR		(OPTION_GROUP + 1)
 #define OPTION_EXCLUDE_LIBS		(OPTION_EH_FRAME_HDR + 1)
-  
+#define OPTION_HASH_STYLE		(OPTION_EXCLUDE_LIBS + 1)
+
 static void
 gld${EMULATION_NAME}_add_options
   (int ns, char **shortopts, int nl, struct option **longopts,
@@ -1741,6 +1742,7 @@ cat >>e${EMULATION_NAME}.c <<EOF
     {"enable-new-dtags", no_argument, NULL, OPTION_ENABLE_NEW_DTAGS},
     {"eh-frame-hdr", no_argument, NULL, OPTION_EH_FRAME_HDR},
     {"exclude-libs", required_argument, NULL, OPTION_EXCLUDE_LIBS},
+    {"hash-style", required_argument, NULL, OPTION_HASH_STYLE},
     {"Bgroup", no_argument, NULL, OPTION_GROUP},
 EOF
 fi
@@ -1797,6 +1799,22 @@ cat >>e${EMULATION_NAME}.c <<EOF
       add_excluded_libs (optarg);
       break;
 
+    case OPTION_HASH_STYLE:
+      link_info.emit_hash = FALSE;
+      link_info.emit_gnu_hash = FALSE;
+      if (strcmp (optarg, "sysv") == 0)
+	link_info.emit_hash = TRUE;
+      else if (strcmp (optarg, "gnu") == 0)
+	link_info.emit_gnu_hash = TRUE;
+      else if (strcmp (optarg, "both") == 0)
+	{
+	  link_info.emit_hash = TRUE;
+	  link_info.emit_gnu_hash = TRUE;
+	}
+      else
+	einfo (_("%P%F: invalid hash style \`%s'\n"), optarg);
+      break;
+
     case 'z':
       if (strcmp (optarg, "initfirst") == 0)
 	link_info.flags_1 |= (bfd_vma) DF_1_INITFIRST;
@@ -1895,6 +1913,7 @@ cat >>e${EMULATION_NAME}.c <<EOF
   fprintf (file, _("  --disable-new-dtags\tDisable new dynamic tags\n"));
   fprintf (file, _("  --enable-new-dtags\tEnable new dynamic tags\n"));
   fprintf (file, _("  --eh-frame-hdr\tCreate .eh_frame_hdr section\n"));
+  fprintf (file, _("  --hash-stylle=STYLE\tSet hash style to sysv, gnu or both\n"));
   fprintf (file, _("  -z combreloc\t\tMerge dynamic relocs into one section and sort\n"));
   fprintf (file, _("  -z defs\t\tReport unresolved symbols in object files.\n"));
   fprintf (file, _("  -z execstack\t\tMark executable as requiring executable stack\n"));
--- bfd/elf-bfd.h.jj	2006-06-20 18:34:24.000000000 +0200
+++ bfd/elf-bfd.h	2006-06-26 16:17:53.000000000 +0200
@@ -1481,6 +1481,8 @@ extern bfd_vma _bfd_elf_section_offset
 
 extern unsigned long bfd_elf_hash
   (const char *);
+extern unsigned long bfd_elf_gnu_hash
+  (const char *);
 
 extern bfd_reloc_status_type bfd_elf_generic_reloc
   (bfd *, arelent *, asymbol *, void *, asection *, bfd *, char **);
--- bfd/elf.c.jj	2006-06-20 18:34:24.000000000 +0200
+++ bfd/elf.c	2006-06-26 16:17:28.000000000 +0200
@@ -206,6 +206,21 @@ bfd_elf_hash (const char *namearg)
   return h & 0xffffffff;
 }
 
+/* DT_GNU_HASH hash function.  Do not change this function; you will
+   cause invalid hash tables to be generated.  */
+
+unsigned long
+bfd_elf_gnu_hash (const char *namearg)
+{
+  const unsigned char *name = (const unsigned char *) namearg;
+  unsigned long h = 5381;
+  unsigned char ch;
+
+  while ((ch = *name++) != '\0')
+    h = (h << 5) + h + ch;
+  return h & 0xffffffff;
+}
+
 bfd_boolean
 bfd_elf_mkobject (bfd *abfd)
 {
@@ -1239,6 +1254,7 @@ _bfd_elf_print_private_bfd_data (bfd *ab
 	    case DT_AUXILIARY: name = "AUXILIARY"; stringp = TRUE; break;
 	    case DT_USED: name = "USED"; break;
 	    case DT_FILTER: name = "FILTER"; stringp = TRUE; break;
+	    case DT_GNU_HASH: name = "GNU_HASH"; break;
 	    }
 
 	  fprintf (f, "  %-11s ", name);
@@ -1823,6 +1839,7 @@ bfd_section_from_shdr (bfd *abfd, unsign
     case SHT_FINI_ARRAY:	/* .fini_array section.  */
     case SHT_PREINIT_ARRAY:	/* .preinit_array section.  */
     case SHT_GNU_LIBLIST:	/* .gnu.liblist section.  */
+    case SHT_GNU_HASH:		/* .gnu.hash section.  */
       return _bfd_elf_make_section_from_shdr (abfd, hdr, name, shindex);
 
     case SHT_DYNAMIC:	/* Dynamic linking information.  */
@@ -2295,6 +2312,7 @@ static const struct bfd_elf_special_sect
   { ".gnu.version_r", 14,  0, SHT_GNU_verneed, 0 },
   { ".gnu.liblist",   12,  0, SHT_GNU_LIBLIST, SHF_ALLOC },
   { ".gnu.conflict",  13,  0, SHT_RELA,     SHF_ALLOC },
+  { ".gnu.hash",       9,  0, SHT_GNU_HASH, SHF_ALLOC },
   { NULL,              0,  0, 0,            0 }
 };
 
@@ -2811,6 +2829,10 @@ elf_fake_sections (bfd *abfd, asection *
     case SHT_GROUP:
       this_hdr->sh_entsize = 4;
       break;
+
+    case SHT_GNU_HASH:
+      this_hdr->sh_entsize = 4;
+      break;
     }
 
   if ((asect->flags & SEC_ALLOC) != 0)
@@ -3256,6 +3278,7 @@ assign_section_numbers (bfd *abfd, struc
 	  break;
 
 	case SHT_HASH:
+	case SHT_GNU_HASH:
 	case SHT_GNU_versym:
 	  /* sh_link is the section header index of the symbol table
 	     this hash table or version table is for.  */
--- bfd/elflink.c.jj	2006-06-20 18:34:53.000000000 +0200
+++ bfd/elflink.c	2006-06-26 20:07:29.000000000 +0200
@@ -240,12 +240,24 @@ _bfd_elf_link_create_dynamic_sections (b
   if (!_bfd_elf_define_linkage_sym (abfd, info, s, "_DYNAMIC"))
     return FALSE;
 
-  s = bfd_make_section_with_flags (abfd, ".hash",
-				   flags | SEC_READONLY);
-  if (s == NULL
-      || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
-    return FALSE;
-  elf_section_data (s)->this_hdr.sh_entsize = bed->s->sizeof_hash_entry;
+  if (info->emit_hash)
+    {
+      s = bfd_make_section_with_flags (abfd, ".hash", flags | SEC_READONLY);
+      if (s == NULL
+	  || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
+	return FALSE;
+      elf_section_data (s)->this_hdr.sh_entsize = bed->s->sizeof_hash_entry;
+    }
+
+  if (info->emit_gnu_hash)
+    {
+      s = bfd_make_section_with_flags (abfd, ".gnu.hash",
+				       flags | SEC_READONLY);
+      if (s == NULL
+	  || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
+	return FALSE;
+      elf_section_data (s)->this_hdr.sh_entsize = 4;
+    }
 
   /* Let the backend create the rest of the sections.  This lets the
      backend set the right flags.  The backend will normally create
@@ -4811,6 +4823,122 @@ elf_collect_hash_codes (struct elf_link_
   return TRUE;
 }
 
+struct collect_gnu_hash_codes
+{
+  bfd *output_bfd;
+  unsigned long int nsyms;
+  unsigned long int *hashcodes;
+  unsigned long int *hashval;
+  unsigned long int *indx;
+  unsigned long int *loc;
+  unsigned long int *counts;
+  bfd_byte *contents;
+  long int min_dynindx;
+  unsigned long int bucketcount;
+  long int local_indx;
+};
+
+/* This function will be called though elf_link_hash_traverse to store
+   all hash value of the exported symbols in an array.  */
+
+static bfd_boolean
+elf_collect_gnu_hash_codes (struct elf_link_hash_entry *h, void *data)
+{
+  struct collect_gnu_hash_codes *s = data;
+  const char *name;
+  char *p;
+  unsigned long ha;
+  char *alc = NULL;
+
+  if (h->root.type == bfd_link_hash_warning)
+    h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+  /* Ignore indirect symbols.  These are added by the versioning code.  */
+  if (h->dynindx == -1)
+    return TRUE;
+
+  /* Ignore also local symbols and undefined symbols.  */
+  if (h->forced_local
+      || h->root.type == bfd_link_hash_undefined
+      || h->root.type == bfd_link_hash_undefweak
+      || ((h->root.type == bfd_link_hash_defined
+	   || h->root.type == bfd_link_hash_defweak)
+	  && h->root.u.def.section->output_section == NULL))
+    return TRUE;
+
+  name = h->root.root.string;
+  p = strchr (name, ELF_VER_CHR);
+  if (p != NULL)
+    {
+      alc = bfd_malloc (p - name + 1);
+      memcpy (alc, name, p - name);
+      alc[p - name] = '\0';
+      name = alc;
+    }
+
+  /* Compute the hash value.  */
+  ha = bfd_elf_gnu_hash (name);
+
+  /* Store the found hash value in the array for compute_bucket_count,
+     and also for .dynsym reordering purposes.  */
+  s->hashcodes[s->nsyms] = ha;
+  s->hashval[h->dynindx] = ha;
+  ++s->nsyms;
+  if (s->min_dynindx < 0 || s->min_dynindx > h->dynindx)
+    s->min_dynindx = h->dynindx;
+
+  if (alc != NULL)
+    free (alc);
+
+  return TRUE;
+}
+
+/* This function will be called though elf_link_hash_traverse to do
+   final dynaminc symbol renumbering.  */
+
+static bfd_boolean
+elf_renumber_gnu_hash_syms (struct elf_link_hash_entry *h, void *data)
+{
+  struct collect_gnu_hash_codes *s = data;
+  unsigned long bucket;
+
+  if (h->root.type == bfd_link_hash_warning)
+    h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+  /* Ignore indirect symbols.  */
+  if (h->dynindx == -1)
+    return TRUE;
+
+  /* Ignore also local symbols and undefined symbols.  */
+  if (h->forced_local
+      || h->root.type == bfd_link_hash_undefined
+      || h->root.type == bfd_link_hash_undefweak
+      || ((h->root.type == bfd_link_hash_defined
+	   || h->root.type == bfd_link_hash_defweak)
+	  && h->root.u.def.section->output_section == NULL))
+    {
+      if (h->dynindx >= s->min_dynindx)
+	h->dynindx = s->local_indx++;
+      return TRUE;
+    }
+
+  bucket = s->hashval[h->dynindx] % s->bucketcount;
+  if (s->counts[bucket])
+    {
+      bfd_put_32 (s->output_bfd, s->indx[bucket],
+		  s->contents + s->loc[bucket] * 4);
+      bfd_put_32 (s->output_bfd, s->counts[bucket],
+		  s->contents + s->loc[bucket] * 4 + 4);
+      s->counts[bucket] = 0;
+      s->loc[bucket] += 2;
+    }
+  bfd_put_32 (s->output_bfd, s->hashval[h->dynindx],
+	      s->contents + s->loc[bucket] * 4);
+  ++s->loc[bucket];
+  h->dynindx = s->indx[bucket]++;
+  return TRUE;
+}
+
 /* Array used to determine the number of hash table buckets to use
    based on the number of symbols there are.  If there are fewer than
    3 symbols we use 1 bucket, fewer than 17 symbols we use 3 buckets,
@@ -4832,42 +4960,26 @@ static const size_t elf_buckets[] =
    Therefore the result is always a good payoff between few collisions
    (= short chain lengths) and table size.  */
 static size_t
-compute_bucket_count (struct bfd_link_info *info)
+compute_bucket_count (struct bfd_link_info *info, unsigned long int *hashcodes,
+		      unsigned long int nsyms)
 {
   size_t dynsymcount = elf_hash_table (info)->dynsymcount;
   size_t best_size = 0;
-  unsigned long int *hashcodes;
-  unsigned long int *hashcodesp;
   unsigned long int i;
   bfd_size_type amt;
 
-  /* Compute the hash values for all exported symbols.  At the same
-     time store the values in an array so that we could use them for
-     optimizations.  */
-  amt = dynsymcount;
-  amt *= sizeof (unsigned long int);
-  hashcodes = bfd_malloc (amt);
-  if (hashcodes == NULL)
-    return 0;
-  hashcodesp = hashcodes;
-
-  /* Put all hash values in HASHCODES.  */
-  elf_link_hash_traverse (elf_hash_table (info),
-			  elf_collect_hash_codes, &hashcodesp);
-
   /* We have a problem here.  The following code to optimize the table
      size requires an integer type with more the 32 bits.  If
      BFD_HOST_U_64_BIT is set we know about such a type.  */
 #ifdef BFD_HOST_U_64_BIT
   if (info->optimize)
     {
-      unsigned long int nsyms = hashcodesp - hashcodes;
       size_t minsize;
       size_t maxsize;
       BFD_HOST_U_64_BIT best_chlen = ~((BFD_HOST_U_64_BIT) 0);
-      unsigned long int *counts ;
       bfd *dynobj = elf_hash_table (info)->dynobj;
       const struct elf_backend_data *bed = get_elf_backend_data (dynobj);
+      unsigned long int *counts;
 
       /* Possible optimization parameters: if we have NSYMS symbols we say
 	 that the hashing table must at least have NSYMS/4 and at most
@@ -4883,10 +4995,7 @@ compute_bucket_count (struct bfd_link_in
       amt *= sizeof (unsigned long int);
       counts = bfd_malloc (amt);
       if (counts == NULL)
-	{
-	  free (hashcodes);
-	  return 0;
-	}
+	return 0;
 
       /* Compute the "optimal" size for the hash table.  The criteria is a
 	 minimal chain length.  The minor criteria is (of course) the size
@@ -4913,9 +5022,9 @@ compute_bucket_count (struct bfd_link_in
 #  define BFD_TARGET_PAGESIZE	(4096)
 # endif
 
-	  /* We in any case need 2 + NSYMS entries for the size values and
-	     the chains.  */
-	  max = (2 + nsyms) * (bed->s->arch_size / 8);
+	  /* We in any case need 2 + DYNSYMCOUNT entries for the size values
+	     and the chains.  */
+	  max = (2 + dynsymcount) * bed->s->sizeof_hash_entry;
 
 # if 1
 	  /* Variant 1: optimize for short chains.  We add the squares
@@ -4925,7 +5034,7 @@ compute_bucket_count (struct bfd_link_in
 	    max += counts[j] * counts[j];
 
 	  /* This adds penalties for the overall size of the table.  */
-	  fact = i / (BFD_TARGET_PAGESIZE / (bed->s->arch_size / 8)) + 1;
+	  fact = i / (BFD_TARGET_PAGESIZE / bed->s->sizeof_hash_entry) + 1;
 	  max *= fact * fact;
 # else
 	  /* Variant 2: Optimize a lot more for small table.  Here we
@@ -4936,7 +5045,7 @@ compute_bucket_count (struct bfd_link_in
 
 	  /* The overall size of the table is considered, but not as
 	     strong as in variant 1, where it is squared.  */
-	  fact = i / (BFD_TARGET_PAGESIZE / (bed->s->arch_size / 8)) + 1;
+	  fact = i / (BFD_TARGET_PAGESIZE / bed->s->sizeof_hash_entry) + 1;
 	  max *= fact;
 # endif
 
@@ -4959,14 +5068,11 @@ compute_bucket_count (struct bfd_link_in
       for (i = 0; elf_buckets[i] != 0; i++)
 	{
 	  best_size = elf_buckets[i];
-	  if (dynsymcount < elf_buckets[i + 1])
+	  if (nsyms < elf_buckets[i + 1])
 	    break;
 	}
     }
 
-  /* Free the arrays we needed.  */
-  free (hashcodes);
-
   return best_size;
 }
 
@@ -5324,7 +5430,10 @@ bfd_elf_size_dynamic_sections (bfd *outp
 	  bfd_size_type strsize;
 
 	  strsize = _bfd_elf_strtab_size (elf_hash_table (info)->dynstr);
-	  if (!_bfd_elf_add_dynamic_entry (info, DT_HASH, 0)
+	  if ((info->emit_hash
+	       && !_bfd_elf_add_dynamic_entry (info, DT_HASH, 0))
+	      || (info->emit_gnu_hash
+		  && !_bfd_elf_add_dynamic_entry (info, DT_GNU_HASH, 0))
 	      || !_bfd_elf_add_dynamic_entry (info, DT_STRTAB, 0)
 	      || !_bfd_elf_add_dynamic_entry (info, DT_SYMTAB, 0)
 	      || !_bfd_elf_add_dynamic_entry (info, DT_STRSZ, strsize)
@@ -5726,8 +5835,6 @@ bfd_elf_size_dynsym_hash_dynstr (bfd *ou
       asection *s;
       bfd_size_type dynsymcount;
       unsigned long section_sym_count;
-      size_t bucketcount = 0;
-      size_t hash_entry_size;
       unsigned int dtagcount;
 
       dynobj = elf_hash_table (info)->dynobj;
@@ -5778,23 +5885,151 @@ bfd_elf_size_dynsym_hash_dynstr (bfd *ou
 	  memset (s->contents, 0, section_sym_count * bed->s->sizeof_sym);
 	}
 
+      elf_hash_table (info)->bucketcount = 0;
+
       /* Compute the size of the hashing table.  As a side effect this
 	 computes the hash values for all the names we export.  */
-      bucketcount = compute_bucket_count (info);
+      if (info->emit_hash)
+	{
+	  unsigned long int *hashcodes;
+	  unsigned long int *hashcodesp;
+	  bfd_size_type amt;
+	  unsigned long int nsyms;
+	  size_t bucketcount;
+	  size_t hash_entry_size;
+
+	  /* Compute the hash values for all exported symbols.  At the same
+	     time store the values in an array so that we could use them for
+	     optimizations.  */
+	  amt = dynsymcount * sizeof (unsigned long int);
+	  hashcodes = bfd_malloc (amt);
+	  if (hashcodes == NULL)
+	    return FALSE;
+	  hashcodesp = hashcodes;
 
-      s = bfd_get_section_by_name (dynobj, ".hash");
-      BFD_ASSERT (s != NULL);
-      hash_entry_size = elf_section_data (s)->this_hdr.sh_entsize;
-      s->size = ((2 + bucketcount + dynsymcount) * hash_entry_size);
-      s->contents = bfd_zalloc (output_bfd, s->size);
-      if (s->contents == NULL)
-	return FALSE;
+	  /* Put all hash values in HASHCODES.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_collect_hash_codes, &hashcodesp);
 
-      bfd_put (8 * hash_entry_size, output_bfd, bucketcount, s->contents);
-      bfd_put (8 * hash_entry_size, output_bfd, dynsymcount,
-	       s->contents + hash_entry_size);
+	  nsyms = hashcodesp - hashcodes;
+	  bucketcount
+	    = compute_bucket_count (info, hashcodes, nsyms);
+	  free (hashcodes);
 
-      elf_hash_table (info)->bucketcount = bucketcount;
+	  if (bucketcount == 0)
+	    return FALSE;
+
+	  elf_hash_table (info)->bucketcount = bucketcount;
+
+	  s = bfd_get_section_by_name (dynobj, ".hash");
+	  BFD_ASSERT (s != NULL);
+	  hash_entry_size = elf_section_data (s)->this_hdr.sh_entsize;
+	  s->size = ((2 + bucketcount + dynsymcount) * hash_entry_size);
+	  s->contents = bfd_zalloc (output_bfd, s->size);
+	  if (s->contents == NULL)
+	    return FALSE;
+
+	  bfd_put (8 * hash_entry_size, output_bfd, bucketcount, s->contents);
+	  bfd_put (8 * hash_entry_size, output_bfd, dynsymcount,
+		   s->contents + hash_entry_size);
+	}
+
+      if (info->emit_gnu_hash)
+	{
+	  size_t i, loc, cnt;
+	  unsigned char *contents;
+	  struct collect_gnu_hash_codes cinfo;
+	  bfd_size_type amt;
+	  size_t bucketcount;
+
+	  memset (&cinfo, 0, sizeof (cinfo));
+
+	  /* Compute the hash values for all exported symbols.  At the same
+	     time store the values in an array so that we could use them for
+	     optimizations.  */
+	  amt = dynsymcount * 2 * sizeof (unsigned long int);
+	  cinfo.hashcodes = bfd_malloc (amt);
+	  if (cinfo.hashcodes == NULL)
+	    return FALSE;
+
+	  cinfo.hashval = cinfo.hashcodes + dynsymcount;
+	  cinfo.min_dynindx = -1;
+	  cinfo.output_bfd = output_bfd;
+
+	  /* Put all hash values in HASHCODES.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_collect_gnu_hash_codes, &cinfo);
+
+	  bucketcount
+	    = compute_bucket_count (info, cinfo.hashcodes, cinfo.nsyms);
+
+	  if (bucketcount == 0)
+	    {
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  amt = bucketcount * sizeof (unsigned long int) * 3;
+	  cinfo.counts = bfd_malloc (amt);
+	  if (cinfo.counts == NULL)
+	    {
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  /* Determine how often each hash bucket is used.  */
+	  memset (cinfo.counts, 0, bucketcount * sizeof (cinfo.counts[0]));
+	  for (i = 0; i < cinfo.nsyms; ++i)
+	    ++cinfo.counts[cinfo.hashcodes[i] % bucketcount];
+
+	  s = bfd_get_section_by_name (dynobj, ".gnu.hash");
+	  BFD_ASSERT (s != NULL);
+	  cinfo.indx = cinfo.counts + bucketcount;
+	  cinfo.loc = cinfo.counts + 2 * bucketcount;
+	  for (i = 0, loc = 0, cnt = dynsymcount - cinfo.nsyms;
+	       i < bucketcount; ++i)
+	    if (cinfo.counts[i] != 0)
+	      {
+		cinfo.indx[i] = cnt;
+		cinfo.loc[i] = loc;
+		loc += 2 + cinfo.counts[i];
+		cnt += cinfo.counts[i];
+	      }
+	  BFD_ASSERT (cnt == dynsymcount);
+	  cinfo.bucketcount = bucketcount;
+	  cinfo.local_indx = cinfo.min_dynindx;
+
+	  s->size = (1 + bucketcount + loc) * 4;
+	  contents = bfd_zalloc (output_bfd, s->size);
+	  if (contents == NULL)
+	    {
+	      free (cinfo.counts);
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  s->contents = contents;
+	  bfd_put_32 (output_bfd, bucketcount, contents);
+	  contents += 4;
+
+	  for (i = 0; i < bucketcount; ++i)
+	    {
+	      if (cinfo.counts[i] == 0)
+		bfd_put_32 (output_bfd, ~0, contents);
+	      else
+		bfd_put_32 (output_bfd, cinfo.loc[i], contents);
+	      contents += 4;
+	    }
+
+	  cinfo.contents = contents;
+
+	  /* Renumber dynamic symbols, populate .gnu.hash section.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_renumber_gnu_hash_syms, &cinfo);
+
+	  free (cinfo.counts);
+	  free (cinfo.hashcodes);
+	}
 
       s = bfd_get_section_by_name (dynobj, ".dynstr");
       BFD_ASSERT (s != NULL);
@@ -6663,9 +6898,6 @@ elf_link_output_extsym (struct elf_link_
     {
       size_t bucketcount;
       size_t bucket;
-      size_t hash_entry_size;
-      bfd_byte *bucketpos;
-      bfd_vma chain;
       bfd_byte *esym;
 
       sym.st_name = h->dynstr_index;
@@ -6679,15 +6911,23 @@ elf_link_output_extsym (struct elf_link_
 
       bucketcount = elf_hash_table (finfo->info)->bucketcount;
       bucket = h->u.elf_hash_value % bucketcount;
-      hash_entry_size
-	= elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
-      bucketpos = ((bfd_byte *) finfo->hash_sec->contents
-		   + (bucket + 2) * hash_entry_size);
-      chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
-	       ((bfd_byte *) finfo->hash_sec->contents
-		+ (bucketcount + 2 + h->dynindx) * hash_entry_size));
+
+      if (finfo->hash_sec != NULL)
+	{
+	  size_t hash_entry_size;
+	  bfd_byte *bucketpos;
+	  bfd_vma chain;
+
+	  hash_entry_size
+	    = elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
+	  bucketpos = ((bfd_byte *) finfo->hash_sec->contents
+		       + (bucket + 2) * hash_entry_size);
+	  chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx, bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
+		   ((bfd_byte *) finfo->hash_sec->contents
+		    + (bucketcount + 2 + h->dynindx) * hash_entry_size));
+	}
 
       if (finfo->symver_sec != NULL && finfo->symver_sec->contents != NULL)
 	{
@@ -7861,7 +8101,7 @@ bfd_elf_final_link (bfd *abfd, struct bf
     {
       finfo.dynsym_sec = bfd_get_section_by_name (dynobj, ".dynsym");
       finfo.hash_sec = bfd_get_section_by_name (dynobj, ".hash");
-      BFD_ASSERT (finfo.dynsym_sec != NULL && finfo.hash_sec != NULL);
+      BFD_ASSERT (finfo.dynsym_sec != NULL);
       finfo.symver_sec = bfd_get_section_by_name (dynobj, ".gnu.version");
       /* Note that it is OK if symver_sec is NULL.  */
     }
@@ -8621,6 +8861,9 @@ bfd_elf_final_link (bfd *abfd, struct bf
 	    case DT_HASH:
 	      name = ".hash";
 	      goto get_vma;
+	    case DT_GNU_HASH:
+	      name = ".gnu.hash";
+	      goto get_vma;
 	    case DT_STRTAB:
 	      name = ".dynstr";
 	      goto get_vma;
--- include/elf/common.h.jj	2006-02-17 15:36:26.000000000 +0100
+++ include/elf/common.h	2006-06-22 10:43:21.000000000 +0200
@@ -338,6 +338,7 @@
 #define SHT_LOOS	0x60000000	/* First of OS specific semantics */
 #define SHT_HIOS	0x6fffffff	/* Last of OS specific semantics */
 
+#define SHT_GNU_HASH	0x6ffffff6	/* GNU style symbol hash table */
 #define SHT_GNU_LIBLIST	0x6ffffff7	/* List of prelink dependencies */
 
 /* The next three section types are defined by Solaris, and are named
@@ -577,6 +578,7 @@
 #define DT_VALRNGHI	0x6ffffdff
 
 #define DT_ADDRRNGLO	0x6ffffe00
+#define DT_GNU_HASH	0x6ffffef5
 #define DT_TLSDESC_PLT	0x6ffffef6
 #define DT_TLSDESC_GOT	0x6ffffef7
 #define DT_GNU_CONFLICT	0x6ffffef8
--- include/bfdlink.h.jj	2006-04-07 17:17:29.000000000 +0200
+++ include/bfdlink.h	2006-06-22 11:11:20.000000000 +0200
@@ -324,6 +324,12 @@ struct bfd_link_info
   /* TRUE if unreferenced sections should be removed.  */
   unsigned int gc_sections: 1;
 
+  /* TRUE if .hash section should be created.  */
+  unsigned int emit_hash: 1;
+
+  /* TRUE if .gnu.hash section should be created.  */
+  unsigned int emit_gnu_hash: 1;
+
   /* What to do with unresolved symbols in an object file.
      When producing executables the default is GENERATE_ERROR.
      When producing shared libraries the default is IGNORE.  The
--- binutils/readelf.c.jj	2006-05-30 16:13:54.000000000 +0200
+++ binutils/readelf.c	2006-06-27 11:58:34.000000000 +0200
@@ -135,6 +135,7 @@ static unsigned long dynamic_syminfo_off
 static unsigned int dynamic_syminfo_nent;
 static char program_interpreter[64];
 static bfd_vma dynamic_info[DT_JMPREL + 1];
+static bfd_vma dynamic_info_DT_GNU_HASH;
 static bfd_vma version_info[16];
 static Elf_Internal_Ehdr elf_header;
 static Elf_Internal_Shdr *section_headers;
@@ -1501,6 +1502,7 @@ get_dynamic_type (unsigned long type)
     case DT_GNU_CONFLICTSZ: return "GNU_CONFLICTSZ";
     case DT_GNU_LIBLIST: return "GNU_LIBLIST";
     case DT_GNU_LIBLISTSZ: return "GNU_LIBLISTSZ";
+    case DT_GNU_HASH:	return "GNU_HASH";
 
     default:
       if ((type >= DT_LOPROC) && (type <= DT_HIPROC))
@@ -2571,6 +2573,7 @@ get_section_type_name (unsigned int sh_t
     case SHT_INIT_ARRAY:	return "INIT_ARRAY";
     case SHT_FINI_ARRAY:	return "FINI_ARRAY";
     case SHT_PREINIT_ARRAY:	return "PREINIT_ARRAY";
+    case SHT_GNU_HASH:		return "GNU_HASH";
     case SHT_GROUP:		return "GROUP";
     case SHT_SYMTAB_SHNDX:	return "SYMTAB SECTION INDICIES";
     case SHT_GNU_verdef:	return "VERDEF";
@@ -6228,6 +6231,15 @@ process_dynamic_section (FILE *file)
 	    }
 	  break;
 
+	case DT_GNU_HASH:
+	  dynamic_info_DT_GNU_HASH = entry->d_un.d_val;
+	  if (do_dynamic)
+	    {
+	      print_vma (entry->d_un.d_val, PREFIX_HEX);
+	      putchar ('\n');
+	    }
+	  break;
+
 	default:
 	  if ((entry->d_tag >= DT_VERSYM) && (entry->d_tag <= DT_VERNEEDNUM))
 	    version_info[DT_VERSIONTAGIDX (entry->d_tag)] =
@@ -6903,6 +6915,9 @@ process_symbol_table (FILE *file)
   bfd_vma nchains = 0;
   bfd_vma *buckets = NULL;
   bfd_vma *chains = NULL;
+  bfd_vma ngnubuckets = 0;
+  bfd_vma *gnubuckets = NULL;
+  bfd_vma *gnuchains = NULL;
 
   if (! do_syms && !do_histogram)
     return 1;
@@ -7282,6 +7297,132 @@ process_symbol_table (FILE *file)
       free (chains);
     }
 
+  if (do_histogram && dynamic_info_DT_GNU_HASH)
+    {
+      unsigned char nb[4];
+      bfd_vma i, maxchain = 0xffffffff;
+      unsigned long *counts;
+      unsigned long hn;
+      unsigned long maxlength = 0;
+      unsigned long nzero_counts = 0;
+      unsigned long nsyms = 0;
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file, dynamic_info_DT_GNU_HASH,
+				     sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      if (fread (nb, 4, 1, file) != 1)
+	{
+	  error (_("Failed to read in number of buckets\n"));
+	  return 0;
+	}
+
+      ngnubuckets = byte_get (nb, 4);
+
+      gnubuckets = get_dynamic_data (file, ngnubuckets, 4);
+
+      if (gnubuckets == NULL)
+	return 0;
+
+      for (i = 0; i < ngnubuckets; i++)
+	if (gnubuckets[i] != 0xffffffff
+	    && (maxchain == 0xffffffff || gnubuckets[i] > maxchain))
+	  maxchain = gnubuckets[i];
+
+      if (maxchain == 0xffffffff)
+	return 0;
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file,
+				     dynamic_info_DT_GNU_HASH
+				     + 4 * (1 + ngnubuckets + maxchain + 1),
+				     sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      if (fread (nb, 4, 1, file) != 1)
+	{
+	  error (_("Failed to read last chain length\n"));
+	  return 0;
+	}
+
+      i = 2 + byte_get (nb, 4);
+      if (maxchain + i < maxchain)
+	return 0;
+      maxchain += i;
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file,
+				     dynamic_info_DT_GNU_HASH
+				     + 4 * (1 + ngnubuckets), sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      gnuchains = get_dynamic_data (file, maxchain, 4);
+
+      if (gnuchains == NULL)
+	return 0;
+
+      printf (_("\nHistogram for `.gnu.hash' bucket list length (total of %lu buckets):\n"),
+	      (unsigned long) ngnubuckets);
+      printf (_(" Length  Number     %% of total  Coverage\n"));
+
+      for (hn = 0; hn < ngnubuckets; ++hn)
+	if (gnubuckets[hn] != 0xffffffff)
+	  {
+	    bfd_vma length = gnuchains[gnubuckets[hn] + 1];
+	    if (length > maxlength)
+	      maxlength = length;
+	    nsyms += length;
+	  }
+
+      counts = calloc (maxlength + 1, sizeof (*counts));
+      if (counts == NULL)
+	{
+	  error (_("Out of memory"));
+	  return 0;
+	}
+
+      for (hn = 0; hn < ngnubuckets; ++hn)
+	if (gnubuckets[hn] != 0xffffffff)
+	  ++counts[gnuchains[gnubuckets[hn] + 1]];
+	else
+	  ++counts[0];
+
+      if (ngnubuckets > 0)
+	{
+	  unsigned long j;
+	  printf ("      0  %-10lu (%5.1f%%)\n",
+		  counts[0], (counts[0] * 100.0) / ngnubuckets);
+	  for (j = 1; j <= maxlength; ++j)
+	    {
+	      nzero_counts += counts[j] * j;
+	      printf ("%7lu  %-10lu (%5.1f%%)    %5.1f%%\n",
+		      j, counts[j], (counts[j] * 100.0) / ngnubuckets,
+		      (nzero_counts * 100.0) / nsyms);
+	    }
+	}
+
+      free (counts);
+
+      free (gnubuckets);
+      free (gnuchains);
+    }
+
   return 1;
 }
 
-------------- next part --------------
2006-06-28  Ulrich Drepper  <drepper@redhat.com>

	* elf/dl-lookup.c (dl_new_hash): New functions.
	(_dl_lookup_symbol_x): Rename hash to old_hash and don't compute
	value here.  Compute new-style hash value.  Pass new hash value
	and reference to variable with the old value to do_lookup_x.
	(_dl_setup_hash): If DT_GNU_HASH is defined, use it and not
	old-style hash table.
	(_dl_debug_bindings): Pass new hash value and reference to variable
	with the old value to do_lookup_x.
	* elf/do-lookup.h (do_lookup_x): Accept additional parameter with
	new-style hash value and change old-style hash value parameter to
	be a reference.  Reoganize functions to determine whether
	new-style hash table is available.  Only fall back on old-style
	table.  If old-style hash value is needed, compute it here.
	* elf/dynamic-link.h (elf_get_dynamic_info): Relocate DT_GNU_HASH
	entry.
	* elf/elf.h: Define SHT_GNU_HASH, DT_GNU_HASH, DT_TLSDEC_PLT,
	DT_TLSDEC_GOT.  Adjust DT_ADDRNUM.
	* include/link.h (struct link_map): Add l_gnu_buckets and
	l_gnu_hashbase.
	* Makeconfig: If linker supports --hash-style option add it to all
	linker command lines to build DSOs.
	* config.make.in: Define have-hash-style.
	* configure.in: Test whether linker supports --hash-style option.

--- libc/Makeconfig.~1.318.~	2006-05-15 11:21:55.000000000 -0700
+++ libc/Makeconfig	2006-06-26 13:51:15.000000000 -0700
@@ -413,6 +413,15 @@
 LDFLAGS-rtld += $(relro-LDFLAGS)
 endif
 
+ifeq (yes,$(have-hash-style))
+# For the time being we unconditionally use 'both'.  At some time we
+# should declare statically linked code as 'out of luck' and compile
+# with --hash-style=gnu only.
+hashstyle-LDFLAGS = -Wl,--hash-style=both
+LDFLAGS.so += $(hashstyle-LDFLAGS)
+LDFLAGS-rtld += $(hashstyle-LDFLAGS)
+endif
+
 # Command for linking programs with the C library.
 ifndef +link
 +link = $(CC) -nostdlib -nostartfiles -o $@ \
--- libc/config.make.in.~1.118.~	2006-04-26 08:17:41.000000000 -0700
+++ libc/config.make.in	2006-06-26 13:49:14.000000000 -0700
@@ -65,6 +65,7 @@
 have-cc-with-libunwind = @libc_cv_cc_with_libunwind@
 fno-unit-at-a-time = @fno_unit_at_a_time@
 bind-now = @bindnow@
+have-hash-style = @libc_cv_hashstyle@
 
 static-libgcc = @libc_cv_gcc_static_libgcc@
 
--- libc/configure.in.~1.460.~	2006-04-26 08:19:22.000000000 -0700
+++ libc/configure.in	2006-06-26 13:47:01.000000000 -0700
@@ -1589,6 +1589,22 @@
   rm -f conftest*])
 
   AC_SUBST(libc_cv_fpie)
+
+  AC_CACHE_CHECK(for --hash-style option,
+		 libc_cv_hashstyle, [dnl
+  cat > conftest.c <<EOF
+int _start (void) { return 42; }
+EOF
+  if AC_TRY_COMMAND([${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS
+			      -fPIC -shared -o conftest.so conftest.c
+			      -Wl,--hash-style=both -nostdlib 1>&AS_MESSAGE_LOG_FD])
+  then
+    libc_cv_hashstyle=yes
+  else
+    libc_cv_hashstyle=no
+  fi
+  rm -f conftest*])
+  AC_SUBST(libc_cv_hashstyle)
 fi
 
 AC_CACHE_CHECK(for -fno-toplevel-reorder, libc_cv_fno_toplevel_reorder, [dnl
--- libc/elf/dl-lookup.c.jj	2005-12-27 11:56:45.000000000 +0100
+++ libc/elf/dl-lookup.c	2006-06-27 10:12:22.000000000 +0200
@@ -1,5 +1,5 @@
 /* Look up a symbol in the loaded objects.
-   Copyright (C) 1995-2002, 2003, 2004, 2005 Free Software Foundation, Inc.
+   Copyright (C) 1995-2005, 2006 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -72,6 +72,16 @@ struct sym_val
 #include "do-lookup.h"
 
 
+static uint_fast32_t
+dl_new_hash (const char *s)
+{
+  uint_fast32_t h = 5381;
+  for (unsigned char c = *s; c != '\0'; c = *++s)
+    h = h * 33 + c;
+  return h & 0xffffffff;
+}
+
+
 /* Add extra dependency on MAP to UNDEF_MAP.  */
 static int
 internal_function
@@ -206,7 +216,8 @@ _dl_lookup_symbol_x (const char *undef_n
 		     const struct r_found_version *version,
 		     int type_class, int flags, struct link_map *skip_map)
 {
-  const unsigned long int hash = _dl_elf_hash (undef_name);
+  const uint_fast32_t new_hash = dl_new_hash (undef_name);
+  unsigned long int old_hash = 0xffffffff;
   struct sym_val current_value = { NULL, NULL };
   struct r_scope_elem **scope = symbol_scope;
 
@@ -229,8 +240,9 @@ _dl_lookup_symbol_x (const char *undef_n
   /* Search the relevant loaded objects for a definition.  */
   for (size_t start = i; *scope != NULL; start = 0, ++scope)
     {
-      int res = do_lookup_x (undef_name, hash, *ref, &current_value, *scope,
-			     start, version, flags, skip_map, type_class);
+      int res = do_lookup_x (undef_name, new_hash, &old_hash, *ref,
+			     &current_value, *scope, start, version, flags,
+			     skip_map, type_class);
       if (res > 0)
 	break;
 
@@ -301,9 +313,9 @@ _dl_lookup_symbol_x (const char *undef_n
 	  struct sym_val protected_value = { NULL, NULL };
 
 	  for (scope = symbol_scope; *scope != NULL; i = 0, ++scope)
-	    if (do_lookup_x (undef_name, hash, *ref, &protected_value,
-			     *scope, i, version, flags, skip_map,
-			     ELF_RTYPE_CLASS_PLT) != 0)
+	    if (do_lookup_x (undef_name, new_hash, &old_hash, *ref,
+			     &protected_value, *scope, i, version, flags,
+			     skip_map, ELF_RTYPE_CLASS_PLT) != 0)
 	      break;
 
 	  if (protected_value.s != NULL && protected_value.m != undef_map)
@@ -352,6 +364,21 @@ _dl_setup_hash (struct link_map *map)
   Elf_Symndx *hash;
   Elf_Symndx nchain;
 
+  if (__builtin_expect (map->l_info[DT_ADDRTAGIDX (DT_GNU_HASH) + DT_NUM
+  				    + DT_THISPROCNUM + DT_VERSIONTAGNUM
+				    + DT_EXTRANUM + DT_VALNUM] != NULL, 1))
+    {
+      Elf32_Word *hash32
+	= (void *) D_PTR (map, l_info[DT_ADDRTAGIDX (DT_GNU_HASH) + DT_NUM
+				      + DT_THISPROCNUM + DT_VERSIONTAGNUM
+				      + DT_EXTRANUM + DT_VALNUM]);
+      map->l_nbuckets = *hash32++;
+      map->l_gnu_buckets = hash32;
+      hash32 += map->l_nbuckets;
+      map->l_gnu_hashbase = hash32;
+      return;
+    }
+
   if (!map->l_info[DT_HASH])
     return;
   hash = (void *) D_PTR (map, l_info[DT_HASH]);
@@ -399,9 +426,10 @@ _dl_debug_bindings (const char *undef_na
 	   || GLRO(dl_trace_prelink_map) == GL(dl_ns)[LM_ID_BASE]._ns_loaded)
 	  && undef_map != GL(dl_ns)[LM_ID_BASE]._ns_loaded)
 	{
-	  const unsigned long int hash = _dl_elf_hash (undef_name);
+	  const uint_fast32_t new_hash = dl_new_hash (undef_name);
+	  unsigned long int old_hash = 0xffffffff;
 
-	  do_lookup_x (undef_name, hash, *ref, &val,
+	  do_lookup_x (undef_name, new_hash, &old_hash, *ref, &val,
 		       undef_map->l_local_scope[0], 0, version, 0, NULL,
 		       type_class);
 
--- libc/elf/do-lookup.h.jj	2006-02-28 15:13:56.000000000 +0100
+++ libc/elf/do-lookup.h	2006-06-27 10:08:03.000000000 +0200
@@ -17,14 +17,15 @@
    Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
    02111-1307 USA.  */
 
+
 /* Inner part of the lookup functions.  We return a value > 0 if we
    found the symbol, the value 0 if nothing is found and < 0 if
    something bad happened.  */
 static int
 __attribute_noinline__
-do_lookup_x (const char *undef_name, unsigned long int hash,
-	     const ElfW(Sym) *ref, struct sym_val *result,
-	     struct r_scope_elem *scope, size_t i,
+do_lookup_x (const char *undef_name, uint_fast32_t new_hash,
+	     unsigned long int *old_hash, const ElfW(Sym) *ref,
+	     struct sym_val *result, struct r_scope_elem *scope, size_t i,
 	     const struct r_found_version *const version, int flags,
 	     struct link_map *skip, int type_class)
 {
@@ -67,105 +68,142 @@ do_lookup_x (const char *undef_name, uns
       strtab = (const void *) D_PTR (map, l_info[DT_STRTAB]);
       verstab = map->l_versyms;
 
-      /* Search the appropriate hash bucket in this object's symbol table
-	 for a definition for the same symbol name.  */
-      for (symidx = map->l_buckets[hash % map->l_nbuckets];
-	   symidx != STN_UNDEF;
-	   symidx = map->l_chain[symidx])
-	{
-	  sym = &symtab[symidx];
-
-	  assert (ELF_RTYPE_CLASS_PLT == 1);
-	  if ((sym->st_value == 0 /* No value.  */
+      const ElfW(Sym) *
+      __attribute_noinline__
+      check_match (const ElfW(Sym) *sym)
+      {
+	assert (ELF_RTYPE_CLASS_PLT == 1);
+	if (__builtin_expect ((sym->st_value == 0 /* No value.  */
 #ifdef USE_TLS
-	       && ELFW(ST_TYPE) (sym->st_info) != STT_TLS
+			       && ELFW(ST_TYPE) (sym->st_info) != STT_TLS
 #endif
-	       )
-	      || (type_class & (sym->st_shndx == SHN_UNDEF)))
-	    continue;
+			       )
+			      || (type_class & (sym->st_shndx == SHN_UNDEF)),
+			      0))
+	  return NULL;
 
-	  if (ELFW(ST_TYPE) (sym->st_info) > STT_FUNC
+	if (__builtin_expect (ELFW(ST_TYPE) (sym->st_info) > STT_FUNC
 #ifdef USE_TLS
-	      && ELFW(ST_TYPE) (sym->st_info) != STT_TLS
+			      && ELFW(ST_TYPE) (sym->st_info) != STT_TLS
 #endif
-	      )
-	    /* Ignore all but STT_NOTYPE, STT_OBJECT and STT_FUNC
-	       entries (and STT_TLS if TLS is supported) since these
-	       are no code/data definitions.  */
-	    continue;
-
-	  if (sym != ref && strcmp (strtab + sym->st_name, undef_name))
-	    /* Not the symbol we are looking for.  */
-	    continue;
+			      , 0))
+	  /* Ignore all but STT_NOTYPE, STT_OBJECT and STT_FUNC
+	     entries (and STT_TLS if TLS is supported) since these
+	     are no code/data definitions.  */
+	  return NULL;
+
+	if (sym != ref && strcmp (strtab + sym->st_name, undef_name))
+	  /* Not the symbol we are looking for.  */
+	  return NULL;
+
+	if (version != NULL)
+	  {
+	    if (__builtin_expect (verstab == NULL, 0))
+	      {
+		/* We need a versioned symbol but haven't found any.  If
+		   this is the object which is referenced in the verneed
+		   entry it is a bug in the library since a symbol must
+		   not simply disappear.
+
+		   It would also be a bug in the object since it means that
+		   the list of required versions is incomplete and so the
+		   tests in dl-version.c haven't found a problem.*/
+		assert (version->filename == NULL
+			|| ! _dl_name_match_p (version->filename, map));
+
+		/* Otherwise we accept the symbol.  */
+	      }
+	    else
+	      {
+		/* We can match the version information or use the
+		   default one if it is not hidden.  */
+		ElfW(Half) ndx = verstab[symidx] & 0x7fff;
+		if ((map->l_versions[ndx].hash != version->hash
+		     || strcmp (map->l_versions[ndx].name, version->name))
+		    && (version->hidden || map->l_versions[ndx].hash
+			|| (verstab[symidx] & 0x8000)))
+		  /* It's not the version we want.  */
+		  return NULL;
+	      }
+	  }
+	else
+	  {
+	    /* No specific version is selected.  There are two ways we
+	       can got here:
+
+	       - a binary which does not include versioning information
+	       is loaded
+
+	       - dlsym() instead of dlvsym() is used to get a symbol which
+	       might exist in more than one form
+
+	       If the library does not provide symbol version information
+	       there is no problem at at: we simply use the symbol if it
+	       is defined.
+
+	       These two lookups need to be handled differently if the
+	       library defines versions.  In the case of the old
+	       unversioned application the oldest (default) version
+	       should be used.  In case of a dlsym() call the latest and
+	       public interface should be returned.  */
+	    if (verstab != NULL)
+	      {
+		if ((verstab[symidx] & 0x7fff)
+		    >= ((flags & DL_LOOKUP_RETURN_NEWEST) ? 2 : 3))
+		  {
+		    /* Don't accept hidden symbols.  */
+		    if ((verstab[symidx] & 0x8000) == 0
+			&& num_versions++ == 0)
+		      /* No version so far.  */
+		      versioned_sym = sym;
+
+		    return NULL;
+		  }
+	      }
+	  }
+
+	/* There cannot be another entry for this symbol so stop here.  */
+	return sym;
+      }
 
-	  if (version != NULL)
-	    {
-	      if (__builtin_expect (verstab == NULL, 0))
-		{
-		  /* We need a versioned symbol but haven't found any.  If
-		     this is the object which is referenced in the verneed
-		     entry it is a bug in the library since a symbol must
-		     not simply disappear.
-
-		     It would also be a bug in the object since it means that
-		     the list of required versions is incomplete and so the
-		     tests in dl-version.c haven't found a problem.*/
-		  assert (version->filename == NULL
-			  || ! _dl_name_match_p (version->filename, map));
+      if (__builtin_expect (map->l_gnu_buckets != NULL, 1))
+	{
+	  /* Use the new GNU-style hash table.  */
+	  size_t chainoff = map->l_gnu_buckets[new_hash % map->l_nbuckets];
 
-		  /* Otherwise we accept the symbol.  */
-		}
-	      else
-		{
-		  /* We can match the version information or use the
-		     default one if it is not hidden.  */
-		  ElfW(Half) ndx = verstab[symidx] & 0x7fff;
-		  if ((map->l_versions[ndx].hash != version->hash
-		       || strcmp (map->l_versions[ndx].name, version->name))
-		      && (version->hidden || map->l_versions[ndx].hash
-			  || (verstab[symidx] & 0x8000)))
-		    /* It's not the version we want.  */
-		    continue;
-		}
-	    }
-	  else
+	  if (chainoff != 0xffffffff)
 	    {
-	      /* No specific version is selected.  There are two ways we
-		 can got here:
-
-		 - a binary which does not include versioning information
-		   is loaded
-
-		 - dlsym() instead of dlvsym() is used to get a symbol which
-		   might exist in more than one form
-
-		 If the library does not provide symbol version
-		 information there is no problem at at: we simply use the
-		 symbol if it is defined.
-
-		 These two lookups need to be handled differently if the
-		 library defines versions.  In the case of the old
-		 unversioned application the oldest (default) version
-		 should be used.  In case of a dlsym() call the latest and
-		 public interface should be returned.  */
-	      if (verstab != NULL)
+	      size_t n = map->l_gnu_hashbase[chainoff + 1];
+	      do
 		{
-		  if ((verstab[symidx] & 0x7fff)
-		      >= ((flags & DL_LOOKUP_RETURN_NEWEST) ? 2 : 3))
+		  --n;
+		  if (map->l_gnu_hashbase[chainoff + 2 + n] == new_hash)
 		    {
-		      /* Don't accept hidden symbols.  */
-		      if ((verstab[symidx] & 0x8000) == 0
-			  && num_versions++ == 0)
-			/* No version so far.  */
-			versioned_sym = sym;
-
-		      continue;
+		      symidx = map->l_gnu_hashbase[chainoff] + n;
+		      sym = check_match (&symtab[symidx]);
+		      if (sym != NULL)
+			goto found_it;
 		    }
 		}
+	      while (n > 0);
 	    }
+	}
+      else
+	{
+	  if (*old_hash == 0xffffffff)
+	    *old_hash = _dl_elf_hash (undef_name);
 
-	  /* There cannot be another entry for this symbol so stop here.  */
-	  goto found_it;
+	  /* Use the old SysV-style hash table.  Search the appropriate
+	     hash bucket in this object's symbol table for a definition
+	     for the same symbol name.  */
+	  for (symidx = map->l_buckets[*old_hash % map->l_nbuckets];
+	       symidx != STN_UNDEF;
+	       symidx = map->l_chain[symidx])
+	    {
+	      sym = check_match (&symtab[symidx]);
+	      if (sym != NULL)
+		goto found_it;
+	    }
 	}
 
       /* If we have seen exactly one versioned symbol while we are
--- libc/elf/dynamic-link.h	15 Mar 2005 22:57:25 -0000	1.54
+++ libc/elf/dynamic-link.h	26 Jun 2006 20:54:51 -0000
@@ -1,5 +1,5 @@
 /* Inline functions for dynamic linking.
-   Copyright (C) 1995-2002, 2003, 2004, 2005 Free Software Foundation, Inc.
+   Copyright (C) 1995-2005, 2006 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -143,6 +143,8 @@
 # endif
       ADJUST_DYN_INFO (DT_JMPREL);
       ADJUST_DYN_INFO (VERSYMIDX (DT_VERSYM));
+      ADJUST_DYN_INFO (DT_ADDRTAGIDX (DT_GNU_HASH) + DT_NUM + DT_THISPROCNUM
+		       + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM);
 # undef ADJUST_DYN_INFO
       assert (cnt <= DL_RO_DYN_TEMP_CNT);
     }
--- libc/elf/elf.h	25 Feb 2006 01:57:44 -0000	1.154
+++ libc/elf/elf.h	26 Jun 2006 20:55:48 -0000
@@ -329,7 +329,8 @@
 #define SHT_GROUP	  17		/* Section group */
 #define SHT_SYMTAB_SHNDX  18		/* Extended section indeces */
 #define	SHT_NUM		  19		/* Number of defined types.  */
-#define SHT_LOOS	  0x60000000	/* Start OS-specific */
+#define SHT_LOOS	  0x60000000	/* Start OS-specific.  */
+#define SHT_GNU_HASH	  0x6ffffff6	/* GNU-style hash table.  */
 #define SHT_GNU_LIBLIST	  0x6ffffff7	/* Prelink library list */
 #define SHT_CHECKSUM	  0x6ffffff8	/* Checksum for DSO content.  */
 #define SHT_LOSUNW	  0x6ffffffa	/* Sun-specific low bound.  */
@@ -699,6 +700,9 @@
    If any adjustment is made to the ELF object after it has been
    built these entries will need to be adjusted.  */
 #define DT_ADDRRNGLO	0x6ffffe00
+#define DT_GNU_HASH	0x6ffffef5	/* GNU-style hash table.  */
+#define DT_TLSDESC_PLT	0x6ffffef6
+#define DT_TLSDESC_GOT	0x6ffffef7
 #define DT_GNU_CONFLICT	0x6ffffef8	/* Start of conflict section */
 #define DT_GNU_LIBLIST	0x6ffffef9	/* Library list */
 #define DT_CONFIG	0x6ffffefa	/* Configuration information.  */
@@ -709,7 +713,7 @@
 #define DT_SYMINFO	0x6ffffeff	/* Syminfo table.  */
 #define DT_ADDRRNGHI	0x6ffffeff
 #define DT_ADDRTAGIDX(tag)	(DT_ADDRRNGHI - (tag))	/* Reverse order! */
-#define DT_ADDRNUM 10
+#define DT_ADDRNUM 11
 
 /* The versioning entry types.  The next are defined as part of the
    GNU extension.  */
--- libc/include/link.h	1 Mar 2006 06:18:30 -0000	1.38
+++ libc/include/link.h	26 Jun 2006 20:56:35 -0000
@@ -124,7 +124,7 @@
     const ElfW(Phdr) *l_phdr;	/* Pointer to program header table in core.  */
     ElfW(Addr) l_entry;		/* Entry point location.  */
     ElfW(Half) l_phnum;		/* Number of program header entries.  */
-    ElfW(Half) l_ldnum;	/* Number of dynamic segment entries.  */
+    ElfW(Half) l_ldnum;		/* Number of dynamic segment entries.  */
 
     /* Array of DT_NEEDED dependencies and their dependencies, in
        dependency order for symbol lookup (with and without
@@ -141,7 +141,13 @@
 
     /* Symbol hash table.  */
     Elf_Symndx l_nbuckets;
-    const Elf_Symndx *l_buckets, *l_chain;
+    const Elf32_Word *l_gnu_buckets;
+    union
+    {
+      const Elf32_Word *l_gnu_hashbase;
+      const Elf_Symndx *l_chain;
+    };
+    const Elf_Symndx *l_buckets;
 
     unsigned int l_direct_opencount; /* Reference count for dlopen/dlclose.  */
     enum			/* Where this object came from.  */
-------------- next part --------------
--- prelink/src/prelink.h.jj	2006-06-20 15:12:16.000000000 +0200
+++ prelink/src/prelink.h	2006-06-27 21:50:11.000000000 +0200
@@ -44,6 +44,11 @@
 #define SHT_GNU_LIBLIST		0x6ffffff7
 #endif
 
+#ifndef DT_GNU_HASH
+#define DT_GNU_HASH		0x6ffffef5
+#define SHT_GNU_HASH		0x6ffffff6
+#endif
+
 struct prelink_entry;
 struct prelink_info;
 struct PLArch;
@@ -75,6 +80,7 @@ typedef struct
   GElf_Addr info_DT_GNU_PRELINKED;
   GElf_Addr info_DT_CHECKSUM;
   GElf_Addr info_DT_VERNEED, info_DT_VERDEF, info_DT_VERSYM;
+  GElf_Addr info_DT_GNU_HASH;
 #define DT_GNU_PRELINKED_BIT 50
 #define DT_CHECKSUM_BIT 51
 #define DT_VERNEED_BIT 52
@@ -83,6 +89,7 @@ typedef struct
 #define DT_FILTER_BIT 55
 #define DT_AUXILIARY_BIT 56
 #define DT_LOPROC_BIT 57
+#define DT_GNU_HASH_BIT 58
   uint64_t info_set_mask;
   int fd, fdro;
   int lastscn, dynamic;
--- prelink/src/prelink.c.jj	2005-06-10 17:09:06.000000000 +0200
+++ prelink/src/prelink.c	2006-06-27 21:53:20.000000000 +0200
@@ -1,4 +1,4 @@
-/* Copyright (C) 2001, 2002, 2003, 2004, 2005 Red Hat, Inc.
+/* Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
    Written by Jakub Jelinek <jakub@redhat.com>, 2001.
 
    This program is free software; you can redistribute it and/or modify
@@ -424,6 +424,7 @@ prelink_prepare (DSO *dso)
 	switch (dso->shdr[i].sh_type)
 	  {
 	  case SHT_HASH:
+	  case SHT_GNU_HASH:
 	  case SHT_DYNSYM:
 	  case SHT_REL:
 	  case SHT_RELA:
--- prelink/src/dso.c.jj	2006-06-21 11:46:34.000000000 +0200
+++ prelink/src/dso.c	2006-06-27 21:51:01.000000000 +0200
@@ -102,6 +102,11 @@ read_dynamic (DSO *dso)
 		  dso->info_set_mask |= (1ULL << DT_AUXILIARY_BIT);
 		else if (dyn.d_tag == DT_LOPROC)
 		  dso->info_set_mask |= (1ULL << DT_LOPROC_BIT);
+		else if (dyn.d_tag == DT_GNU_HASH)
+		  {
+		    dso->info_DT_GNU_HASH = dyn.d_un.d_val;
+		    dso->info_set_mask |= (1ULL << DT_GNU_HASH_BIT);
+		  }
 	      }
 	    if (ndx < maxndx)
 	      break;
@@ -1361,6 +1366,7 @@ adjust_dso (DSO *dso, GElf_Addr start, G
 	    return 1;
 	  break;
 	case SHT_HASH:
+	case SHT_GNU_HASH:
 	case SHT_NOBITS:
 	case SHT_STRTAB:
 	  break;
--- prelink/src/exec.c.jj	2006-05-22 16:33:43.000000000 +0200
+++ prelink/src/exec.c	2006-06-27 21:52:39.000000000 +0200
@@ -61,7 +61,11 @@ update_dynamic_tags (DSO *dso, GElf_Shdr
 	  || (dynamic_info_is_set (dso, DT_VERSYM_BIT)
 	      && dso->info_DT_VERSYM == old_shdr[j].sh_addr
 	      && old_shdr[j].sh_type == SHT_GNU_versym
-	      && set_dynamic (dso, DT_VERSYM, shdr[i].sh_addr, 1)))
+	      && set_dynamic (dso, DT_VERSYM, shdr[i].sh_addr, 1))
+	  || (dynamic_info_is_set (dso, DT_GNU_HASH_BIT)
+	      && dso->info_DT_GNU_HASH == old_shdr[j].sh_addr
+	      && old_shdr[j].sh_type == SHT_GNU_HASH
+	      && set_dynamic (dso, DT_GNU_HASH, shdr[i].sh_addr, 1)))
 	return 1;
     }
 
--- prelink/src/space.c.jj	2006-05-22 16:27:00.000000000 +0200
+++ prelink/src/space.c	2006-06-27 21:48:06.000000000 +0200
@@ -60,6 +60,7 @@ print_sections (DSO *dso, GElf_Ehdr *ehd
       { SHT_GNU_verneed, "VERNEED" },
       { SHT_GNU_versym, "VERSYM" },
       { SHT_GNU_LIBLIST, "LIBLIST" },
+      { SHT_GNU_HASH, "GNU_HASH" },
       { 0, NULL }
     };
 
@@ -181,6 +182,7 @@ readonly_is_movable (DSO *dso, GElf_Ehdr
   switch (shdr[k].sh_type)
     {
     case SHT_HASH:
+    case SHT_GNU_HASH:
     case SHT_DYNSYM:
     case SHT_REL:
     case SHT_RELA:
@@ -527,6 +529,7 @@ find_readonly_space (DSO *dso, GElf_Shdr
 	      switch (shdr[j].sh_type)
 		{
 		case SHT_HASH:
+		case SHT_GNU_HASH:
 		case SHT_DYNSYM:
 		case SHT_STRTAB:
 		case SHT_GNU_verdef:
-------------- next part --------------
cat > a.c <<\EOF
cat /tmp/a.c
#include <dlfcn.h>
const char *libs[] = {
"libvclplug_gtk680lx.so", "libvclplug_gen680lx.so", "libnss_files.so.2", "libGL.so.1", "servicemgr.uno.so",
"shlibloader.uno.so", "simplereg.uno.so", "nestedreg.uno.so", "typemgr.uno.so", "implreg.uno.so",
"security.uno.so", "libreg.so.3", "libstore.so.3", "regtypeprov.uno.so", "configmgr2.uno.so",
"typeconverter.uno.so", "gconfbe1.uno.so", "behelper.uno.so", "sax.uno.so", "localebe1.uno.so",
"uriproc.uno.so", "libspl680lx.so", "libucb1.so", "ucpgvfs1.uno.so", "libgcc3_uno.so", "libpackage2.so",
"libfileacc.so", "libuui680lx.so", "libfilterconfig1.so", "libdtransX11680lx.so", "i18npool.uno.so",
"liblocaledata_en.so", "fsstorage.uno.so", "libxstor.so", "libdbtools680lx.so", "libcups.so.2",
"libgnutls.so.13", "libgcrypt.so.11", "libgpg-error.so.0", "libmcnttype.so", "libucpchelp1.so",
"svtmisc.uno.so" };
int
main (int argc, char **argv)
{
  int i;
  void *h;
  int flags = RTLD_LAZY;
  if (argv[1][0] == 'g')
    flags |= RTLD_GLOBAL;
  for (i = 0; i < sizeof (libs) / sizeof (libs[0]); ++i)
    h = dlopen (libs[i], flags);
  return 0;
}
EOF
gcc -g -O2 -o a a.c -Wl,-rpath,/usr/lib64/openoffice.org2.0/program/ \
  -L/usr/lib64/openoffice.org2.0/program/ -lsoffice -lsw680lx -lsvx680lx -lstdc++ -lm -shared-libgcc
for V in local global; do for M in '' 'export LD_X=1' 'export LD_BIND_NOW=1' 'export LD_X=1 LD_BIND_NOW=1'; \
  do ( for i in 1 2 3 4; do eval $M; time ./a $V; done 2>&1 > /dev/null | \
    awk 'BEGIN { printf "'"$V $M"'\t" } /^real/ { printf "%s ", $2 } END { printf "\n" }' ); done; done

local					0m0.264s 0m0.253s 0m0.256s 0m0.256s
local export LD_X=1			0m0.544s 0m0.538s 0m0.538s 0m0.537s
local export LD_BIND_NOW=1		0m0.480s 0m0.474s 0m0.477s 0m0.480s
local export LD_X=1 LD_BIND_NOW=1	0m1.102s 0m1.094s 0m1.096s 0m1.095s
global					0m0.301s 0m0.299s 0m0.294s 0m0.294s
global export LD_X=1			0m0.625s 0m0.619s 0m0.619s 0m0.618s
global export LD_BIND_NOW=1		0m0.553s 0m0.546s 0m0.544s 0m0.544s
global export LD_X=1 LD_BIND_NOW=1	0m1.251s 0m1.245s 0m1.244s 0m1.243s

for V in local global; do for M in '' 'export LD_X=1' 'export LD_BIND_NOW=1' 'export LD_X=1 LD_BIND_NOW=1'; \
  do ( echo "$V $M"; eval $M; valgrind --tool=cachegrind ./a $V 2>&1 > /dev/null | sed -n '/== I   refs/,$p' ); \
    done; done

local 
==11628== I   refs:      213,572,489
==11628== I1  misses:         11,630
==11628== L2i misses:         10,103
==11628== I1  miss rate:        0.00%
==11628== L2i miss rate:        0.00%
==11628== 
==11628== D   refs:       78,630,135  (62,272,247 rd + 16,357,888 wr)
==11628== D1  misses:      4,699,115  ( 4,544,371 rd +    154,744 wr)
==11628== L2d misses:        643,429  (   549,365 rd +     94,064 wr)
==11628== D1  miss rate:         5.9% (       7.2%   +        0.9%  )
==11628== L2d miss rate:         0.8% (       0.8%   +        0.5%  )
==11628== 
==11628== L2 refs:         4,710,745  ( 4,556,001 rd +    154,744 wr)
==11628== L2 misses:         653,532  (   559,468 rd +     94,064 wr)
==11628== L2 miss rate:          0.2% (       0.2%   +        0.5%  )
local export LD_X=1
==11632== I   refs:      306,655,479
==11632== I1  misses:         11,612
==11632== L2i misses:         10,459
==11632== I1  miss rate:        0.00%
==11632== L2i miss rate:        0.00%
==11632== 
==11632== D   refs:      129,271,101  (99,462,385 rd + 29,808,716 wr)
==11632== D1  misses:      9,739,970  ( 9,576,214 rd +    163,756 wr)
==11632== L2d misses:      3,035,531  ( 2,930,229 rd +    105,302 wr)
==11632== D1  miss rate:         7.5% (       9.6%   +        0.5%  )
==11632== L2d miss rate:         2.3% (       2.9%   +        0.3%  )
==11632== 
==11632== L2 refs:         9,751,582  ( 9,587,826 rd +    163,756 wr)
==11632== L2 misses:       3,045,990  ( 2,940,688 rd +    105,302 wr)
==11632== L2 miss rate:          0.6% (       0.7%   +        0.3%  )
local export LD_BIND_NOW=1
==11638== I   refs:      416,076,941
==11638== I1  misses:         11,145
==11638== L2i misses:          9,847
==11638== I1  miss rate:        0.00%
==11638== L2i miss rate:        0.00%
==11638== 
==11638== D   refs:      156,764,733  (123,796,220 rd + 32,968,513 wr)
==11638== D1  misses:      9,682,235  (  9,503,136 rd +    179,099 wr)
==11638== L2d misses:        967,489  (    865,728 rd +    101,761 wr)
==11638== D1  miss rate:         6.1% (        7.6%   +        0.5%  )
==11638== L2d miss rate:         0.6% (        0.6%   +        0.3%  )
==11638== 
==11638== L2 refs:         9,693,380  (  9,514,281 rd +    179,099 wr)
==11638== L2 misses:         977,336  (    875,575 rd +    101,761 wr)
==11638== L2 miss rate:          0.1% (        0.1%   +        0.3%  )
local export LD_X=1 LD_BIND_NOW=1
==11643== I   refs:      612,287,612
==11643== I1  misses:         11,141
==11643== L2i misses:         10,057
==11643== I1  miss rate:        0.00%
==11643== L2i miss rate:        0.00%
==11643== 
==11643== D   refs:      264,754,881  (202,680,154 rd + 62,074,727 wr)
==11643== D1  misses:     20,634,045  ( 20,436,902 rd +    197,143 wr)
==11643== L2d misses:      6,327,654  (  6,214,729 rd +    112,925 wr)
==11643== D1  miss rate:         7.7% (       10.0%   +        0.3%  )
==11643== L2d miss rate:         2.3% (        3.0%   +        0.1%  )
==11643== 
==11643== L2 refs:        20,645,186  ( 20,448,043 rd +    197,143 wr)
==11643== L2 misses:       6,337,711  (  6,224,786 rd +    112,925 wr)
==11643== L2 miss rate:          0.7% (        0.7%   +        0.1%  )
global 
==11647== I   refs:      229,660,039
==11647== I1  misses:         11,781
==11647== L2i misses:         10,255
==11647== I1  miss rate:        0.00%
==11647== L2i miss rate:        0.00%
==11647== 
==11647== D   refs:       86,649,339  (68,557,134 rd + 18,092,205 wr)
==11647== D1  misses:      6,704,681  ( 6,545,220 rd +    159,461 wr)
==11647== L2d misses:        685,354  (   590,853 rd +     94,501 wr)
==11647== D1  miss rate:         7.7% (       9.5%   +        0.8%  )
==11647== L2d miss rate:         0.7% (       0.8%   +        0.5%  )
==11647== 
==11647== L2 refs:         6,716,462  ( 6,557,001 rd +    159,461 wr)
==11647== L2 misses:         695,609  (   601,108 rd +     94,501 wr)
==11647== L2 miss rate:          0.2% (       0.2%   +        0.5%  )
global export LD_X=1
==11651== I   refs:      331,688,345
==11651== I1  misses:         11,730
==11651== L2i misses:         10,602
==11651== I1  miss rate:        0.00%
==11651== L2i miss rate:        0.00%
==11651== 
==11651== D   refs:      142,641,436  (109,595,921 rd + 33,045,515 wr)
==11651== D1  misses:     12,232,659  ( 12,067,731 rd +    164,928 wr)
==11651== L2d misses:      3,522,116  (  3,416,331 rd +    105,785 wr)
==11651== D1  miss rate:         8.5% (       11.0%   +        0.4%  )
==11651== L2d miss rate:         2.4% (        3.1%   +        0.3%  )
==11651== 
==11651== L2 refs:        12,244,389  ( 12,079,461 rd +    164,928 wr)
==11651== L2 misses:       3,532,718  (  3,426,933 rd +    105,785 wr)
==11651== L2 miss rate:          0.7% (        0.7%   +        0.3%  )
global export LD_BIND_NOW=1
==11656== I   refs:      445,261,358
==11656== I1  misses:         11,280
==11656== L2i misses:          9,978
==11656== I1  miss rate:        0.00%
==11656== L2i miss rate:        0.00%
==11656== 
==11656== D   refs:      171,275,049  (135,170,564 rd + 36,104,485 wr)
==11656== D1  misses:     13,300,976  ( 13,111,867 rd +    189,109 wr)
==11656== L2d misses:      1,045,200  (    943,012 rd +    102,188 wr)
==11656== D1  miss rate:         7.7% (        9.7%   +        0.5%  )
==11656== L2d miss rate:         0.6% (        0.6%   +        0.2%  )
==11656== 
==11656== L2 refs:        13,312,256  ( 13,123,147 rd +    189,109 wr)
==11656== L2 misses:       1,055,178  (    952,990 rd +    102,188 wr)
==11656== L2 miss rate:          0.1% (        0.1%   +        0.2%  )
global export LD_X=1 LD_BIND_NOW=1
==11660== I   refs:      657,215,295
==11660== I1  misses:         11,238
==11660== L2i misses:         10,165
==11660== I1  miss rate:        0.00%
==11660== L2i miss rate:        0.00%
==11660== 
==11660== D   refs:      288,810,775  (220,892,186 rd + 67,918,589 wr)
==11660== D1  misses:     25,132,151  ( 24,931,250 rd +    200,901 wr)
==11660== L2d misses:      7,240,360  (  7,126,874 rd +    113,486 wr)
==11660== D1  miss rate:         8.7% (       11.2%   +        0.2%  )
==11660== L2d miss rate:         2.5% (        3.2%   +        0.1%  )
==11660== 
==11660== L2 refs:        25,143,389  ( 24,942,488 rd +    200,901 wr)
==11660== L2 misses:       7,250,525  (  7,137,039 rd +    113,486 wr)
==11660== L2 miss rate:          0.7% (        0.8%   +        0.1%  )

for V in local global; do for M in '' '-E LD_X=1' '-E LD_BIND_NOW=1' '-E LD_X=1 -E LD_BIND_NOW=1'; \
  do ( echo "$V $M"; ./timing $M ./a $V ); done; done

local
Strip out best and worst realtime result
minimum: 0.252914000 sec real / 0.000051294 sec CPU
maximum: 0.269686000 sec real / 0.000083306 sec CPU
average: 0.254617928 sec real / 0.000071702 sec CPU
stdev  : 0.000890554 sec real / 0.000003730 sec CPU
local -E LD_X=1
optarg="LD_X=1"
Strip out best and worst realtime result
minimum: 0.536379000 sec real / 0.000050866 sec CPU
maximum: 0.539256000 sec real / 0.000079972 sec CPU
average: 0.537778428 sec real / 0.000074764 sec CPU
stdev  : 0.000612980 sec real / 0.000002034 sec CPU
local -E LD_BIND_NOW=1
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.470151000 sec real / 0.000053946 sec CPU
maximum: 0.481664000 sec real / 0.000084505 sec CPU
average: 0.473882142 sec real / 0.000073921 sec CPU
stdev  : 0.001887639 sec real / 0.000002616 sec CPU
local -E LD_X=1 -E LD_BIND_NOW=1
optarg="LD_X=1"
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 1.092469000 sec real / 0.000051647 sec CPU
maximum: 1.106560000 sec real / 0.000078219 sec CPU
average: 1.096268250 sec real / 0.000064646 sec CPU
stdev  : 0.002515850 sec real / 0.000003027 sec CPU
global
Strip out best and worst realtime result
minimum: 0.294585000 sec real / 0.000050279 sec CPU
maximum: 0.304168000 sec real / 0.000078209 sec CPU
average: 0.297781285 sec real / 0.000072901 sec CPU
stdev  : 0.002508159 sec real / 0.000004136 sec CPU
global -E LD_X=1
optarg="LD_X=1"
Strip out best and worst realtime result
minimum: 0.617157000 sec real / 0.000064151 sec CPU
maximum: 0.645039000 sec real / 0.000084488 sec CPU
average: 0.621962785 sec real / 0.000075530 sec CPU
stdev  : 0.002484547 sec real / 0.000003147 sec CPU
global -E LD_BIND_NOW=1
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.544103000 sec real / 0.000052304 sec CPU
maximum: 0.557447000 sec real / 0.000078790 sec CPU
average: 0.548014107 sec real / 0.000073886 sec CPU
stdev  : 0.002805780 sec real / 0.000002697 sec CPU
global -E LD_X=1 -E LD_BIND_NOW=1
optarg="LD_X=1"
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 1.241722000 sec real / 0.000058554 sec CPU
maximum: 1.255916000 sec real / 0.000076953 sec CPU
average: 1.247884071 sec real / 0.000063511 sec CPU
stdev  : 0.003259242 sec real / 0.000002160 sec CPU

/usr/sbin/prelink -vmR ./a
for V in local global; do for M in '' 'export LD_X=1' 'export LD_BIND_NOW=1' 'export LD_X=1 LD_BIND_NOW=1'; \
  do ( for i in 1 2 3 4; do eval $M; time ./a $V; done 2>&1 > /dev/null | \
    awk 'BEGIN { printf "'"$V $M"'\t" } /^real/ { printf "%s ", $2 } END { printf "\n" }' ); done; done

local					0m0.145s 0m0.138s 0m0.139s 0m0.139s
local export LD_X=1			0m0.274s 0m0.268s 0m0.266s 0m0.269s
local export LD_BIND_NOW=1		0m0.245s 0m0.238s 0m0.238s 0m0.239s
local export LD_X=1 LD_BIND_NOW=1	0m0.504s 0m0.497s 0m0.498s 0m0.496s
global					0m0.182s 0m0.175s 0m0.174s 0m0.175s
global export LD_X=1			0m0.352s 0m0.357s 0m0.344s 0m0.346s
global export LD_BIND_NOW=1		0m0.310s 0m0.305s 0m0.316s 0m0.306s
global export LD_X=1 LD_BIND_NOW=1	0m0.647s 0m0.641s 0m0.640s 0m0.640s

# valgrind --tool=cachegrind stats not provided for prelinked testcase,
# as valgrind apparently uses LD_PRELOAD internally and thus prevents
# prelinking.

for V in local global; do for M in '' '-E LD_X=1' '-E LD_BIND_NOW=1' '-E LD_X=1 -E LD_BIND_NOW=1'; \
  do ( echo "$V $M"; ./timing $M ./a $V ); done; done

local
Strip out best and worst realtime result
minimum: 0.137495000 sec real / 0.000066247 sec CPU
maximum: 0.142180000 sec real / 0.000086736 sec CPU
average: 0.138369035 sec real / 0.000072997 sec CPU
stdev  : 0.000575184 sec real / 0.000002132 sec CPU
local -E LD_X=1
optarg="LD_X=1"
Strip out best and worst realtime result
minimum: 0.264590000 sec real / 0.000060576 sec CPU
maximum: 0.272804000 sec real / 0.000082688 sec CPU
average: 0.266598571 sec real / 0.000072811 sec CPU
stdev  : 0.001817765 sec real / 0.000003394 sec CPU
local -E LD_BIND_NOW=1
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.236854000 sec real / 0.000065925 sec CPU
maximum: 0.245201000 sec real / 0.000080373 sec CPU
average: 0.238382678 sec real / 0.000075591 sec CPU
stdev  : 0.000959453 sec real / 0.000002887 sec CPU
local -E LD_X=1 -E LD_BIND_NOW=1
optarg="LD_X=1"
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.496607000 sec real / 0.000065955 sec CPU
maximum: 0.512757000 sec real / 0.000084887 sec CPU
average: 0.498181678 sec real / 0.000074275 sec CPU
stdev  : 0.001529594 sec real / 0.000002630 sec CPU
global
Strip out best and worst realtime result
minimum: 0.173740000 sec real / 0.000048699 sec CPU
maximum: 0.181163000 sec real / 0.000083410 sec CPU
average: 0.175901500 sec real / 0.000070443 sec CPU
stdev  : 0.001745144 sec real / 0.000003656 sec CPU
global -E LD_X=1
optarg="LD_X=1"
Strip out best and worst realtime result
minimum: 0.344016000 sec real / 0.000058830 sec CPU
maximum: 0.377289000 sec real / 0.000076792 sec CPU
average: 0.346814392 sec real / 0.000072660 sec CPU
stdev  : 0.002058835 sec real / 0.000002517 sec CPU
global -E LD_BIND_NOW=1
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.304208000 sec real / 0.000049604 sec CPU
maximum: 0.314217000 sec real / 0.000077094 sec CPU
average: 0.307348821 sec real / 0.000071335 sec CPU
stdev  : 0.002641413 sec real / 0.000003427 sec CPU
global -E LD_X=1 -E LD_BIND_NOW=1
optarg="LD_X=1"
optarg="LD_BIND_NOW=1"
Strip out best and worst realtime result
minimum: 0.640543000 sec real / 0.000044401 sec CPU
maximum: 0.664382000 sec real / 0.000089763 sec CPU
average: 0.646539678 sec real / 0.000071135 sec CPU
stdev  : 0.005879177 sec real / 0.000003697 sec CPU
-------------- next part --------------
--- libc/elf/dl-lookup.c	2006-06-27 10:12:22.000000000 +0200
+++ libc/elf/dl-lookup.c	27 Jun 2006 14:59:07 -0000
@@ -364,7 +364,13 @@
   Elf_Symndx *hash;
   Elf_Symndx nchain;
 
-  if (__builtin_expect (map->l_info[DT_ADDRTAGIDX (DT_GNU_HASH) + DT_NUM
+#ifdef SHARED
+  extern int nognubuckets;
+#else
+#define nognubuckets 0
+#endif
+  if (!nognubuckets &&
+      __builtin_expect (map->l_info[DT_ADDRTAGIDX (DT_GNU_HASH) + DT_NUM
   				    + DT_THISPROCNUM + DT_VERSIONTAGNUM
 				    + DT_EXTRANUM + DT_VALNUM] != NULL, 1))
     {
--- libc/elf/rtld.c.~1.362.~	2006-04-08 12:50:07.000000000 -0700
+++ libc/elf/rtld.c	2006-06-27 07:54:51.000000000 -0700
@@ -2493,6 +2493,7 @@ process_dl_audit (char *str)
 extern char **_environ attribute_hidden;
 
 
+int nognubuckets;
 static void
 process_envvars (enum mode *modep)
 {
@@ -2520,6 +2521,11 @@ process_envvars (enum mode *modep)
 
       switch (len)
 	{
+	case 1:
+	  if (envline[0] == 'X')
+	    nognubuckets = 1;
+	  break;
+
 	case 4:
 	  /* Warning level, verbose or not.  */
 	  if (memcmp (envline, "WARN", 4) == 0)


More information about the Libc-alpha mailing list