This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFA] Kill some linear searches in minsyms.c


So uh, yeah, and stuff.  I found an interesting one today when I wasn't even
looking for it.  Earlier today I posted an updated patch to make it easy,
very easy, to profile GDB.  Here's a pretty good sample of why I think this
is important.  I was actually looking for something completely different,
but I noticed this on the way.

Consider this patch:

2000-03-06  Jim Blandy  <jimb@redhat.com>

        From Tom Tromey <tromey@cygnus.com> and Keith Seitz <?>:

        * minsyms.c: #include <ctype.h>, for msymbol_hash_iw.
        (compact_minimal_symbols): Added `objfile' argument.
        Put symbols in the objfile's hash table.
        (install_minimal_symbols): Put symbols in the objfile's demangled
        hash table.
        (lookup_minimal_symbol): Use hash table to find symbol in
        objfile.
        (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): New
        functions.
        (prim_record_minimal_symbol_and_info): Initialize the
        hash link fields of the new minimal symbol.
        * symtab.h (struct minimal_symbol): New fields `hash_next',
        `demangled_hash_next'.
        (msymbol_hash_iw, msymbol_hash, add_minsym_to_hash_table): Declare.
        * objfiles.h (MINIMAL_SYMBOL_HASH_SIZE): New define.
        (struct objfile): New fields `msymbol_hash',
        `msymbol_demangled_hash'.

These replaced a linear search with a hash table.  We'd better use it, too,
since a good-sized application has an awful lot of minimal symbols. 
However, two other related functions (that's lookup_minimal_symbol_text and
lookup_minimal_symbol_solib_trampoline) were not converted.  These get
called reasonably often, vis:

                0.75    0.43      62/310         create_overlay_event_breakpoint [16]
                3.01    1.72     248/310         create_longjmp_breakpoint [5]
[4]     48.1    3.76    2.16     310         lookup_minimal_symbol_text [4]

That "3.76" is in _seconds_, by the way.  Then they proceed to do:
                1.26    0.00 9567591/9567734     strcmp_iw [15]
                0.90    0.00 27551335/30742029     symbol_demangled_name [17]

27.5 million calls to a wrapper function.  My first instinct was that a
wrapper function was a bad idea, then.  My second instinct was to smack my
first instinct upside the head and find out why we were calling it so many
times, since I knew we hashed minsyms.  Converting them to the hash table
once I found the problem was quite easy.  The patch is attached; is it OK to
commit?  No regressions on i386-pc-linux-gnu, for what that's worth, and
it's quite straightforward.



Future things to examine:

Before, GDB's CPU time to load symbols for all of mozilla, get through a
large number of threaded shared library events (ow, but I'm glad the pread64
patch is in, or this measurement would have been just impossible to sort out
of the noise), and reach the first dialog box was 26.630000 seconds.  After
the attached patch, 17.820000 seconds.  Much better.  The new hot spots are:

 21.32      1.42     1.42  1594586     0.00     0.00  hash
 10.66      2.13     0.71   185486     0.00     0.00  find_corresponding_bincl_psymtab
  8.11      2.67     0.54  1128766     0.00     0.00  bcache
  6.76      3.12     0.45       41    10.98    91.28  read_dbx_symtab
  6.31      3.54     0.42     1519     0.28     0.30  end_psymtab

I suspect we can cut find_corresponding_bincl_psymtab out of the way too;
and maybe the bcache needs some rethinking into a more specialized data
structure since we're calling it so often; but maybe there's nothing that
can be done about that (this test has a terrifyingly large number of symbols
in it).  There's 3.3 million calls to mmalloc, too - a million of them are
from a call to save_inferior_ptid in thread_db_xfer_memory.  And
symbol_demangled_name is still on the top ten, via compare_psymbols.  That's
another good candidate for a clever data structure if it becomes worth the
time, which it isn't in this test.

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

2003-01-05  Daniel Jacobowitz  <drow@mvista.com>

	* minsyms.c (enum ms_search_which): New.
	(lookup_minimal_symbol_internal): New function, broken out from
	lookup_minimal_symbol.
	(lookup_minimal_symbol, lookup_minimal_symbol_text)
	(lookup_minimal_symbol_solib_trampoline): Use them.

Index: minsyms.c
===================================================================
RCS file: /cvs/src/src/gdb/minsyms.c,v
retrieving revision 1.22
diff -u -p -r1.22 minsyms.c
--- minsyms.c	11 Jul 2002 20:46:19 -0000	1.22
+++ minsyms.c	6 Jan 2003 04:19:16 -0000
@@ -1,5 +1,6 @@
 /* GDB routines for manipulating the minimal symbol tables.
-   Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001
+   Copyright 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
+   2002, 2003
    Free Software Foundation, Inc.
    Contributed by Cygnus Support, using pieces from other GDB modules.
 
@@ -132,11 +133,18 @@ add_minsym_to_demangled_hash_table (stru
     }
 }
 
+enum ms_search_which {
+  ms_search_all,
+  ms_search_text,
+  ms_search_trampoline
+};
 
 /* Look through all the current minimal symbol tables and find the
    first minimal symbol that matches NAME.  If OBJF is non-NULL, limit
-   the search to that objfile.  If SFILE is non-NULL, limit the search
-   to that source file.  Returns a pointer to the minimal symbol that
+   the search to that objfile.  If SFILE is non-NULL, only find file-scope
+   symbols from that source file (global symbols are still preferred).
+   WHICH indicates which minimal symbols to accept: all, text symbols only,
+   or trampolines only.  Returns a pointer to the minimal symbol that
    matches, or NULL if no match is found.
 
    Note:  One instance where there may be duplicate minimal symbols with
@@ -144,9 +152,10 @@ add_minsym_to_demangled_hash_table (stru
    symbol tables for an executable contain global symbols with the same
    names (the dynamic linker deals with the duplication). */
 
-struct minimal_symbol *
-lookup_minimal_symbol (register const char *name, const char *sfile,
-		       struct objfile *objf)
+static struct minimal_symbol *
+lookup_minimal_symbol_internal (register const char *name, const char *sfile,
+				struct objfile *objf,
+				enum ms_search_which which)
 {
   struct objfile *objfile;
   struct minimal_symbol *msymbol;
@@ -170,68 +179,77 @@ lookup_minimal_symbol (register const ch
        objfile != NULL && found_symbol == NULL;
        objfile = objfile->next)
     {
-      if (objf == NULL || objf == objfile)
+      int pass;
+      if (objf && objf != objfile)
+	continue;
+
+      /* Do two passes: the first over the ordinary hash table,
+	 and the second over the demangled hash table.  */
+      for (pass = 1; pass <= 2 && found_symbol == NULL; pass++)
 	{
-	  /* Do two passes: the first over the ordinary hash table,
-	     and the second over the demangled hash table.  */
-        int pass;
-
-        for (pass = 1; pass <= 2 && found_symbol == NULL; pass++)
+	  /* Select hash list according to pass.  */
+	  if (pass == 1)
+	    msymbol = objfile->msymbol_hash[hash];
+	  else
+	    msymbol = objfile->msymbol_demangled_hash[dem_hash];
+
+	  for (; msymbol != NULL && found_symbol == NULL;
+	       msymbol = (pass == 1)
+		 ? msymbol->hash_next
+		 : msymbol->demangled_hash_next)
 	    {
-            /* Select hash list according to pass.  */
-            if (pass == 1)
-              msymbol = objfile->msymbol_hash[hash];
-            else
-              msymbol = objfile->msymbol_demangled_hash[dem_hash];
+	      if (!SYMBOL_MATCHES_NAME (msymbol, name))
+		continue;
 
-            while (msymbol != NULL && found_symbol == NULL)
+	      if (which == ms_search_trampoline)
 		{
-                if (SYMBOL_MATCHES_NAME (msymbol, name))
-		    {
-                    switch (MSYMBOL_TYPE (msymbol))
-                      {
-                      case mst_file_text:
-                      case mst_file_data:
-                      case mst_file_bss:
+		  if (MSYMBOL_TYPE (msymbol) == mst_solib_trampoline)
+		    return msymbol;
+		  continue;
+		}
+
+	      if (which == ms_search_text
+		  && MSYMBOL_TYPE (msymbol) != mst_file_text
+		  && MSYMBOL_TYPE (msymbol) != mst_text)
+		continue;
+
+	      switch (MSYMBOL_TYPE (msymbol))
+		{
+		case mst_file_data:
+		case mst_file_bss:
+		case mst_file_text:
 #ifdef SOFUN_ADDRESS_MAYBE_MISSING
-                        if (sfile == NULL || STREQ (msymbol->filename, sfile))
-                          found_file_symbol = msymbol;
+		  if (sfile == NULL || STREQ (msymbol->filename, sfile))
+		    found_file_symbol = msymbol;
 #else
-                        /* We have neither the ability nor the need to
-                           deal with the SFILE parameter.  If we find
-                           more than one symbol, just return the latest
-                           one (the user can't expect useful behavior in
-                           that case).  */
-                        found_file_symbol = msymbol;
+		  /* We have neither the ability nor the need to deal
+		     with the SFILE parameter.  If we find more than
+		     one symbol, just return the latest one (the user
+		     can't expect useful behavior in that case).  */
+		  found_file_symbol = msymbol;
 #endif
-                        break;
+		  break;
 
-                      case mst_solib_trampoline:
+		case mst_solib_trampoline:
 
-                        /* If a trampoline symbol is found, we prefer to
-                           keep looking for the *real* symbol. If the
-                           actual symbol is not found, then we'll use the
-                           trampoline entry. */
-                        if (trampoline_symbol == NULL)
-                          trampoline_symbol = msymbol;
-                        break;
-
-                      case mst_unknown:
-                      default:
-                        found_symbol = msymbol;
-                        break;
-                      }
-		    }
-
-                /* Find the next symbol on the hash chain.  */
-                if (pass == 1)
-                  msymbol = msymbol->hash_next;
-                else
-                  msymbol = msymbol->demangled_hash_next;
+		  /* If a trampoline symbol is found, we prefer to
+		     keep looking for the *real* symbol. If the actual
+		     symbol is not found, then we'll use the
+		     trampoline entry. */
+		  if (trampoline_symbol == NULL)
+		    trampoline_symbol = msymbol;
+		  break;
+
+		case mst_unknown:
+		default:
+		  if (which == ms_search_all)
+		    found_symbol = msymbol;
+		  break;
 		}
 	    }
 	}
     }
+
   /* External symbols are best.  */
   if (found_symbol)
     return found_symbol;
@@ -247,127 +265,28 @@ lookup_minimal_symbol (register const ch
   return NULL;
 }
 
-/* Look through all the current minimal symbol tables and find the
-   first minimal symbol that matches NAME and of text type.  
-   If OBJF is non-NULL, limit
-   the search to that objfile.  If SFILE is non-NULL, limit the search
-   to that source file.  Returns a pointer to the minimal symbol that
-   matches, or NULL if no match is found.
- */
+struct minimal_symbol *
+lookup_minimal_symbol (register const char *name, const char *sfile,
+		       struct objfile *objf)
+{
+  return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_all);
+}
 
 struct minimal_symbol *
 lookup_minimal_symbol_text (register const char *name, const char *sfile,
 			    struct objfile *objf)
 {
-  struct objfile *objfile;
-  struct minimal_symbol *msymbol;
-  struct minimal_symbol *found_symbol = NULL;
-  struct minimal_symbol *found_file_symbol = NULL;
-
-#ifdef SOFUN_ADDRESS_MAYBE_MISSING
-  if (sfile != NULL)
-    {
-      char *p = strrchr (sfile, '/');
-      if (p != NULL)
-	sfile = p + 1;
-    }
-#endif
-
-  for (objfile = object_files;
-       objfile != NULL && found_symbol == NULL;
-       objfile = objfile->next)
-    {
-      if (objf == NULL || objf == objfile)
-	{
-	  for (msymbol = objfile->msymbols;
-	       msymbol != NULL && SYMBOL_NAME (msymbol) != NULL &&
-	       found_symbol == NULL;
-	       msymbol++)
-	    {
-	      if (SYMBOL_MATCHES_NAME (msymbol, name) &&
-		  (MSYMBOL_TYPE (msymbol) == mst_text ||
-		   MSYMBOL_TYPE (msymbol) == mst_file_text))
-		{
-		  switch (MSYMBOL_TYPE (msymbol))
-		    {
-		    case mst_file_text:
-#ifdef SOFUN_ADDRESS_MAYBE_MISSING
-		      if (sfile == NULL || STREQ (msymbol->filename, sfile))
-			found_file_symbol = msymbol;
-#else
-		      /* We have neither the ability nor the need to
-		         deal with the SFILE parameter.  If we find
-		         more than one symbol, just return the latest
-		         one (the user can't expect useful behavior in
-		         that case).  */
-		      found_file_symbol = msymbol;
-#endif
-		      break;
-		    default:
-		      found_symbol = msymbol;
-		      break;
-		    }
-		}
-	    }
-	}
-    }
-  /* External symbols are best.  */
-  if (found_symbol)
-    return found_symbol;
-
-  /* File-local symbols are next best.  */
-  if (found_file_symbol)
-    return found_file_symbol;
-
-  return NULL;
+  return lookup_minimal_symbol_internal (name, sfile, objf, ms_search_text);
 }
 
-/* Look through all the current minimal symbol tables and find the
-   first minimal symbol that matches NAME and of solib trampoline type.  
-   If OBJF is non-NULL, limit
-   the search to that objfile.  If SFILE is non-NULL, limit the search
-   to that source file.  Returns a pointer to the minimal symbol that
-   matches, or NULL if no match is found.
- */
-
 struct minimal_symbol *
 lookup_minimal_symbol_solib_trampoline (register const char *name,
-					const char *sfile, struct objfile *objf)
+					const char *sfile,
+					struct objfile *objf)
 {
-  struct objfile *objfile;
-  struct minimal_symbol *msymbol;
-  struct minimal_symbol *found_symbol = NULL;
-
-#ifdef SOFUN_ADDRESS_MAYBE_MISSING
-  if (sfile != NULL)
-    {
-      char *p = strrchr (sfile, '/');
-      if (p != NULL)
-	sfile = p + 1;
-    }
-#endif
-
-  for (objfile = object_files;
-       objfile != NULL && found_symbol == NULL;
-       objfile = objfile->next)
-    {
-      if (objf == NULL || objf == objfile)
-	{
-	  for (msymbol = objfile->msymbols;
-	       msymbol != NULL && SYMBOL_NAME (msymbol) != NULL &&
-	       found_symbol == NULL;
-	       msymbol++)
-	    {
-	      if (SYMBOL_MATCHES_NAME (msymbol, name) &&
-		  MSYMBOL_TYPE (msymbol) == mst_solib_trampoline)
-		return msymbol;
-	    }
-	}
-    }
-
-  return NULL;
+  return lookup_minimal_symbol_internal (name, sfile, objf,
+					 ms_search_trampoline);
 }
-
 
 /* Search through the minimal symbol table for each objfile and find
    the symbol whose address is the largest address that is still less


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]