[RFA, doc RFA] Avoid calling gdb_realpath if basenames are different
Doug Evans
dje@google.com
Fri Nov 11 00:57:00 GMT 2011
On Sat, Nov 5, 2011 at 11:30 PM, Doug Evans <dje@google.com> wrote:
> Hi.
> This patch has been brought up before (by others).
> E.g., http://sourceware.org/ml/gdb-patches/2010-04/msg00466.html
> I'm hoping we can get this in now.
> We're paying a real and significant cost for what is mostly a
> theoretical concern.
> [E.g., How often is one source file referred to by the user using a basename
> that is different than what's recorded in the debug info?]
>
> If people are concerned about breaking someone's usage,
> we could default basenames-may-differ to true in 7.4,
> with a warning that it will be set to false in 7.5 (or some such).
> [We could leave the default set to true, especially if someone knew
> of at least some minimally common usage this would break.
> I'd hate to otherwise penalize the vast majority of users if not.]
Hi.
Ok to check in?
Note: I set the default to be the common case (speed up gdb by
assuming basenames never differ).
Let me know if you want the default changed.
Tom: I'll look at the bugs you mentioned separately.
2011-11-10 Doug Evans <dje@google.com>
* NEWS: Mention new parameter basenames-may-differ.
* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
! basenames_may_differ.
* psymtab.c (lookup_partial_symtab): Ditto.
* symtab.c (lookup_symtab): Ditto.
(basenames_may_differ): New global.
(_initialize_symtab): New parameter basenames-may-differ.
* symtab.h (basenames_may_differ): Declare.
doc/
* gdb.texinfo (Files): Document basenames-may-differ.
-------------- next part --------------
2011-11-10 Doug Evans <dje@google.com>
* NEWS: Mention new parameter basenames-may-differ.
* dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
! basenames_may_differ.
* psymtab.c (lookup_partial_symtab): Ditto.
* symtab.c (lookup_symtab): Ditto.
(basenames_may_differ): New global.
(_initialize_symtab): New parameter basenames-may-differ.
* symtab.h (basenames_may_differ): Declare.
doc/
* gdb.texinfo (Files): Document basenames-may-differ.
Index: NEWS
===================================================================
RCS file: /cvs/src/src/gdb/NEWS,v
retrieving revision 1.464
diff -u -p -r1.464 NEWS
--- NEWS 2 Nov 2011 23:44:19 -0000 1.464
+++ NEWS 10 Nov 2011 23:49:26 -0000
@@ -150,6 +150,20 @@ show debug entry-values
Control display of debugging info for determining frame argument values at
function entry and virtual tail call frames.
+set basenames-may-differ
+show basenames-may-differ
+ Set whether a source file may have multiple base names.
+ A "base name" is the name of a file with the directory part removed.
+ Example: The base name of "/home/user/hello.c" is "hello.c".
+ When doing file name based lookups, gdb will canonicalize file names
+ (e.g., expand symlinks) before comparing them, which is an expensive
+ operation.
+ If set, gdb will not assume a file is known by one base name, and thus
+ it cannot optimize file name comparisions by skipping the canonicalization
+ step if the base names are different.
+ If not set, all source files must be known by one base name,
+ and gdb will do file name comparisons more efficiently.
+
* New remote packets
QTEnable
Index: dwarf2read.c
===================================================================
RCS file: /cvs/src/src/gdb/dwarf2read.c,v
retrieving revision 1.579
diff -u -p -r1.579 dwarf2read.c
--- dwarf2read.c 10 Nov 2011 20:21:27 -0000 1.579
+++ dwarf2read.c 10 Nov 2011 23:49:26 -0000
@@ -2445,7 +2445,8 @@ dw2_lookup_symtab (struct objfile *objfi
struct symtab **result)
{
int i;
- int check_basename = lbasename (name) == name;
+ const char *name_basename = lbasename (name);
+ int check_basename = name_basename == name;
struct dwarf2_per_cu_data *base_cu = NULL;
dw2_setup (objfile);
@@ -2478,6 +2479,12 @@ dw2_lookup_symtab (struct objfile *objfi
&& FILENAME_CMP (lbasename (this_name), name) == 0)
base_cu = per_cu;
+ /* Before we invoke realpath, which can get expensive when many
+ files are involved, do a quick comparison of the basenames. */
+ if (! basenames_may_differ
+ && FILENAME_CMP (lbasename (this_name), name_basename) != 0)
+ continue;
+
if (full_path != NULL)
{
const char *this_real_name = dw2_get_real_path (objfile,
Index: psymtab.c
===================================================================
RCS file: /cvs/src/src/gdb/psymtab.c,v
retrieving revision 1.31
diff -u -p -r1.31 psymtab.c
--- psymtab.c 28 Oct 2011 17:29:37 -0000 1.31
+++ psymtab.c 10 Nov 2011 23:49:26 -0000
@@ -134,6 +134,7 @@ lookup_partial_symtab (struct objfile *o
const char *full_path, const char *real_path)
{
struct partial_symtab *pst;
+ const char *name_basename = lbasename (name);
ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
{
@@ -142,6 +143,12 @@ lookup_partial_symtab (struct objfile *o
return (pst);
}
+ /* Before we invoke realpath, which can get expensive when many
+ files are involved, do a quick comparison of the basenames. */
+ if (! basenames_may_differ
+ && FILENAME_CMP (name_basename, lbasename (pst->filename)) != 0)
+ continue;
+
/* If the user gave us an absolute path, try to find the file in
this symtab and use its absolute path. */
if (full_path != NULL)
@@ -172,7 +179,7 @@ lookup_partial_symtab (struct objfile *o
/* Now, search for a matching tail (only if name doesn't have any dirs). */
- if (lbasename (name) == name)
+ if (name_basename == name)
ALL_OBJFILE_PSYMTABS_REQUIRED (objfile, pst)
{
if (FILENAME_CMP (lbasename (pst->filename), name) == 0)
Index: symtab.c
===================================================================
RCS file: /cvs/src/src/gdb/symtab.c,v
retrieving revision 1.285
diff -u -p -r1.285 symtab.c
--- symtab.c 29 Oct 2011 07:26:07 -0000 1.285
+++ symtab.c 10 Nov 2011 23:49:26 -0000
@@ -112,6 +112,11 @@ void _initialize_symtab (void);
/* */
+/* Non-zero if a file may be known by two different basenames.
+ This is the uncommon case, and significantly slows down gdb.
+ Default set to "off" to not slow down the common case. */
+int basenames_may_differ = 0;
+
/* Allow the user to configure the debugger behavior with respect
to multiple-choice menus when more than one symbol matches during
a symbol lookup. */
@@ -155,6 +160,7 @@ lookup_symtab (const char *name)
char *real_path = NULL;
char *full_path = NULL;
struct cleanup *cleanup;
+ const char* base_name = lbasename (name);
cleanup = make_cleanup (null_cleanup, NULL);
@@ -180,6 +186,12 @@ got_symtab:
return s;
}
+ /* Before we invoke realpath, which can get expensive when many
+ files are involved, do a quick comparison of the basenames. */
+ if (! basenames_may_differ
+ && FILENAME_CMP (base_name, lbasename (s->filename)) != 0)
+ continue;
+
/* If the user gave us an absolute path, try to find the file in
this symtab and use its absolute path. */
@@ -4883,5 +4897,22 @@ Show how the debugger handles ambiguitie
Valid values are \"ask\", \"all\", \"cancel\", and the default is \"all\"."),
NULL, NULL, &setlist, &showlist);
+ add_setshow_boolean_cmd ("basenames-may-differ", class_obscure,
+ &basenames_may_differ, _("\
+Set whether a source file may have multiple base names."), _("\
+Show whether a source file may have multiple base names."), _("\
+A \"base name\" is the name of a file with the directory part removed.\n\
+Example: The base name of \"/home/user/hello.c\" is \"hello.c\".\n\
+When doing file name based lookups, gdb will canonicalize file names\n\
+(e.g., expand symlinks) before comparing them, which is an expensive\n\
+operation.\n\
+If set, gdb will not assume a file is known by one base name, and thus\n\
+it cannot optimize file name comparisions by skipping the canonicalization\n\
+step if the base names are different.\n\
+If not set, all source files must be known by one base name,\n\
+and gdb will do file name comparisons much more efficiently."),
+ NULL, NULL,
+ &setlist, &showlist);
+
observer_attach_executable_changed (symtab_observer_executable_changed);
}
Index: symtab.h
===================================================================
RCS file: /cvs/src/src/gdb/symtab.h,v
retrieving revision 1.191
diff -u -p -r1.191 symtab.h
--- symtab.h 10 Nov 2011 20:21:28 -0000 1.191
+++ symtab.h 10 Nov 2011 23:49:26 -0000
@@ -1306,4 +1306,6 @@ void fixup_section (struct general_symbo
struct objfile *lookup_objfile_from_block (const struct block *block);
+extern int basenames_may_differ;
+
#endif /* !defined(SYMTAB_H) */
Index: doc/gdb.texinfo
===================================================================
RCS file: /cvs/src/src/gdb/doc/gdb.texinfo,v
retrieving revision 1.890
diff -u -p -r1.890 gdb.texinfo
--- doc/gdb.texinfo 8 Nov 2011 21:34:18 -0000 1.890
+++ doc/gdb.texinfo 10 Nov 2011 23:49:28 -0000
@@ -15680,6 +15680,47 @@ This is the default.
@end table
@end table
+@cindex file name canonicalization
+@cindex base name differences
+When processing file names provided by the user,
+@value{GDBN} will canonicalize them and remove symbolic links.
+This ensures that @value{GDBN} will find the right file,
+even if the debug information specifies an alternate path.
+However, with large programs this canonicalization can noticeably slow
+down @value{GDBN}. To compensate, @value{GDBN} will try to avoid
+this canonicalization wherever possible. One way it can do so
+is by first comparing the @samp{base name} of a file.
+The @samp{base name} of a file is simply the file's name without
+any directory information. For example, the base name of
+@file{/home/user/hello.c} is @file{hello.c}.
+By doing this @value{GDBN} can skip, for example,
+@file{/usr/include/stdio.h} without having to first canonicalize
+and then compare the directory names.
+This works great, except when the base name of a file
+can have multiple names due to symbolic links.
+For example, if @file{/home/user/bar.c} is a symbolic link to
+@file{/home/user/foo.c} then @value{GDBN} cannot just look at
+the base name of two files, it must canonicalize them, expand
+all symbolic links, and @emph{then} compare the file names
+to see if they match.
+Fortunately, having one file known by two different base names
+does not generally occur in practice.
+Should it occur, however, @value{GDBN} provides an escape hatch
+to allow this to work.
+By setting @code{basenames-may-differ} to @code{true}
+@value{GDBN} will always canonicalize file names before
+comparing them, thus ensuring that one file known by multiple
+base names are treated as the same file.
+
+@table @code
+@item set basenames-may-differ
+@kindex set basenames-may-differ
+Set whether a source file may have multiple base names.
+
+@item show basenames-may-differ
+@kindex show basenames-may-differ
+Show whether a source file may have multiple base names.
+@end table
@node Separate Debug Files
@section Debugging Information in Separate Files
More information about the Gdb-patches
mailing list