This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFA] Don't reset watchpoint block on solib load.
- From: Jim Blandy <jimb at codesourcery dot com>
- To: Vladimir Prus <vladimir at codesourcery dot com>
- Cc: gdb-patches at sources dot redhat dot com
- Date: Tue, 15 Jan 2008 17:28:32 -0800
- Subject: Re: [RFA] Don't reset watchpoint block on solib load.
- References: <200711202013.47537.vladimir@codesourcery.com> <m3abozfg5c.fsf@codesourcery.com> <200711281858.51145.vladimir@codesourcery.com>
(Since it took me a bit to re-discover what this patch was about, I
figure I should recap...)
breakpoint_re_set_one gets called when the symbolic information
available to GDB has changed: we've re-run or attached and the
executable file has changed since we last read it, or the program has
loaded or unloaded a shared library. The breakpoint_re_set_one
function is responsible for updating all references to symbol table
contents in breakpoints, watchpoints, etc.
A watchpoint can refer to local variables that were in scope when it
was set, so it may be associated with a particular frame. However,
breakpoint_re_set_one re-parses watchpoint expressions using
parse_expression, which parses in the scope of whatever frame happens
to selected when it is called. This is very wrong: GDB can call
breakpoint_re_set_one in many different circumstances, and the frame
selected at that time has no necessary relationship to the frame in
which the user set the watchpoint.
In particular, GDB calls breakpoint_re_set_one when it stops for a
shared library load or unload. At this point, the selected frame is
the youngest frame, in the dynamic linker's debug breakpoint function.
Since watchpoints' local variables are almost certainly not in scope
in that function, GDB fails to re-parse watchpoint expressions that
refer to local variables whenever we do a dlopen or dlclose. (And
then fails to handle the error and dies.) For example:
$ cat w1.c
#include <dlfcn.h>
int
main (int argc, char **argv)
{
volatile int g = 1;
g = 2;
void *lib = dlopen ("libm.so", RTLD_NOW);
g = 3;
return g;
}
$ gcc -g w1.c -ldl -o w1
$ ~/gdb/pub/nat/gdb/gdb w1
GNU gdb 6.7.50.20080111-cvs
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb) b 10
Breakpoint 1 at 0x8048403: file w1.c, line 10.
(gdb) run
Starting program: /home/jimb/play/w1
Breakpoint 1, main () at w1.c:10
10 void *lib = dlopen ("libm.so", RTLD_NOW);
(gdb) watch g
Hardware watchpoint 2: g
(gdb) next
Error in re-setting breakpoint 2: No symbol "g" in current context.
Segmentation fault (core dumped)
$
I believe Vlad's most recent patch is this one:
http://sourceware.org/ml/gdb-patches/2007-11/msg00522.html
The patch has two effects:
First, in breakpoint_re_set_one (w), if W->exp_valid_block is NULL
(meaning that the watchpoint refers to no local variables), or if it
is set and GDB can find W->watchpoint_frame on the stack, then it
tries to re-parse the expression in exp_valid_block.
Second, it re-parses watchpoint condition expressions (the Y in 'watch
X if Y') when it inserts breakpoints.
I have two questions about this patch.
- If we dlclose a shared library while there are still frames in that
library's code on the stack, couldn't exp_valid_block be pointing to
garbage (i.e. a 'struct block' in the shared library whose debug
info has now been freed), even though frame_find_by_id can find the
frame?
Granted, the user's program is unhappy, because it's going to return
to code that isn't there any more, but GDB shouldn't crash --- after
all, debuggers are meant for use on unhappy programs.
If we can find the frame, and we can find the frame's block, could
we use that instead? In the perverse case I outlined, I assume we'd
be able to find the frame, but not the block.
Certainly, for watchpoints that cite no local variables, we should
just always try to re-parse. The prior conversation identified some
ugly issues there, too, but that's an existing bug; let's take it
one step at a time.
- Why shouldn't we re-parse the condition at the same time we re-parse
the expression, in the same block? Why should we re-parse the
condition when we insert breakpoints?
Although I agreed to reparsing the condition when we insert
breakpoints earlier in the thread, I can't see now why it's a good
idea. It seems wasteful, since we remove and insert breakpoints
frequently, and since we do remove and re-insert breakpoints on
dlopen/dlclose, it doesn't seem to be a significantly more
auspicious time to do the parse.