This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/6] Introduce `pre_expanded sals'


On Tuesday 12 April 2011 21:26:32, Tom Tromey wrote:
> >>>>> "Pedro" == Pedro Alves <pedro@codesourcery.com> writes:
> 
> Tom> This is just a way for decode_line_1 to tell the breakpoint code that
> Tom> the returned sals has multiple locations but should still create just a
> Tom> single breakpoint.  We needed this because a given SystemTap probe name
> Tom> may have multiple locations.
> 
> Pedro> Hmm, doesn't sound right.  Conceptually, breakpoint locations are
> Pedro> multiple expansions of the same source location.  Different
> Pedro> source locations are different breakpoints.  E.g, bp_location
> Pedro> doesn't have line number or source file fields.  From the user's
> Pedro> perpective, there's only a single "point" in the source code for
> Pedro> all the multiple locations for a single breakpoint.
> 
> Could you explain why this is important?  I agree this is how it is, but
> I think it is actually somewhat confusing at times.

The main problem (ignoring that frontends aren't prepared to have a 
breakpoint map to more than one source file:line) is that if you allowed
multiple locations (addresses) at different source lines under the same
breakpoint, with the same hierarchy we have today, 

Consider the in-charge, and not-in-charge versions of a constructor
(, or multiple expansions of an inline function, or instantiations
of a template function (about the same thing)).

A::A()
{
}

A::A(int overload)
{
}


Now, do "break A::A".  The user only really cares about source
locations, so he expects two locations, one for each overload.
But, if you allow locations with different line numbers under
the same breakpoint, you'd get something like:

 (gdb) info breakpoints 
 Num     Type           Disp Enb Address            What
 1       breakpoint     keep y   <MULTIPLE>         
 1.1                         y     0x00000000004005d2 in A::A() at foo.cc:33
 1.2                         y     0x00000000004005ec in A::A() at foo.cc:33
 1.3                         y     0x00000000004005d2 in A::A(int) at foo.cc:38
 1.4                         y     0x00000000004005ec in A::A(int) at foo.cc:38


Now, say I didn't want to A::A(int) overload to break afterall.  I now
need to disable two separate locations.  If you think this isn't bad,
consider inline functions, and overloads.

void
inline_func()
{
}

void
inline_func(int overload)
{
}

void
inline_func(bar overload)
{
}

You do "break inline_func".  There are two overloads, so the
user (well, I would) naturaly want to see one entry per
overload under the breakpoint hierarchy, but instead,
since the functions were inlined 200 times each, he gets:

 (gdb) info breakpoints 
 Num     Type           Disp Enb Address            What
 1       breakpoint     keep y   <MULTIPLE>         
 1.1                         y     0x00000000004005d2 in inline_func() at blah
 1.2                         y     0x00000000004005ec in inline_func(int) at blah
 1.1                         y     0x00000000004005d2 in inline_func(bar) at blah
 1.2                         y     0x00000000004005ec in inline_func(int) at blah
 1.3                         y     0x00000000004005ec in inline_func(int) at blah
 1.4                         y     0x00000000004005ec in inline_func(int) at blah
 1.5                         y     0x00000000004005ec in inline_func(int) at blah
 ... 
 1.100                         y     0x00000000004005ec in inline_func(int) at blah
 1.101                         y     0x00000000004005ec in inline_func(int) at blah
 1.102                         y     0x00000000004005ec in inline_func(int) at blah
 1.103                         y     0x00000000004005ec in inline_func(int) at blah
 ...

you get the point.  If a shared library is loaded meanwhile that used
the inline function, you'll end up with even more locations.  But,
good luck if you now decide that you want to the disable only the
inline_func(bar) locations.

So you see, "locations" represent the expansions of a given source
line in GDB.  If you go follow Eli's links, you will see that there
were a lot of different names proposed for the "locations" concept,
and they all resolved around things like "physical addresses".
GDB as is could even hide the multiple locations from users, and
most wouldn't care.  The multiple locations show through more
of the machine/binary/compiled code than most users care about.

This is why I proposed in the response to Jan, that if we want
to group different "_source_ locations" under the same breakpoint,
we implement it as a 3rd hierarchy, so you'd get:

 (gdb) info breakpoints 
 Num     Type           Disp Enb Address            What
 1       breakpoint     keep y   <MULTIPLE>         
 1.1                         y     <MULTIPLE>
 1.1.1                         y     0x00000000004005d2 in inline_func() at blah
 1.1.2                         y     0x0000000000XXXXXX in inline_func() at blah
 1.1.N                         y     0x0000000000YYYYYY in inline_func() at blah

 1.2                         y     <MULTIPLE>
 1.2.1                         y     0x00000000004005ec in inline_func(int) at blah
 1.2.2                         y     0x0000000000XXXXXX in inline_func(int) at blah
 1.2.N                         y     0x0000000000YYYYYY in inline_func(int) at blah

 1.3                         y     <MULTIPLE>
 1.3.1                         y     0x0000000000400781 in inline_func(bar) at blah
 1.3.2                         y     0x0000000000XXXXXX in inline_func(bar) at blah
 1.3.N                         y     0x0000000000YYYYY in inline_func(bar) at blah

etc.  (could probably be made prettier.)

This way, if the user wants to disable inline_func(bar), he can do disable "1.3".

> The problem I see with respect to the SystemTap change is that a given
> probe location does not have a canonical name.
> 
> E.g., suppose you have a probe `program:name' that is invoked at 2
> different spots in the source.  Suppose further that `break
> probe:program:name' sets 2 breakpoints.  Now consider 2 scenarios:
> First, the developer deletes a probe point.  Second, the developer adds
> a probe point.

(3rd, developer notices a typo, and renames the probe at a given
line in the program.)

> With the single-breakpoint approach, both of these do (what I consider
> to be) the right thing -- the user ends up with what he asked for.  I
> don't see how they can work with the multiple-breakpoint approach.

You still have manifestations of the same problem.  Location numbers are
supposed to be stable.  For example, users can enable/disable locations
individually.  Say, you have:

 (gdb) info breakpoints 
 Num     Type           Disp Enb Address            What
 1       breakpoint     keep y   <MULTIPLE>         
 1.1                         y     "program:name" in foo_func() at blah.c:12
 1.2                         n     "program:name" in bar_func() at blah.c:20
 1.3                         y     "program:name" in qux_func() at blah.c:45

If the program changes, say the 1.2 location disappears, you've now
confused GDB --- how will it know that 1.3 is supposed to be enabled,
given that it is now the second location?  (multiple variations of
this issue).

LTTng/UST also has the same issue (marker names are not required to
be unique, though it's advised they are), and we try to heuristically
cope with it as best as possible.  This is
decode_static_tracepoint_spec returning a list of sals
matching the marker spec, the static_trace_marker_id_idx
field in the breakpoint structure.

	      /* Given that its possible to have multiple markers with
		 the same string id, if the user is creating a static
		 tracepoint by marker id ("strace -m MARKER_ID"), then
		 store the sals index, so that breakpoint_re_set can
		 try to match up which of the newly found markers
		 corresponds to this one  */
	      tp->static_trace_marker_id_idx = i;

and see also all the resynching update_static_tracepoint tries
to do.

> I think the deeper confusion in gdb is that an ambiguous name sets a
> single breakpoint, with a single location, but the meaning of this name
> is decided arbitrarily (say, by psymtab expansion order).
> 
> That is, `break file.c:73' sets a breakpoint in the first `file.c' we
> happen to trip across.  Or, `break function' sets a breakpoint in the
> first `function'.
> 
> I'd rather change gdb to set a breakpoint at all matching locations, and
> let the user disambiguate if that is really what he wanted.

See, "bp_location" concerns with address locations while the user
the user is concerned about "source locations".  I think that
if "bp_location" had been named "bp_compiled_address", we wouldn't
be having this discussion.  :-)

> 
> I hit this while debugging gdb itself from time to time -- try `break
> parse_number' and guess where it gets set.

It's not necessarily a bad thing (and not necessarily a good thing
either).   If you have a big project, with a bunch of static
"foo" functions, it's not unreasonable to have "b foo"
break at the "closest" function with that name in your context.

What would "list parse_number" do if "break parse_number" breaks
in all the instances of that static function?

-- 
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]