This is the mail archive of the
dwarf2@corp.sgi.com
mailing list for the dwarf2 project.
Re: Modifies vs. Replaces
- To: brender at gemevn dot zko dot dec dot com
- Subject: Re: Modifies vs. Replaces
- From: todd dot allen at ccur dot com (Todd Allen)
- Date: Wed, 28 Mar 2001 10:58:58 -0700 (MST)
- Cc: dwarf2 at corp dot sgi dot com (dwarf2)
- Reply-To: todd dot allen at ccur dot com (Todd Allen)
>
> OK, suppose a debugger does that. Now when the debugger calls
> doprint the value of k is correct in its global home location, but
> the values of j and l are still only correct in the registers,
> not in memory, so the execution of doprint may still be "wrong"
> or misleading.
>
> Merely updating all the copies of k does not solve the whole
> problem. Somehow the debugger needs to know how to update all
> the global variables that doprint might possibly (or at least actually)
> reference to their memory locations. Moreover, this is a transitive
> requirement that must handle anything that doprint might (or
> actually) calls.
>
> Further, the relevant values are not necessarily all in the current
> routine--every routine in the stack needs to be considered. And
> if there are multiple threads in the same process, then every
> stack in every thread...
>
> Moreover, updating k is the easy case in that the new value of
> k is readily in hand in the debugger. Updating all the other
> variables leads to the requirement for a rule along the lines
> suggested by Jim in his follow-on mail so that the debugger
> unambiguously knows which of the several locations associated
> with a variable should be used to update the others.
>
If you ignore multiple threads and local copies of globals in outer frames
for a moment (I'll get back to outer frames, at least), then a possible
implementation to deal with this would be:
At the time a debugger needs to call a subroutine, it should examine all
global variables with live local copies. In the absence of
interprocedural analysis, it should just be those with local copies in the
current frame whose lifetimes cover the point at which the debugger is
stopped. For each such global variable:
Let L be the variable's set of *accurate* locations at the point where
the debugger is stopped. By accurate, I mean a location that can be
read to get an up-to-date value of the variable. What this means
depends on the definition of the local copy's location list, which I'll
discuss below.
Let E be the variable's set of locations at the entry point to the
called subroutine.
If L is not a superset of E, then
Arbitrarily select any location in L, and call the selection l.
The choice doesn't matter because the accuracy requirement
indicates that all the locations in L must have the same value.
Let W be (E - L).
For each location w in W:
copy l to w.
(If you're worried about a compiler which reuses global
locations, then to be truly safe, you'd want to save the value at
w before copying l to it. And then you'd want to restore each w
after the subroutine call.)
With a solution like this in place, writing changes to global variables both
to their original in-memory locations and to their local copies becomes
unnecessary, as Ron pointed out.
Adding interprocedural analysis, then it should be enough to examine all
outer frames for additional local copies of globals, and use a similar
algorithm on them. The only difference would be the "point where the
debugger is stopped", which I think would need to be replaced with the "point
in the subroutine containing the local copy where the debugger branched to an
inner frame".
ASIDE: With interprocedural analysis, the issue of having to search outer
frames for local copies isn't isolated just to function calls. In
general, it has to be done for global variable references and
modifications, too.
Consider how each of the proposed possible meanings for local copy locations
supports this:
#1: Local copy locations augment the location of their globals.
There's nothing definitive in the standard to say what locations
should be in L. A good heuristic would be to use all the locations in
the local copy, but it's just a heuristic.
#2: Local copy locations augment the location of their globals, and a
"primary for reference" location is defined for each pc. I am
assuming that it's one of the locations in the local copy for any
locations covered by pc ranges in the local copy.
In this case, the debugger can use as L the set containing the
singleton "primary for reference". This is a pessimistic because the
real L could be larger than that. Stated differently, the DWARF
doesn't distinguish between the case where only the local copy is
accurate (i.e. where L really does contain just the local copy), and
where both the local copy and the global in-memory copy are accurate
(i.e. where L really contains both locations). This pessimism means
that the debugger might perform more l to w copies than truly were
necessary.
#3: Local copy locations replace the location of their globals.
In this case, the debugger knows L as precisely as the compiler can
describe it. It's the local copy's set of locations that are live at
the point where it's stopped. That might include the global in-memory
location or not, as indicated by the compiler.
I don't know if a debugger making redundant copies bothers anyone else. In
our debugger, for calls that will be made frequently (e.g. those called as
part of the condition on a conditional breakpoint), we implement them by
patching them into the running program, so that they can be executed more
quickly than they would if debugger interaction was required. Any redundant
copies will take execution time. Our customers' programs often have
real-time scheduling constraints, so we go to a lot of effort to reduce the
impact on the run-time performance of the executable being debugged, even if
that means extra hassle for the debugger. The additional execution time for
the redundant copies could be a problem.
So my personal bias leads me to think solution #3 is the best. It provides
enough information for the debugger to determine the most accurate set of
local copies which need to be copied back. The cost is that location lists
are required for the local copies.
Solution #2 also is pretty good. It's more compact than #3, and it provides
enough information that a debugger can determine a superset of local copies
which need to be copied back. The debugger might end up making some
redundant copies, but at least it covers all the necessary ones. Also, it
isn't much of a stretch to add a vendor extension to provide the additional
information about redundancy that #3 allows.
I didn't deal with the issue of multiple threads here. I don't fully grasp
how an optimizer could deal with their asynchronous nature. A sophisticated
compiler could figure out that a thread didn't interact with a global at all.
But once that's known, there isn't much to do to deal with it. How would it
have enough information to know that another thread wouldn't interact with a
global for only a limited time while the current thread made a local copy? I
suppose a very sophisticated optimizer might be able to deduce that another
thread was blocked for a while by code in a currently executing thread. If
that's the sort of thing you're getting at, then I agree that describing it
goes way beyond the capabilities of DWARF. If you had something else in
mind, I would be interested in hearing any elaboration.
--
Todd Allen
Concurrent Computer Corporation