This is the mail archive of the
dwarf2@corp.sgi.com
mailing list for the dwarf2 project.
Re: Modifies vs. Replaces
- To: DWARF2 at corp dot sgi dot com
- Subject: Re: Modifies vs. Replaces
- From: Jim Dehnert <dehnert at transmeta dot com>
- Date: Tue, 27 Mar 2001 00:40:08 -0800
- Organization: Transmeta Corp.
- References: <01032615570825@gemevn.zko.dec.com>
- Reply-To: Jim Dehnert <dehnert at transmeta dot com>
In response to Todd's request to hear opinions from others on this subject
(as extended), I'll offer mine. It's based on what I know the SGI
compilers do, and on wanting to see a definition that they can use to
efficiently describe it in DWARF. By all means correct me if I've
misinterpreted things. I will try to describe my assumptions carefully,
since I think there's been a lot of confusion in this thread due to
differing assumptions about what is being described.
The case in point, as I understand it, is (C/C++ instance):
file a.c:
int g; // Defining instance of global g
file b.c:
extern int g; // Non-defining instance of g
void f (void)
{
// code referencing g, including modifications
}
Now, as I understand the discussion, the current spec says that the defining
instance, in a.c, will have location information and the non-defining one,
in b.c, will not. The problem that has been raised is that, inside f, it
may be necessary to specify a location (in a register) that a.c could not have
known about. This requires location information associated with the non-
defining instance of g. The question is: how should the location information
in a.c and b.c interact?
First an observation: Even with the most trivial compilers, on most
architectures, there will be one or more addresses after a calculation of g
in a register in function f where the memory location does not contain the
real value of g because it hasn't been stored yet. If the user manages to
breakpoint a debugging session at such a point and references the memory
location, he'll get the wrong value. If he attempts to set a new value of
g in the memory location, it will be lost as soon as he proceeds past the
store.
Now in simple compiler implementations, this isn't too bad. It's a short
address range, and it probably lies between statement boundaries where users
usually put their breakpoints. I think nearly all compiler/debugger
implementations don't try to deal with this, and get away with it -- the user
who does assembly-level breakpointing just needs to be careful.
Some compilers, however, at least when optimizing, make this problem much
more serious. The SGI compiler, for instance, will allocate g to a register
for extended ranges in the execution of f if it's referenced a lot, loading
it into the register at the beginning of the range, reading and writing the
register instead of memory in the range, and writing the register back to
memory at the end. So even at statement boundaries, the location info
associated with the defining instance in a.c is completely invalid --
references to memory will get garbage, and assignments to memory will be
lost.
This is a common situation (for SGI), and my primary concern for this issue
going forward is that it should be efficiently describable. Describing it
all all requires invalidating the location info associated with the defining
instance -- it may not be used correctly by the debugger in the range in
question. I believe that this implies, unavoidably, that for the address
ranges provided by the non-defining instance's location info, that location
info must _replace_ the defining instance's location -- it cannot be correct
to supplement it. (I'm assuming here, as Ron did, that the defining instance
for a global will specify a universal address range, because it can hardly
do anything else.) This position corresponds to #1 in the list of alternatives
we've seen:
1) Ignore location information in the defining entry, and
only use the location information provided in the non-defining
entry [within the scope of the non-defining entity].
I don't believe #2 can describe the situation I've described above, and #3 is
useless.
Before closing, I want to address the philosophical point that's been raised.
I too agree that it is entirely appropriate for Dwarf to simply provide well-
defined mechanisms for describing a program, and leave it to the producer to
define a binding that uses particular mechanisms to describe various
language features and compilation mappings. In theory at least, that will
allow any compliant consumer to draw correct conclusions from the Dwarf.
However, the phrase "well-defined mechanisms" is critical here. There are
times when it makes sense to leave some flexibility where architectures
differ in their requirements. But leaving deliberate ambiguity in the
definition of what a construct means, when not driven by the need to deal
with different underlying architectures, is highly undesirable -- it
undermines the primary objective of a standard. Note that I'm NOT talking
about specifying how the client (debugger) behaves -- I'm talking about what
the producer implies when it uses a construct. I think that's the situation
we have here.
Jim
--
- Jim Dehnert dehnert@transmeta.com
(408)919-6984 dehnertj@acm.org