This is the mail archive of the dwarf2@corp.sgi.com mailing list for the dwarf2 project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

duplicate dwarf2 reduction via comdat


>>>>> "Ron" == Ron 603-884-2088 <brender@gemevn.zko.dec.com> writes:

> Duplicate Dwarf data deletion (represented here by 991026.3, more recently
> addressed in Dave Anderson's email of 14 November 2000 and discussed
> at the 30 November meeting, although that is not mentioned in the minutes)
> was one of the key reasons that stimulated the formation of this group
> and our revision efforts. I find it most unfortunate that we might
> "finish" without achieving anything on this topic. I'd like to suggest
> that we have some limited amount of brainstorming concerning whether
> we really want to give up or whether we should try to have another
> run at it.  I have some thoughts (obviously not written up) about how
> we might better "enable" a vendor solution, without dictating the whole
> scheme. I'd like the chance to get feedback on whether it is worth
> the effort to try or not.

Early in the process, I talked about the scheme I implemented in GCC
without extending the 2.0 spec.  The code for this scheme is present in the
current CVS sources for the compiler, but is disabled by default, pending
final cleanup work and GDB support for FORM_ref_addr.

I had also proposed at least one extension (AT_extension on
TAG_compilation_unit) to allow my scheme to be more efficient.  As I
recall, that change was rejected, and there was a request for me to write
up my scheme in more detail, but I never did.  So here we go:

(Note: I haven't read Dave's email yet.  I will soon.)

----

The basic idea of my scheme is to separate chunks of potentially duplicated
information out into their own COMDAT sections, to be recombined into the
monolithic .debug_info section by the linker.  No attempt is currently made
to optimize the other DWARF sections, and we currently only do this for
header files.

Each header file, then, is given its own compilation unit, in addition to
the primary compilation unit corresponding to the primary source file.
References between these compilation units use FORM_ref_addr, of course.

The CU for the header file contains only the "interface" parts of the
header, namely types.  The "implementation" parts, i.e. anything with a
location attribute, remain in the primary CU, since they correspond to
actual code in the object file.

For consistency and collision protection, the COMDAT key for a particular
compilation unit is generated from the basename of the header file and a
checksum of the contents.  The (global) symbols used for references to DIEs
in these CUs are composed of this key and a sequence number.  References
from the header CUs to the primary CU use internal symbols; there is no
need for them to be consistent between two CUs for the same header, so long
as they refer to (semantically) the same thing.

There are some issues with the current draft that make this less effective
than it could be.  For one, 3.3.8.3 says that a concrete out-of-line
instance of an inline function needs to be owned by the same parent as the
abstract instance, which prevents us from putting them in different CUs.
Does anyone know what the rationale for this rule is?  It seems entirely
arbitrary to me.

It would also be nice to be able to provide debugging info for an abstract
version of a template, to reduce the redundancy between instantiations.

Also, it should be possible to do the AT_declaration/AT_specification thing
with nested types; if a nested type is only defined in the implementation
.cc file for a class, the compiler should be able to put its definition at
file scope.  This also would remove the necessity for going back and
modifying previously generated information; nested types are the downfall
of the gcc dwarf1 generator, which tries hard to write everything out
immediately and forget about it.

This scheme could be further extended by putting the information for
COMDATted code into a separate CU in the same COMDAT group with the
function itself.

Thoughts?  Questions?

Jason

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]