Range lists, zero-length functions, linker gc

Fri Jun 19 12:00:27 GMT 2020

Hi,

On Tue, 2020-06-02 at 11:06 -0700, David Blaikie via Elfutils-devel wrote:
> > I do think combining Split DWARF and LTO might not be the best
> > solution. When doing LTO you probably want something like GCC Early
> > Debug, which is like Split DWARF, but different, because the Early
> > Debug simply doesn't contain any address (ranges) yet (not even through
> > indirection like .debug_addr).
> 
> I don't think Early Debug fits here - it seems like it was
> specifically for DWARF that doesn't refer to any code (eg: function
> declarations and type definitions). I don't see how it could be used
> for the actual address-referencing DWARF needed to describe function
> definitions.

I think that is kind of the point of Early Debug. Only use DWARF (at
first) for address/range-less data like types and program scope
entries, but don't emit anything (in DWARF format) for things that
might need adjustments during link/LTO phase. The problem with using
DWARF with address (ranges) during early object creation is that the
linker isn't capable to rewrite the DWARF. You'll need a linker plugin
that calls back into the compiler to do the actual LTO and emit the
actual DWARF containing address/ranges (which can then link back to the
already emitted DWARF types/program scope/etc during the Early Debug
phase). I think the issue you are describing is actually that you do
use DWARF to describe function definitions (not just the declarations)
too early. If you aren't sure yet which addresses will be used DWARF
isn't really the appropriate (temporary) debug format.

> > > > > & again the overhead of all those separate contributions, headers,
> > > > > etc, turns out to be not very desirable in any case.
> > > > 
> > > > Yes, I agree with that. But as said earlier, maybe the compiler
> > > > shouldn't have generated to code/data in the first place?
> > > 
> > > In the (especially) C++ compilation model, I don't believe that's
> > > possible - inline functions, templates, etc, require duplication -
> > > unless you have a more complicated build process that can gather the
> > > potential duplication, then fan back out again to compile, etc.
> > > ThinLTO does some of this - at a cost of a more complicated build
> > > system, etc.
> > 
> > It might be useful for the original discussion to have a few more
> > concrete examples to show when you might have unused code that the
> > linker might want to discard, but where the compiler could only produce
> > DWARF in one big blob. Apart of the -ffunction-sections case,
> 
> Function sections, inline functions, function templates are core examples.

I understand the function sections case, but can you give actual
examples of an inline function or function template source code and how
a DWARF producer generates DWARF for that? Maybe some simple source
code we can put through gcc or clang to see how they (mis)handle it.
Not being a compiler architect I am not sure I understand why those
cannot be expressed correctly.

> > where I
> > would argue the compiler simply needs to make sure that if it generates
> > code in separate sections it also should create the DWARF separate
> > section (groups).
> 
> I don't think that's practical - the overhead, I believe, is too high.
> Headers for each section contribution (ELF headers but DWARF headers
> moreso - having a separate .debug_addr, .debug_line, etc section for
> each function would be very expensive) would make for very large
> object files.

I see your point, but maybe this shouldn't be handled by the linker
then, but maybe have a linker plugin so the compiler can fixup the
DWARF (or generate it later).

Cheers,

Mark