Problems with dwarf-getmacros test

Mark Wielaard
Tue May 9 15:11:00 GMT 2017

On Mon, 2017-05-08 at 18:22 +0200, Ulf Hermann wrote:
> I frequently get failures from the test, on 
> testfile-macros:0xb. The test repeatedly outputs "(null)" instead of the 
> actual macros and then runs into the assert at dwarf-getmacros.c:50. The 
> failure is very nondeterministic, though. I haven't found a reliable way 
> to trigger it.

Is it only with testfile-macros?
All other testfiles always run correctly?

> Further examination reveals that the __libdw_in_section check in 
> READ_AND_RELOCATE (libdwP.h:656), when called from __libdw_read_offset 
> seems to be bogus. The "return -1" in there is what produces the null 
> results and ultimately the assert.

Do you have the whole call stack of that failed __libdw_read_offset
call? Which source line in tests/dwarf-getmacros.c prints the "(null)"?

> Experiments show that the address is 
> frequently not in the section we're checking there, but still valid. 
> Just dropping the check makes the test succeed.

I think this might be related to our "fake" CU and attributes we invent
in libdw/dwarf_getmacros.c (read_macros). See around this comment:

          /* We pretend this is a DW_AT_GNU_macros attribute so that
             DW_FORM_sec_offset forms get correctly interpreted as
             offset into .debug_macro.  */

If that is the issue then we might need to somehow make
READ_AND_RELOCATE and/or __libdw_in_section aware that the CU is fake
and the check isn't needed. In which case we probably need to add some
flag "fake" to the CU and pass the CU to the various __libdw_read_*
functions to disable that sanity check in READ_AND_RELOCATE.

> I'm currently at a loss about why this happens. One thing that strikes 
> me was that the additional dbg_ret mechamism was added in 2012 with 
> commit 775375e3, but the check in READ_AND_RELOCATE was not adapted then.

The READ_AND_RELOCATE macro is hard to read because it captures the
names of some of the variables it uses instead of getting them passed as
arguments. It took me a couple of times to double check what it does
seems correct. The check in READ_AND_RELOCATE is against (dbg,
sec_index, addr) checking that the address is inside the section for the
dbg Dwarf. While the __libdw_read_offset (dbg_ret, sec_ret, *ret) is for
the returned valued in the returned section in the returned Dwarf.

> However, the address is also not necessarily in dbg_ret at that point. 
> Checking dbg_ret in addition to dbg still fails sometimes, and also that 
> wouldn't explain the nondeterminism.

The nondeterminism is weird indeed.



More information about the Elfutils-devel mailing list