Bug 26043

Summary: eu-addr2line debuginfo-path option with relative path doesn't find file:line information
Product: elfutils Reporter: devel.origin
Component: toolsAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: elfutils-devel, fche, mark
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Test that launches eu-addr2line with relative and absolute debuginfo-path

Description devel.origin 2020-05-26 09:37:26 UTC
$ eu-addr2line -x 0x7fecb0892b18 --debuginfo-path=../../demo/build --pid=18030
_ZN6nsdemo13forEachThreadEPFviPvES0_+0x1ce (.text)
??:0
$ eu-addr2line -x 0x7fecb0892b18 --debuginfo-path="$(realpath ../../demo/build)" --pid=18030
_ZN6nsdemo13forEachThreadEPFviPvES0_+0x1ce (.text)
/home/devel/demo/src/ForEachThread.cpp:40
$
Comment 1 Mark Wielaard 2020-05-26 10:11:46 UTC
Could you show what is under the demo/build directory?
I am trying to understand where the DWARF debug data is.
Is it in the main binaries, in separate .debug files in the directory of the main binary, in a separate file in a separate directory, are the main binaries using build-ids and/or .gnu.debuglink sections?

--debuginfo-path (for utilities using dwfl_standard_argp) will set the .debuginfo_path of the Dwfl_callbacks. This is then used to find debug files.

The way these paths are used is as follows (from libdwfl.h):

/* These standard find_elf and find_debuginfo callbacks are
   controlled by a string specifying directories to look in.
   If `debuginfo_path' is set in the Dwfl_Callbacks structure
   and the char * it points to is not null, that supplies the
   string.  Otherwise a default path is used.

   If the first character of the string is + or - that enables or
   disables CRC32 checksum validation when it's necessary.  The
   remainder of the string is composed of elements separated by
   colons.  Each element can start with + or - to override the
   global checksum behavior.  This flag is never relevant when
   working with build IDs, but it's always parsed in the path
   string.  The remainder of the element indicates a directory.

   Searches by build ID consult only the elements naming absolute
   directory paths.  They look under those directories for a link
   named ".build-id/xx/yy" or ".build-id/xx/yy.debug", where "xxyy"
   is the lower-case hexadecimal representation of the ID bytes.

   In searches for debuginfo by name, if the remainder of the
   element is empty, the directory containing the main file is
   tried; if it's an absolute path name, the absolute directory path
   (and any subdirectory of that path) containing the main file is
   taken as a subdirectory of this path; a relative path name is taken
   as a subdirectory of the directory containing the main file.
   Hence for /usr/bin/ls, the default string ":.debug:/usr/lib/debug"
   says to look in /usr/bin, then /usr/bin/.debug, then the path subdirs
   under /usr/lib/debug, in the order /usr/lib/debug/usr/bin, then
   /usr/lib/debug/bin, and finally /usr/lib/debug, for the file name in
   the .gnu_debuglink section (or "ls.debug" if none was found).  */
Comment 2 devel.origin 2020-05-26 16:23:18 UTC
The debug info is in the .build-id/ dir: ../../demo/build/.build-id/ab/...

Seems consistent with what the comment says. Should I make a real test example (the current is just a very good fictional one)?
Comment 3 Frank Ch. Eigler 2020-05-26 17:39:28 UTC
An strace of eu-addr2line would help show where it looked for debuginfo.
BTW another option is to run

$ debuginfod -F ../../demo/build &
$ sleep 10 # give it a bit of time if this is first time running it
$ DEBUGINFOD_URLS=http://localhost:8002/ eu-addr2line ....
Comment 4 Mark Wielaard 2020-05-26 20:11:45 UTC
(In reply to devel.origin from comment #2)
> The debug info is in the .build-id/ dir: ../../demo/build/.build-id/ab/...
> 
> Seems consistent with what the comment says. Should I make a real test
> example (the current is just a very good fictional one)?

If you could create a reproducer that would be great.

I think the issue is that the build-id searches simply skip relative directories. Specifically this in libdwfl/dwfl_build_id_find_elf.c:

      /* Only absolute directory names are useful to us.  */ 
      if (dir[0] != '/') 
        continue; 

While for find-debuginfo.c we have:

          /* A relative path says to try a subdirectory of that name 
             in the main file's directory.  */ 

I believe the idea is that lookups by build-id can be done without having a (main) file. While file based lookup always starts with a (main) file, and relative paths can be resolved based on the main file directory.

It might make sense to try build-id lookups using the same relative path logic if we do have a (main) file to start with.

BTW. How did you end up with this setup?
Is there a development tool that creates a structure like that?
Comment 5 devel.origin 2020-05-27 07:38:05 UTC
Created attachment 12572 [details]
Test that launches eu-addr2line with relative and absolute debuginfo-path

I've also included the output in the README.md.

The testcase uses a CMake/bash script I wrote to separate debuginfo (so then later it's possible to package it into *-dbg packages, after installation on a target system they normally end up in /usr/lib/debug/.build-id/).

The relative paths come in while unit testing. My pipeline is running unit tests after debuginfo separation but before packaging, so I point it to the debug files in the build directory. It was setup like that almost exactly 4 years ago, I think that maybe relative paths were working then.
Comment 6 Mark Wielaard 2020-06-18 21:57:06 UTC
Thanks for the demo! I have been thinking about what to do about this issue. I think it does make sense to make the empty path and relative dirs work like with the debug file lookup. The only thing I am slightly worried about is that it increases the amount of lookups with the default setup, where normally for system files the lookup would skip the empty path and .debug path and immediately find (or not) the build-id file under /usr/lib/debug/

BTW. How does this setup work with gdb? How does gdb find the build-id based files?
Comment 7 devel.origin 2020-06-24 07:07:05 UTC
For that demo the debug files are ending up in a custom directory (the build-id-demo/.build-id/) for the packge builder to pick up. So they can't be found by GDB. When packages are installed the debug info is naturally in /usr/lib/debug.

To make the GDB work when launching a developer build from the build directory the build script puts a copy of debug info of all libraries into one .build-id/ on the top level of the build directory (I haven't added this part to the testcase project). In .gdbinit there is an option to look in current dir: "set debug-file-directory /usr/lib/debug:.", so GDB finds debug files if the current dir has a .build-id/ subdir.
Comment 8 Mark Wielaard 2020-06-25 15:45:20 UTC
(In reply to devel.origin from comment #7)
> For that demo the debug files are ending up in a custom directory (the
> build-id-demo/.build-id/) for the packge builder to pick up. So they can't
> be found by GDB. When packages are installed the debug info is naturally in
> /usr/lib/debug.
> 
> To make the GDB work when launching a developer build from the build
> directory the build script puts a copy of debug info of all libraries into
> one .build-id/ on the top level of the build directory (I haven't added this
> part to the testcase project). In .gdbinit there is an option to look in
> current dir: "set debug-file-directory /usr/lib/debug:.", so GDB finds debug
> files if the current dir has a .build-id/ subdir.

Thanks. I have a clear picture now.

I think it would make sense to change the search for .build-id based files in the default search from only checking absolute paths to also include relative paths. I am just pondering how to prevent an "explosion" of extra stats/checks. With the default setting it would add 2 extra stat calls (which would normally always fail). Maybe that isn't too bad?
Comment 9 devel.origin 2020-07-04 06:42:53 UTC
Other option is just to change the manpage. Those who write this kind of scripts are probably reading it. Adding a realpath/readling to a script is a one-time job for them.