This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Re: [patch 2/2] Fix loading core files without build-ids
- From: Mark Wielaard <mjw at redhat dot com>
- To: elfutils-devel at lists dot fedorahosted dot org
- Date: Sat, 18 May 2013 00:00:58 +0200
- Subject: Re: [patch 2/2] Fix loading core files without build-ids
Hi Jan,
Haven't gotten completely through the patches yet, but here is some
early feedback.
On Fri, 2013-04-26 at 20:57 +0200, Jan Kratochvil wrote:
> With my former patches I improved some cases and regressed different cases, so
> with this patch it should behave better in general - but not in 100% cases.
> The former code using ELF headers from segments could find more modules but
> their proper address was only a luck, sometimes wrong, as proven in the mail
> above ELF.
>
> "Fix loading core files without build-ids" - with build-id header the ELF
> headers from segments method almost always works, it has more wrong results
> when build-ids are missing, therefore even the ELF headers are missing.
BTW. How often does it happen that build-id are missing in core files?
Is that just when a distro "misconfigures" the kernel dumping settings?
We should of course handle it correctly anyway, just wondering.
> elfutils was always fragile by searching for ELF headers in core files
> segments first
> /* Now sniff segment contents for modules. */
> and only then trying to adjust them from libc link_map list.
> /* Next, we should follow the chain from DT_DEBUG. */
> +
> /* If content-sniffing already reported a module covering
> the same area, find that existing module to adjust.
> The l_ld address is the only one we know for sure
> to be within the module's own segments (its .dynamic). */
>
> But that adjustment currently does not work as it uses
> Dwfl_Module *mod = INTUSE(dwfl_addrmodule) (dwfl, l_ld);
> but if the module is placed incorrectly one cannot find the proper module by
> its address -- by its placement.
>
> IMO one should follow link_map first and fall back to the ELF headers only if
> it fails.
Yes, that order makes sense to me.
> Not sure if one should try to map address-non-conflicting files from ELF
> headers, this patch does not do it. (That could better fit case (2) below.)
>
> There are IMO two basic use cases with different pros/cons:
>
> (1) Map address -> symbol: eu-addr2line -S --core=xxx 0xaddress
> Exact modules placement is needed and it is better to report no symbol
> than a wrong symbol.
Agreed.
> (2) I want to know all build-ids: eu-unstrip -n --core=xxx
> Modules placement does not matter, excessive output does not matter,
> I need to only install needed debuginfos.
Thanks for this example. Now I finally understand why the current code
does the "sniffing". The build-ids found could be interesting, but we
don't know for sure in this case.
> The testcase run-unstrip-n.sh had to be updated:
> (a) It was currently reporting filenames like /lib/libc.so.6 but those were
> a false match, the system files do not match build-ids from the core file,
> therefore there should be no successful match to system files.
> (b) Current code outputs [exe] and [vdso] in slightly different order.
The order shouldn't really matter, so that part of the change is fine.
But now you don't know the name at all? That is a pity since they are a
good hint. Would be nice if we could provide them in some way as hint as
you see with /proc/self/exe -> /bin/foo (deleted), or by putting them
between [brackets].
> commit e88be1bac6665f8905ef1cb6e3003267f92e912d
> Author: Jan Kratochvil <jan.kratochvil@redhat.com>
> Date: Fri Apr 26 20:27:10 2013 +0200
>
> Use DT_DEBUG library search first.
>
> libdwfl/
> 2013-04-26 Jan Kratochvil <jan.kratochvil@redhat.com>
>
> * argp-std.c (parse_opt) <ARGP_KEY_SUCCESS> <opt->core> <opt->e>: Set
> executable_for_core before calling dwfl_core_file_report.
This makes sense to me.
> * core-file.c (dwfl_core_file_report): Move raw segments reporting
> lower. New variables vdso_bias, vdso_l_ld and exec_vaddr. Call also
> dwfl_segment_report_module for EXEC_VADDR and VDSO_L_LD. Call
> dwfl_segment_report_module for raw segments only if LISTED is zero.
Is the FIXME: Use vdso_bias. just informational, or were you going to do
that? So you don't sniff at all if dwfl_link_map_report () found
something. Would it hurt at this point to sniff? It is a pity
dwfl_segment_report_module () doesn't have documentation. I am still
staring at it to fully understand what it does.
> * dwfl_module_build_id.c (check_notes): Move into
> __libdwfl_find_elf_build_id.
> (__libdwfl_find_build_id): Rename to ...
> (__libdwfl_find_elf_build_id): ... here. Add parameters build_id_bits,
> build_id_elfaddr and build_id_len. Verify MOD vs. ELF.
> (__libdwfl_find_elf_build_id) (check_notes): Remove parameters mod and
> set, rename data_vaddr to data_elfaddr. Do not call found_build_id.
> (__libdwfl_find_elf_build_id): Update the check_notes caller, do not
> adjust its data_elfaddr parameter.
> (__libdwfl_find_build_id): New wrapper of __libdwfl_find_elf_build_id.
> * libdwflP.h (__libdwfl_find_elf_build_id): New declaration.
OK, as written here the moving around confused me a little. But reading
the code it looks correct. Since __libdwfl_find_build_id () adds the
main bias afterwards, you don't adjusting vaddr in the other places
anymore.
> (dwfl_link_map_report): Add parameters vdso_bias, vdso_l_ld and
> exec_vaddr.
> * link_map.c: Include system.h and fcntl.h.
> (report_r_debug): Add parameters vdso_bias, vdso_l_ld and exec_vaddr,
> describe them in the function comment. Delete dwfl_addrmodule call and
> its dependent code. Verify build-id before calling dwfl_report_elf,
> also supply executable_for_core to it and skip vDSO. Report vdso_bias
> and vdso_l_ld when found. Clear exec_vaddr when found.
> (dwfl_link_map_report): Add parameters vdso_bias, vdso_l_ld and
> exec_vaddr.
> (dwfl_link_map_report) (consider_phdr): Add parameter offset. Set
> exec_vaddr when found.
> (dwfl_link_map_report): New variable in_ok. Try to read IN from
> EXECUTABLE_FOR_CORE. Update consider_phdr caller parameters. Update
> report_r_debug caller parameters.
I still need to go through and understand the above.
Cheers,
Mark