Bug 29818

Summary: STAP probe map_failed is no longer generated.
Product: glibc Reporter: Andrew Burgess <aburgess>
Component: dynamic-linkAssignee: Not yet assigned to anyone <unassigned>
Status: WAITING ---    
Severity: normal CC: adhemerval.zanella, fche, lsix, pedro, simon.marchi
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:
Attachments: Remove the map_failed probe.

Description Andrew Burgess 2022-11-22 12:29:59 UTC
Created attachment 14472 [details]
Remove the map_failed probe.

While working on GDB, I noticed that in recent glibc, the map_failed probe was no longer being generated within the dynamic linker.
    
The reason is that the 'map_failed' probe is being optimised out by the compiler after this commit:

  commit ed3ce71f5c64c5f07cbde0ef03554ea8950d8f2c
  Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
  Date:   Thu Nov 11 09:28:21 2021 -0300

      elf: Move la_activity (LA_ACT_ADD) after _dl_add_to_namespace_list() (BZ #28062)

The problem is that the map_failed probe is guarded like this:

	  if (make_consistent && r != NULL)
	    {
	      r->r_state = RT_CONSISTENT;
	      _dl_debug_state ();
	      LIBC_PROBE (map_failed, 2, nsid, r);
	    }

After the above commit the make_consistent variable can never be true when this block of code is reached, as such, the compiler has optimised away this block, including the map_failed probe within it.
    
Both the map_start and map_failed probes are in the function _dl_map_object_from_fd.  Before the above commit it is was possible to reach the map_start probe, and then later in the function, decide that actually, we couldn't map the requested file for some reason.
    
We would then exit _dl_map_object_from_fd, but first we would pass through the map_failed probe, this would allow observers to know that the mapping, identified by the previous map_start probe, had failed.
    
After the above commit, the map_start probe is now only reached at the very end of the function _dl_map_object_from_fd, when all other error checks have already been completed.  As such, there is no longer a need for a map_failed probe.  If an observer sees map_start, then they can know that the mapping will proceed.

This bug is really just me asking for confirmation of the above understanding.  From the GDB side, we were never actually using map_start / map_failed, but we did check for both probes, so the absence of map_failed was causing GDB to not make use of the probes.  I already plan to remove the map_failed check from GDB given there are glibc in the wild without that probe.  But it would be nice to confirm that the probe is now deprecated.

I've attached a patch that removes the map_failed probe, but this is completely untested.
Comment 1 Adhemerval Zanella 2022-11-23 14:06:07 UTC
Patch looks ok, thanks. It also needs to remove the map_failed entry from elf/rtld-debugger-interface.txt, since it is not used anymore. Could you send the patch on libc-alpha please?
Comment 2 Pedro Alves 2022-11-28 15:35:09 UTC
IWBN if glibc had some kind of test that ensured that all the meant-to-be-exported probes are really exported, to catch such a problem from happening again.
Comment 3 Frank Ch. Eigler 2022-11-28 17:56:38 UTC
Interesting!

To Pedro's question, enumerating probes in a compiled libc is a matter of a perf or readelf or stap command, not a problem.  Such a thing could be a test case that matches the manual/probes.texi list against what turns out to be compiled.

Consumers such as gdb should not assume/assert the existence of any particular probe, if at all possible.
Comment 4 Adhemerval Zanella 2022-11-30 12:02:19 UTC
(In reply to Frank Ch. Eigler from comment #3)
> Interesting!
> 
> To Pedro's question, enumerating probes in a compiled libc is a matter of a
> perf or readelf or stap command, not a problem.  Such a thing could be a
> test case that matches the manual/probes.texi list against what turns out to
> be compiled.
> 
> Consumers such as gdb should not assume/assert the existence of any
> particular probe, if at all possible.

We have a internal script (scripts/glibcelf.py) which has ELF parse support to check against the elf/elf.h header definition, it should be feasible to extend it support stapsdt notes and compare against the ones described in manual/probes.texi (which could be obtained with another script).