[PATCH] Fix handling of null stap semaphores

Simon Marchi simark@simark.ca
Sun Dec 29 18:58:00 GMT 2019


On 2019-12-03 2:59 p.m., George Barrett wrote:
> According to the SystemTap documentation on user-space probes[0], stap
> probe points without semaphores are denoted by setting the semaphore
> address in the probe's note to zero. At present the code does do a
> comparison of the semaphore address against zero, but only after it's
> been relocated; as such it will (almost?) always fail, commonly
> resulting in GDB trying to overwrite the ELF magic located at the
> image's base address.
> 
> This commit tests the address as specified in the SDT note rather than
> the relocated value in order to correctly detect absent probe
> semaphores.
> 
> [0]: https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
> 
> gdb/Changelog:
> 
> 	* stap-probe.c: Fix handling of null stap semaphores

Hi George,

Thanks, the patch looks good to me.  Though the ChangeLog should mention the
modified functions, I propose to use this:

	* stap-probe.c (stap_modify_semaphore): Don't check for null
	semaphores.
	(stap_probe::set_semaphore, stap_probe::clear_semaphore): Check
	for null semaphores.

I'd really like if we could have a test for this, so that eventual refactors don't
re-introduce this bug.  Perhaps the gdb.base/stap-probe.exp test could be enhanced to
test that the ELF magic number hasn't been changed?

One difficulty is finding out where it is, I don't know if there's a GDB command that
will compute that directly.  One way is to take the address of a global variable before
and after starting the process, and see how it has been relocated, that would be the base
of the image:

(gdb) p &some_global
$1 = (int *) 0x402c <some_global>
(gdb) start
Temporary breakpoint 1 at 0x111d: file test.c, line 9.
Starting program: /home/simark/src/aoc/08/p2/a.out

Temporary breakpoint 1, main () at test.c:9
9	  for (i = 0; i < 1000; i++) {
(gdb) p &some_global
$2 = (int *) 0x55555555802c <some_global>
(gdb) p/x 0x55555555802c - 0x402c
$3 = 0x555555554000
(gdb) p (*(char*) 0x555555554000)@4
$4 = "\177ELF"

Also, the semaphore is set when the breakpoint is inserted and cleared when the breakpoint
is removed.  By default, GDB removes the breakpoints while the inferior is stopped, so if
we inspect the magic number while the inferior is stopped, it will appear OK.

However, we can use "set breakpoints always-inserted on" to make GDB leave the breakpoint
inserted when the inferior is stopped.  When doing this, we can observe the overwritten
magic number:

(gdb) set breakpoint always-inserted 1
(gdb) b -probe provider:name
Breakpoint 1 at 0x1137
(gdb) p &some_global
$1 = (int *) 0x402c <some_global>
(gdb) start
Temporary breakpoint 2 at 0x111d: file test.c, line 9.
Starting program: /home/simark/src/aoc/08/p2/a.out

Temporary breakpoint 2, main () at test.c:9
9	  for (i = 0; i < 1000; i++) {
(gdb) p &some_global
$2 = (int *) 0x55555555802c <some_global>
(gdb) p/x 0x55555555802c - 0x402c
$3 = 0x555555554000
(gdb) p ((unsigned char[4]) *0x555555554000)
$4 = "\200ELF"

I think all this only applies if the main program is a position-independent executable, so
this test should probably be skipped if we detect the relocation is 0.

So with all this it should be pretty straightforward to improve the test to check for this.

Note that I found what looks like to be a GDB bug while doing this:

https://sourceware.org/bugzilla/show_bug.cgi?id=25321

Simon



More information about the Gdb-patches mailing list