This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug releng/25581] USDT probes when /proc/[pid]/mem not writeable


https://sourceware.org/bugzilla/show_bug.cgi?id=25581

--- Comment #2 from Dale Hamel <dale.hamel at srvthe dot net> ---
This minimal proof of concept shows the main concept employed by this patch:

```c
#include <stdio.h>
#include <unistd.h>

void _check();

#define MY_ASM \
  do { \
    __asm__ __volatile__ ("_check:\n990: nop");\
  } while(0)

int main(int argc, char **argv)
{
  while(1) {
    MY_ASM;
    printf("%08X\n", (*(char *)_check) & 0x90);
    sleep(1);
  }
}
```

When I run the program, I see the output:

```
...
00000090
00000090
00000090
...
```

Now If I attach to it with bpftrace:

```
bpftrace -e 'uprobe:./a.out:0x1164 { printf("here\n") }'
```

Note that 0x1164 is from:

```
$readelf -s ./a.out| grep _check
35: 0000000000001164     0 NOTYPE  LOCAL  DEFAULT   13 _check
```

I can see that the value changes in the program:

```
00000090
00000090
00000080
00000080
```

Though I was expecting 0xCC, not 0x80. It seems like 0x80 is for a syscall?
Unsure why, but it toggles as I attach/detach with bpftrace.



I compiled ruby with `--enable-dtrace` with these sys/sdt and dtrace.py, and
performed these tests:

Given the ruby process:

```
ruby -e 'TracePoint.new{}.enable; def foo; puts "hi"; sleep 1; end; while true
do; foo;end'
```

Note that `TracePoint.new{}.enable` is needed post ruby 2.5, aside from that we
are just sleeping and calling `foo`.

To start with, I was perplexed when I saw there were no sdt notes on
/proc/RUBYPID/exe, but on checking /proc/RUBYPID/maps, I see that this is built
with libruby.so, and the probes are in there. I check the elf notes on that,
and find `method__entry`:

```
$ readelf --notes /proc/168586/root/usr/local/lib/libruby.so | grep
method__entry -A1
    Name: cmethod__entry
    Location: 0x0000000000237e5a, Base: 0x00000000002f7305, Semaphore:
0x0000000000000000
--
    Name: method__entry
    Location: 0x000000000023819e, Base: 0x00000000002f7305, Semaphore:
0x0000000000000000
--
    Name: method__entry
    Location: 0x0000000000243ec3, Base: 0x00000000002f7305, Semaphore:
0x0000000000000000
--
    Name: cmethod__entry
    Location: 0x00000000002462e4, Base: 0x00000000002f7305, Semaphore:
0x0000000000000000

```

There are multiple addresses for this probe that we might attach to, but only
one of them will actually be checked in the source. To determine this, we have
to find the hack function:

```
$ readelf -s /proc/168586/root/usr/local/lib/libruby.so | grep
ruby_method__entry_check
  3922: 000000000023819e     0 NOTYPE  LOCAL  DEFAULT   12
ruby_method__entry_check
```

So it is the one at the address `0x000000000023819e`. To translate this to the
vmaddr, we need the base of libruby.so, which we can read from
/proc/RUBYPID/maps:

```
cat /proc/168586/maps | grep libruby
7f4f035f4000-7f4f0361f000 r--p 00000000 08:01 2637891                   
/usr/local/lib/libruby.so.2.6.5
7f4f0361f000-7f4f03852000 r-xp 0002b000 08:01 2637891                   
/usr/local/lib/libruby.so.2.6.5
7f4f03852000-7f4f03938000 r--p 0025e000 08:01 2637891                   
/usr/local/lib/libruby.so.2.6.5
7f4f03938000-7f4f0393e000 r--p 00343000 08:01 2637891                   
/usr/local/lib/libruby.so.2.6.5
7f4f0393e000-7f4f03941000 rw-p 00349000 08:01 2637891                   
/usr/local/lib/libruby.so.2.6.5
```

So 0x7f4f035f4000 + 0x000000000023819e = 0x7f4f0382c19e 

Now lets just check the data in memory at that address without anything
attached:

```
$dd if=/proc/168586/mem count=1 bs=1 skip=$(( 0x7F4F0382C19E )) 2> /dev/null |
xxd
00000000: 90                                       .
```

It's the NOP exactly as expected. Now for the real magic, we attach bpftrace in
one terminal:

```
$ bpftrace -e 'usdt::ruby:method__entry {printf("%s\n", str(arg1))}' -p 168586
```

And immediately see it is printing `foo`, a good sign. But my `ENABLED` check
could still be broken right? So lets check the memory now that the probe is
enabled:

```
dd if=/proc/168586/mem count=1 bs=1 skip=$(( 0x7F4F0382C19E )) 2> /dev/null |
xxd
00000000: cc                                       .
```

Exactly as expected, the kernel has overwritten the NOP (0x90) with INT3
(0xCC).

The macro to check this determines that 0x90 != 0xCC, and returns true - the
probe is enabled by the uprobe itself.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]