Bug 10318 - Bad address reading arg from mark probe
Summary: Bad address reading arg from mark probe
Status: RESOLVED DUPLICATE of bug 10601
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on: 10601
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-23 14:21 UTC by Mark Wielaard
Modified: 2011-07-20 21:19 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
parameterize loc2c with a callback to emit what now are deref and store_deref macro uses (826 bytes, patch)
2009-06-26 18:14 UTC, Mark Wielaard
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2009-06-23 14:21:34 UTC
This is a continuation of bug #10289 comment #1:
> I don't yet have a test because this fix ends up failing with an unrelated
> problem: kernel read fault at 0xfffffffffffffe15 (addr) near identifier '$arg1'

This might or might not be related to bug #10305 where I see a failure to
resolve the arguments of a mark probe (after fixing the bias offsets that made
the finding of probes fail).

Stan, with which testcase did you see the above?
Comment 1 Mark Wielaard 2009-06-23 21:02:08 UTC
Here is an example from exelib.exe:

sourcing: /home/mark/src/systemtap/testsuite/systemtap.exelib/mark.tcl for
uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_uprobeslibgcc-O0-m32-debug
executing: stap /home/mark/src/systemtap/testsuite/systemtap.exelib/mark.stp
./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
./libuprobeslibgcc-O0-m32-debug.so -c
./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
FAIL:
mark-uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_uprobeslibgcc-O0-m32-debug
line 1: expected "main_count: 3"
Got "ERROR: kernel read fault at 0xfffffffffffffff4 (addr) near identifier
'$arg1' at /home/mark/src/systemtap/testsuite/systemtap.exelib/mark.stp:5:30"
Comment 2 Mark Wielaard 2009-06-24 09:07:24 UTC
Relevant verbose output from:

$ stap -vvv /home/mark/src/systemtap/testsuite/systemtap.exelib/mark.stp
./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
./libuprobeslibgcc-O0-m32-debug.so -c
./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe

focused on module
'/home/mark/src/systemtap/testsuite/uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe'
selected function main_func
probe
main_func@/home/mark/src/systemtap/testsuite/systemtap.exelib/uprobes_exe.c:22
process=/home/mark/src/systemtap/testsuite/uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
reloc=.absolute section=.text pc=0x80484c3
finding location for local 'arg1' near address 0x80484c3, module bias 0x0
focused on module
'/home/mark/src/systemtap/testsuite/libuprobeslibgcc-O0-m32-debug.so =
[0x10000-0x116e4, bias 0x0] file
/home/mark/src/systemtap/testsuite/libuprobeslibgcc-O0-m32-debug.so ELF machine
i?86|x86_64 (code 3)
focused on module
'/home/mark/src/systemtap/testsuite/libuprobeslibgcc-O0-m32-debug.so'
selected function lib_func
probe
lib_func@/home/mark/src/systemtap/testsuite/systemtap.exelib/uprobes_lib.c:19
process=/home/mark/src/systemtap/testsuite/libuprobeslibgcc-O0-m32-debug.so
reloc=.dynamic pc=0x467
finding location for local 'arg1' near address 0x467, module bias 0x10000

loc2c-test doesn't want to play along though...
$ ../loc2c-test -e ./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
0x80484c3 arg1
../loc2c-test: fetch supported only for base type or pointer
Comment 3 Mark Wielaard 2009-06-24 09:43:35 UTC
(In reply to comment #2)> loc2c-test doesn't want to play along though...
> $ ../loc2c-test -e ./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
> 0x80484c3 arg1
> ../loc2c-test: fetch supported only for base type or pointer

loc2c-test needed to resolve through const and volatile types. I pushed a patch
for that. Now it does play along:


$ ../loc2c-test -e ./uprobesgcc-O0-m32-debug-uprobeslibgcc-O0-m32-debug_exe
0x80484c3 arg1
#define PROBEADDR 0x80484c3ULL
static void print_value(struct pt_regs *regs)
{
  intptr_t value;
  {
    intptr_t addr;
  intptr_t frame_base;
  { // DWARF expression: 0x75(8)
    {
      intptr_t s0;
        s0 = fetch_register (5) + 8L;
      frame_base = s0;
    }
  }
    { // DWARF expression: 0x91(-20)
      {
        intptr_t s0;
        s0 = frame_base + -20L;
        addr = s0;
      }
    }
    { int32_t value = deref (4, addr);value = value; }
  }
  printk (" ---> %ld\n", (unsigned long) value);
  return;

 deref_fault:
  printk (" => BAD ACCESS\n");
}
Comment 4 Mark Wielaard 2009-06-24 10:00:53 UTC
I suspect we do something wrong with the "fetch_register (5)".
gdb seems able to resolve the arg1 variable fine:

Breakpoint 1, 0x080484c3 in main_func (foo=3)
    at /home/mark/src/systemtap/testsuite/systemtap.exelib/uprobes_exe.c:25
25	  STAP_PROBE1(test, main_count, foo);
(gdb) print arg1
$1 = 3
(gdb) print &arg1
$2 = (volatile int *) 0xffffd2ec
Comment 5 Roland McGrath 2009-06-24 10:53:55 UTC
The kernel-mode definitions of fetch_register() et al (not to mention deref!)
are wholly inappropriate for dealing with user-mode register states, especially
32-bit ones on 64-bit kernels.  You need an entirely different regime of runtime
calls (should use user_regset calls) for dealing with user-mode registers.
Comment 6 Mark Wielaard 2009-06-26 18:14:30 UTC
Created attachment 4023 [details]
parameterize loc2c with a callback to emit what now are deref and store_deref macro uses

We discussed this a bit on irc (transcribed in this comment - mostly roland
talking)

The attached patch by roland is a sketch for the start of the first bit:
parameterize loc2c with a callback to emit what now are deref and store_deref
macro uses. You could also nix the used_deref tracking in loc2c and just make
stap's emit_deref callback set its tracking flag. Notice how in the patch
emit_deref has the "size" value at translation time (expr[i].number is the
size, where expr[i] is the DW_OP_deref or whatnot that we are translating), so
the callback can get an int rather than just a string of the size number as we
emit now. The point being that the new callbacks should take that int, rather
than it being hidden as a literal string of C or whatnot. deref/store_deref are
the easy ones. For proper interface in loc2c, it should have a callback to
emit.  But in fact, the stap callback will just differ in the name of the macro
it emits for kernel vs user. Next, parameterize where it emits fetch_register
and store_register macros so it uses a callback to emit, that takes the
register number as an int arg to the callback. Hopefully this patch illustrated
how to tease apart the loc2c impl macros like push() where you need to split up
what was a simple push("x = fetch_register (%d)", blah); line to have a
callback in the middle.

After the parameterization, deref et al are simple: just want a different
runtime macro that does checking like vanilla get_user()/put_user() macros do.
For fetch/store_register what you want at runtime is calls to user_regset
functions, which take byte offset and length for regset layout. What you have
at translation time is a DWARF register number, so ideally you want to
translate that to to a regset+offset at translation time and emit code that
calls the runtime for user_regset fetches from an emitted literal-number
offset. libebl has the the DWARF reg # -> regset layout mapping, but it is not
exported prettily now. It is a fixed ABI, so for now probably easiest to put
some hard-coded tables in stap for some arch's (later elfutils will make this
translation easy for stap, for now the backends/*_corenote.c files have the
tables you could translate to hard-code something). The user case callback for
emit-register would map reg# to regset+offset, then emit a
"fetch_from_regset(regset, offset)" runtime call or suchlike.
Comment 7 Wenji Huang 2009-07-14 09:45:42 UTC
Not sure whether this is another instance of this bug. It's on FC11, 
latest git source.

$stap -vve 'probe kernel.function("sys_read"){print($fd)}'
SystemTap translator/driver (version 0.9.8/0.137 commit release-0.9.8-146-gec6fdef)
Copyright (C) 2005-2009 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Session arch: i686 release: 2.6.29.4-167.fc11.i686.PAE
Created temporary directory "/tmp/stapZvtGh0"
Searched '/usr/local/share/systemtap/tapset/i686/*.stp', found 3
Searched '/usr/local/share/systemtap/tapset/*.stp', found 51
Pass 1: parsed user script and 54 library script(s) in 90usr/40sys/317real ms.
probe sys_read@fs/read_write.c:372 kernel reloc=.dynamic section=.text pc=0xc04a94c9
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) in
740usr/1220sys/9097real ms.
Pass 3: using cached
/home/wjhuang/.systemtap/cache/66/stapconf_66ed81ed4078196cbba85d4ef02c9350_446.h
probe_1746 locks nothing
dump_unwindsyms kernel index=0 base=0xc0400000
Found build-id in kernel, length 20, end at 0xc071afd4
Pass 3: translated to C into
"/tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.c" in
1010usr/890sys/22950real ms.
Running make -C "/lib/modules/2.6.29.4-167.fc11.i686.PAE/build"
M="/tmp/stapZvtGh0" modules >/dev/null
Pass 4: compiled C into "stap_99279a38a8b4c7286c326f231792e7bb_984.ko" in
2370usr/1860sys/21529real ms.
Copying /tmp/stapZvtGh0/stapconf_66ed81ed4078196cbba85d4ef02c9350_446.h to
/home/wjhuang/.systemtap/cache/66/stapconf_66ed81ed4078196cbba85d4ef02c9350_446.h
Copying /tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.ko to
/home/wjhuang/.systemtap/cache/99/stap_99279a38a8b4c7286c326f231792e7bb_984.ko
Copying /tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.ko.sgn to
/home/wjhuang/.systemtap/cache/99/stap_99279a38a8b4c7286c326f231792e7bb_984.ko.sgn
Copying /tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.c to
/home/wjhuang/.systemtap/cache/99/stap_99279a38a8b4c7286c326f231792e7bb_984.c
Pass 5: starting run.
Running /usr/local/bin/staprun -v
/tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.ko
ERROR: kernel read fault at 0x009880c0 (addr) near identifier '$fd' at <input>:1:41
WARNING: Number of errors: 1, skipped probes: 1
stapio:cleanup_and_exit:371 detach=0
stapio:cleanup_and_exit:388 closing control channel
Pass 5: run completed in 0usr/70sys/431real ms.
Running rm -rf /tmp/stapZvtGh0
Comment 8 Mark Wielaard 2009-07-17 18:19:12 UTC
(In reply to comment #7)
> Not sure whether this is another instance of this bug. It's on FC11, 
> latest git source.
> 
> $stap -vve 'probe kernel.function("sys_read"){print($fd)}'
> SystemTap translator/driver (version 0.9.8/0.137 commit
release-0.9.8-146-gec6fdef)
> Copyright (C) 2005-2009 Red Hat, Inc. and others
> This is free software; see the source for copying conditions.
> Session arch: i686 release: 2.6.29.4-167.fc11.i686.PAE
> [...]
> Pass 5: starting run.
> Running /usr/local/bin/staprun -v
> /tmp/stapZvtGh0/stap_99279a38a8b4c7286c326f231792e7bb_984.ko
> ERROR: kernel read fault at 0x009880c0 (addr) near identifier '$fd' at
<input>:1:41
> WARNING: Number of errors: 1, skipped probes: 1

This is most likely related to bug #10408
Comment 9 Mark Wielaard 2011-07-20 21:19:29 UTC
PR10601 was the root cause.

*** This bug has been marked as a duplicate of bug 10601 ***