Bug 1868 - fadvise64 / fadvise64_64 alias causes seg fault
Summary: fadvise64 / fadvise64_64 alias causes seg fault
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: translator (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks: 907
  Show dependency treegraph
 
Reported: 2005-11-15 16:44 UTC by Kevin Stafford
Modified: 2011-03-16 21:19 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2005-11-24 22:49:07


Attachments
vmware backtrace screenshot (20.01 KB, image/png)
2005-11-25 21:31 UTC, Frank Ch. Eigler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Stafford 2005-11-15 16:44:42 UTC
On vanilla 2.6.14 the following alias causes a seg fault when probed:

probe kernel.syscall.fadvise64 =
   kernel.function("sys_fadvise64_64") {
      name = "fadvise64"
      fd = $fd
      offset = $offset
      len = $len
      advice = $advice
   }
Comment 1 Frank Ch. Eigler 2005-11-24 22:53:28 UTC
Please be more specific.  Is this a translator SEGV?  On which architecture?  (I
can't trigger it on an i686 or x86_64 with cvs systemtap against a recent FC4
kernel.)  Or is this a probe execution-time crash?  More details?
Comment 2 Frank Ch. Eigler 2005-11-25 21:29:28 UTC
OK, I can reproduce the segv; -22.17.EL i686 kernel, cvs systemtap.
% stap -e 'probe kernel.syscall.fadvise64 {}'

The crash occurs in loc2c.c, during the processing of the $len 64-bit target
variable.  I attach a vmware png screenshot (sorry) of the gdb backtrace.  The
given "location" pointer appears to be corrupt.

Another data point.  When the same test is run on a recentish FC3 kernel, the
translator succeeds.  Oddly, for an 8-byte value, only a single "piece" is
fetched.  If so, maybe the debuginfo is again crappy.  Still if so, loc2c should
live with that better.

  intptr_t value;
  {
    union {
      char bytes[8];
      struct {
        uint0_t p0;
      } pieces __attribute__ ((packed));
      uint64_t whole;
    } u;
    intptr_t addr;
    { // DWARF expression: 0x74
      {
        intptr_t s0;
        s0 = fetch_register (4) + 0L;
        addr = s0;
      }
    }
    u.pieces.p0 = deref (sizeof u.pieces.p0, addr);
    value = u.whole;
  }
  printk (" ---> %ld\n", (unsigned long) value);
Comment 3 Frank Ch. Eigler 2005-11-25 21:31:02 UTC
Created attachment 767 [details]
vmware backtrace screenshot
Comment 4 Roland McGrath 2005-11-26 23:47:43 UTC
Please point to the particular kernel build that produced the loc2c-test output
in  comment #2.
Comment 5 Frank Ch. Eigler 2005-11-26 23:54:51 UTC
Unless there is more than one 2.6.9-22.17.EL i686 kernel,
it'd be the one under dist/4E-U3.
Comment 6 Roland McGrath 2005-11-26 23:56:33 UTC
That is not what I asked.  Supply details about the "recentish FC3 kernel" build
from which the loc2c-test output in comment #2 comes.
Comment 7 Frank Ch. Eigler 2005-11-26 23:58:52 UTC
Oh.  The FC3 kernel.  2.6.12-1.1381_FC3smp i686 
Comment 8 Roland McGrath 2005-11-27 02:08:16 UTC
It does appear that both failure modes for those different kernels were the same
bug.  I've checked in some fixes to loc2c to handle the problem case.  The bug
arose when a noncontiguous location is partially in memory and partially in
registers.

After the fix, the loc2c-test output for the 1.1381_FC3 kernel looks correct.
I was not able to test running probes on that kernel because the
kernel-smp-devel rpms for that version seem to be AWOL.

On the RHEL4 kernel, the fix cures the crash and I saw no testsuite failures.
Please verify the fix further.
Comment 9 Frank Ch. Eigler 2005-11-28 01:15:06 UTC
Works great, thanks.