Bug 18597 - long_arg() doesn't correctly handle negative values in 32-on-64 environment
Summary: long_arg() doesn't correctly handle negative values in 32-on-64 environment
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: tapsets (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-25 11:00 UTC by Martin Cermak
Modified: 2015-07-08 06:09 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
proposed patch (2.84 KB, patch)
2015-06-25 11:03 UTC, Martin Cermak
Details | Diff
alternative patch (3.47 KB, patch)
2015-06-30 15:59 UTC, David Smith
Details | Diff
better patch (426 bytes, patch)
2015-06-30 20:21 UTC, David Smith
Details | Diff
actual patch (2.72 KB, patch)
2015-06-30 20:46 UTC, David Smith
Details | Diff
testcase proposal (648 bytes, text/plain)
2015-07-03 14:18 UTC, Martin Cermak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Cermak 2015-06-25 11:00:07 UTC
Currently (15f1b3dd85f3) long_arg() doesn't correctly handle negative values in compat tasks:

=======
 7.1 S x86_64 # gcc -g systemtap.syscall/aio.c -m32
 7.1 S x86_64 # stap -g -e 'probe kernel.function("compat_sys_io_submit"){printf("%d %d %d\n", @__compat_long($nr), __int32($nr), %{ /* pure */ _stp_is_compat_task() %})}' -c ./a.out
1
1 1 1
1 1 1
-1 -1 1
1 1 1
 7.1 S x86_64 # stap -g -e 'probe kprobe.function("compat_sys_io_submit"){printf("%d %d %d\n", long_arg(2), int_arg(2), %{ /* pure */ _stp_is_compat_task() %})}' -c ./a.out
1
1 1 1
1 1 1
4294967295 -1 1
1 1 1
 7.1 S x86_64 #
=======

See '-1' versus '4294967295'. This happens on x86_64 and powerpc, but not on s390x. Currently this issue is being worked around in tapsets, e.g.:

=======
# io_submit __________________________________________________
# long sys_io_submit(aio_context_t ctx_id, long nr, struct iocb __user * __user *iocbpp)
# long compat_sys_io_submit(aio_context_t ctx_id, int nr, u32 __user *iocb)
#
probe nd_syscall.io_submit = __nd_syscall.io_submit,
        __nd_syscall.compat_io_submit ?
{
        name = "io_submit"
        asmlinkage()
        ctx_id = ulong_arg(1)
        iocbpp_uaddr = pointer_arg(3)
        argstr = sprintf("%u, %d, %p", ctx_id, nr, iocbpp_uaddr)
}
probe __nd_syscall.io_submit = kprobe.function("sys_io_submit") ?
{
        @__syscall_gate(%{ __NR_io_submit %})
        asmlinkage()
        nr = long_arg(2)
}
probe __nd_syscall.compat_io_submit = kprobe.function("compat_sys_io_submit") ?
{
        asmlinkage()
        nr = int_arg(2)
}
probe nd_syscall.io_submit.return = __nd_syscall.io_submit.return,
        kprobe.function("compat_sys_io_submit").return ?
{
        name = "io_submit"
        retstr = returnstr(1)
}
probe __nd_syscall.io_submit.return = kprobe.function("sys_io_submit").return ?
{
        @__syscall_gate(%{ __NR_io_submit %})
}

=======

Fixing this issue would allow us to do changes like following ...

=======
$ git diff tapset/linux/nd_syscalls.stp                                            
diff --git a/tapset/linux/nd_syscalls.stp b/tapset/linux/nd_syscalls.stp           
index 1a5e486..d41ec34 100644                                                      
--- a/tapset/linux/nd_syscalls.stp                                                 
+++ b/tapset/linux/nd_syscalls.stp                                                 
@@ -3029,24 +3029,18 @@ probe __nd_syscall.io_setup.return = kprobe.function("sys_io_setup").return ?
 # long compat_sys_io_submit(aio_context_t ctx_id, int nr, u32 __user *iocb)       
 #                                                                                 
 probe nd_syscall.io_submit = __nd_syscall.io_submit,                              
-       __nd_syscall.compat_io_submit ?                                            
+       kprobe.function("compat_sys_io_submit") ?                                  
 {                                                                                 
        name = "io_submit"                                                         
        asmlinkage()                                                               
        ctx_id = ulong_arg(1)                                                      
+       nr = long_arg(2)                                                           
        iocbpp_uaddr = pointer_arg(3)                                              
        argstr = sprintf("%u, %d, %p", ctx_id, nr, iocbpp_uaddr)                   
 }                                                                                 
 probe __nd_syscall.io_submit = kprobe.function("sys_io_submit") ?                 
 {                                                                                 
        @__syscall_gate(%{ __NR_io_submit %})                                      
-       asmlinkage()                                                               
-       nr = long_arg(2)                                                           
-}                                                                                 
-probe __nd_syscall.compat_io_submit = kprobe.function("compat_sys_io_submit") ?
-{                                                                                 
-       asmlinkage()                                                               
-       nr = int_arg(2)                                                            
 }                                                                                 
 probe nd_syscall.io_submit.return = __nd_syscall.io_submit.return,                
        kprobe.function("compat_sys_io_submit").return ?                           
$  
=======

... and thus would allow us to clean up the tapset a bit. 

Following change seems to fix the issue on x86_64:

=======
$ git diff tapset/x86_64/registers.stp | nl
     1  diff --git a/tapset/x86_64/registers.stp b/tapset/x86_64/registers.stp
     2  index d7035e4..e662f9e 100644
     3  --- a/tapset/x86_64/registers.stp
     4  +++ b/tapset/x86_64/registers.stp
     5  @@ -167,7 +167,7 @@ function _stp_arg:long (argnum:long, sign_extend:long, truncate:long) %{ /* pure
     6          default:
     7                  goto bad_argnum;
     8          }
     9  -       if (STAP_ARG_truncate || argsz == sizeof(int)) {
    10  +       if (STAP_ARG_truncate || _stp_is_compat_task()) {
    11                  if (STAP_ARG_sign_extend)
    12                          STAP_RETVALUE = (int64_t) __stp_sign_extend32(val);
    13                  else
    14  @@ -215,12 +215,7 @@ function ulong_arg:long (argnum:long) {
    15   }
    16   
    17   function longlong_arg:long (argnum:long) {
    18  -       if (probing_32bit_app()) {
    19  -               lowbits = _stp_arg(argnum, 0, 1)
    20  -               highbits = _stp_arg(argnum+1, 0, 1)
    21  -               return ((highbits << 32) | lowbits)
    22  -       } else
    23  -               return _stp_arg(argnum, 0, 0)
    24  +       return _stp_arg(argnum, 0, 0)
    25   }
    26   
    27   function ulonglong_arg:long (argnum:long) {
$ 
=======

The actual fix is on lines 9-10. Lines 18-23 are just removal of redundant code that, based on my testing, appears to never be used. Note, that in case of probing compat functions, we usually build longlong value out of two 32 bit values passed to the function as separate parameters. An x86_64 example is sys32_sync_file_range(int fd, unsigned off_low, unsigned off_hi, unsigned n_low, unsigned n_hi,  int flags), in which case we fabricate offset = ((uint_arg(3) << 32) | uint_arg(2)). Based on my testing lines 18-23 are of no use and this analogically applies to respective powerpc and s390x snippets. So I'm removing this later on im my patch.

But back to the actual fix. For powerpc, following update seems to work:

=======
$ git diff tapset/powerpc/registers.stp | nl
     1  diff --git a/tapset/powerpc/registers.stp b/tapset/powerpc/registers.stp
     2  index 03609f8..eb80018 100644
     3  --- a/tapset/powerpc/registers.stp
     4  +++ b/tapset/powerpc/registers.stp
     5  @@ -149,7 +149,7 @@ function _stp_arg:long (argnum:long, sign_extend:long, truncate:long) {
     6          else if (argnum == 8)
     7                  val = u_register("r10")
     8   
     9  -       if (truncate) {
    10  +       if (truncate || @__compat_task) {
    11                  if (sign_extend)
    12                          val = _stp_sign_extend32(val)
    13                  else
    14  @@ -178,12 +178,7 @@ function ulong_arg:long (argnum:long) {
    15   }
    16   
    17   function longlong_arg:long (argnum:long) {
    18  -       if (probing_32bit_app()) {
    19  -               lowbits = _stp_arg(argnum, 0, 1)
    20  -               highbits = _stp_arg(argnum+1, 0, 1)
    21  -               return ((highbits << 32) | lowbits)
    22  -       } else
    23  -               return _stp_arg(argnum, 0, 0)
    24  +       return _stp_arg(argnum, 0, 0)
    25   }
    26   
    27   function ulonglong_arg:long (argnum:long) {
$ 
=======

As I said, this issue is not present on s390, because a change in this sense is already there. So related part of my patch reads:

=======
$ git diff tapset/s390/registers.stp | nl
     1  diff --git a/tapset/s390/registers.stp b/tapset/s390/registers.stp
     2  index c8cb304..c2ced72 100644
     3  --- a/tapset/s390/registers.stp
     4  +++ b/tapset/s390/registers.stp
     5  @@ -237,7 +237,7 @@ function _stp_arg:long (argnum:long, sign_extend:long, truncate:long)
     6          else if (argnum >= 6)
     7                  val = _stp_get_kernel_stack_param(argnum - 6)
     8   
     9  -       if (truncate || %{ /* pure */ _stp_is_compat_task() %}) {
    10  +       if (truncate || @__compat_task) {
    11                  /* High bits may be garbage. */
    12                  val = (val & 0xffffffff)
    13                  if (sign_extend)
    14  @@ -265,13 +265,7 @@ function ulong_arg:long (argnum:long) {
    15   }
    16   
    17   function longlong_arg:long (argnum:long) {
    18  -       if (probing_32bit_app()) {
    19  -               /* TODO verify if this is correct for 31bit apps */
    20  -               highbits = _stp_arg(argnum, 0, 1)
    21  -               lowbits = _stp_arg(argnum+1, 0, 1)
    22  -               return ((highbits << 32) | lowbits)
    23  -       } else
    24  -               return _stp_arg(argnum, 0, 0)
    25  +       return _stp_arg(argnum, 0, 0)
    26   }
    27   
    28   function ulonglong_arg:long (argnum:long) {
$ 
=======

Which actually is a cosmetical update only.

In general, applying aforementioned updates seems to cause regressions in pread, pwrite and sync_file_range. But after looking closer I think the problem is just that these probes rely on aforementioned misbehavior. Their fixes make significant part of patch I'm going to attach.
Comment 1 Martin Cermak 2015-06-25 11:03:42 UTC
Created attachment 8393 [details]
proposed patch

Fix reported misbehaviour of long_arg(); fix pread, pwrite and sync_file_range; simplify nd_syscall.io_submit as an example benefit of the fix.
Comment 2 Martin Cermak 2015-06-25 15:03:22 UTC
> =======
> $ git diff tapset/x86_64/registers.stp | nl
>      1  diff --git a/tapset/x86_64/registers.stp
> b/tapset/x86_64/registers.stp
>      2  index d7035e4..e662f9e 100644
>      3  --- a/tapset/x86_64/registers.stp
>      4  +++ b/tapset/x86_64/registers.stp
>      5  @@ -167,7 +167,7 @@ function _stp_arg:long (argnum:long,
> sign_extend:long, truncate:long) %{ /* pure
>      6          default:
>      7                  goto bad_argnum;
>      8          }
>      9  -       if (STAP_ARG_truncate || argsz == sizeof(int)) {
>     10  +       if (STAP_ARG_truncate || _stp_is_compat_task()) {
>     11                  if (STAP_ARG_sign_extend)
>     12                          STAP_RETVALUE = (int64_t)
> __stp_sign_extend32(val);
>     13                  else
>     14  @@ -215,12 +215,7 @@ function ulong_arg:long (argnum:long) {
>     15   }
>     16   
>     17   function longlong_arg:long (argnum:long) {
>     18  -       if (probing_32bit_app()) {
>     19  -               lowbits = _stp_arg(argnum, 0, 1)
>     20  -               highbits = _stp_arg(argnum+1, 0, 1)
>     21  -               return ((highbits << 32) | lowbits)
>     22  -       } else
>     23  -               return _stp_arg(argnum, 0, 0)
>     24  +       return _stp_arg(argnum, 0, 0)
>     25   }
>     26   
>     27   function ulonglong_arg:long (argnum:long) {
> $ 
> =======
> 
> Lines 18-23 are just removal of redundant code that, based on my testing, appears to never be used ... stuff deleted ...

Actually, I am not sure about user scripts. If there is a risk this might break them, then lines 18-26 (and respective snippets for powerpc and s390) can stay in place. It is not directly related to the reported issue. Maybe I shouldn't have mixed these two independent things here.
Comment 3 David Smith 2015-06-26 12:13:43 UTC
(In reply to Martin Cermak from comment #2)
> > Lines 18-23 are just removal of redundant code that, based on my testing, appears to never be used ... stuff deleted ...
> 
> Actually, I am not sure about user scripts. If there is a risk this might
> break them, then lines 18-26 (and respective snippets for powerpc and s390)
> can stay in place. It is not directly related to the reported issue. Maybe I
> shouldn't have mixed these two independent things here.

Hmm. If through your testing you don't see any regressions here, I'd say go ahead and delete that code. Be sure to test RHEL[567], f21, f22, and rawhide on the affected arch.
Comment 4 David Smith 2015-06-26 12:20:07 UTC
(In reply to Martin Cermak from comment #1)
> Created attachment 8393 [details]
> proposed patch
> 
> Fix reported misbehaviour of long_arg(); fix pread, pwrite and
> sync_file_range; simplify nd_syscall.io_submit as an example benefit of the
> fix.

OK, you lost me on this:

+# compat_sync_file_range _____________________________________
+# asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags,
+#                                             unsigned offset_hi, unsigned offset_lo,
+#                                             unsigned nbytes_hi, unsigned nbytes_lo)
+#
+probe nd_syscall.compat_sync_file_range =
+	kprobe.function("compat_sys_sync_file_range2") ?
+{
+	asmlinkage()
+	name = "sync_file_range"
+	fd = int_arg(1)
+	flags = uint_arg(2)
+	offset = (u32_arg(3) << 32) + u32_arg(4)
+	nbytes = (u32_arg(5) << 32) + u32_arg(6)
+	flags_str = _sync_file_range_flags_str(flags)
+	argstr = sprintf("%d, %d, %d, %s", fd, offset, nbytes,
+	                 _sync_file_range_flags_str(flags))
+}

If this code wasn't needed before, why is it needed now?
Comment 5 David Smith 2015-06-30 15:56:46 UTC
(In reply to David Smith from comment #4)
> If this code wasn't needed before, why is it needed now?

I've figured this out. This code is needed because stp_arg() (which long_arg() calls) had a bug. The bug was that in some circumstances, longlong_arg() would return a 64-bit value (from 1 register) when called on a 32-bit process. longlong_arg()/stp_arg() was designed to do the following:

If we're in a 32-bit compat process, then you can't put a 64-bit value in 1 register. Instead, you have to break the 64-bit value up and put it in 2 registers. This is why longlong_arg() had code in it like this:

-	if (probing_32bit_app()) {
-		lowbits = _stp_arg(argnum, 0, 1)
-		highbits = _stp_arg(argnum+1, 0, 1)
-		return ((highbits << 32) | lowbits)

However, this code doesn't really work correctly on a 64-bit OS. Let me give an example. Here's pread() and the x86_64 compat function for pread(). The 'pos' argument (or the combination of 'poslo' and 'poshi' arguments) is a 64-bit value.

====
ssize_t sys_pread64(unsigned int fd, char __user *buf, size_t count, loff_t pos)
{
  //...
}

asmlinkage long sys32_pread(unsigned int fd, char __user *ubuf, u32 count,
                            u32 poslo, u32 poshi)
{
        return sys_pread64(fd, ubuf, count,
                         ((loff_t)AA(poshi) << 32) | AA(poslo));
}
====

If we're probing a compat function, we'd never call longlong_arg(), we'd do the same thing that the wrapper function is doing - call ulong_arg() on arguments 4 and 5 and slam the values together. Since probing_32bit_app() never returned true, the supposed 32-bit code in longlong_arg() was never getting called.

Your patch has to add lots of "compat" probes because as a side effect of fixing the negative value problem it "broke" longlong_arg() returning a 64-bit value from 1 register even when in a 32-bit process (which shouldn't have worked in the first place).

I've been struggling with the right way to fix this. I went down the same path you tried, and ended up about in the same place. Then I decided to try a different path.

What if we decided that if you call longlong_arg() on a 64-bit OS on a 32-bit process you really *want* a 64-bit value from 1 register? In some ways this makes sense and it matches our old behavior. My theory here is that you know what you are doing and if you call longlong_arg() on a 32-bit process you must be in the true 64-bit function at this point.

I'm testing this now and it works well on everything but ppc64. I'm unsure of what is going on there.
Comment 6 David Smith 2015-06-30 15:59:26 UTC
Created attachment 8402 [details]
alternative patch

Here's the patch I'm testing.
Comment 7 Josh Stone 2015-06-30 17:21:32 UTC
(In reply to David Smith from comment #5)
> What if we decided that if you call longlong_arg() on a 64-bit OS on a
> 32-bit process you really *want* a 64-bit value from 1 register? In some
> ways this makes sense and it matches our old behavior. My theory here is
> that you know what you are doing and if you call longlong_arg() on a 32-bit
> process you must be in the true 64-bit function at this point.

That assumes you're only using this in the kernel.  For a uprobe, longlong_arg() still needs to follow the 2-register ABI, no?  That's when the probing_32bit_app() branch should be active.

So yes, longlong_arg() in 64-bit kernel context should use 1 register, regardless of the task, but a 32-bit user context still needs to use 2 registers.
Comment 8 David Smith 2015-06-30 17:36:50 UTC
(In reply to Josh Stone from comment #7)
> (In reply to David Smith from comment #5)
> > What if we decided that if you call longlong_arg() on a 64-bit OS on a
> > 32-bit process you really *want* a 64-bit value from 1 register? In some
> > ways this makes sense and it matches our old behavior. My theory here is
> > that you know what you are doing and if you call longlong_arg() on a 32-bit
> > process you must be in the true 64-bit function at this point.
> 
> That assumes you're only using this in the kernel.  For a uprobe,
> longlong_arg() still needs to follow the 2-register ABI, no?  That's when
> the probing_32bit_app() branch should be active.
> 
> So yes, longlong_arg() in 64-bit kernel context should use 1 register,
> regardless of the task, but a 32-bit user context still needs to use 2
> registers.

We can certainly add that code back, but do we know of anyone using the stp_arg()-based dwarfless parameters functions from uprobes? Do we test that?
Comment 9 David Smith 2015-06-30 20:21:01 UTC
Created attachment 8403 [details]
better patch

Here's a patch that is in a state ready to check in, assuming we like the direction. It works everywhere I've tried it.
Comment 10 David Smith 2015-06-30 20:22:00 UTC
(In reply to David Smith from comment #8)
> (In reply to Josh Stone from comment #7)
> > (In reply to David Smith from comment #5)
> > > What if we decided that if you call longlong_arg() on a 64-bit OS on a
> > > 32-bit process you really *want* a 64-bit value from 1 register? In some
> > > ways this makes sense and it matches our old behavior. My theory here is
> > > that you know what you are doing and if you call longlong_arg() on a 32-bit
> > > process you must be in the true 64-bit function at this point.
> > 
> > That assumes you're only using this in the kernel.  For a uprobe,
> > longlong_arg() still needs to follow the 2-register ABI, no?  That's when
> > the probing_32bit_app() branch should be active.
> > 
> > So yes, longlong_arg() in 64-bit kernel context should use 1 register,
> > regardless of the task, but a 32-bit user context still needs to use 2
> > registers.
> 
> We can certainly add that code back, but do we know of anyone using the
> stp_arg()-based dwarfless parameters functions from uprobes? Do we test that?

From what I can tell, we don't test using dwarfless parameters from uprobes.
Comment 11 Josh Stone 2015-06-30 20:32:09 UTC
(In reply to David Smith from comment #9)
> Created attachment 8403 [details]
> better patch
> 
> Here's a patch that is in a state ready to check in, assuming we like the
> direction. It works everywhere I've tried it.

Wrong patch?  These are runtime/stat changes that you already pushed...
Comment 12 David Smith 2015-06-30 20:46:04 UTC
Created attachment 8404 [details]
actual patch

Here's the (hopefully) correct patch.
Comment 13 Martin Cermak 2015-07-01 15:21:37 UTC
The aforementioned patch brings some testcase extensions, that fail on rhel5. For instance the pwrite testcase newly has following subtest:

=======
  pwrite(-1, "Hello Again", 11, 0x12345678deadbeefLL);                          
  //staptest// pwrite (-1, "Hello Again", 11, 1311768468603649775) = NNNN
=======

For the purpose of this comment, I reduced pwrite.c to this one single pwrite call only, and dompiled it with -m31. On x86_64, value of the fourth argument is being grabbed in _stp_get_arg32_by_number(n, nr_regargs, regs, &val), where n=4 and nr_regargs=6, effectively grabbing the value from RREG(cx, regs). This works fine except of rhel5. E.g. on rhel7 we have:

=======
 7.1 S x86_64 # stap -ge 'probe kernel.function("*pwrite*") {println(pp()); print_regs()}' -c ./a.out 
WARNING: probe kernel.function("C_SYSC_pwritev@fs/read_write.c:1072") (address 0xffffffff811c7d06) registration error (rc -84)
kernel.function("sys32_pwrite@arch/x86/ia32/sys_ia32.c:183")
RIP: ffffffff81062c10
RSP: ffff880094fa3f80  EFLAGS: 00000293
RAX: 00000000000000b5 RBX: 00000000ffffffff RCX: 00000000deadbeef
RDX: 000000000000000b RSI: 00000000080485bc RDI: 00000000ffffffff
RBP: 00000000080485bc R08: 0000000012345678 R09: 00000000ffeae768
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f749a2ae740(0000) GS:ffff88022fb00000(0063) knlGS:00000000f75e66c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f76caa10 CR3: 0000000225946000 CR4: 00000000000006e0
kernel.function("SyS_pwrite64@fs/read_write.c:542")
RIP: ffffffff811c7180
RSP: ffff880094fa3f70  EFLAGS: 00000202
RAX: 00000000000000b5 RBX: 00000000ffffffff RCX: 12345678deadbeef
RDX: 000000000000000b RSI: 00000000080485bc RDI: 00000000ffffffff
RBP: ffff880094fa3f78 R08: 12345678deadbeef R09: 00000000ffeae768
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f749a2ae740(0000) GS:ffff88022fb00000(0063) knlGS:00000000f75e66c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f76caa10 CR3: 0000000225946000 CR4: 00000000000006e0
kernel.function("SYSC_pwrite64@fs/read_write.c:542")
RIP: ffffffff811c71a7
RSP: ffff880094fa3f28  EFLAGS: 00000246
RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 12345678deadbeef
RDX: 000000000000000b RSI: 00000000080485bc RDI: 00000000ffffffff
RBP: ffff880094fa3f68 R08: 12345678deadbeef R09: 00000000ffeae768
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f749a2ae740(0000) GS:ffff88022fb00000(0063) knlGS:00000000f75e66c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000f76caa10 CR3: 0000000225946000 CR4: 00000000000006e0
 7.1 S x86_64 # 
=======

Whereas on rhel5 I see:

=======
 5.11 S x86_64 # stap -ge 'probe kernel.function("*pwrite*") {println(pp()); print_regs()}' -c ./a.out
kernel.function("sys32_pwrite@arch/x86_64/ia32/sys_ia32.c:690")
RIP: ffffffff800860b2
RSP: ffff81015527ff80  EFLAGS: 00000283
RAX: 00000000000000b5 RBX: 00000000ffffffff RCX: 00000000deadbeef
RDX: 000000000000000b RSI: 0000000008048578 RDI: 00000000ffffffff
RBP: 0000000008048578 R08: 00000000ffffffff R09: 00000000ffaafd48
R10: ffff81015527e000 R11: 0000000000000297 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00002b70d78cdaf0(0000) GS:ffff810181caddc0(0063) knlGS:00000000f7eed6c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000004889c0 CR3: 0000000061b4e000 CR4: 00000000000006e0
kernel.function("sys_pwrite64@fs/read_write.c:438")
RIP: ffffffff80044241
RSP: ffff81015527ff80  EFLAGS: 00000282
RAX: 00000000000000b5 RBX: 00000000ffffffff RCX: ffffffffdeadbeef
RDX: 000000000000000b RSI: 0000000008048578 RDI: 00000000ffffffff
RBP: 0000000008048578 R08: ffffffff00000000 R09: 00000000ffaafd48
R10: ffff81015527e000 R11: 0000000000000297 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00002b70d78cdaf0(0000) GS:ffff810181caddc0(0063) knlGS:00000000f7eed6c0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00000000004889c0 CR3: 0000000061b4e000 CR4: 00000000000006e0
 5.11 S x86_64 #
=======

On rhel5 sys32_pwrite looks like this:

=======
asmlinkage long
sys32_pwrite(unsigned int fd, char __user *ubuf, u32 count, u32 poslo, u32 poshi)
{
        return sys_pwrite64(fd, ubuf, count,
                          ((loff_t)AA(poshi) << 32) | AA(poslo));
}
=======

Which overall means that in this case sys32_pwrite() is only getting truncated argument and that is also what it passes to sys_pwrite64() via CX. Looks like it's glibc's choice to throw poshi away when calling sys32_pwrite().

And indeed, on rhel7 we have:

=======
 7.1 S x86_64 # stap -e 'probe process.syscall {if ($syscall==181) printf("%d, %x, %x, %x, %x, %x\n", $syscall, $arg1, $arg2, $arg3, $arg4, $arg5)}' -c ./a.out 
181, ffffffff, 80485bc, b, deadbeef, 12345678
 7.1 S x86_64 # 
=======

Whereas on rhel5:

=======
 5.11 S x86_64 # stap -e 'probe process.syscall {if ($syscall==181) printf("%d, %x, %x, %x, %x, %x\n", $syscall, $arg1, $arg2, $arg3, $arg4, $arg5)}' -c ./a.out
181, ffffffff, 8048578, b, deadbeef, ffffffff
 5.11 S x86_64 #
=======

So this is probably okay.


Now I'm going to run patched systemtap with original testcases to check for regressions this way.
Comment 14 David Smith 2015-07-01 16:13:59 UTC
(In reply to Martin Cermak from comment #13)
> The aforementioned patch brings some testcase extensions, that fail on
> rhel5. For instance the pwrite testcase newly has following subtest:
> 
> =======
>   pwrite(-1, "Hello Again", 11, 0x12345678deadbeefLL);                      
> 
>   //staptest// pwrite (-1, "Hello Again", 11, 1311768468603649775) = NNNN
> =======

I wasn't 100% sure what is going on here. I believe glibc is doing this, since strace isn't reporting the correct values either on RHEL5. syscall.pwrite was reporting the same values, so I didn't worry about it too much.

I believe I've figured this out. If you add the following line to the test case, it should work fine:

#define _XOPEN_SOURCE 500                                                       

I'm doing some more testing, but it appears to fix things.
Comment 15 Martin Cermak 2015-07-01 17:10:47 UTC
Good news - Applying patch from c#12 minus testcase updates shows no regressions here (compared to test results for unpatched bits).

(In reply to David Smith from comment #14)
> If you add the following line to the test
> case, it should work fine:
> 
> #define _XOPEN_SOURCE 500                                                   

This helps here.
Comment 16 David Smith 2015-07-01 17:29:35 UTC
The last patch (plus that test case fix) checked in as commit 7a563a0.
Comment 17 Martin Cermak 2015-07-03 14:18:03 UTC
Created attachment 8410 [details]
testcase proposal

(In reply to David Smith from comment #10)
> From what I can tell, we don't test using dwarfless parameters from uprobes.
Comment 18 David Smith 2015-07-06 13:45:22 UTC
(In reply to Martin Cermak from comment #17)
> Created attachment 8410 [details]
> testcase proposal
> 
> (In reply to David Smith from comment #10)
> > From what I can tell, we don't test using dwarfless parameters from uprobes.

The test is a reasonable start, but it only tests compat long long handling. We'd really want to test all types using dwarfless parameters from a uprobe.

Could you also open a new bug on this?
Comment 19 Martin Cermak 2015-07-08 06:09:10 UTC
(In reply to David Smith from comment #18)
> The test is a reasonable start, but it only tests compat long long handling.
> We'd really want to test all types using dwarfless parameters from a uprobe.
> 
> Could you also open a new bug on this?

Yup, filed PR18630 to track it.