This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation

From: Chris Metcalf <cmetcalf at mellanox dot com>
To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, <libc-alpha at sourceware dot org>, Mike Frysinger <vapier at gentoo dot org>
Date: Wed, 15 Jun 2016 14:21:29 -0400
Subject: Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
Authentication-results: sourceware.org; auth=none
Authentication-results: spf=none (sender IP is ) smtp dot mailfrom=cmetcalf at mellanox dot com;
References: <1465941275-3459-1-git-send-email-adhemerval dot zanella at linaro dot org> <20160615053714 dot GM4053 at vapier dot lan> <57615BFB dot 2050101 at linaro dot org>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:99

On 6/15/2016 9:45 AM, Adhemerval Zanella wrote:

On 15/06/2016 02:37, Mike Frysinger wrote:

On 14 Jun 2016 18:54, Adhemerval Zanella wrote:

This patch fixes the p{readv,writev}{64} consolidation implementation
from commits 4e77815 and af5fdf5.  Different from pread/pwrite
implementation, preadv/pwritev implementation does not require
__ALIGNMENT_ARG because kernel syscall prototypes define
the high and low part of the off_t, if it is the case, directly
(different from pread/pwrite where the architecture ABI for passing
64-bit values must be in consideration for passsing the arguments).

i had looked at that specifically but thought it ok because the old code
was using the alignment arg.  was the old code broken too ?

this is what the preadv code looked like:
-ssize_t
-__libc_preadv (int fd, const struct iovec *vector, int count, off_t offset)
-{
-  assert (sizeof (offset) == 4);
-  return SYSCALL_CANCEL (preadv, fd,
-                         vector, count, __ALIGNMENT_ARG
-                         __LONG_LONG_PAIR (offset >> 31, offset));
-}

although i guess this isn't too surprising as this code was in the
generic sysdeps dir which currently doesn't have as many users as
we wish it did :).

I though it too, but before the consolidation patch only nios2 and tile 32-bits
used the generic preadv.c implementation.  And only tile 32-bits defines
__ASSUME_ALIGNED_REGISTER_PAIRS (so __ALIGNMENT_ARG will pad the argument with 0).
However since the syscall is defined as in linux source:

fs/read_write.c:

  991 SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec,
  992                 unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h)
  993 {
  994         loff_t pos = pos_from_hilo(pos_h, pos_l);
  995
  996         return do_preadv(fd, vec, vlen, pos, 0);
  997 }


Actually, the relevant code at fs/read_write.c:1162 is:

COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
        const struct compat_iovec __user *,vec,
        compat_ulong_t, vlen, u32, pos_low, u32, pos_high)
{
    loff_t pos = ((loff_t)pos_high << 32) | pos_low;

    return do_compat_preadv64(fd, vec, vlen, pos, 0);
}

but it amounts to the same thing as far the arguments are concerned.

The idea is not really to align the argument to zero pass, but rather to split
the possible 64-bits argument in high and low as required (as the default
implementation was doing [2]). On tile, it is working because the preadv.c
offset is 32-bits and thus the high word is indeed zero, but it is passing
one superfluous argument.


No, what happens is that instead of passing r3=pos_low=lo, r4=pos_high=0, we
are passing r3=pos_low=0, r4=pos_high=lo, r5=0.  This means that we get a crazy
high offset and either fail or (for a sufficiently large file) get the wrong data.
Filed as bug 20261.

It appears to be true for both preadv and pwritev, though I only tested preadv.

Also my understanding is this generic implementation
does work correctly in every architecture because __LONG_LONG_PAIR relies
on endianess.


In fact that's another bug; __LONG_LONG_PAIR is intended only to
be used if you are simulating passing a 64-bit value in 32-bit registers,
where the ABI naturally splits the 64 bits into a high and low part
according to endianness.

In this example we are calling into the kernel where it expects a pos_low
and a pos_high in a particular order.  On a big-endian machine, the
__LONG_LONG_PAIR will put the high part first and the kernel will
see the wrong offset.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

Follow-Ups:
- Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
  - From: Adhemerval Zanella

References:
- [PATCH] Fix p{readv,writev}{64} consolidation implementation
  - From: Adhemerval Zanella
- Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
  - From: Mike Frysinger
- Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
  - From: Adhemerval Zanella

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]