This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation

(+Chung-Lin Tang for possible nios2 bug.)

On 6/16/2016 11:49 AM, Adhemerval Zanella wrote:
On 16/06/2016 12:25, Chris Metcalf wrote:
On 6/15/2016 4:17 PM, Adhemerval Zanella wrote:
On 15/06/2016 15:21, Chris Metcalf wrote:
On 6/15/2016 9:45 AM, Adhemerval Zanella wrote:
On 15/06/2016 02:37, Mike Frysinger wrote:
On 14 Jun 2016 18:54, Adhemerval Zanella wrote:
This patch fixes the p{readv,writev}{64} consolidation implementation
from commits 4e77815 and af5fdf5.  Different from pread/pwrite
implementation, preadv/pwritev implementation does not require
__ALIGNMENT_ARG because kernel syscall prototypes define
the high and low part of the off_t, if it is the case, directly
(different from pread/pwrite where the architecture ABI for passing
64-bit values must be in consideration for passsing the arguments).
i had looked at that specifically but thought it ok because the old code
was using the alignment arg.  was the old code broken too ?

this is what the preadv code looked like:
-__libc_preadv (int fd, const struct iovec *vector, int count, off_t offset)
-  assert (sizeof (offset) == 4);
-  return SYSCALL_CANCEL (preadv, fd,
-                         vector, count, __ALIGNMENT_ARG
-                         __LONG_LONG_PAIR (offset >> 31, offset));

although i guess this isn't too surprising as this code was in the
generic sysdeps dir which currently doesn't have as many users as
we wish it did :).

The idea is not really to align the argument to zero pass, but rather to split
the possible 64-bits argument in high and low as required (as the default
implementation was doing [2]). On tile, it is working because the preadv.c
offset is 32-bits and thus the high word is indeed zero, but it is passing
one superfluous argument.
No, what happens is that instead of passing r3=pos_low=lo, r4=pos_high=0, we
are passing r3=pos_low=0, r4=pos_high=lo, r5=0.  This means that we get a crazy
high offset and either fail or (for a sufficiently large file) get the wrong data.
Filed as bug 20261.
I was referring to *old* behaviour (pre-consolidation) implementation
(the sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c), which did:

    return SYSCALL_CANCEL (preadv, fd,
                           vector, count, __ALIGNMENT_ARG
                           __LONG_LONG_PAIR (offset >> 31, offset));

It has 2 issue:

   1. It passed one superfluous argument to preadv. On tilepro build
      I noted that is using internal_syscall6, which means it is passing
      { fd, vector, cound, 0, offset, offset>>31 }.  It is not wrong,
      since for this code off_t will be the old 32-bit value, but the
      semantic is wrong.
No, it's definitely wrong :-)

We pass "offset" as the high part, and "0" as the low part. Accordingly, what
the kernel sees is a request for "offset << 32" rather than "offset".  The bug
report I filed contains a repro that does fail on tilegx32.
I am not following because I am referring to the old code that was *suppose*
to work on tile (either on 32-bits kernel or 64-bits kernel with compat
enabled). I rechecked the compilation against GLIBC 2.23 (which does not
contain this consolidation) and I am seeing tilepro passing:

"R00" (fd), "R01" (vector), "R02" (count), "R03" (0), "R04" (offset), "R05" (offset >> 31)

This is wrong, and this was indeed a bug prior to your consolidation patch.
This is a generic/wordsize-32 specific bug which apparently was not caught
during testing of tilepro or tilegx32.  In addition to the erroneous __ALIGNMENT_ARG,
we should be using LO_HI_LONG() here rather than __LONG_LONG_PAIR.
I suspect this is also a bug in the nios2 glibc.

Which indeed seems wrong with the issue you pointed out.  So how was this
suppose to work on previous GLIBC?

It doesn't.  That's why I filed the bug.

The new consolidation implementation indeed contains a bug and that's what
my v2 [1] is trying to fix.  With the v2 I am not seeing:

"R00" (fd), "R01" (vector), "R02" (count), "R03" ((long) (offset)), "R04" ((long) (((uint64_t) (offset)) >> 32))


When you say "not seeing", I'm assuming you mean "now seeing".

In any case, "vector, count, lo, hi" is what we should see here.

   2. __LONG_LONG_PAIR is not correct for big-endian.
Yes, this needs to be fixed on the kernel side for consistency. Many other
syscalls pass their arguments with __LONG_LONG_PAIR.  Handling it this way
in the kernel seems pretty standard at this point, e.g. regs_to_64 in
arch/arm64/include/asm/assembler.h, or merge_64 in
arch/mips/kernel/linux32.c.  Even the fixed-endian platforms do it
consistently, e.g. "hi, lo" on parisc/s390/sparc and "lo, hi" on metag/x86.

It's possible little-endian compat powerpc shares this bug since it seems to
take fixed "hi, lo" arguments, e.g. in compat_sys_readahead().  I'll
forward this email on to the powerpc kernel maintainers to make sure
they have a chance to think about it.

I *really* do not think this is correct approach mainly because it break an
already defined kernel ABI and old BE preadv call on GLIBC was not done
by using __LONG_LONG_PAIR.  Old GLIBC will just fail in this new kernel
interface.  Unfortunately this is something we will need to carry on IMHO.

That's why I just removed the wrong assumption in old
sysdeps/unix/sysv/linux/generic/wordsize-32/preadv{64} about __LONG_LONG_PAIR.

Yes, there are two issues here that I am conflating - sorry about that.

The preadv/pwritev stuff should be using LO_HI_LONG.  I think your
v2 patch is correct.  No kernel change is required here.

When I went and reviewed this stuff for tile, it led me to the tile kernel ABI
code that was not properly handling __LONG_LONG_PAIR in the kernel
for APIs that expect 64-bit values like truncate/ftruncate, pread64/pwrite64,
sync_file_range, and fallocate.

I think our installed base of "tilepro/tilegx32 big-endian" is so small as to
be effectively zero.  We had one or two customers interested in big-endian 64-bit,
and a couple of customers interested in tilegx compat 32-bit, but I don't think we
ever had customers interested in the cross product of those two features.

Generally speaking I would otherwise agree and say we should provide
tile-specific variants of all those broken syscalls in glibc to support the existing ABI.

Now for BZ#20261 I do not think it applicable since this is a fix for
consolidation done in development phase, it does not appear in any
released version.
It breaks on all versions, as far as I can tell.  I tested it on our RHEL 6 platform
(based on backported glibc 2.12) and it fails there.
What exactly breaks in all version? What I am referring is that not the
bug is 'invalid', but rather it does not apply to a *user visible* one
because the consolidation code is only in 2.24 master branch (it is not
in any *released* version).

Now, if it fails on older releases it is not due this change, but rather
due old sysdeps/unix/sysv/linux/generic/wordsize-32/preadv{64}.  Now, if
it is the case it is indeed a user visible bug.

Yes, that's exactly the case for preadv/pwritev, due to the bogus _ALIGNMENT_ARG;
it breaks on all 32-bit tile platforms with all public releases.

Chris Metcalf, Mellanox Technologies

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]