This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Chris Metcalf <cmetcalf at mellanox dot com>, libc-alpha at sourceware dot org, Mike Frysinger <vapier at gentoo dot org>, Chung-Lin Tang <cltang at codesourcery dot com>
- Date: Tue, 21 Jun 2016 10:17:48 -0300
- Subject: Re: [PATCH] Fix p{readv,writev}{64} consolidation implementation
- Authentication-results: sourceware.org; auth=none
- References: <1465941275-3459-1-git-send-email-adhemerval dot zanella at linaro dot org> <20160615053714 dot GM4053 at vapier dot lan> <57615BFB dot 2050101 at linaro dot org> <e9d7c072-9fd7-e4ab-5b65-6974a7597c77 at mellanox dot com> <5761B7ED dot 6040102 at linaro dot org> <fc3b9249-9fa8-ed16-980a-af69a65dacf9 at mellanox dot com> <5762CA72 dot 8080000 at linaro dot org> <cc81cdf8-b004-c9b4-d507-799c238f3a4e at mellanox dot com>
On 20/06/2016 16:26, Chris Metcalf wrote:
> (+Chung-Lin Tang for possible nios2 bug.)
>
> On 6/16/2016 11:49 AM, Adhemerval Zanella wrote:
>> On 16/06/2016 12:25, Chris Metcalf wrote:
>>> On 6/15/2016 4:17 PM, Adhemerval Zanella wrote:
>>>> On 15/06/2016 15:21, Chris Metcalf wrote:
>>>>> On 6/15/2016 9:45 AM, Adhemerval Zanella wrote:
>>>>>> On 15/06/2016 02:37, Mike Frysinger wrote:
>>>>>>> On 14 Jun 2016 18:54, Adhemerval Zanella wrote:
>>>>>>>> This patch fixes the p{readv,writev}{64} consolidation implementation
>>>>>>>> from commits 4e77815 and af5fdf5. Different from pread/pwrite
>>>>>>>> implementation, preadv/pwritev implementation does not require
>>>>>>>> __ALIGNMENT_ARG because kernel syscall prototypes define
>>>>>>>> the high and low part of the off_t, if it is the case, directly
>>>>>>>> (different from pread/pwrite where the architecture ABI for passing
>>>>>>>> 64-bit values must be in consideration for passsing the arguments).
>>>>>>> i had looked at that specifically but thought it ok because the old code
>>>>>>> was using the alignment arg. was the old code broken too ?
>>>>>>>
>>>>>>> this is what the preadv code looked like:
>>>>>>> -ssize_t
>>>>>>> -__libc_preadv (int fd, const struct iovec *vector, int count, off_t offset)
>>>>>>> -{
>>>>>>> - assert (sizeof (offset) == 4);
>>>>>>> - return SYSCALL_CANCEL (preadv, fd,
>>>>>>> - vector, count, __ALIGNMENT_ARG
>>>>>>> - __LONG_LONG_PAIR (offset >> 31, offset));
>>>>>>> -}
>>>>>>>
>>>>>>> although i guess this isn't too surprising as this code was in the
>>>>>>> generic sysdeps dir which currently doesn't have as many users as
>>>>>>> we wish it did :).
>>>>>>>
>>>>>> The idea is not really to align the argument to zero pass, but rather to split
>>>>>> the possible 64-bits argument in high and low as required (as the default
>>>>>> implementation was doing [2]). On tile, it is working because the preadv.c
>>>>>> offset is 32-bits and thus the high word is indeed zero, but it is passing
>>>>>> one superfluous argument.
>>>>> No, what happens is that instead of passing r3=pos_low=lo, r4=pos_high=0, we
>>>>> are passing r3=pos_low=0, r4=pos_high=lo, r5=0. This means that we get a crazy
>>>>> high offset and either fail or (for a sufficiently large file) get the wrong data.
>>>>> Filed as bug 20261.
>>>> I was referring to *old* behaviour (pre-consolidation) implementation
>>>> (the sysdeps/unix/sysv/linux/generic/wordsize-32/preadv.c), which did:
>>>>
>>>> --
>>>> return SYSCALL_CANCEL (preadv, fd,
>>>> vector, count, __ALIGNMENT_ARG
>>>> __LONG_LONG_PAIR (offset >> 31, offset));
>>>> --
>>>>
>>>> It has 2 issue:
>>>>
>>>> 1. It passed one superfluous argument to preadv. On tilepro build
>>>> I noted that is using internal_syscall6, which means it is passing
>>>> { fd, vector, cound, 0, offset, offset>>31 }. It is not wrong,
>>>> since for this code off_t will be the old 32-bit value, but the
>>>> semantic is wrong.
>>> No, it's definitely wrong :-)
>>>
>>> We pass "offset" as the high part, and "0" as the low part. Accordingly, what
>>> the kernel sees is a request for "offset << 32" rather than "offset". The bug
>>> report I filed contains a repro that does fail on tilegx32.
>> I am not following because I am referring to the old code that was *suppose*
>> to work on tile (either on 32-bits kernel or 64-bits kernel with compat
>> enabled). I rechecked the compilation against GLIBC 2.23 (which does not
>> contain this consolidation) and I am seeing tilepro passing:
>>
>> "R00" (fd), "R01" (vector), "R02" (count), "R03" (0), "R04" (offset), "R05" (offset >> 31)
>
> This is wrong, and this was indeed a bug prior to your consolidation patch.
> This is a generic/wordsize-32 specific bug which apparently was not caught
> during testing of tilepro or tilegx32. In addition to the erroneous __ALIGNMENT_ARG,
> we should be using LO_HI_LONG() here rather than __LONG_LONG_PAIR.
> I suspect this is also a bug in the nios2 glibc.
Ah right then.
>
>> Which indeed seems wrong with the issue you pointed out. So how was this
>> suppose to work on previous GLIBC?
>
> It doesn't. That's why I filed the bug.
Ack.
>
>> The new consolidation implementation indeed contains a bug and that's what
>> my v2 [1] is trying to fix. With the v2 I am not seeing:
>>
>> "R00" (fd), "R01" (vector), "R02" (count), "R03" ((long) (offset)), "R04" ((long) (((uint64_t) (offset)) >> 32))
>>
>> [1]https://sourceware.org/ml/libc-alpha/2016-06/msg00607.html
>
> When you say "not seeing", I'm assuming you mean "now seeing".
>
> In any case, "vector, count, lo, hi" is what we should see here.
Yes, s/not/now. Right, that is my understanding as well.
>
>>>> 2. __LONG_LONG_PAIR is not correct for big-endian.
>>> Yes, this needs to be fixed on the kernel side for consistency. Many other
>>> syscalls pass their arguments with __LONG_LONG_PAIR. Handling it this way
>>> in the kernel seems pretty standard at this point, e.g. regs_to_64 in
>>> arch/arm64/include/asm/assembler.h, or merge_64 in
>>> arch/mips/kernel/linux32.c. Even the fixed-endian platforms do it
>>> consistently, e.g. "hi, lo" on parisc/s390/sparc and "lo, hi" on metag/x86.
>>>
>>> It's possible little-endian compat powerpc shares this bug since it seems to
>>> take fixed "hi, lo" arguments, e.g. in compat_sys_readahead(). I'll
>>> forward this email on to the powerpc kernel maintainers to make sure
>>> they have a chance to think about it.
>>>
>>> Seehttps://lkml.kernel.org/g/1466019219-10462-1-git-send-email-cmetcalf@mellanox.com
>> I *really* do not think this is correct approach mainly because it break an
>> already defined kernel ABI and old BE preadv call on GLIBC was not done
>> by using __LONG_LONG_PAIR. Old GLIBC will just fail in this new kernel
>> interface. Unfortunately this is something we will need to carry on IMHO.
>>
>> That's why I just removed the wrong assumption in old
>> sysdeps/unix/sysv/linux/generic/wordsize-32/preadv{64} about __LONG_LONG_PAIR.
>
> Yes, there are two issues here that I am conflating - sorry about that.
>
> The preadv/pwritev stuff should be using LO_HI_LONG. I think your
> v2 patch is correct. No kernel change is required here.
Nice, I was with the wrong assumption you had concerns regarding preadv/writev
patch.
>
> When I went and reviewed this stuff for tile, it led me to the tile kernel ABI
> code that was not properly handling __LONG_LONG_PAIR in the kernel
> for APIs that expect 64-bit values like truncate/ftruncate, pread64/pwrite64,
> sync_file_range, and fallocate.
>
> I think our installed base of "tilepro/tilegx32 big-endian" is so small as to
> be effectively zero. We had one or two customers interested in big-endian 64-bit,
> and a couple of customers interested in tilegx compat 32-bit, but I don't think we
> ever had customers interested in the cross product of those two features.
>
> Generally speaking I would otherwise agree and say we should provide
> tile-specific variants of all those broken syscalls in glibc to support the existing ABI.
Right, I was assuming a more conservative approach regarding kernel/libc
API. But I see your point regarding the base installation.
>
>>>> Now for BZ#20261 I do not think it applicable since this is a fix for
>>>> consolidation done in development phase, it does not appear in any
>>>> released version.
>>> It breaks on all versions, as far as I can tell. I tested it on our RHEL 6 platform
>>> (based on backported glibc 2.12) and it fails there.
>> What exactly breaks in all version? What I am referring is that not the
>> bug is 'invalid', but rather it does not apply to a *user visible* one
>> because the consolidation code is only in 2.24 master branch (it is not
>> in any *released* version).
>>
>> Now, if it fails on older releases it is not due this change, but rather
>> due old sysdeps/unix/sysv/linux/generic/wordsize-32/preadv{64}. Now, if
>> it is the case it is indeed a user visible bug.
>
> Yes, that's exactly the case for preadv/pwritev, due to the bogus _ALIGNMENT_ARG;
> it breaks on all 32-bit tile platforms with all public releases.
>
Right, I think we now aligned about pread/pwritev and I will commit it soon. Thanks
for the reply!