This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v4] Fix Linux fcntl OFD locks for non-LFS architectures (BZ#20251)
On 09/07/2018 11:56, Stefan Liebler wrote:
> On 07/09/2018 04:20 PM, Adhemerval Zanella wrote:
>>
>>
>> On 09/07/2018 10:44, Stefan Liebler wrote:
>>> On 07/06/2018 04:45 PM, Adhemerval Zanella wrote:
>>>>
>>>>
>>>> On 06/07/2018 11:09, Adhemerval Zanella wrote:
>>>>>
>>>>>
>>>>> On 06/07/2018 10:00, Stefan Liebler wrote:
>>>>>> On 06/20/2018 11:43 PM, Adhemerval Zanella wrote:
>>>>>>> Changes from previous version:
>>>>>>>
>>>>>>> - Add a testcase for compat fcntl using OFD locks.
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> This patch fixes the OFD ("file private") locks for architectures that
>>>>>>> support non-LFS flock definition (__USE_FILE_OFFSET64 not defined). The
>>>>>>> issue in this case is both F_OFD_{GETLK,SETLK,SETLKW} and
>>>>>>> F_{SET,GET}L{W}K64 expects a flock64 argument and when using old
>>>>>>> F_OFD_* flags with a non LFS flock argument the kernel might interpret
>>>>>>> the underlying data wrongly. Kernel idea originally was to avoid using
>>>>>>> such flags in non-LFS syscall, but since GLIBC uses fcntl with LFS
>>>>>>> semantic as default it is possible to provide the functionality and
>>>>>>> avoid the bogus struct kernel passing by adjusting the struct manually
>>>>>>> for the required flags.
>>>>>>>
>>>>>>> The idea follows other LFS interfaces that provide two symbols:
>>>>>>>
>>>>>>> 1. A new LFS fcntl64 is added on default ABI with the usual macros to
>>>>>>> select it for FILE_OFFSET_BITS=64.
>>>>>>>
>>>>>>> 2. The Linux non-LFS fcntl use a stack allocated struct flock64 for
>>>>>>> F_OFD_{GETLK,SETLK,SETLKW} copy the results on the user provided
>>>>>>> struct.
>>>>>>>
>>>>>>> 3. Keep a compat symbol with old broken semantic for architectures
>>>>>>> that do not define __OFF_T_MATCHES_OFF64_T.
>>>>>>>
>>>>>>> So for architectures which defines __USE_FILE_OFFSET64, fcntl64 will
>>>>>>> aliased to fcntl and no adjustment would be required. So to actually
>>>>>>> use F_OFD_* with LFS support the source must be built with LFS support
>>>>>>> (_FILE_OFFSET_BITS=64).
>>>>>>>
>>>>>>> Also F_OFD_SETLKW command is handled a cancellation point, as for
>>>>>>> F_SETLKW{64}.
>>>>>>>
>>>>>>> Checked on x86_64-linux-gnu and i686-linux-gnu.
>>>>>>>
>>>>>> ...
>>>>>>
>>>>>> Hi Adhemerval,
>>>>>>
>>>>>> I'm running the new test misc/tst-ofdlocks-compat on s390-32.
>>>>>> If I run it on linux 4.17, the test succeeds and after the second fcntl call which returns zero, lck contains the region from the first fcntl call:
>>>>>> (gdb) p/x lck
>>>>>> $2 = {l_type = 0x1, l_whence = 0x0, l_start = 0x800003ff, l_len = 0x400, l_pid = 0xffffffff}
>>>>>>
>>>>>> If I run it on linux 4.14, the test fails. There the second fcntl returns -1 and errno = EOVERFLOW
>>>>>> In this case, lck is not updated:
>>>>>> p/x lck
>>>>>> $4 = {l_type = 0x1, l_whence = 0x0, l_start = 0x7ffffbff, l_len = 0x1000, l_pid = 0x0}
>>>>>>
>>>>>> In both cases struct flock64 is just passed to syscall fcntl64 via __old_libc_fcntl64.
>>>>>>
>>>>>> Are the different behaviours related to a change in kernel-code?
>>>>>
>>>>> I think it is due the patch 'fcntl: don't cap l_start and l_end values
>>>>> for F_GETLK64 in compat syscall' (4d2dc2cc766c3b51929658cacbc6e34fc8e242fb
>>>>> added on v4.15). Previously for COMPAT_SYSCALL_DEFINE3(fcntl64,...) kernel
>>>>> did:
>>>>>
>>>>> static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
>>>>> compat_ulong_t arg)
>>>>> [...]
>>>>> case F_GETLK64:
>>>>> case F_OFD_GETLK:
>>>>> err = get_compat_flock64(&flock, compat_ptr(arg));
>>>>> if (err)
>>>>> break;
>>>>> err = fixup_compat_flock(&flock);
>>>>> if (err)
>>>>> return err;
>>>>> err = put_compat_flock64(&flock, compat_ptr(arg));
>>>>> break;
>>>>> [....]
>>>>>
>>>>> It means that even if the fcntl(..., F_OFD_GETLK, struct flock64) did not
>>>>> fail, kernel will return EOVERFLOW due 'fixup_compat_flock'. The patch
>>>>> changed to:
>>>>>
>>>>> [...]
>>>>> case F_GETLK64:
>>>>> case F_OFD_GETLK:
>>>>> err = get_compat_flock64(&flock, compat_ptr(arg));
>>>>> if (!err)
>>>>> err= put_compat_flock64(&flock, compat_ptr(arg));
>>>>> break;
>>>>> [....]
>>>>>
>>> Yes, this sounds reasonable.
>>> It was introduced with commit 'fs/locks: don't mess with the address limit in compat_fcntl64' (94073ad77fff221b5e66b8b9863a546ba212d6a3)
>>>
>>>>> So if compat fcntl will just return if get_compat_flock64 succeed. Also
>>>>> afaiu only compat syscall is affected by it I think, the generic OFD
>>>>> locks for 32 bits go through fcntl_getlk64 and it does return EOVERFLOW
>>>>> in the aforementioned case.
>>>>>
>>>>> I am not sure which would be the best way to get around this kernel
>>>>> issue, any suggestion to harden the testcase? It also suggest to me that
>>>>> the possible usercase I assume in my testcase (OFD locks with struct
>>>>> flock64) never really worked on previous releases...
>>>>>
>>>>
>>>> Also, on x86 4.4 where actually tested the kernel OFD locks does:
>>>>
>>>> fs/compat.c:
>>>> [...]
>>>> case F_GETLK64:
>>>> case F_SETLK64:
>>>> case F_SETLKW64:
>>>> case F_OFD_GETLK:
>>>> case F_OFD_SETLK:
>>>> case F_OFD_SETLKW:
>>>> ret = get_compat_flock64(&f, compat_ptr(arg));
>>>> if (ret != 0)
>>>> break;
>>>> old_fs = get_fs();
>>>> set_fs(KERNEL_DS);
>>>> conv_cmd = convert_fcntl_cmd(cmd);
>>>> ret = sys_fcntl(fd, conv_cmd, (unsigned long)&f);
>>>> set_fs(old_fs);
>>>> if ((conv_cmd == F_GETLK || conv_cmd == F_OFD_GETLK) && ret == 0) {
>>>> /* need to return lock information - see above for commentary */
>>>> if (f.l_start > COMPAT_LOFF_T_MAX)
>>>> ret = -EOVERFLOW;
>>>> if (f.l_len > COMPAT_LOFF_T_MAX)
>>>> f.l_len = COMPAT_LOFF_T_MAX;
>>>> if (ret == 0)
>>>> ret = put_compat_flock64(&f, compat_ptr(arg));
>>>> }
>>>> break;
>>>> [...]
>>>>
>>>> Which is similar to 4.14, however COMPAT_LOFF_T_MAX is defined as
>>>> 0x7fffffffffffffffL for all architectures and l_start/l_len are
>>>> defined as __kernel_long_t (which for a 64 bits kernel is 'long').
>>>> So the tests are not actually checking for overflow.
>>> Yes. They do not test for overflowing 32bit long. But as fcntl64 with command F_OFD_GETLK assumes 64bit long, it is fine?
>>>>
>>>> Recent kernels do the right overflow check by using COMPAT_OFF_T_MAX
>>>> with a correct value (0x7fffffff on x86). It seems to be fixed by
>>>> 80f0cce6aadebf6caf74d1f8ceb4b008ca72a9e9 (v4.12).
>>>>
>>> Sure? This commit introduces the following implementation and I think those checks are the same as before:
>>> COMPAT_SYSCALL_DEFINE3(fcntl64,...
>>
>> Not sure, I just try to see if there a kernel change by inspecting the
>> kernel patches.
>>
>>> ...
>>> case F_GETLK:
>>> ... if (f.l_start > COMPAT_OFF_T_MAX)
>>> ret = -EOVERFLOW;
>>> ...
>>> case F_OFD_GETLK:
>>> ... if (f.l_start > COMPAT_LOFF_T_MAX)
>>> ret = -EOVERFLOW;
>>>
>>> In recent kernels, only the command F_GETLK is using fixup_compat_flock() which checks against COMPAT_OFF_T_MAX.
>>>
>>> The errno=EOVERFLOW for the second fcntl call in tst-ofdlocks.c is generated by the new glibc check, but not from the kernel.
>>
>> This is the expected behaviour, the new fcntl symbol for non default LFS
>> ABI returns EOVERFLOW.
>>
>>> I've got errno=EOVERFLOW with my kernel 4.14 due to the call of fixup_compat_flock.
>>> I've also retested it on a system with kernel 4.10. There the kernel does not return with errno=EOVERFLOW.
>>>
>>
>> Can you verify that for s390-32 tst-ofdlocks-compat.c is a really using the
>> old version (i.e passing the arguments to kernel unmodified)? Otherwise
>> I am not sure if this is a glibc bug.
>>
> Yes it is. I'm running the s390-32 tst-ofdlocks-compat testcase in 64bit gdb and we are here:
> 72 TEST_VERIFY (fcntl (fd, F_OFD_GETLK, &lck) == 0);
>
> (gdb) p sizeof (lck)
> $2 = 32
>
> (gdb) p/x lck
> $3 = {l_type = 0x1, l_whence = 0x0, l_start = 0x7ffffbff, l_len = 0x1000,
> l_pid = 0x0}
>
> (gdb) x/4gx &lck
> 0x7fffeae8: 0x0001000000000000 0x000000007ffffbff
> 0x7fffeaf8: 0x0000000000001000 0x0000000000000000
>
> Stepping to syscall: ...
>
> (gdb) where
> #0 __fcntl64_nocancel_adjusted (fd=fd@entry=4, cmd=cmd@entry=36, arg=arg@entry=0x7fffeae8)
> at ../sysdeps/unix/sysv/linux/fcntl_nocancel.c:64
> #1 0x7df1e076 in __GI___libc_fcntl64 (fd=4, cmd=36) at ../sysdeps/unix/sysv/linux/fcntl64.c:51
> #2 0x7df1e006 in __old_libc_fcntl64 (fd=<optimized out>, cmd=<optimized out>) at ../sysdeps/unix/sysv/linux/fcntl.c:114
> #3 0x00401a98 in do_test () at ../sysdeps/unix/sysv/linux/tst-ofdlocks-compat.c:72
> #4 0x00401f42 in run_test_function (config=0x7fffebe8, config=0x7fffebe8, argv=0x7fffed48, argc=<optimized out>)
> at support_test_main.c:156
> #5 support_test_main (argc=<optimized out>, argv=0x7fffed48, config=config@entry=0x7fffebe8) at support_test_main.c:322
> #6 0x004017b0 in main (argc=<optimized out>, argv=<optimized out>) at ../support/test-driver.c:168
>
> We are just before the syscall:
> >│0x7df25fe0 <__fcntl64_nocancel_adjusted+32> svc 221
>
> (gdb) x/4gx arg
> 0x7fffeae8: 0x0001000000000000 0x000000007ffffbff
> 0x7fffeaf8: 0x0000000000001000 0x0000000000000000
>
> (gdb) si
> (gdb) ir 2
> r2 0xffffffffffffffb5 18446744073709551541
>
> (gdb) x/4gx arg
> 0x7fffeae8: 0x0001000000000000 0x000000007ffffbff
> 0x7fffeaf8: 0x0000000000001000 0x0000000000000000
>
Right, so we are sure the test is actually doing what is intending to do.
Now back to kernel, afaiu commit 4d2dc2cc76 does reference it fixes 94073ad77fff2:
---
fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
Currently, we're capping the values too low in the F_GETLK64 case. The
fields in that structure are 64-bit values, so we shouldn't need to do
any sort of fixup there.
Make sure we check that assumption at build time in the future however
by ensuring that the sizes we're copying will fit.
With this, we no longer need COMPAT_LOFF_T_MAX either, so remove it.
Fixes: 94073ad77fff2 (fs/locks: don't mess with the address limit in compat_fcntl64)
---
And 94073ad77fff2 does add a fixup_compat_flock check for F_OFD_GETLK (not
only for F_GETLK64):
---
+ case F_GETLK64:
+ case F_OFD_GETLK:
+ err = get_compat_flock64(&flock, compat_ptr(arg));
+ if (err)
+ break;
+ err = fcntl_getlk(f.file, convert_fcntl_cmd(cmd), &flock);
+ if (err)
+ break;
+ err = fixup_compat_flock(&flock);
+ if (err)
+ return err;
+ err = put_compat_flock64(&flock, compat_ptr(arg));
+ break;
---
In any way, it is still a kernel issue because prior 94073ad77fff2 F_OFD_GETLK
were handle as:
---
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
case F_OFD_GETLK:
case F_OFD_SETLK:
case F_OFD_SETLKW:
ret = get_compat_flock64(&f, compat_ptr(arg));
if (ret != 0)
break;
old_fs = get_fs();
set_fs(KERNEL_DS);
conv_cmd = convert_fcntl_cmd(cmd);
ret = sys_fcntl(fd, conv_cmd, (unsigned long)&f);
set_fs(old_fs);
if ((conv_cmd == F_GETLK || conv_cmd == F_OFD_GETLK) && ret == 0) {
/* need to return lock information - see above for commentary */
if (f.l_start > COMPAT_LOFF_T_MAX)
ret = -EOVERFLOW;
if (f.l_len > COMPAT_LOFF_T_MAX)
f.l_len = COMPAT_LOFF_T_MAX;
if (ret == 0)
ret = put_compat_flock64(&f, compat_ptr(arg));
}
break;
---
And COMPAT_LOFF_T_MAX was defined as 0x7fffffffffffffffL for all architectures.
So it seems that kernel between 4.13 through 4.15 have this issue with for
compat kernels and I do think it is a kernel issue because fcntl64 is the
expected way to use OFD locks. GLIBC returns EOVERFLOW because from
application standpoint, it should use LFS variant instead.