This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC v3 03/23] sysdeps/wait: Use waitid if avaliable
- From: ebiederm at xmission dot com (Eric W. Biederman)
- To: Arnd Bergmann <arnd at arndb dot de>
- Cc: Rich Felker <dalias at libc dot org>, Alistair Francis <alistair dot francis at wdc dot com>, GNU C Library <libc-alpha at sourceware dot org>, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, Florian Weimer <fweimer at redhat dot com>, Palmer Dabbelt <palmer at sifive dot com>, macro at wdc dot com, Zong Li <zongbox at gmail dot com>, Alistair Francis <alistair23 at gmail dot com>, Andrew Morton <akpm at linux-foundation dot org>, Linus Torvalds <torvalds at linux-foundation dot org>, Al Viro <viro at zeniv dot linux dot org dot uk>, Christian Brauner <christian at brauner dot io>, "H. Peter Anvin" <hpa at zytor dot com>
- Date: Sun, 21 Jul 2019 07:15:24 -0500
- Subject: Re: [RFC v3 03/23] sysdeps/wait: Use waitid if avaliable
- References: <cover.1563321715.git.alistair.francis@wdc.com> <2d3359a195633b85e01f83bf536330d72f7bc8aa.1563321715.git.alistair.francis@wdc.com> <20190721040310.GA3283@brightrain.aerifal.cx> <CAK8P3a0bhS_d1SMyLQO_LDNvdgR+MKXmoESxWkrcz22mvjKcuw@mail.gmail.com>
Arnd Bergmann <arnd@arndb.de> writes:
> On Sun, Jul 21, 2019 at 6:03 AM Rich Felker <dalias@libc.org> wrote:
>> On Tue, Jul 16, 2019 at 05:08:48PM -0700, Alistair Francis wrote:
>> > #ifdef __NR_waitpid
>> > return SYSCALL_CANCEL (waitpid, pid, stat_loc, options);
>> > +#elif defined(__NR_waitid)
>> > + __pid_t ret;
>> > + idtype_t idtype = P_PID;
>> > + siginfo_t infop;
>> > +
>> > + if (pid < -1) {
>> > + idtype = P_PGID;
>> > + pid *= -1;
>> > + } else if (pid == -1) {
>> > + idtype = P_ALL;
>> > + } else if (pid == 0) {
>> > + idtype = P_PGID;
>> > + pid = getpgrp();
>> > + }
>> > +
>> > + options |= WEXITED;
>> > +
>> > + ret = SYSCALL_CANCEL (waitid, idtype, pid, &infop, options, NULL);
>>
>> This emulation has a fundamental race condition. Between getpgrp and
>> waitid, a signal handler may perform setpgrp, setsid, and/or fork in
>> ways that cause the wrong pgid to be passed to the waitid syscall.
>> There is no way around this because you cannot block signals for the
>> interval, since signals must be able to interrupt the waitid syscall.
>
> Interesting, I don't think anyone ever considered this race, as waitid()
> was clearly always /intended/ as a superset of wait(), waitpid(), wait3()
> and wait4(), as Roland McGrath described in the initial creation
> of the waitid() syscall, see
> https://lore.kernel.org/lkml/200408152303.i7FN3Vc3021030@magilla.sf.frob.com/
>
It is definitley a problem as setpid and setpgrp and setsid are all
required to be signal safe, and this waitpid emulation is clearly not.
> I originally planned to have a new waitid_time64() syscall to replace
> them in the y2038 series, but that got deferred, so at the moment
> the older 32-bit architectures don't have any changes for wait*(), and
> riscv32 just removed wait4() but kept waitid().
>
> It would be easy enough to restore wait4() for riscv32, or we could
> decide to extend waitid() in a backward-compatible way to close
> this race, e.g. by allowing P_PGID with pid=0 (this is currently
> -EINVAL), by adding a new P_SAME_PGID macro to cover that
> case, or by creating a new waitid replacement that does other things
> as well (nanosecond ru_*time, pidfd, ...).
>
> Adding a few more kernel developers that may have an opinion on this.
In a different reply it was mentioned that wait4 has had an issue since
2008, where it does not track changes in the current processes pgrp.
I would like to get resolution on if that is a real problem for anyone
or if it is a something theoretical found on code review, before we deal
with this issue. Because tracking pgrp changes will require a deeper
code change.
My gut feel says either fix waitid (with P_SAME_PGID) or clearly document
why the kernel's waitid is not a replacement for wait4, and add wait4 to
riscv.
Would adding wait4 to riscv allow us to get rid of the rusage parameter
of the kernel's waitid?
I don't oppose a new syscall to take advantage of pidfd but I don't
think there is any need to tie the two fixes together so I would rather
keep them separate. Just so we don't rush through fixing one to deal
with the other.
Eric