This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC v3 03/23] sysdeps/wait: Use waitid if avaliable


Rich Felker <dalias@libc.org> writes:

> On Sun, Jul 21, 2019 at 12:03:10AM -0400, Rich Felker wrote:
>> On Tue, Jul 16, 2019 at 05:08:48PM -0700, Alistair Francis wrote:
>> > If the waitid syscall is avaliable let's use that as waitpid
>> > and wait4 aren't always avaliable (they aren't avaliable on RV32).
>> > 
>> > Unfortunately waitid is substantially differnt to waitpid and wait4, so
>> > the conversion ends up being complex.
>> > 
>> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
>> > ---
>> >  ChangeLog                                  |  3 ++
>> >  sysdeps/unix/sysv/linux/wait.c             | 39 ++++++++++++++++--
>> >  sysdeps/unix/sysv/linux/waitpid.c          | 46 ++++++++++++++++++++++
>> >  sysdeps/unix/sysv/linux/waitpid_nocancel.c | 45 +++++++++++++++++++++
>> >  4 files changed, 130 insertions(+), 3 deletions(-)
>> > [...]
>> >  
>> >  weak_alias (__libc_wait, __wait)
>> > diff --git a/sysdeps/unix/sysv/linux/waitpid.c b/sysdeps/unix/sysv/linux/waitpid.c
>> > index f0897574c0..7d4e0bb77d 100644
>> > --- a/sysdeps/unix/sysv/linux/waitpid.c
>> > +++ b/sysdeps/unix/sysv/linux/waitpid.c
>> > @@ -20,12 +20,58 @@
>> >  #include <sysdep-cancel.h>
>> >  #include <stdlib.h>
>> >  #include <sys/wait.h>
>> > +#include <unistd.h>
>> >  
>> >  __pid_t
>> >  __waitpid (__pid_t pid, int *stat_loc, int options)
>> >  {
>> >  #ifdef __NR_waitpid
>> >    return SYSCALL_CANCEL (waitpid, pid, stat_loc, options);
>> > +#elif defined(__NR_waitid)
>> > +  __pid_t ret;
>> > +  idtype_t idtype = P_PID;
>> > +  siginfo_t infop;
>> > +
>> > +  if (pid < -1) {
>> > +    idtype = P_PGID;
>> > +    pid *= -1;
>> > +  } else if (pid == -1) {
>> > +    idtype = P_ALL;
>> > +  } else if (pid == 0) {
>> > +    idtype = P_PGID;
>> > +    pid = getpgrp();
>> > +  }
>> > +
>> > +  options |= WEXITED;
>> > +
>> > +  ret = SYSCALL_CANCEL (waitid, idtype, pid, &infop, options, NULL);
>> 
>> This emulation has a fundamental race condition. Between getpgrp and
>> waitid, a signal handler may perform setpgrp, setsid, and/or fork in
>> ways that cause the wrong pgid to be passed to the waitid syscall.
>> There is no way around this because you cannot block signals for the
>> interval, since signals must be able to interrupt the waitid syscall.
>> 
>> Unless there's some trick I'm missing here, the kernel folks' removal
>> of the wait4 syscall is just a bug in the kernel that they need to
>> fix. It also makes it impossible to implement the wait4 function,
>> since there's no way to get rusage for the exited process.
>
> Reportedly (via Stefan O'Rear just now) there was a similar kernel bug
> introduced in 161550d74c07303ffa6187ba776f62df5a906a21 that makes
> wait4 fail to honor pgrp changes that happen while already in the
> syscall (e.g. performed on the caller by another thread or even
> another process).

I could not find the report from Stefan O'Rear.

Does that result in actual problems for programs or is this a
theoretical race noticed upon code review?

> But the race condition here in userspace is even
> more egregious I think, since it violates the contract in a case where
> there is a clear observable order between the pgrp change and the
> blocking wait -- for instance, the signal handler could change pgrp
> of itself and a child process, and then whether or not the signal
> handler had executed before the waitpid, the waitpid should catch the
> child's exit. But with the above race, it fails to.

Definitely a bigger issue.  I believe posix requires waitpid is required
to be signal safe.

Eric


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]