This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC v3 03/23] sysdeps/wait: Use waitid if avaliable


On Sun, Jul 21, 2019 at 12:03:10AM -0400, Rich Felker wrote:
> On Tue, Jul 16, 2019 at 05:08:48PM -0700, Alistair Francis wrote:
> > If the waitid syscall is avaliable let's use that as waitpid
> > and wait4 aren't always avaliable (they aren't avaliable on RV32).
> > 
> > Unfortunately waitid is substantially differnt to waitpid and wait4, so
> > the conversion ends up being complex.
> > 
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > ---
> >  ChangeLog                                  |  3 ++
> >  sysdeps/unix/sysv/linux/wait.c             | 39 ++++++++++++++++--
> >  sysdeps/unix/sysv/linux/waitpid.c          | 46 ++++++++++++++++++++++
> >  sysdeps/unix/sysv/linux/waitpid_nocancel.c | 45 +++++++++++++++++++++
> >  4 files changed, 130 insertions(+), 3 deletions(-)
> > [...]
> >  
> >  weak_alias (__libc_wait, __wait)
> > diff --git a/sysdeps/unix/sysv/linux/waitpid.c b/sysdeps/unix/sysv/linux/waitpid.c
> > index f0897574c0..7d4e0bb77d 100644
> > --- a/sysdeps/unix/sysv/linux/waitpid.c
> > +++ b/sysdeps/unix/sysv/linux/waitpid.c
> > @@ -20,12 +20,58 @@
> >  #include <sysdep-cancel.h>
> >  #include <stdlib.h>
> >  #include <sys/wait.h>
> > +#include <unistd.h>
> >  
> >  __pid_t
> >  __waitpid (__pid_t pid, int *stat_loc, int options)
> >  {
> >  #ifdef __NR_waitpid
> >    return SYSCALL_CANCEL (waitpid, pid, stat_loc, options);
> > +#elif defined(__NR_waitid)
> > +  __pid_t ret;
> > +  idtype_t idtype = P_PID;
> > +  siginfo_t infop;
> > +
> > +  if (pid < -1) {
> > +    idtype = P_PGID;
> > +    pid *= -1;
> > +  } else if (pid == -1) {
> > +    idtype = P_ALL;
> > +  } else if (pid == 0) {
> > +    idtype = P_PGID;
> > +    pid = getpgrp();
> > +  }
> > +
> > +  options |= WEXITED;
> > +
> > +  ret = SYSCALL_CANCEL (waitid, idtype, pid, &infop, options, NULL);
> 
> This emulation has a fundamental race condition. Between getpgrp and
> waitid, a signal handler may perform setpgrp, setsid, and/or fork in
> ways that cause the wrong pgid to be passed to the waitid syscall.
> There is no way around this because you cannot block signals for the
> interval, since signals must be able to interrupt the waitid syscall.
> 
> Unless there's some trick I'm missing here, the kernel folks' removal
> of the wait4 syscall is just a bug in the kernel that they need to
> fix. It also makes it impossible to implement the wait4 function,
> since there's no way to get rusage for the exited process.

Reportedly (via Stefan O'Rear just now) there was a similar kernel bug
introduced in 161550d74c07303ffa6187ba776f62df5a906a21 that makes
wait4 fail to honor pgrp changes that happen while already in the
syscall (e.g. performed on the caller by another thread or even
another process). But the race condition here in userspace is even
more egregious I think, since it violates the contract in a case where
there is a clear observable order between the pgrp change and the
blocking wait -- for instance, the signal handler could change pgrp
of itself and a child process, and then whether or not the signal
handler had executed before the waitpid, the waitpid should catch the
child's exit. But with the above race, it fails to.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]