[PATCH v8 0/7] Add pidfd and cgroupv2 support for process creation
Rich Felker
dalias@libc.org
Mon Aug 21 13:55:51 GMT 2023
On Mon, Aug 21, 2023 at 08:53:53AM +0200, Florian Weimer wrote:
> * Rich Felker:
>
> > On Fri, Aug 18, 2023 at 11:06:35AM -0300, Adhemerval Zanella via Libc-alpha wrote:
> >> The glibc 2.36 added wrappers for Linux syscall pidfd_open, pidfd_getfd,
> >> and pidfd_send_signal, and exported the P_PIDFD to use along with
> >> waitid. The pidfd is a race-free interface, however, the pidfd_open is
> >> subject to TOCTOU if the file descriptor is not obtained directly from
> >> the clone or clone3 syscall (there is still a small window between the
> >> clone return and the pidfd_getfd where the process can be reaped and the
> >> process ID reused).
> >
> > Unless I'm missing something, that window is purely programmer error.
> > The pid belongs to the parent process, that called fork, posix_spawn,
> > clone, or whatever, and is responsible for not freeing it until it's
> > done using it.
> >
> > Yes this can happen if you install a SIGCHLD handler that reaps
> > anything it sees, or if you're calling wait without a pid. This is
> > programming error. If you're stuck with code outside your control that
> > makes that mistake, you can already avoid it with clone by setting the
> > child exit signal to 0 rather than SIGCHLD. But it's best just not to
> > do that.
>
> I think clone3 with exit_signal set to 0 and CLONE_PIDFD allows the
> creation of subprocesses that are difficult to observe by accident from
> the rest of the process, while obtaining a stable identifier for the
> process. I do not think there is any other way to achieve that. I
> think it's desirable to expose this functionality in some way.
Indeed that seems like useful functionality to expose for cases where
you can't fix some bad code, but there are lots of issues with how
clone3 (and even clone) should behave with respect to the child
environment when you don't exec -- is it _Fork-like or fork-like? Can
you use AS-unsafe interfaces in the child of a MT parent? Etc. This
should all be discussed on libc-coord not libc-alpha, IMO.
Independent of that, though, I'd like to focus on the fact that
randomly reaping children with no regard to what part of your process
owned those pids has been wrong and broken practice for something like
4 decades. Just like looping over a range of fds and closing
everything is broken. It's a type of UAF bug and it's not a pattern we
suddenly need to support because systemd or whoever says so. It's a
bug that's easily detected with static analysis (any calls to wait or
to waitpid/waitid without the pid specified, or SIGCHLD handlers) and
usually easy to fix -- and then the fix works on every POSIX-ish
platform that's existed, pretty much ever, not just on GNU/Linux.
Rich
More information about the Libc-alpha
mailing list