[PATCH v7 5/8] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)

Adhemerval Zanella Netto adhemerval.zanella@linaro.org
Fri Aug 11 16:14:53 GMT 2023



On 11/08/23 08:45, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> diff --git a/NEWS b/NEWS
>> index 99824eab95..ff41443896 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -15,6 +15,13 @@ Major new features:
>>    set the cgroupv2 in the new process in a race free manner.  These functions
>>    are GNU extensions and require a kernel with clone3 support.
>>  
>> +* On Linux, the pidfd_spawn and pidfd_spawp functions have been added.
>> +  They have similar prototype and semantic as posix_spawn, but instead of
>> +  returning a process ID, they return a file descriptor that can be used
>> +  along other pidfd functions (like pidfd_send_signal, poll, or waitid).
>> +  The pidfd functionality avoid the issue of PID reuse with traditional
>> +  posix_spawn interface.
> 
> “avoid[s]”

Ack.

> 
>> diff --git a/posix/tst-posix_spawn-setsid.c b/posix/tst-posix_spawn-setsid.c
>> index 124d878ce2..751674165c 100644
>> --- a/posix/tst-posix_spawn-setsid.c
>> +++ b/posix/tst-posix_spawn-setsid.c
>> @@ -18,78 +18,158 @@
> 
>> +/* Called on process re-execution, write down the session id on PIDFILE.  */
>> +_Noreturn static void
>> +handle_restart (const char *pidfile)
>> +{
>> +  int pidfd = xopen (pidfile, O_WRONLY, 0);
>> +
>> +  char buf[INT_STRLEN_BOUND (pid_t)];
>> +  int s = snprintf (buf, sizeof buf, "%d", getsid (0));
>> +  size_t n = write (pidfd, buf, s);
>> +  TEST_VERIFY (n == s);
>> +
>> +  xclose (pidfd);
>> +
>> +  exit (EXIT_SUCCESS);
>> +}
> 

Indeed, I removed the exit.

> I suspect this has an issue with hiding test failures (mapping not
> shared after execve).
>> diff --git a/posix/tst-spawn3.c b/posix/tst-spawn3.c
>> index e7ce0fb386..bd21ac6c4b 100644
>> --- a/posix/tst-spawn3.c
>> +++ b/posix/tst-spawn3.c
>> @@ -16,6 +16,7 @@
> 
>> +  char buf[INT_STRLEN_BOUND (pid_t)];
> 
> This should be INT_BUFSIZE_BOUND.

Ack.

> 
>> diff --git a/posix/tst-spawn6.c b/posix/tst-spawn6.c
>> index 4e29d78168..ff36351cd6 100644
>> --- a/posix/tst-spawn6.c
>> +++ b/posix/tst-spawn6.c
> 
>> @@ -202,7 +201,7 @@ do_test (int argc, char *argv[])
>>    if (restart)
>>      return handle_restart (argv[1], argv[2]);
>>  
>> -  pid_t pid = xfork ();
>> +  PID_T_TYPE pid = xfork ();
>>    if (pid == 0)
>>      {
>>        /* Create a pseudo-terminal to avoid interfering with the one using by
> 
> I think the result of xfork remains pid_t, so that switch seems wrong?
> 

Indeed, I reverted that.

> 
>> diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
>> index 6d8a67039e..bd96ad12ad 100644
>> --- a/sysdeps/unix/sysv/linux/Versions
>> +++ b/sysdeps/unix/sysv/linux/Versions
>> @@ -324,6 +324,8 @@ libc {
>>    GLIBC_2.39 {
>>      posix_spawnattr_getcgroup_np;
>>      posix_spawnattr_setcgroup_np;
>> +    pidfd_spawn;
>> +    pidfd_spawnp;
>>    }
> 
> I'd prefer to maintain lexicographic order.

Ack.

> 
>> diff --git a/sysdeps/unix/sysv/linux/bits/spawn_ext.h b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
>> index 3bc10ab477..ff8550f264 100644
>> --- a/sysdeps/unix/sysv/linux/bits/spawn_ext.h
>> +++ b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
>> @@ -37,4 +37,35 @@ extern int posix_spawnattr_setcgroup_np
>>  (posix_spawnattr_t
>> +/* Spawn a new process executing PATH with the attributes describes in *ATTRP.
>> +   Before running the process perform the actions described in FACTS.  Return
>> +   a PID file descriptor in PIDFD if process creation was successful and the
>> +   argument is non-null.
>> +
>> +   This function is a possible cancellation point and therefore not
>> +   marked with __THROW. */
> 
> Missing space after .

Ack.

> 
>> +extern int pidfd_spawn (int *__restrict __pidfd,
>> +			const char *__restrict __path,
>> +			const posix_spawn_file_actions_t *__restrict __facts,
>> +			const posix_spawnattr_t *__restrict __attrp,
>> +			char *const __argv[__restrict_arr],
>> +			char *const __envp[__restrict_arr])
>> +    __nonnull ((2, 5));
>> +
>> +/* Similar to `pidfd_spawn' but search for FILE in the PATH.
>> +
>> +   This function is a possible cancellation point and therefore not
>> +   marked with __THROW. */
> 
> Missing space after .

Ack.

> 
>> +extern int pidfd_spawnp (int *__restrict __pidfd,
>> +			 const char *__restrict __file,
>> +			 const posix_spawn_file_actions_t *__restrict __facts,
>> +			 const posix_spawnattr_t *__restrict __attrp,
>> +			 char *const __argv[__restrict_arr],
>> +			 char *const __envp[__restrict_arr])
>> +    __nonnull ((2, 5));
> 
> I think we should mark PIDFD as nonnull.  If the caller ignores the
> descriptor, it just leaks.  In that case, the caller should use the
> non-descriptor variant of posix_spawn.

Sounds reasonable.  Another options would to just close the file descriptor
if clone is successful and pidfd is NULL.

> 
>> diff --git a/sysdeps/unix/sysv/linux/clone-pidfd-support.c b/sysdeps/unix/sysv/linux/clone-pidfd-support.c
>> new file mode 100644
>> index 0000000000..af2d213cc5
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/clone-pidfd-support.c
> 
>> +bool
>> +__clone_pidfd_supported (void)
>> +{
>> +  static int supported = 0;
> 
> I suggest to make this a file-level static with a non-colliding name,
> this way it's easier to print its value with a debugger.

Ack.

> 
>> +  int state = atomic_load_relaxed (&supported);
>> +  if (state == 0)
>> +    {
> 
>> +    }
>> +
>> +  return state == 1;
> 
> “return state > 0;” is probably more efficient.
> 

Ack.

>> index f0d4c62ae6..844abf1b0b 100644
>> --- a/sysdeps/unix/sysv/linux/spawni.c
>> +++ b/sysdeps/unix/sysv/linux/spawni.c
> 
>> @@ -319,6 +320,15 @@ __spawnix (pid_t * pid, const char *file,
>>    struct posix_spawn_args args;
>>    int ec;
>>  
>> +  bool use_pidfd = xflags & SPAWN_XFLAGS_RET_PIDFD;
>> +
>> +  /* For CLONE_PIDFD, older kernels might not fail with unsupported flags or
>> +     some versions might not support waitid (P_PIDFD).  So to avoid the need
>> +     to handle the error on the helper process, check for full pidfd
>> +     support.  */
>> +  if (use_pidfd && !__clone_pidfd_supported ())
>> +    return ENOSYS;
> 
> Why not EOPNOTSUPP?  I think ENOSYS can be justified because the pidfd
> functions are a separate family of functions, and not a sub-operation
> that is failing,  Maybe add this to the comment?

I think ENOSYS fits better here, since the ideia is that if pidfd waitid
is not supported there posix_spawn can not be used regardless of its
arguments.  This is different than posix_spawn with cgroupv2, that
depends where the kernel supports clone with CLONE_INTO_CGROUP.

I have added a comment about it.


More information about the Libc-alpha mailing list