[PATCH v3 2/4] Linux: Add close_range

Florian Weimer fweimer@redhat.com
Tue Mar 9 09:47:59 GMT 2021


* Adhemerval Zanella via Libc-alpha:

> diff --git a/manual/llio.texi b/manual/llio.texi
> index c0a53e1a6e..ceb18ac89a 100644
> --- a/manual/llio.texi
> +++ b/manual/llio.texi
> @@ -284,6 +284,44 @@ of trying to close its underlying file descriptor with @code{close}.
>  This flushes any buffered output and updates the stream object to
>  indicate that it is closed.
>  
> +@deftypefun int close_range (unsigned int @var{lowfd}, unsigned int @var{maxfd}, int @var{flags})
> +@standards{Linux, unistd.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{@acsfd{}}}
> +@c This is a syscall for Linux v5.9.  There is no fallback emulation for
> +@c older kernels.
> +
> +The function @code{clode_range} closes the file descriptor from @var{lowfd}
> +to @var{maxfd} (inclusive).  This function is similar to call @code{close} in
> +specified file descriptor range depending on the @var{flags}.
> +
> +The @var{flags} add options on how the files are closes.  Linux currently
> +supports:.

Spurios “.” at end of line.

> +@vtable @code
> +@item CLOSE_RANGE_UNSHARE
> +Unshare the file descriptor table before closing file descriptors.
> +@end vtable
> +
> +The normal return value from @code{close_range} is @math{0}; a value
> +of @math{-1} is returned in case of failure.  The following @code{errno} error
> +conditions are defined for this function:
> +
> +@table @code
> +@item EINVAL
> +The @var{lowfd} value is larger than @var{maxfd} or an unsupported @var{flags}
> +is used.
> +
> +@item ENOMEM
> +Either there is not enough memory for the operation, or the process is
> +out of address space.
> +
> +@item EMFILE
> +The process has too many files open.
> +The maximum number of file descriptors is controlled by the
> +@end table
> +@end deftypefun

The ENOMEM and EMFILE descriptions are not really clear in this context.
Are these failures only possible with CLOSE_RANGE_UNSHARE?

It may make sense to mention the lack of emulation (so that callers have
to be aware of ENOSYS for the time being) and that the function is
specific to Linux.

Based on the description, it is not clear what happens if the range
contains closed descriptors.  Maybe mention that already-closed
descriptors are simply skipped?

> index c35f783e2a..e2a6fa763f 100644
> --- a/sysdeps/unix/sysv/linux/Versions
> +++ b/sysdeps/unix/sysv/linux/Versions
> @@ -169,6 +169,9 @@ libc {
>    }
>    GLIBC_2.32 {
>    }
> +  GLIBC_2.33 {
> +    close_range;
> +  }

Needs to be GLIBC_2.34 now.  Also the copyright year needs adjusting.

> diff --git a/sysdeps/unix/sysv/linux/bits/unistd_ext.h b/sysdeps/unix/sysv/linux/bits/unistd_ext.h
> index c315cc5cb8..799c59512c 100644
> --- a/sysdeps/unix/sysv/linux/bits/unistd_ext.h
> +++ b/sysdeps/unix/sysv/linux/bits/unistd_ext.h
> @@ -33,4 +33,15 @@
>     not detached and has not been joined.  */
>  extern __pid_t gettid (void) __THROW;
>  
> +/* Unshare the file descriptor table before closing file descriptors.  */
> +#define CLOSE_RANGE_UNSHARE     (1U << 1)

Needs to be indented.

Please include <linux/close_range.h> if available.  It's a separate
header so that we can use it.

I think if you do that, the consistency check won't add much value
because it can't really test anything.

> +/* Close all file descriptors in the range FD up to MAX_FD.  The flag FLAGS
> +   are define by the CLOSE_RANGE prefix.  This function behaves like close
> +   on the range, but in a fail-safe where it will either fail and not close
> +   any file descriptor or close all of them.  Returns 0 on successor or -1
> +   for failure (and sets errno accordingly).  */
> +extern int close_range (unsigned int __fd, unsigned int __max_fd,
> +			int __flags) __THROW;
> +
>  #endif

Maybe add /* __USE_GNU */ to the #endif, given that it's now at a
distance from the #ifdef.

> diff --git a/sysdeps/unix/sysv/linux/tst-close_range.c b/sysdeps/unix/sysv/linux/tst-close_range.c
> new file mode 100644
> index 0000000000..131cf27c72
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/tst-close_range.c
> @@ -0,0 +1,222 @@

> +#define NFDS 100
> +
> +static int
> +open_multiple_temp_files (void)
> +{
> +  /* Check if the temporary file descriptor has no no gaps.  */
> +  int lowfd = xopen ("/dev/null", O_RDONLY, 0600);
> +  for (int i = 1; i <= NFDS; i++)
> +    TEST_COMPARE (xopen ("/dev/null", O_RDONLY, 0600),
> +		  lowfd + i);
> +  return lowfd;
> +}

Okay, this ensures that there are no gaps.

> +
> +static void
> +close_range_test_common (int lowfd, unsigned int flags)
> +{
> +  const int maximum_fd = lowfd + NFDS;
> +  const int half_fd = maximum_fd / 2;
> +  const int gap_1 = maximum_fd - 8;

I think half_fd should be should be lowfd + NFDS / 2.

> +_Noreturn static int
> +close_range_test_fn (void *arg)
> +{
> +  int lowfd = (int) ((uintptr_t) arg);
> +  close_range_test_common (lowfd, 0);
> +  exit (EXIT_SUCCESS);
> +}
> +
> +/* Check if a clone_range on a subprocess created with CLONE_FILES close
> +   the shared file descriptor table entries in the parent.  */
> +static void
> +close_range_test_subprocess (void)
> +{
> +  struct support_descriptors *descrs = support_descriptors_list ();
> +
> +  /* Check if the temporary file descriptor has no no gaps.  */
> +  int lowfd = open_multiple_temp_files ();
> +
> +  enum { stack_size = 4096 };
> +  DEFINE_STACK (stack, stack_size);
> +  pid_t pid = xclone (close_range_test_fn, (void*) (uintptr_t) lowfd, stack,
> +		      stack_size, CLONE_FILES | SIGCHLD);

stack_size is too small, I think.  Future error checking (possible with
clone3) would lead to failures.

> +static void
> +close_range_unshare_test (void)
> +{
> +  struct support_descriptors *descrs1 = support_descriptors_list ();
> +
> +  /* Check if the temporary file descriptor has no no gaps.  */
> +  int lowfd = open_multiple_temp_files ();
> +
> +  struct support_descriptors *descrs2 = support_descriptors_list ();
> +
> +  enum { stack_size = 4096 };
> +  DEFINE_STACK (stack, stack_size);
> +  pid_t pid = xclone (close_range_unshare_test_fn, (void*) (uintptr_t) lowfd,
> +		      stack, stack_size, CLONE_FILES | SIGCHLD);

Likewise.

Thanks,
Florian



More information about the Libc-alpha mailing list