[PATCH] manual: Add Descriptor-Relative Access section

Florian Weimer fw@deneb.enyo.de
Mon May 4 09:10:26 GMT 2020


* Michael Kerrisk:

>>>> +@end vtable
>>>> +
>>>> +There is no relationship between these flags and the type argument to
>>>
>>> Formatting for 'type'?
>> 
>> You mean, as in @var{type}?  I think this is frowned upon because it
>> would be a dangling meta-syntactic variable in this context.
>
> I don't quite understand your explanation, but yet I meant formatting
> as you suggest. Without it, it's not so obvious to the reader that 
> 'type' refers to an argument.

The context is this:

There is no relationship between these flags and the type argument to
the @code{getauxval} function (with @code{AT_@dots{}} constants defined
in @file{elf.h}).

I think it's reasonable clear?  I think I'm following the advice in
the Texinfo documentation:

| Use the @var command to indicate metasyntactic variables. A
| metasyntactic variable is something that stands for another piece of
| text. For example, you should use a metasyntactic variable in the
| documentation of a function to describe the arguments that are
| passed to that function.
|
| Do not use @var for the names of normal variables in computer
| programs. These are specific names, so @code is correct for them
| (@code). For example, the Emacs Lisp variable texinfo-tex-command is
| not a metasyntactic variable; it is properly formatted using @code.

“type” does not stand for any other text in the quoted sentence, it
refers to the single function argument.

>> +Many functions that accept file names have variants have
>
> Wording issue: have... have

Fixed.

>> +@code{@dots{}at} variants which accept a file descriptor and a file name
>> +argument instead of just a file name argument.  For example,
>> +@code{fstatat} is the descriptor-based variant of the @code{fstat}
>> +function.  Most such functions also accept an additional flags argument
>> +which changes the behavior of the file name lookup based on the
>> +@code{AT_@dots{}} flags specified.
>> +
>> +There are several reasons to use descriptor-relative access:
>> +
>> +@itemize @bullet
>> +@item
>> +The working directory is a process-wide resource, so individual threads
>> +cannot change it without affecting other threads in the process.
>> +Explicitly specifying the directory against which relative paths are
>> +resolved can be a thread-safe alternative to changing the working
>> +directory.
>> +
>> +@item
>> +If a progrem wishes to access a directory tree which is being modified
>> +concurrently, perhaps even by a different user on the system, the
>> +program must avoid looking up file names with multiple components, in
>> +order to detect symbolic links, using the @code{O_NOFOLLOW} flag
>> +(@pxref{Open-time Flags}) or the @code{AT_SYMLINK_FOLLOW} flag
>> +(described below).  Without directory-relative access, it is necessary
>> +to use the @code{fchdir} function to change the working directory
>> +(@pxref{Working Directory}), which is not thread-safe.
>
> I find the above a little hard to grok. I think it would be helpful
> to more explicitly point out that the problem is that the components
> in the dirname part of the pathname could be symlinks whose targets
> may change, which is a problem if the application will use the
> pathname in a series of syscalls.

Good point: I tried to improve it somewhat:

@item
If a progrem wishes to access a directory tree which is being modified
concurrently, perhaps even by a different user on the system, the
program typically must avoid following symbolic links.  With POSIX
interfaces, this can be done using the @code{O_NOFOLLOW} flag
(@pxref{Open-time Flags}) or the @code{AT_SYMLINK_FOLLOW} flag
(described below), but these flags affect only the final component of a
file name (the basename).  Symbolic links in the parent directory part
are still followed.  Therefore, without directory-relative access, it is
necessary to use the @code{fchdir} function to change the working
directory (@pxref{Working Directory}) and use the basename for file
system access.  As explained before, this is not thread-safe.  Keeping a
file descriptor of the directory is also required to be able to return
to it later, so descriptor-based access is a natural fit.

I've also switched to “effective basename”, from “effective final file
name component”, further below.

8<------------------------------------------------------------------8<
And document the functions openat, openat64, fstatat, fstatat64.
(The safety assessment for fstatat was already obsolete because
current glibc assumes kernel support for the underlying system call.)

-----
 manual/filesys.texi | 187 +++++++++++++++++++++++++++++++++++++++++++++++++---
 manual/llio.texi    |  28 ++++++++
 manual/startup.texi |   7 +-
 3 files changed, 210 insertions(+), 12 deletions(-)

diff --git a/manual/filesys.texi b/manual/filesys.texi
index 73e630842e..ad11dc2b26 100644
--- a/manual/filesys.texi
+++ b/manual/filesys.texi
@@ -15,6 +15,7 @@ access permissions and modification times.
 @menu
 * Working Directory::           This is used to resolve relative
 				 file names.
+* Descriptor-Relative Access::  Ways to control file name lookup.
 * Accessing Directories::       Finding out what files a directory
 				 contains.
 * Working with Directory Trees:: Apply actions to all files or a selectable
@@ -206,6 +207,148 @@ An I/O error occurred.
 @end table
 @end deftypefun
 
+@node Descriptor-Relative Access
+@section Descriptor-Relative Access
+@cindex file name lookup based on descriptors
+@cindex pathname resolution based on descriptors
+@cindex descriptor-based file name resolution
+@cindex @code{@dots{}at} functions
+
+Many functions that accept file names have @code{@dots{}at} variants
+which accept a file descriptor and a file name argument instead of just
+a file name argument.  For example, @code{fstatat} is the
+descriptor-based variant of the @code{fstat} function.  Most such
+functions also accept an additional flags argument which changes the
+behavior of the file name lookup based on the @code{AT_@dots{}} flags
+specified.
+
+There are several reasons to use descriptor-relative access:
+
+@itemize @bullet
+@item
+The working directory is a process-wide resource, so individual threads
+cannot change it without affecting other threads in the process.
+Explicitly specifying the directory against which relative paths are
+resolved can be a thread-safe alternative to changing the working
+directory.
+
+@item
+If a progrem wishes to access a directory tree which is being modified
+concurrently, perhaps even by a different user on the system, the
+program typically must avoid following symbolic links.  With POSIX
+interfaces, this can be done using the @code{O_NOFOLLOW} flag
+(@pxref{Open-time Flags}) or the @code{AT_SYMLINK_FOLLOW} flag
+(described below), but these flags affect only the final component of a
+file name (the basename).  Symbolic links in the parent directory part
+are still followed.  Therefore, without directory-relative access, it is
+necessary to use the @code{fchdir} function to change the working
+directory (@pxref{Working Directory}) and use the basename for file
+system access.  As explained before, this is not thread-safe.  Keeping a
+file descriptor of the directory is also required to be able to return
+to it later, so descriptor-based access is a natural fit.
+
+@item
+Listing directory contents using the @code{readdir} or @code{readdir64}
+functions (@pxref{Reading/Closing Directory}) does not provide full file
+name paths.  Using @code{@dots{}at} functions, it is possible to use the
+file names directly, without having to construct such full paths.
+
+@item
+Additional flags available with some of the @code{@dots{}at} functions
+provide access to functionality which is not available otherwise.
+@end itemize
+
+The file descriptor used by these @code{@dots{}at} functions has the
+following uses:
+
+@itemize @bullet
+@item
+It can be a file descriptor referring to a directory.  Such a descriptor
+can be created explicitly using the @code{open} function, with or
+without the @code{O_DIRECTORY} flag.  @xref{Opening and Closing Files}.
+Or it can be created implicitly by @code{opendir} and retrieved using
+the @code{dirfd} function.  @xref{Opening a Directory}.
+
+If a directory descriptor is used with one of the @code{@dots{}at}
+functions, a relative file name argument is resolved relative to the
+directory referred to by the directory descriptor, just as if that
+directory were the current working directory.  Absolute file name
+arguments (starting with @samp{/}) are resolved against the file system
+root, and the descriptor argument is effectively ignored for the
+purposes of file name lookup.
+
+This means that file name lookup is not constrained to the directory of
+the descriptor.  For example, it is possible to access a file
+@file{example} in the parent directory using a file name argument
+@code{"../example"}, or in the root directory using @code{"/example"}.
+
+@item
+@vindex @code{AT_FDCWD}
+The special value @code{AT_FDCWD}.  This means that the current working
+directory is used for the lookup if the file name is a relative.  For
+@code{@dots{}at} functions with an @code{AT_@dots{}} flags argument,
+this provides a shortcut to use those flags with regular (not
+descriptor-based) file name lookups.
+
+@item
+An arbitrary file descriptor, along with an empty string @code{""} as
+the file name argument, and the @code{AT_EMPTY_PATH} flag.  In this case,
+the operation uses the file descriptor directly, without further
+file name resolution.  On Linux, this allows operations on descriptors
+opened with the @code{O_PATH} flag.  For regular descriptors (without
+@code{O_PATH}), the same functionality is also available through the
+plain descriptor-based functions (for example, @code{fstat} instead of
+@code{fstatat}).
+
+This is a GNU extension.
+@end itemize
+
+@cindex file name resolution flags
+@cindex @code{AT_*} file name resolution flags
+The flags argument in @code{@dots{}at} functions can be a combination of
+the following flags, defined in @file{fcntl.h}.  Not all such functions
+support all flags.  Some of the functions (such as @code{openat})
+completely lack an argument for the @code{AT_*} flags.
+
+In the flag descriptions below, the @dfn{effective basename} refers to
+the final component (basename) of the full path constructed from the
+descriptor and file name arguments, using file name lookup, as described
+above.
+
+@vtable @code
+@item AT_EMPTY_PATH
+This flag is used with an empty file name @code{""} and a descriptor
+which does not necessarily refer to a directory.  It is most useful with
+@code{O_PATH} descriptors, as described above.  This flag is a GNU
+extension.
+
+@item AT_NO_AUTOMOUNT
+If the effective basename refers to a potential file system mount point
+controlled by an auto-mounting service, the operation does not trigger
+auto-mounting and refers to the unmounted mount point instead.
+@xref{Mount-Unmount-Remount}.  If a file system has already been mounted
+at the effective basename, the operation applies to the mounted file
+system, not the underlying file system that was mounted over.  This flag
+is a GNU extension.
+
+@item AT_SYMLINK_FOLLOW
+If the effective basename is a symbolic link, the operation follows the
+symbolic link and operates on its target.  (For most functions, this is
+the default behavior.)
+
+@item AT_SYMLINK_NOFOLLOW
+If the effective basename is a symbolic link, the operation operates on
+the symbolic link, without following it.  The difference in behavior
+enabled by this flag is similar to the difference between the
+@code{lstat} and @code{stat} functions, or the behavior activated by the
+@code{O_NOFOLLOW} flag (@pxref{Open-time Flags}).  Even with the
+@code{AT_SYMLINK_NOFOLLOW} flag present, symbolic links in a non-final
+component of the file name are still followed.
+@end vtable
+
+There is no relationship between these flags and the type argument to
+the @code{getauxval} function (with @code{AT_@dots{}} constants defined
+in @file{elf.h}).
 
 @node Accessing Directories
 @section Accessing Directories
@@ -1250,10 +1393,11 @@ A hardware error occurred while trying to read or write the to filesystem.
 
 The @code{linkat} function is analogous to the @code{link} function,
 except that it identifies its source and target using a combination of a
-file descriptor (referring to a directory) and a pathname.  If a
-pathnames is not absolute, it is resolved relative to the corresponding
-file descriptor.  The special file descriptor @code{AT_FDCWD} denotes
-the current directory.
+file descriptor (referring to a directory) and a file name.
+@xref{Descriptor-Relative Access}.  For @code{linkat}, if a file name is
+not absolute, it is resolved relative to the corresponding file
+descriptor.  As usual, the special value @code{AT_FDCWD} denotes the
+current directory.
 
 The @var{flags} argument is a combination of the following flags:
 
@@ -2091,9 +2235,36 @@ function is available under the name @code{fstat} and so transparently
 replaces the interface for small files on 32-bit machines.
 @end deftypefun
 
-@c fstatat will call alloca and snprintf if the syscall is not
-@c available.
-@c @safety{@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
+@deftypefun int fstatat (int @var{filedes}, const char *@var{filename}, struct stat *@var{buf}, int @var{flags})
+@standards{POSIX.1, sys/stat.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is a descriptor-relative version of the @code{fstat}
+function above.  @xref{Descriptor-Relative Access}.  The @var{flags}
+argument can contain a combination of the flags @code{AT_EMPTY_PATH},
+@code{AT_NO_AUTOMOUNT}, @code{AT_SYMLINK_NOFOLLOW}.
+
+Compared to @code{fstat}, the following additional error conditions can
+occur:
+
+@table @code
+@item EBADF
+The @var{filedes} argument is not a valid file descriptor.
+
+@item EINVAL
+The @var{flags} argument is not valid.
+
+@item ENOTDIR
+The descriptor @var{filedes} is not associated with a directory, and
+@var{filename} is a relative file name.
+@end table
+@end deftypefun
+
+@deftypefun int fstatat64 (int @var{filedes}, const char *@var{filename}, struct stat64 *@var{buf}, int @var{flags})
+@standards{GNU, sys/stat.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+This function is the large-file variant of @code{fstatat}, similar to
+how @code{fstat64} is the variant of @code{fstat}.
+@end deftypefun
 
 @deftypefun int lstat (const char *@var{filename}, struct stat *@var{buf})
 @standards{BSD, sys/stat.h}
@@ -3582,12 +3753,10 @@ The @code{mkdtemp} function comes from OpenBSD.
 @c fchmodat
 @c fchownat
 @c futimesat
-@c fstatat (there's a commented-out safety assessment for this one)
 @c statx
 @c mkdirat
 @c mkfifoat
 @c name_to_handle_at
-@c openat
 @c open_by_handle_at
 @c readlinkat
 @c renameat
diff --git a/manual/llio.texi b/manual/llio.texi
index 6db4a70836..afbeca881e 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -180,6 +180,34 @@ new, extended API using 64 bit file sizes and offsets transparently
 replaces the old API.
 @end deftypefun
 
+@deftypefun int openat (int @var{filedes}, const char *@var{filename}, int @var{flags}[, mode_t @var{mode}])
+@standards{POSIX.1, fcntl.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{@acsfd{}}}
+This function is the descriptor-relative variant of the @code{open}
+function.  @xref{Descriptor-Relative Access}.
+
+Note that the @var{flags} argument of @code{openat} does not accept
+@code{AT_@dots{}} flags, only the flags described for the @code{open}
+function above.
+
+The @code{openat} function can fail for additional reasons:
+
+@table @code
+@item EBADF
+The @var{filedes} argument is not a valid file descriptor.
+
+@item ENOTDIR
+The descriptor @var{filedes} is not associated with a directory, and
+@var{filename} is a relative file name.
+@end table
+@end deftypefun
+
+@deftypefun int openat (int @var{filedes}, const char *@var{filename}, int @var{flags}[, mode_t @var{mode}])
+@standards{GNU, fcntl.h}
+The large-file variant of the @code{openat}, similar to how
+@code{open64} is the large-file variant of @code{open}.
+@end deftypefun
+
 @deftypefn {Obsolete function} int creat (const char *@var{filename}, mode_t @var{mode})
 @standards{POSIX.1, fcntl.h}
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{@acsfd{}}}
diff --git a/manual/startup.texi b/manual/startup.texi
index 21c48cd037..52126ef59e 100644
--- a/manual/startup.texi
+++ b/manual/startup.texi
@@ -664,9 +664,10 @@ basis there may be information that is not available any other way.
 @c Reads from hwcap or iterates over constant auxv.
 This function is used to inquire about the entries in the auxiliary
 vector.  The @var{type} argument should be one of the @samp{AT_} symbols
-defined in @file{elf.h}.  If a matching entry is found, the value is
-returned; if the entry is not found, zero is returned and @code{errno} is
-set to @code{ENOENT}.
+defined in @file{elf.h}.  (There is no relationship between these types
+and the file name lookup flags in @file{fcntl.h}.)  If a matching entry
+is found, the value is returned; if the entry is not found, zero is
+returned and @code{errno} is set to @code{ENOENT}.
 @end deftypefun
 
 For some platforms, the key @code{AT_HWCAP} is the easiest way to inquire


More information about the Libc-alpha mailing list