Update mmap() flags and errors lists

DJ Delorie dj@redhat.com
Wed Jun 5 04:10:48 GMT 2024


v2 will follow...

Florian Weimer <fweimer@redhat.com> writes:
> While you are adding this, please avoid starting a sentence with @var,
> so something like:
>
>   [The] @var{flags} [parameter] contains …

Fixed.

>> -They include:
>> +Note that, aside from @code{MAP_PRIVATE} and @code{MAP_SHARED}, not
>> +all flags are supported on all versions of all operating systems.
>> +Consult the kernel-specific documenation for details.  The flags
>> +include:
>
> typo: documen[t]ation

Fixed.

>> +@item MAP_SHARED_VALIDATE
>> +Similar to @code{MAP_SHARED} except that additional flags will be
>> +validated by the kernel, and the call will fail if an unrecognized
>> +flag is provided.  With @code{MAP_SHARED} using a flag on a kernel
>> +that doesn't support it causes the flag to be ignored.
>> +@code{MAP_SHARED_VALIDATE} should be used when the behavior of all
>> +flags is required.
>
> This leads to the question what to do if you want this checking behavior
> with MAP_PRIVATE instead of MAP_SHARED.

I didn't write the spec ;-)

>>  @item MAP_FIXED
>>  This forces the system to use the exact mapping address specified in
>> -@var{address} and fail if it can't.
>> +@var{address} and fail if it can't.  Note that if the new mapping
>> +would overlap an existing mapping, the existing map is unmapped.
>
> This is misleading, I believe.  The overlapping part is replaced with
> the new mapping.  If the overlap is incomplete, part of the previous
> mapping remains.

Reworded.

>> +@item MAP_HUGE_16KB
>> +@dots{}
>> +@item MAP_HUGE_16GB
>> +Some architectures support more than one size of ``huge'' pages for
>> +@code{MAP_HUGETLB}.  These flags allow the caller to choose amongst
>> +them.  Note that while the ABI allows the caller to specify arbitrary
>> +page sizes, not all sizes have corresponding defined macros, and not
>> +all defined macros correspond to sizes supported by the kernel.  It is
>> +up to the programmer to only ask for huge page sizes that are known to
>> +be supported.
>
> These we do not support?  (We probably should.)

The ABI is a 6-bit bitfield giving the biased bit width of the page
size.  Not all combinations have macros, and not all combinations are
honored by the kernel.  We do have macros for the combinations that the
kernel honors.  So if the kernel can do it, it works, but if the kernel
can't do it, either you get a runtime error or a compile time error ;-)

I suppose we could list all 64 possible macros in our headers, but at
the moment, we don't.

>> +@item MAP_32BIT
>> +Require addresses that can be accessed with a 32 bit pointer, i.e.,
>> +within the first 4 GiB.  Ignored if MAP_FIXED is specified.
>> +
>> +@item MAP_DENYWRITE
>> +@item MAP_EXECUTABLE
>> +@item MAP_FILE
>> +
>> +Provided for compatibility.  Ignored by the Linux kernel.
>
> I thought that some corner cases still handle MAP_DENYWRITE?

Nope, completely ignored by the kernel.

>> +@item MAP_FIXED_NOREPLACE
>> +Similar to @code{MAP_FIXED} except the call will fail with
>> +@code{EEXIST} if the new mapping would overwrite an existing mapping.
>
> How does this interact with MAP_SHARED_VALIDATE above?  Can it be
> combined with MAP_FIXED?

Superset of MAP_FIXED, so it's internally *always* combined:
        /* force arch specific MAP_FIXED handling in get_unmapped_area */
        if (flags & MAP_FIXED_NOREPLACE)
                flags |= MAP_FIXED;

I would assume it interacts with MAP_SHARED_VALIDATE exactly as
documented.  Creates a shared fixed mapping, unless the kernel doesn't
support MAP_FIXED_NOREPLACE, then errors.

>> +@item MAP_GROWSDOWN
>> +This flag is used to make stacks, and is typically only needed inside
>> +the program loader to set up the main stack and thread stacks for the
>> +running process.  The mapping is created according to the other flags,
>> +except an additional page just prior to the mapping is marked as a
>> +``guard page''.  If a write is attempted inside this guard page, that
>> +page is mapped, the mapping is extended, and a new guard page is
>> +created.  Thus, the mapping continues to grow towards lower addresses
>> +until it encounters some other mapping.
>
> Maybe reference -fstack-clash-protection, and note that @theglibc{} does
> not use this for thread stacks?

I took out the thread stack text.

Added text about -fstack-clash-protection.

>> +@item MAP_LOCKED
>> +Requests that mapped pages are locked in memory (i.e. not paged out).
>> +Note that this is a request and not a requirement; use @code{mlock} if
>> +locking is mandatory.
>> +
>> +@item MAP_POPULATE
>> +@item MAP_NONBLOCK
>> +These two are opposites.  @code{MAP_POPULATE} requests that the kernel
>> +read-ahead a file-backed mapping, causing more pages to be mapped
>> +before they're needed.  @code{MAP_NONBLOCK} requests that the kernel
>> +@emph{not} attempt such, only mapping pages when they're actually
>> +needed.
>
> MAP_POPULATE is just a hint, right?  And even with mlockall, or
> MAP_LOCKED, it does not guarantee the absence of future page faults.

Correct, which is why I said "requests" but I'll add better text.

>> +@item MAP_NORESERVE
>> +Asks the kernel to not reserve physical backing for a mapping.  This
>> +would be useful for, for example, a very large but sparsely used
>> +mapping which need not be limited in span by available RAM or swap.
>> +Note that writes to such a mapping may cause a @code{SIGSEGV} if the
>> +amount of backing required eventualy exceeds system resources.
>> +
>> +On Linux, this flag's behavior may be overwridden by
>> +@code{/proc/sys/vm/overcommit_memory} as documented in swap(5).
>
> Shoud @xref the man-pages section added in the other patch.  However,
> swap(5) does not appear to exist?

Should be proc(5).  I tweaked the wording to not need a reference, I
think.  We do *not* want to accidentally include-by-reference
documentation on /proc or /sys, just the system calls.

>> +@item MAP_SYNC
>> +This flag is used to map persistent memory devices into the running
>> +program in such a way that writes to the mapping are immediately
>> +written to the device as well.  Unlike most other flags, this one will
>> +fail unless @code{MAP_SHARED_VALIDATE} is also given.
>
> Is this about DAX?

Yes.

>> +@item EAGAIN
>> +
>> +Either the underlying file is locked, or the system has temporarily
>> +run out of resources.
>
> See below, I think the reference about locking is spurious.

Based on kernel code:

        if (!mlock_future_ok(mm, vm_flags, len))
                return -EAGAIN;

>> +@item EBADF
>> +
>> +The @var{fd} passes is invalid, and a valid file descriptor is required.
>
> Is a file descriptor ever required?

If mapping a file, yes.  That's the default ;-)

>> +@item EEXIST
>> +
>> +@code{MAP_FIXED_NOREPLACE} was specified and an existing mapping was
>> +found in the requested address range.
>
> See my comment above for MAP_FIXED_NOREPLACE.

Tweaked the wording.

>>  @item EINVAL
>>  
>>  Either @var{address} was unusable (because it is not a multiple of the
>> @@ -1663,28 +1761,35 @@ applicable page size), or inconsistent @var{flags} were given.
>>  If @code{MAP_HUGETLB} was specified, the file or system does not support
>>  large page sizes.
>>  
>> -@item EACCES
>> +@item ENFILE
>>  
>> -@var{filedes} was not open for the type of access specified in @var{protect}.
>> +There are too many open files in the system.
>
> Can this error actually happen?  It's a bit surprising.

No direct mention in the kernel sources but the man pages documents it.
Removed.

>> +@item ENODEV
>> +
>> +This file is of a type that doesn't support mapping.
>>  
>>  @item ENOMEM
>>  
>>  Either there is not enough memory for the operation, or the process is
>>  out of address space.
>
> This should probably reference vm.max_map_count.

Noted.

>> -@c On Linux, EAGAIN will appear if the file has a conflicting mandatory lock.
>> -@c However mandatory locks are not discussed in this manual.
>
> Mandatory locks are disabled in pretty much all kernels out there, no?

I wouldn't think we'd want to write documentation based on config
options; if you get this error, obviously the config option was set on ;-)

But I removed this comment because I mentioned locking (in general) in
the added EAGAIN entry.

>> +@item EOVERFLOW
>> +
>> +Either the offset into the file causes the page counts to exceed the
>> +range of a 32 bit number, or the offset requested exceeds the length
>> +of the file.
>
> The reference to page size may be incorrect.  I think it's a fixed
> offset regardless of page size on systems that can't pass a 64-bit file
> offset.

The code I was looking at was talking about having 2^32 *pages* mapped.

>> +@item ETXTBSY
>> +
>> +@code{MAP_DENYWRITE} was specified, but the file descriptor given was
>> +open for writing.
>
> This seems to contradict the earlier suggestion that MAP_DENYWRITE is
> ignored.

Kernel source says this is still returned if you try to map a swap file
for writing...  rewritten.



More information about the Libc-alpha mailing list