This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Linux: Add <sys/direntries.h>
On 15/07/2019 10:34, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>>> Yes, I think there is no way aound memory allocation for this. We can
>>> make it rare enough if we pre-allocate just one place, though. I think
>>> it's still a reasonable solution for a bad fringe case.
>>
>> I think pre-allocating the d_off table entry is a 'pessimization' which
>> falls in the idea of pay for not what want. Due current telldir/seekdir
>> breakage for some filesystem, I would say it is not extensively used or usage
>> is limited that the issue does not hit very often. Also, since d_off for some
>> file system is a hashed 'cookie', even 64-bit call does requires this
>> (although I am not sure how is frequency the hash algorithm used set the
>> higher bit).
>
> We only need to preallocate the space for one mapping entry (number to
> 64-bit offset), so that telldir doesn't have to allocate anything and
> therefore cannot fail. If telldir is not called, we can overwrite the
> entry at the next readdir call, without to allocate anything. Only if
> telldir is actually called, the next readdir call needs to allocate
> something.
Although POSIX states that, for 'loc' used with seekdir that were not
obtained from telldir subsequent, calls to readdir() are unspecified, I
think we should also avoid calling lseek for invalid positions. This is
what FreeBSD does by allocating a mapping for each call to telldir and
thus checking on seekdir if the value is valid (it only need to check for
values that overflow 'long' though).
>
>>> But it inteferes with using readdir in an async-signal-safe (or at least
>>> malloc-unsafe) context, which is something people actually do. If we
>>> intend to tell people to move to safer interfaces (useful after vfork or
>>> just even multi-threaded fork), I think we are under an obligation to
>>> provide such interfaces.
>>
>> To summarize, opendir async-signal-safe issue are:
>>
>> 1. malloc call on opendir to allocate the initial buffer;
>
> And the DIR struct itself, which is not a problem inside glibc, but is
> externally, it would add some ABI which does not exist today.
That the malloc was referring in fact, at __alloc_dir.
>
>> 2. (mips64 only for kernel prior 3.10) __getdents calls malloc prior
>> getdents syscall (due initial opendir buffers malloc will always
>> be called);
>>
>> 3. readddir uses lock associate to DIR to provide thread-safey;
>>
>> 4. telldir/seekdir might require associated data-structure allocation
>> to handle the return value limitation (not implemented).
>>
>> For 1. we can either provide an alternative to allow an user-provided
>> buffer or use mmap directly. The user provided buffer also allows help
>> the buffer bloat due the preferred large block sizes from file system.
>>
>> For 2. we can get back to a VLA and avoid malloc altogether, the default
>> buffer used will trigger at most 32kb allocation and usage is pretty
>> limited in a specific scenario.
>
> That would still intefere with the posix_spawn code, I think.
We will need to provision a large stack size for mips64 (most likely by
defining it on arch-specific internal header).
>
>> For 3. we might provide an 'unlocked' version same as we do for stdio.
>> I am not sure how often opendir is used in a multithread environment
>> with concurrent accesses.
>
> I don't think this lock is a problem because it's specific to one DIR *.
> It would only affect the async-signal-safe context if the DIR * was
> allocated on the outside and reused from the async-signal-safe context.
Indeed it should not prevent if opendir is called within the context of
the signal handler.