This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Linux: Add <sys/direntries.h>



On 15/07/2019 09:10, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 08/07/2019 12:44, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>> Probably not many, even now you see on debian sparc maillist programs
>>>> that raise alignment issue due sparc restrictions. But masking wrong C
>>>> usage is not intention of such interfaces.
>>>
>>> I think code that's difficult to implement in a portable fashion may
>>> have a place in glibc, particularly if it is closely related to a glibc
>>> or kernel interface.
>>
>> But we do have dirent functions exactly to abstract it. If user wants to
>> optimize by using more direct function, such as getdents, its usage are
>> very specific and the auxiliary functions, as the one you intend to provide,
>> are just guesses of what user might want.
> 
> I think the interfaces cover this use case pretty well:
> 
>   <https://sourceware.org/ml/libc-help/2019-07/msg00007.html>
> 
>>>>> To support 64-bit directory offsets on 32-bit architectures, readdir and
>>>>> readdir64 must call malloc after all previous telldir call, to allocate
>>>>> space for the 64-to-32 mapping.  telldir cannot report failure and
>>>>> therefore cannot call malloc.  So extending opendir in this way looks
>>>>> rather problematic.
>>>>
>>>> I think you meant seekdir instead of telldir.
>>>
>>> No, I meant telldir.  telldir needs space to store a new mapping if it
>>> is called (which we do not know beforehand).  We can preallocate a fixed
>>> number of entries (one is quite enough), but if telldir has been called,
>>> and *then* readdir is called, we need to make sure that we have space
>>> for another mapping if telldir is called again.
>>>
>>> This is the proper fix for bug 23960 and is what FreeBSD does.
>>
>> Right, I see what you mean now. It was not clear you was referring to a
>> possible solution of the telldir/seekdir issues where there is an extra map 
>> between the fs d_off and the expected result long type for telldir. 
>>
>> But even the FreeBSD solution is to assume the return value will hold
>> on 'long' return and allocate the new maps entries on a linked list
>> external to the buffer used for getdirentries.
> 
> Yes, I think there is no way aound memory allocation for this.  We can
> make it rare enough if we pre-allocate just one place, though.  I think
> it's still a reasonable solution for a bad fringe case.

I think pre-allocating the d_off table entry is a 'pessimization' which
falls in the idea of pay for not what want.  Due current telldir/seekdir 
breakage for some filesystem, I would say it is not extensively used or usage 
is limited that the issue does not hit very often. Also, since d_off for some
file system is a hashed 'cookie', even 64-bit call does requires this
(although I am not sure how is frequency the hash algorithm used set the
higher bit).

> 
> But it inteferes with using readdir in an async-signal-safe (or at least
> malloc-unsafe) context, which is something people actually do.  If we
> intend to tell people to move to safer interfaces (useful after vfork or
> just even multi-threaded fork), I think we are under an obligation to
> provide such interfaces.

To summarize, opendir async-signal-safe issue are:

  1. malloc call on opendir to allocate the initial buffer;

  2. (mips64 only for kernel prior 3.10) __getdents calls malloc prior
     getdents syscall (due initial opendir buffers malloc will always
     be called);

  3. readddir uses lock associate to DIR to provide thread-safey;

  4. telldir/seekdir might require associated data-structure allocation
     to handle the return value limitation (not implemented).

For 1. we can either provide an alternative to allow an user-provided
buffer or use mmap directly.  The user provided buffer also allows help
the buffer bloat due the preferred large block sizes from file system.

For 2. we can get back to a VLA and avoid malloc altogether, the default 
buffer used will trigger at most 32kb allocation and usage is pretty
limited in a specific scenario.

For 3. we might provide an 'unlocked' version same as we do for stdio.
I am not sure how often opendir is used in a multithread environment
with concurrent accesses.

For 4. it is not an issue *now* for glibc, since it does not really
handle the d_off value correctly.  Once we fixed it, most likely by
using the similar strategy FreeBSD does, it will need to both a lock
and a malloc call. I am not sure if we would be able to provide 
async-signal-safe version with current constraint of using 'long' as
indication of position.

I give you that opendir interfaces has some issues and it is not really 
meant to be used on async scenarios or even in the limbo state after 
vfork/fork in multithread scenarios.  However, the main issue I see it why 
developers are relying on such interfaces in theses scenarios and which are 
the issues they are trying to solve.

Both from previous iterations about this topic and the recent libc-help,
it seems what we should provide instead is a safe way to spawn an process
while closing any lingering open file descriptor that caller can't guarantee
that has been opened with CLOCEXEC.  

So I prefer if we integrate your 'sys/direntries.h' on my proposed 
posix_spawn closefrom code instead of provide another GNU extension that 
mimics an already POSIX interface to solve a very specific issue. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]