This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Fix readdir_r with long file names
- From: Florian Weimer <fweimer at redhat dot com>
- To: mtk dot manpages at gmail dot com, Siddhesh Poyarekar <siddhesh at redhat dot com>
- Cc: Rich Felker <dalias at aerifal dot cx>, "Carlos O'Donell" <carlos at redhat dot com>, KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>, libc-alpha <libc-alpha at sourceware dot org>, Roland McGrath <roland at hack dot frob dot com>, linux-man <linux-man at vger dot kernel dot org>
- Date: Tue, 1 Mar 2016 17:59:37 +0100
- Subject: Re: [PATCH] Fix readdir_r with long file names
- Authentication-results: sourceware.org; auth=none
- References: <51B0B39F dot 4060202 at redhat dot com> <51B0BD36 dot 3030202 at redhat dot com> <CAHGf_=r9Rz63pho+84ORk0a_oDyJSj-MCnZ56uPrT3L6sVEfeQ at mail dot gmail dot com> <20130607013024 dot GO29800 at brightrain dot aerifal dot cx> <51B19203 dot 3070307 at redhat dot com> <20130607144143 dot GQ29800 at brightrain dot aerifal dot cx> <51B57E35 dot 4080403 at redhat dot com> <51B65EA7 dot 2020402 at redhat dot com> <20130611011324 dot GT29800 at brightrain dot aerifal dot cx> <51B8702D dot 2060505 at redhat dot com> <20130813040038 dot GE21795 at spoyarek dot pnq dot redhat dot com> <520C88A6 dot 9070501 at redhat dot com> <56D54DAD dot 1040306 at gmail dot com>
On 03/01/2016 09:07 AM, Michael Kerrisk (man-pages) wrote:
> I see that glibc 2.23 deprecates readdir_r(), which prompted me to catch
> up on this thread. I'd like to see the points you make documented in the
> readdir_r(3) man page also. Would you be willing to allow that text to
> be reused / reworked for the page, under that page's existing "verbatim"
> license (https://www.kernel.org/doc/man-pages/licenses.html#verbatim)?
Hi Michael,
thanks for keeping an eye on deprecations. The deprecation happened for
glibc 2.24 (unrelased).
I'm happy to report that I may grant your request.
> The text I'd propose to add to the man page would be (new material
> starting at ===>):
It may make sense to move this documentation to a separate manual page,
specific to readdir_r. This will keep the readdir documentation nice
and crisp. Most programmers will never have to consult all these details.
You should remove the example using pathconf because it is not correct.
The kernel does not return valid values for _PC_NAME_MAX and some file
systems (such as CIFS, and CD-ROMs with Joliet extensions once a kernel
bug is fixed). The CIFS limit is somewhere around 765, and not 255 as
reported by the kernel. If I recall correctly, Windows SMB servers can
actually exceed the 255 byte limit. The reason is that Windows NTFS has
a limit based on 16-bit UCS-2 characters, and after UTF-8 conversion,
the maximum length is more than 255 bytes.
> ===> However, the above approach has problems, and it is recommended
> that applications use readdir() instead of readdir_r(). Furâ
> thermore, since version 2.23, glibc deprecates readdir_r().
> The reasons are as follows:
>
> * On systems where NAME_MAX is undefined, calling readdir_r()
> may be unsafe because the interface does not allow the callâ
> er to specify the length of the buffer used for the returned
> directory entry.
>
> * On some systems, readdir_r() can't read directory entries
> with very long names. When the glibc implementation encounâ
> ters such a name, readdir_r() fails with the error ENAMETOOâ
> LONG after the final directory entry has been read. On some
> other systems, readdir_r() may return a success status, but
> the returned d_name field may not be null terminated or may
> be truncated.
>
> * In the current POSIX.1 specification (POSIX.1-2008), readâ
> dir_r() is not required to be thread-safe. However, in modâ
> ern implementations (including the glibc implementation),
> concurrent calls to readdir_r() that specify different
> directory streams are thread-safe. Therefore, the use of
These two references to readdir_r should be to readdir instead.
I believe there was a historic implementation which implemented
fdopendir (fd) as (DIR *) fd, and used a global static buffer for
readdir. This is about the only way readdir can be non-thread-safe.
> readdir_r() is generally unnecessary in multithreaded proâ
> grams. In cases where multiple threads must read from the
> same directory stream, using readdir() with external synâ
> chronization is still preferable to the use of readdir_r(),
> for the reasons given in the points above.
>
> * It is expected that a future version of POSIX.1 will make
> readdir_r() obsolete, and require that readdir() be thread-
> safe when concurrently employed on different directory
> streams.
Okay.
Thanks,
Florian