This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] glob: Simplify the interface for the GLOB_ALTDIRFUNC callback gl_readdir
- From: Florian Weimer <fweimer at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 11 Apr 2016 18:57:17 +0200
- Subject: Re: [PATCH] glob: Simplify the interface for the GLOB_ALTDIRFUNC callback gl_readdir
- Authentication-results: sourceware.org; auth=none
- References: <57068276 dot 9000503 at redhat dot com> <20160408184422 dot A74BE2C3B21 at topped-with-meat dot com>
On 04/08/2016 08:44 PM, Roland McGrath wrote:
This is a potentially breaking ABI and API change. The documentation does
not say anything specific about what the gl_readdir function's protocol is.
It just says "an alternative implementation of readdir". The way I'd read
that, and the reality of the status quo ante, is that a trivial wrapper
around readdir will behave identically to not using GLOB_ALTDIRFUNC at all.
That means that d_ino==0 results must be ignored.
IMHO this means we need symbol versioning to make a change to the treatment
of d_ino.
I'm trying to get clarification if getdents on Linux can ever return
zero d_ino values. It's going to be difficult to work this out due to
the VFS layer. My hunch is that inode 0 is more likely to mean “I can't
tell you the inode number right now here”, and not “please skip this entry”.
In a sense, with glob64, supplying readdir64 as the callback does not
actually change behavior if the d_ino == 0 is left out because that
readdir64 implementation would not be POSIX-compliant (d_ino has to be
the file serial number if the member exists).
I think we have to filter in readdir in order to comply with POSIX if
the alleged historic race (d_ino is 0 during removal) can happen on
Linux or Hurd.
It's a sufficiently conservative change to start ignoring d_namlen,
so you could do that separately.
Indeed, this should be a different conversation. I tried to word the
new documentation in a way that leaves the door open for that.
Florian
glob: Simplify the interface for the GLOB_ALTDIRFUNC callback gl_readdir
Previously, application code had to set up d_ino and d_namlen members
if the platform supported them, involving conditional compilation.
DT_UNKNOWN is zero, so zero-initialization of struct dirent will set
the d_type member to a conservative value.
Changing the behavior with regards to d_ino is left to a future
cleanup.
2016-04-11 Florian Weimer <fweimer@redhat.com>
glob: Simplify and document the interface for the GLOB_ALTDIRFUNC
callback function gl_readdir.
* posix/glob.c (NAMELEN, CONVERT_D_NAMLEN): Remove.
(CONVERT_DIRENT_DIRENT64): Use strcpy instead of memcpy.
(glob_in_dir): Remove len. Use strdup instead of malloc and
memcpy to copy the name.
* posix/bug-glob2.c (my_readdir): Clear struct dirent object. Set
d_ino to 1.
* posix/tst-gnuglob.c (my_readdir): Likewise.
* manual/pattern.texi (Calling Glob): Document requirements for
implementations of the gl_readdir callback function.
diff --git a/manual/pattern.texi b/manual/pattern.texi
index d1b9275..fa5ecd7 100644
--- a/manual/pattern.texi
+++ b/manual/pattern.texi
@@ -237,6 +237,19 @@ function used to read the contents of a directory. It is used if the
@code{GLOB_ALTDIRFUNC} bit is set in the flag parameter. The type of
this field is @w{@code{struct dirent *(*) (void *)}}.
+An implementation of @code{gl_readdir} should initialize the
+@code{struct dirent} object to zero, up to the @code{d_name} member.
+The @code{d_ino} member must be set to a non-zero value, otherwise
+@code{glob} may ignore the returned directory entry. As an
+optimization, it may set the @code{d_type} member if the file type of
+the entry is known. The @code{d_name} member must be null-terminated;
+the entire string is used by @code{glob}.
+
+The @code{struct dirent} object can be overwritten by a subsequent call
+to the @code{gl_readdir} callback function on the same directory stream
+created by the @code{gl_opendir} callback function. It should be
+deallocated by the @code{gl_closedir} callback function.
+
This is a GNU extension.
@item gl_opendir
diff --git a/posix/bug-glob2.c b/posix/bug-glob2.c
index ddf5ec9..8c377b6 100644
--- a/posix/bug-glob2.c
+++ b/posix/bug-glob2.c
@@ -193,7 +193,8 @@ my_readdir (void *gdir)
return NULL;
}
- dir->d.d_ino = dir->idx;
+ memset (&dir->d, 0, offsetof (struct dirent, d_name));
+ dir->d.d_ino = 1; /* glob should not skip this entry. */
#ifdef _DIRENT_HAVE_D_TYPE
dir->d.d_type = filesystem[dir->idx].type;
@@ -202,13 +203,11 @@ my_readdir (void *gdir)
strcpy (dir->d.d_name, filesystem[dir->idx].name);
#ifdef _DIRENT_HAVE_D_TYPE
- PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_ino: %ld, d_type: %d, d_name: \"%s\" }\n",
- dir->level, (long int) dir->idx, dir->d.d_ino, dir->d.d_type,
- dir->d.d_name);
+ PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_type: %d, d_name: \"%s\" }\n",
+ dir->level, (long int) dir->idx, dir->d.d_type, dir->d.d_name);
#else
- PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_ino: %ld, d_name: \"%s\" }\n",
- dir->level, (long int) dir->idx, dir->d.d_ino,
- dir->d.d_name);
+ PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_name: \"%s\" }\n",
+ dir->level, (long int) dir->idx, dir->d.d_name);
#endif
++dir->idx;
diff --git a/posix/glob.c b/posix/glob.c
index 0c04c3c..9ae76ac 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -57,10 +57,8 @@
#if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
# include <dirent.h>
-# define NAMLEN(dirent) strlen((dirent)->d_name)
#else
# define dirent direct
-# define NAMLEN(dirent) (dirent)->d_namlen
# ifdef HAVE_SYS_NDIR_H
# include <sys/ndir.h>
# endif
@@ -76,12 +74,6 @@
#endif
-/* In GNU systems, <dirent.h> defines this macro for us. */
-#ifdef _D_NAMLEN
-# undef NAMLEN
-# define NAMLEN(d) _D_NAMLEN(d)
-#endif
-
/* When used in the GNU libc the symbol _DIRENT_HAVE_D_TYPE is available
if the `d_type' member for `struct dirent' is available.
HAVE_STRUCT_DIRENT_D_TYPE plays the same role in GNULIB. */
@@ -105,12 +97,6 @@
/* If the system has the `struct dirent64' type we use it internally. */
#if defined _LIBC && !defined COMPILE_GLOB64
-# if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
-# define CONVERT_D_NAMLEN(d64, d32)
-# else
-# define CONVERT_D_NAMLEN(d64, d32) \
- (d64)->d_namlen = (d32)->d_namlen;
-# endif
# if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
# define CONVERT_D_INO(d64, d32)
@@ -127,8 +113,7 @@
# endif
# define CONVERT_DIRENT_DIRENT64(d64, d32) \
- memcpy ((d64)->d_name, (d32)->d_name, NAMLEN (d32) + 1); \
- CONVERT_D_NAMLEN (d64, d32) \
+ strcpy ((d64)->d_name, (d32)->d_name); \
CONVERT_D_INO (d64, d32) \
CONVERT_D_TYPE (d64, d32)
#endif
@@ -1554,7 +1539,6 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
while (1)
{
const char *name;
- size_t len;
#if defined _LIBC && !defined COMPILE_GLOB64
struct dirent64 *d;
union
@@ -1622,12 +1606,10 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
names = newnames;
cur = 0;
}
- len = NAMLEN (d);
- names->name[cur] = (char *) malloc (len + 1);
+ names->name[cur] = strdup (d->d_name);
if (names->name[cur] == NULL)
goto memory_error;
- *((char *) mempcpy (names->name[cur++], name, len))
- = '\0';
+ ++cur;
++nfound;
}
}
diff --git a/posix/tst-gnuglob.c b/posix/tst-gnuglob.c
index 992b997..184e030 100644
--- a/posix/tst-gnuglob.c
+++ b/posix/tst-gnuglob.c
@@ -211,7 +211,8 @@ my_readdir (void *gdir)
return NULL;
}
- dir->d.d_ino = dir->idx;
+ memset (&dir->d, 0, offsetof (struct dirent, d_name));
+ dir->d.d_ino = 1; /* glob should not skip this entry. */
#ifdef _DIRENT_HAVE_D_TYPE
dir->d.d_type = filesystem[dir->idx].type;
@@ -220,13 +221,11 @@ my_readdir (void *gdir)
strcpy (dir->d.d_name, filesystem[dir->idx].name);
#ifdef _DIRENT_HAVE_D_TYPE
- PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_ino: %ld, d_type: %d, d_name: \"%s\" }\n",
- dir->level, (long int) dir->idx, dir->d.d_ino, dir->d.d_type,
- dir->d.d_name);
+ PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_type: %d, d_name: \"%s\" }\n",
+ dir->level, (long int) dir->idx, dir->d.d_type, dir->d.d_name);
#else
- PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_ino: %ld, d_name: \"%s\" }\n",
- dir->level, (long int) dir->idx, dir->d.d_ino,
- dir->d.d_name);
+ PRINTF ("my_readdir ({ level: %d, idx: %ld }) = { d_name: \"%s\" }\n",
+ dir->level, (long int) dir->idx, dir->d.d_name);
#endif
++dir->idx;