This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] glob: Avoid copying the d_name field of struct dirent [BZ #19779]
- From: Florian Weimer <fweimer at redhat dot com>
- To: Roland McGrath <roland at hack dot frob dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Wed, 30 Mar 2016 13:37:56 +0200
- Subject: Re: [PATCH] glob: Avoid copying the d_name field of struct dirent [BZ #19779]
- Authentication-results: sourceware.org; auth=none
- References: <56E339A7 dot 7060704 at redhat dot com> <20160311222757 dot DB90C2C3C24 at topped-with-meat dot com>
On 03/11/2016 11:27 PM, Roland McGrath wrote:
>> --- a/posix/bug-glob2.c
>> +++ b/posix/bug-glob2.c
> [...]
>> @@ -75,7 +87,7 @@ typedef struct
>> int level;
>> int idx;
>> struct dirent d;
>> - char room_for_dirent[NAME_MAX];
>> + char room_for_dirent[1000];
>> } my_DIR;
>
> There should be some comment about where this arbitrary size comes from.
I changed it to a sizeof.
>> +/* A representation of a directory entry which does not depend on the
>> + layout of struct dirent, or the size of ino_t. */
>> +struct abstract_dirent
>> +{
>> + const char *d_name;
>> + int d_type;
>> + bool skip_entry;
>> +};
>
> I'm not convinced that this is a good name for the struct. It's not an
> abstract form of 'struct dirent', because the name member points into some
> other buffer whose lifetime is not intrinsically related to this struct.
It's now struct readdir_result.
> uint8_t is big enough for d_type, and will make the struct smaller on
> ILP32.
I assume the struct is turned into scalars anyway, but I added the
__typeof__.
>> +/* Extract name and type from directory entry. No copy of the name is
>> + made. */
>> +static void
>> +convert_dirent (const struct dirent *source, struct abstract_dirent *target)
>
> I like Paul's suggestion of using a struct-returning function here. The
> comment should be explicit that the result will point to SOURCE->d_name and
> so SOURCE must be kept live.
I made these changes.
>> +#if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
>> + target->d_type = source->d_type;
>> +#else
>> + target->d_type = 0;
>> +#endif
>> +#if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
>> + /* Posix does not require that the d_ino field be present, and some
>> + systems do not provide it. */
>> + target->skip_entry = false;
>> +#else
>> + target->skip_entry = source->d_ino == 0;
>> +#endif
>
> For both of these I'd prefer to see an inline or macro that encapsulates
> the #if stuff without other repetition, e.g.:
There are now separate macros for that.
> target->type = D_TYPE (source);
> target->skip_entry = D_SKIP_ME (source);
>
>> + struct abstract_dirent e;
>> + {
>> + struct dirent *d = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)
>> + ? ((struct dirent *)
>> + (*pglob->gl_readdir) (stream))
>> + : __readdir (stream));
>
> Convert to __glibc_unlikely while you're here (unless the gnulib sharing
> rules say not to, not clear on that off hand).
>> + if (d == NULL)
>> + break;
>> + convert_dirent (d, &e);
>> + }
>
> This could all be nicely broken out into a function that takes STREAM,
> FLAGS, PGLOB, and returns E.
It's slightly more complicated than that because I accidentally removed
the interesting parts.
Further cleanups should wait until we have a decision whether we will
ever try to synchronize with gnulib again.
>> - len = NAMLEN (d);
>> - names->name[cur] = (char *) malloc (len + 1);
>> + names->name[cur] = strdup (e.d_name);
>> if (names->name[cur] == NULL)
>> goto memory_error;
>> - *((char *) mempcpy (names->name[cur++], name, len))
>> - = '\0';
>> + ++cur;
>
> In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
> as malloc+mempcpy (with no strlen required). Perhaps it's close enough,
> but that is a subtle change you didn't mention as intended.
I want to avoid conditionalized code because these things tend to break
after a while (or never work in the first place).
I have tested the attached patch with fake-disabled d_type support, and
on i386-redhat-linux-gnu and x86_64-redhat-linux-gnu.
Florian
2016-03-30 Florian Weimer <fweimer@redhat.com>
[BZ #19779]
CVE-2016-1234
Avoid copying names of directory entries.
* posix/glob.c (NAMELEN, DIRENT_MUST_BE, DIRENT_MIGHT_BE_SYMLINK)
(DIRENT_MIGHT_BE_DIR, CONVERT_D_NAMLEN, CONVERT_D_INO)
(CONVERT_D_TYPE, CONVERT_DIRENT_DIRENT64, REAL_DIR_ENTRY)
(NAME_MAX): Remove macros.
(struct readdir_result): New type.
(D_TYPE_TO_RESULT, D_TYPE_TO_RESULT, GL_READDIR): New macros.
(readdir_result_might_be_symlink, readdir_result_might_be_dir):
New functions.
(convert_dirent, convert_dirent64): New function.
(glob_in_dir): Use struct dirent_result. Call convert_dirent or
convert_dirent64. Adjust references to the readdir result. Use
strdup instead of NAMELEN and mempcpy.
* sysdeps/unix/sysv/linux/i386/glob64.c:
(convert_dirent, GL_READDIR): Redefine for second file inclusion.
* posix/bug-glob2.c (LONG_NAME): Define.
(filesystem): Add LONG_NAME.
(my_DIR): Increase the size of room_for_dirent.
diff --git a/posix/bug-glob2.c b/posix/bug-glob2.c
index ddf5ec9..fca048b 100644
--- a/posix/bug-glob2.c
+++ b/posix/bug-glob2.c
@@ -40,6 +40,17 @@
# define PRINTF(fmt, args...)
#endif
+#define LONG_NAME \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+ "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
static struct
{
@@ -58,6 +69,7 @@ static struct
{ ".", 3, DT_DIR, 0755 },
{ "..", 3, DT_DIR, 0755 },
{ "a", 3, DT_REG, 0644 },
+ { LONG_NAME, 3, DT_REG, 0644 },
{ "unreadable", 2, DT_DIR, 0111 },
{ ".", 3, DT_DIR, 0111 },
{ "..", 3, DT_DIR, 0755 },
@@ -75,7 +87,7 @@ typedef struct
int level;
int idx;
struct dirent d;
- char room_for_dirent[NAME_MAX];
+ char room_for_dirent[sizeof (LONG_NAME)];
} my_DIR;
diff --git a/posix/glob.c b/posix/glob.c
index 0c04c3c..e043edc 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -24,6 +24,7 @@
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <stdbool.h>
#include <stddef.h>
/* Outcomment the following line for production quality code. */
@@ -57,10 +58,8 @@
#if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
# include <dirent.h>
-# define NAMLEN(dirent) strlen((dirent)->d_name)
#else
# define dirent direct
-# define NAMLEN(dirent) (dirent)->d_namlen
# ifdef HAVE_SYS_NDIR_H
# include <sys/ndir.h>
# endif
@@ -75,82 +74,8 @@
# endif /* HAVE_VMSDIR_H */
#endif
-
-/* In GNU systems, <dirent.h> defines this macro for us. */
-#ifdef _D_NAMLEN
-# undef NAMLEN
-# define NAMLEN(d) _D_NAMLEN(d)
-#endif
-
-/* When used in the GNU libc the symbol _DIRENT_HAVE_D_TYPE is available
- if the `d_type' member for `struct dirent' is available.
- HAVE_STRUCT_DIRENT_D_TYPE plays the same role in GNULIB. */
-#if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
-/* True if the directory entry D must be of type T. */
-# define DIRENT_MUST_BE(d, t) ((d)->d_type == (t))
-
-/* True if the directory entry D might be a symbolic link. */
-# define DIRENT_MIGHT_BE_SYMLINK(d) \
- ((d)->d_type == DT_UNKNOWN || (d)->d_type == DT_LNK)
-
-/* True if the directory entry D might be a directory. */
-# define DIRENT_MIGHT_BE_DIR(d) \
- ((d)->d_type == DT_DIR || DIRENT_MIGHT_BE_SYMLINK (d))
-
-#else /* !HAVE_D_TYPE */
-# define DIRENT_MUST_BE(d, t) false
-# define DIRENT_MIGHT_BE_SYMLINK(d) true
-# define DIRENT_MIGHT_BE_DIR(d) true
-#endif /* HAVE_D_TYPE */
-
-/* If the system has the `struct dirent64' type we use it internally. */
-#if defined _LIBC && !defined COMPILE_GLOB64
-# if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
-# define CONVERT_D_NAMLEN(d64, d32)
-# else
-# define CONVERT_D_NAMLEN(d64, d32) \
- (d64)->d_namlen = (d32)->d_namlen;
-# endif
-
-# if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
-# define CONVERT_D_INO(d64, d32)
-# else
-# define CONVERT_D_INO(d64, d32) \
- (d64)->d_ino = (d32)->d_ino;
-# endif
-
-# ifdef _DIRENT_HAVE_D_TYPE
-# define CONVERT_D_TYPE(d64, d32) \
- (d64)->d_type = (d32)->d_type;
-# else
-# define CONVERT_D_TYPE(d64, d32)
-# endif
-
-# define CONVERT_DIRENT_DIRENT64(d64, d32) \
- memcpy ((d64)->d_name, (d32)->d_name, NAMLEN (d32) + 1); \
- CONVERT_D_NAMLEN (d64, d32) \
- CONVERT_D_INO (d64, d32) \
- CONVERT_D_TYPE (d64, d32)
-#endif
-
-
-#if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
-/* Posix does not require that the d_ino field be present, and some
- systems do not provide it. */
-# define REAL_DIR_ENTRY(dp) 1
-#else
-# define REAL_DIR_ENTRY(dp) (dp->d_ino != 0)
-#endif /* POSIX */
-
#include <stdlib.h>
#include <string.h>
-
-/* NAME_MAX is usually defined in <dirent.h> or <limits.h>. */
-#include <limits.h>
-#ifndef NAME_MAX
-# define NAME_MAX (sizeof (((struct dirent *) 0)->d_name))
-#endif
-
#include <alloca.h>
#ifdef _LIBC
@@ -195,8 +120,104 @@
static const char *next_brace_sub (const char *begin, int flags) __THROWNL;
+/* A representation of a directory entry which does not depend on the
+ layout of struct dirent, or the size of ino_t. */
+struct readdir_result
+{
+ const char *name;
+ bool skip_entry;
+# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
+ __typeof__ (((struct dirent) {}).d_type) type;
+# endif
+};
+
+# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
+/* Designated initializer based on the d_type member of struct
+ dirent. */
+# define D_TYPE_TO_RESULT(source) .type = (source)->d_type
+
+/* True if the directory entry D might be a symbolic link. */
+static inline bool
+readdir_result_might_be_symlink (struct readdir_result d)
+{
+ return d.type == DT_UNKNOWN || d.type == DT_LNK;
+}
+
+/* True if the directory entry D might be a directory. */
+static inline bool
+readdir_result_might_be_dir (struct readdir_result d)
+{
+ return d.type == DT_DIR || readdir_result_might_be_symlink (d);
+}
+# else
+# define D_TYPE_TO_RESULT(source)
+
+/* If we do not have type information, symbolic links and directories
+ are always a possibility. */
+
+static inline bool
+readdir_result_might_be_symlink (struct readdir_result d)
+{
+ return true;
+}
+
+static inline bool
+readdir_result_might_be_dir (struct readdir_result d)
+{
+ return true;
+}
+
+# endif
+
+# if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
+/* Designated initializer for skip_entry. POSIX does not require that
+ the d_ino field be present, and some systems do not provide it. */
+# define D_INO_TO_RESULT(source) .skip_entry = false
+# else
+# define D_INO_TO_RESULT(source) .skip_entry = (source)->d_ino == 0
+# endif
+
#endif /* !defined _LIBC || !defined GLOB_ONLY_P */
+/* Call gl_readdir on STREAM. This macro can be overridden to reduce
+ type safety if an old interface version needs to be supported. */
+#ifndef GL_READDIR
+# define GL_READDIR(pglob, stream) (pglob)->gl_readdir ((stream))
+#endif
+
+/* Extract name and type from directory entry. No copy of the name is
+ made. If SOURCE is NULL, result name is NULL. */
+static struct readdir_result
+convert_dirent (const struct dirent *source)
+{
+ if (source == NULL)
+ return (struct readdir_result) {};
+ else
+ return (struct readdir_result)
+ {
+ .name = source->d_name,
+ D_INO_TO_RESULT (source),
+ D_TYPE_TO_RESULT (source)
+ };
+}
+
+#ifndef COMPILE_GLOB64
+/* Like convert_dirent, but works on struct dirent64 instead. */
+static struct readdir_result
+convert_dirent64 (const struct dirent64 *source)
+{
+ if (source == NULL)
+ return (struct readdir_result) {};
+ else
+ return (struct readdir_result)
+ {
+ .name = source->d_name,
+ D_INO_TO_RESULT (source),
+ D_TYPE_TO_RESULT (source)
+ };
+}
+#endif
+
#ifndef attribute_hidden
# define attribute_hidden
#endif
@@ -1553,56 +1574,36 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
while (1)
{
- const char *name;
- size_t len;
-#if defined _LIBC && !defined COMPILE_GLOB64
- struct dirent64 *d;
- union
- {
- struct dirent64 d64;
- char room [offsetof (struct dirent64, d_name[0])
- + NAME_MAX + 1];
- }
- d64buf;
-
- if (__glibc_unlikely (flags & GLOB_ALTDIRFUNC))
- {
- struct dirent *d32 = (*pglob->gl_readdir) (stream);
- if (d32 != NULL)
- {
- CONVERT_DIRENT_DIRENT64 (&d64buf.d64, d32);
- d = &d64buf.d64;
- }
- else
- d = NULL;
- }
- else
- d = __readdir64 (stream);
+ struct readdir_result d;
+ {
+ if (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0))
+ d = convert_dirent (GL_READDIR (pglob, stream));
+ else
+ {
+#ifdef COMPILE_GLOB64
+ d = convert_dirent (__readdir (stream));
#else
- struct dirent *d = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)
- ? ((struct dirent *)
- (*pglob->gl_readdir) (stream))
- : __readdir (stream));
+ d = convert_dirent64 (__readdir64 (stream));
#endif
- if (d == NULL)
+ }
+ }
+ if (d.name == NULL)
break;
- if (! REAL_DIR_ENTRY (d))
+ if (d.skip_entry)
continue;
/* If we shall match only directories use the information
provided by the dirent call if possible. */
- if ((flags & GLOB_ONLYDIR) && !DIRENT_MIGHT_BE_DIR (d))
+ if ((flags & GLOB_ONLYDIR) && !readdir_result_might_be_dir (d))
continue;
- name = d->d_name;
-
- if (fnmatch (pattern, name, fnm_flags) == 0)
+ if (fnmatch (pattern, d.name, fnm_flags) == 0)
{
/* If the file we found is a symlink we have to
make sure the target file exists. */
- if (!DIRENT_MIGHT_BE_SYMLINK (d)
- || link_exists_p (dfd, directory, dirlen, name, pglob,
- flags))
+ if (!readdir_result_might_be_symlink (d)
+ || link_exists_p (dfd, directory, dirlen, d.name,
+ pglob, flags))
{
if (cur == names->count)
{
@@ -1622,12 +1623,10 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
names = newnames;
cur = 0;
}
- len = NAMLEN (d);
- names->name[cur] = (char *) malloc (len + 1);
+ names->name[cur] = strdup (d.name);
if (names->name[cur] == NULL)
goto memory_error;
- *((char *) mempcpy (names->name[cur++], name, len))
- = '\0';
+ ++cur;
++nfound;
}
}
diff --git a/sysdeps/unix/sysv/linux/i386/glob64.c b/sysdeps/unix/sysv/linux/i386/glob64.c
index b4fcd1a..bc0fabc 100644
--- a/sysdeps/unix/sysv/linux/i386/glob64.c
+++ b/sysdeps/unix/sysv/linux/i386/glob64.c
@@ -1,3 +1,21 @@
+/* Two glob variants with 64-bit support, for dirent64 and __olddirent64.
+ Copyright (C) 1998-2016 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
#include <dirent.h>
#include <glob.h>
#include <sys/stat.h>
@@ -38,11 +56,15 @@ int __old_glob64 (const char *__pattern, int __flags,
#undef dirent
#define dirent __old_dirent64
+#undef GL_READDIR
+# define GL_READDIR(pglob, stream) \
+ ((struct __old_dirent64 *) (pglob)->gl_readdir ((stream)))
#undef __readdir
#define __readdir(dirp) __old_readdir64 (dirp)
#undef glob
#define glob(pattern, flags, errfunc, pglob) \
__old_glob64 (pattern, flags, errfunc, pglob)
+#define convert_dirent __old_convert_dirent
#define glob_in_dir __old_glob_in_dir
#define GLOB_ATTRIBUTE attribute_compat_text_section