- To: bugs@gnu.org
- Subject: libc/1067: [glibc-2.1.1] Redundant lseeks from Linux __getdirentries, with fix
- From: Jamie Lokier <jamie.lokier@cern.ch>
- Date: Wed, 7 Apr 1999 05:24:43 +0200
- Xref: arthur.rhein-neckar.de mail.gnats-libc-bugs:2927
>Number: 1067
>Category: libc
>Synopsis: Redundant lseeks from Linux __getdirentries, with fix
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: libc-gnats
>State: open
>Class: change-request
>Submitter-Id: unknown
>Arrival-Date: Tue Apr 06 23:30:02 EDT 1999
>Last-Modified:
>Originator: Jamie Lokier
>Organization:
CERN, Geneva
>Release: libc-2.1.1
>Environment:
Host type: i386-redhat-linux-gnu
System: Linux pcep-jamie 2.2.5 #9 Wed Mar 31 18:35:38 CEST 1999 i686 unknown
Architecture: i686
Addons: crypt glibc-compat linuxthreads
Build CFLAGS: -O3 -Wall -Winline -Wstrict-prototypes -Wwrite-strings -g
Build CC: egcs
Compiler version: egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
Kernel headers: 2.2.5
Symbol versioning: yes
Build static: yes
Build shared: yes
Build pic-default: no
Build profile: yes
Build omitfp: no
Build bounded: no
Build static-nss: no
Stdio: libio
>Description:
On Linux, __getdirentries does redundent lseek system calls.
The overhead is quite significant (4.5% of the running time of
a program which is similar to `find'), due entirely to the
redundant calls. And it is easily avoided.
>How-To-Repeat:
strace any program which calls __getdirentries.
Watch the call to lseek to get the current file position,
before each call to getdents which is what getdirentries
uses on Linux to read a directory.
>Fix:
The lseek call is redundant, except in the case where the caller
explicitly requests the original file position. However a little care
is needed to make sure it can be removed. Hopefully the comments I've
included make it clear.
Here's the patch.
--- glibc/sysdeps/unix/sysv/linux/getdents.c.jamie Thu Oct 22 15:57:51 1998
+++ glibc/sysdeps/unix/sysv/linux/getdents.c Wed Apr 7 01:17:26 1999
@@ -1,4 +1,4 @@
-/* Copyright (C) 1993, 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
+/* Copyright (C) 1993,1995,1996,1997,1998,1999 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -65,17 +65,30 @@
ssize_t
__getdirentries (int fd, char *buf, size_t nbytes, off_t *basep)
{
- off_t base = __lseek (fd, (off_t) 0, SEEK_CUR);
+ off_t base = basep ? __lseek (fd, (off_t) 0, SEEK_CUR) : 0;
off_t last_offset = base;
size_t red_nbytes;
struct kernel_dirent *skdp, *kdp;
struct dirent *dp;
int retval;
+ const size_t alignment = __alignof__ (struct dirent);
const size_t size_diff = (offsetof (struct dirent, d_name)
- offsetof (struct kernel_dirent, d_name));
+ const size_t divisor = ((offsetof (struct dirent, d_name) + 14 + alignment)
+ & ~(alignment - 1));
- red_nbytes = nbytes - ((nbytes / (offsetof (struct dirent, d_name) + 14))
- * size_diff);
+ /* To avoid a call to __lseek (when basep == 0), we have to avoid
+ requiring an initial value for last_offset. Do this by ensuring
+ that if the kernel returns any entries, we always have room to
+ copy at least one. Ensure this by:
+ nbytes >= ((red_nbytes + size_diff + alignment - 1) & ~(alignment - 1))
+ or equivalently:
+ red_nbytes <= (nbytes & ~(alignment - 1)) - size_diff
+ Hence the checks. */
+ nbytes &= ~(alignment - 1);
+ red_nbytes = nbytes - (nbytes / divisor) * size_diff;
+ if (red_nbytes + size_diff > nbytes)
+ red_nbytes = nbytes > size_diff ? nbytes - size_diff : 0;
dp = (struct dirent *) buf;
skdp = kdp = __alloca (red_nbytes);
@@ -84,7 +97,6 @@
while ((char *) kdp < (char *) skdp + retval)
{
- const size_t alignment = __alignof__ (struct dirent);
/* Since kdp->d_reclen is already aligned for the kernel structure
this may compute a value that is bigger than necessary. */
size_t new_reclen = ((kdp->d_reclen + size_diff + alignment - 1)
>Audit-Trail:
>Unformatted: