This is the mail archive of the libc-alpha@cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[Jamie Lokier <jamie.lokier@cern.ch>] libc/1067: [glibc-2.1.1] Redundant lseeks from Linux __getdirentries, with fix



Jamie send us another report about getdirentries.  This one optimizes
away a number of lseek calls.  I didn't check all assertions of the
patch.

Andreas





>Number:         1067
>Category:       libc
>Synopsis:       Redundant lseeks from Linux __getdirentries, with fix
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    libc-gnats
>State:          open
>Class:          change-request
>Submitter-Id:   unknown
>Arrival-Date:   Tue Apr 06 23:30:02 EDT 1999
>Last-Modified:
>Originator:     Jamie Lokier
>Organization:
 CERN, Geneva
>Release:        libc-2.1.1
>Environment:
	
Host type: i386-redhat-linux-gnu
System: Linux pcep-jamie 2.2.5 #9 Wed Mar 31 18:35:38 CEST 1999 i686 unknown
Architecture: i686

Addons: crypt glibc-compat linuxthreads
Build CFLAGS: -O3 -Wall -Winline -Wstrict-prototypes -Wwrite-strings -g
Build CC: egcs
Compiler version: egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
Kernel headers: 2.2.5
Symbol versioning: yes
Build static: yes
Build shared: yes
Build pic-default: no
Build profile: yes
Build omitfp: no
Build bounded: no
Build static-nss: no
Stdio: libio

>Description:
        On Linux, __getdirentries does redundent lseek system calls.
	The overhead is quite significant (4.5% of the running time of
	a program which is similar to `find'), due entirely to the
	redundant calls.  And it is easily avoided.

>How-To-Repeat:
	strace any program which calls __getdirentries.
	Watch the call to lseek to get the current file position,
	before each call to getdents which is what getdirentries
	uses on Linux to read a directory.

>Fix:

The lseek call is redundant, except in the case where the caller
explicitly requests the original file position.  However a little care
is needed to make sure it can be removed.  Hopefully the comments I've
included make it clear.

Here's the patch.

--- glibc/sysdeps/unix/sysv/linux/getdents.c.jamie	Thu Oct 22 15:57:51 1998
+++ glibc/sysdeps/unix/sysv/linux/getdents.c	Wed Apr  7 01:17:26 1999
@@ -1,4 +1,4 @@
-/* Copyright (C) 1993, 1995, 1996, 1997, 1998 Free Software Foundation, Inc.
+/* Copyright (C) 1993,1995,1996,1997,1998,1999 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -65,17 +65,30 @@
 ssize_t
 __getdirentries (int fd, char *buf, size_t nbytes, off_t *basep)
 {
-  off_t base = __lseek (fd, (off_t) 0, SEEK_CUR);
+  off_t base = basep ? __lseek (fd, (off_t) 0, SEEK_CUR) : 0;
   off_t last_offset = base;
   size_t red_nbytes;
   struct kernel_dirent *skdp, *kdp;
   struct dirent *dp;
   int retval;
+  const size_t alignment = __alignof__ (struct dirent);
   const size_t size_diff = (offsetof (struct dirent, d_name)
 			    - offsetof (struct kernel_dirent, d_name));
+  const size_t divisor = ((offsetof (struct dirent, d_name) + 14 + alignment)
+			  & ~(alignment - 1));
 
-  red_nbytes = nbytes - ((nbytes / (offsetof (struct dirent, d_name) + 14))
-			 * size_diff);
+  /* To avoid a call to __lseek (when basep == 0), we have to avoid
+     requiring an initial value for last_offset.  Do this by ensuring
+     that if the kernel returns any entries, we always have room to
+     copy at least one.  Ensure this by:
+       nbytes >= ((red_nbytes + size_diff + alignment - 1) & ~(alignment - 1))
+     or equivalently:
+       red_nbytes <= (nbytes & ~(alignment - 1)) - size_diff
+     Hence the checks.  */
+  nbytes &= ~(alignment - 1);
+  red_nbytes = nbytes - (nbytes / divisor) * size_diff;
+  if (red_nbytes + size_diff > nbytes)
+    red_nbytes = nbytes > size_diff ? nbytes - size_diff : 0;
 
   dp = (struct dirent *) buf;
   skdp = kdp = __alloca (red_nbytes);
@@ -84,7 +97,6 @@
 
   while ((char *) kdp < (char *) skdp + retval)
     {
-      const size_t alignment = __alignof__ (struct dirent);
       /* Since kdp->d_reclen is already aligned for the kernel structure
 	 this may compute a value that is bigger than necessary.  */
       size_t new_reclen = ((kdp->d_reclen + size_diff + alignment - 1)

>Audit-Trail:
>Unformatted:




-- 
 Andreas Jaeger   aj@arthur.rhein-neckar.de    jaeger@informatik.uni-kl.de
  for pgp-key finger ajaeger@aixd1.rhrk.uni-kl.de

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]