This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Avoid the use of mmap as buffer for stdio streams


currently glibc uses mmap() to allocate the (anonymous) buffers that back a stdio stream. This is non-optimal in several ways:
* mmap and munmap are very expensive operations in a threaded/multi
cpu envinvironment: the kernel needs to take "big" locks and also
needs to do a cross-cpu TLB invalidate on every munmap.
* even though the standard buffer is 1Kb, mmap() causes this to take
4Kb of memory and TLB space; this is wasting 300% !
* there is no recycling of the memory going on on frequent
fopen/fclose cycles, each cycle leads to the expensive system calls
without any sharing; sharing is good for avoiding work,
tlb pressure, cpu cache utilization etc.
* unlike malloc()/free(), mmap() and munmap() have very little sanity
checking and bugs in the implementation or use get hidden most of
the time but can lead to sporadic and hard to debug corruption or
other weird effects [*]

This is caused by ALLOC_BUF and FREE_BUF to be hardcoded to mmap/munmap in libio/libioP.h in case the operating system has mmap; in the same file an malloc/free based implementation is available as well for the non-mmap case. I suspect this came from a time where mmap was thought to be much faster than malloc.... which isn't the case on nowadays glibc/Linux.

The attached patch just removes the mmap based implementation, and causes glibc to always use the malloc/free based implementation.

This has the advantage of solving all the downsides of mmap/munmap, but has 2 potential disadvantages of it's own:
1) it's now no longer a good idea to use fopen and friends inside the malloc() implementation (that doesn't seem to happen anyway)

[*] 2) stricter error checking in free() compared to malloc will uncover bugs. For example setvbuf() is known to cause a "junk" pointer to be passed to FREE_BUF(), with this patch this will cause a program abort, rather than the existing "call munmap() on a junk pointer and watch the kernel give -EINVAL" behavior.
( )

I feel that these potential disadvantages outweigh the advantages and so I'd like to ask for the patch to be applied (possibly after the setvbuf bug is fixed first)

   Arjan van de Ven
Index: glibc-20061120T1000/libio/libioP.h
--- glibc-20061120T1000.orig/libio/libioP.h
+++ glibc-20061120T1000/libio/libioP.h
@@ -825,26 +825,7 @@ extern void _IO_un_link_internal (struct
 # define ROUND_TO_PAGE(_S) \
        (((_S) + EXEC_PAGESIZE - 1) & ~(EXEC_PAGESIZE - 1))
-# define FREE_BUF(_B, _S) \
-       munmap ((_B), ROUND_TO_PAGE (_S))
-# define ALLOC_BUF(_B, _S, _R) \
-       do {								      \
-	  (_B) = (char *) mmap (0, ROUND_TO_PAGE (_S),			      \
-				PROT_READ | PROT_WRITE,			      \
-				MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);	      \
-	  if ((_B) == (char *) MAP_FAILED)				      \
-	    return (_R);						      \
-       } while (0)
-# define ALLOC_WBUF(_B, _S, _R) \
-       do {								      \
-	  (_B) = (wchar_t *) mmap (0, ROUND_TO_PAGE (_S),		      \
-				   PROT_READ | PROT_WRITE,		      \
-				   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);	      \
-	  if ((_B) == (wchar_t *) MAP_FAILED)				      \
-	    return (_R);						      \
-       } while (0)
-#else /* _G_HAVE_MMAP */
+#endif /* _G_HAVE_MMAP */
 # define FREE_BUF(_B, _S) \
@@ -861,7 +842,6 @@ extern void _IO_un_link_internal (struct
 	    return (_R);						      \
        } while (0)
-#endif /* _G_HAVE_MMAP */
 #ifndef OS_FSTAT
 # define OS_FSTAT fstat

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]