asprintf has two bugs. The first is a performance bug - when collecting a
string more than BUFSIZ bytes, realloc is potentially called for every byte
added to the string. If every call to realloc results in a relocation of the
existing memory contents, you have O(n^2) behavior. On the other hand, if
realloc were called with a 1.5 growth factor, rather than to the currently
needed size, you have reduced the complexity to O(n log n).
The second is a fatal off-by-one bug. For a demonstration of the bug, in
cygwin, this command causes bash to coredump, due to its use of asprintf to
implement the bash printf builtin. Basically, asprintf is writing beyond the
array bounds of any printed string greater than BUFSIZ bytes, and when that
write happens to be beyond a malloc buffer boundary, it corrupts data such that
free'ing the allocated string goes haywire:
$ bash
$ printf '%s\n' `perl -e 'printf "a"x1028'` > /dev/null
Aborted (core dumped)
$
One other thing to consider, but that this patch does not address: currently,
asprintf ALWAYS allocates a buffer of at least BUFSIZ bytes (1024 on cygwin),
even for something like asprintf(&s,"") that only uses 1 byte, as part of the
cantwrite() check at the start of vfprintf. Is it worth trying to tweak things
so that it initially allocates a smaller size, such as 64 bytes, to decrease
wasted memory overhead on small strings, not to mention the fact that some
malloc implementations have better strategies for small strings than for large
strings, in an effort to improve performance?
2006-11-29 Eric Blake <ebb9@byu.net>
* libc/stdio/fvwrite.c (__sfvwrite_r): Avoid off-by-one error in
asprintf, as well as quadratic realloc behavior.
Index: libc/stdio/fvwrite.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/fvwrite.c,v
retrieving revision 1.8
diff -u -u -p -r1.8 fvwrite.c
--- libc/stdio/fvwrite.c 14 Jun 2006 20:49:11 -0000 1.8
+++ libc/stdio/fvwrite.c 29 Nov 2006 17:54:15 -0000
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 1990 The Regents of the University of California.
+ * Copyright (c) 1990, 2006 The Regents of the University of California.
* All rights reserved.
*
* Redistribution and use in source and binary forms are permitted
@@ -127,13 +127,23 @@ _DEFUN(__sfvwrite_r, (ptr, fp, uio),
w = fp->_w;
if (fp->_flags & __SSTR)
{
- if (len > w && fp->_flags & __SMBF)
+ if (len >= w && fp->_flags & __SMBF)
{ /* must be asprintf family */
unsigned char *ptr;
int curpos = (fp->_p - fp->_bf._base);
+ /* Choose a geometric growth factor to avoid
+ quadratic realloc behavior, but use a rate less
+ than (1+sqrt(5))/2 to accomodate malloc
+ overhead. asprintf EXPECTS us to overallocate, so
+ that it can add a trailing \0 without
+ reallocating. The new allocation should thus be
+ max(prev_size*1.5, curpos+len+1). */
+ int newsize = fp->_bf._size * 3 / 2;
+ if (newsize < curpos + len + 1)
+ newsize = curpos + len + 1;
ptr = (unsigned char *)_realloc_r (_REENT,
fp->_bf._base,
- curpos + len);
+ newsize);
if (!ptr)
{
/* Free buffer which is no longer used. */
@@ -142,8 +152,9 @@ _DEFUN(__sfvwrite_r, (ptr, fp, uio),
}
fp->_bf._base = ptr;
fp->_p = ptr + curpos;
- fp->_bf._size = curpos + len;
- w = fp->_w = len;
+ fp->_bf._size = newsize;
+ w = len;
+ fp->_w = newsize - curpos;
}
if (len < w)
w = len;