This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[RFC][BZ #16009] Memory handling in strxfrm_l()
- From: Leonhard Holz <leonhard dot holz at web dot de>
- To: libc-alpha at sourceware dot org
- Date: Thu, 20 Nov 2014 12:10:29 +0100
- Subject: [RFC][BZ #16009] Memory handling in strxfrm_l()
- Authentication-results: sourceware.org; auth=none
Hello everybody,
by reading through the strxfrm_l function to solve a possible buffer
overflow bug (https://sourceware.org/bugzilla/show_bug.cgi?id=16009) I
found some other improvable things and would like to have some feedback
how to proceed.
1. Line 133: Empty string handling. This could be placed at the very
start of the function. Instead of determining the strlen() of the input
one could check if the first byte is \0 -> faster special path.
2. Line 151: Size of needed buffer for weights and rule cache. This is
calculated as (srclen + 1) * (sizeof (int32_t) + 1). Since srclen and
the malloc parameter are both size_t this will cause an integer overflow
for large enough strings and result in a too-small buffer which is
written beyond its boundary (this is actually bug 16009).
3. Lines 153, 169, 170: Repetition of buffer size calculation. I would
like to introduce a variable to calculate it just ones.
4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
tries to allocate the memory on the stack. This looks a bit weird to me
since if the needed memory is too large for the heap then it's probably
too large for the stack. Also alloca has a bad fail behaviour, best case
it terminates the program with a stack overflow error, otherwise the
"program behavior is undefined" (man page). So I think one should avoid
allocation attempts larger than __libc_use_alloca recommends in any case.
5. Handling of failed memory allocation. Since the last ressort in the
current implementation is a stack overflow there is currently no
"function could not compute" path. I'd like to add that in case the
needed buffer is too large at all (regarding size_t) or too large for
alloca + malloc fails. For the return value the man page states: "The
strxfrm() function returns the number of bytes required to store the
transformed string in dest excluding the terminating null byte ('\0').
If the value returned is n or more, the contents of dest are
indeterminate." - Does that imply that returning n is a good idea to
communicate an error?
6. Use the given parameter n instead of srclen = strlen(src) for
computing the buffer size. Since only n bytes are written into dst the
number of weights and rules to cache should be limited by this parameter
and calling strlen could be avoided. Does this sound reasonable?
Best,
Leonhard