This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC][BZ #16009] Memory handling in strxfrm_l()

Hello everybody,

by reading through the strxfrm_l function to solve a possible buffer overflow bug ( I found some other improvable things and would like to have some feedback how to proceed.

1. Line 133: Empty string handling. This could be placed at the very start of the function. Instead of determining the strlen() of the input one could check if the first byte is \0 -> faster special path.

2. Line 151: Size of needed buffer for weights and rule cache. This is calculated as (srclen + 1) * (sizeof (int32_t) + 1). Since srclen and the malloc parameter are both size_t this will cause an integer overflow for large enough strings and result in a too-small buffer which is written beyond its boundary (this is actually bug 16009).

3. Lines 153, 169, 170: Repetition of buffer size calculation. I would like to introduce a variable to calculate it just ones.

4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l tries to allocate the memory on the stack. This looks a bit weird to me since if the needed memory is too large for the heap then it's probably too large for the stack. Also alloca has a bad fail behaviour, best case it terminates the program with a stack overflow error, otherwise the "program behavior is undefined" (man page). So I think one should avoid allocation attempts larger than __libc_use_alloca recommends in any case.

5. Handling of failed memory allocation. Since the last ressort in the current implementation is a stack overflow there is currently no "function could not compute" path. I'd like to add that in case the needed buffer is too large at all (regarding size_t) or too large for alloca + malloc fails. For the return value the man page states: "The strxfrm() function returns the number of bytes required to store the transformed string in dest excluding the terminating null byte ('\0'). If the value returned is n or more, the contents of dest are indeterminate." - Does that imply that returning n is a good idea to communicate an error?

6. Use the given parameter n instead of srclen = strlen(src) for computing the buffer size. Since only n bytes are written into dst the number of weights and rules to cache should be limited by this parameter and calling strlen could be avoided. Does this sound reasonable?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]