This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
- From: Leonhard Holz <leonhard dot holz at web dot de>
- To: libc-alpha at sourceware dot org
- Date: Fri, 21 Nov 2014 11:23:31 +0100
- Subject: Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
- Authentication-results: sourceware.org; auth=none
- References: <546DCC25 dot 4020808 at web dot de> <20141121040317 dot GT22465 at brightrain dot aerifal dot cx>
Am 21.11.2014 05:03, schrieb Rich Felker:
On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
tries to allocate the memory on the stack. This looks a bit weird to
me since if the needed memory is too large for the heap then it's
probably too large for the stack. Also alloca has a bad fail
behaviour, best case it terminates the program with a stack overflow
error, otherwise the "program behavior is undefined" (man page). So
I think one should avoid allocation attempts larger than
__libc_use_alloca recommends in any case.
Indeed, this is a serious bug.
I took a look at __libc_use_alloca and it's just
extern inline int __libc_use_alloca (size_t size)
{
return size <= __MAX_ALLOCA_CUTOFF;
}
Should'nt it consider the currently used stack size to see if there is
enough stack memory left? Or do I have a too simple idea of OS stack
memory handling...
5. Handling of failed memory allocation. Since the last ressort in
the current implementation is a stack overflow there is currently no
"function could not compute" path. I'd like to add that in case the
needed buffer is too large at all (regarding size_t) or too large
for alloca + malloc fails. For the return value the man page states:
"The strxfrm() function returns the number of bytes required to
store the transformed string in dest excluding the terminating null
byte ('\0'). If the value returned is n or more, the contents of
dest are indeterminate." - Does that imply that returning n is a
good idea to communicate an error?
No, the function is not permitted to return an error; it's required by
ISO C to produce a result. Falsely reporting that it needs more space
for the result, and thereby causing the caller to keep allocating
larger and larger buffers until it runs out of memory itself, is not
valid; in particular, it could report different needed lengths for the
same string at different calls in the ame program with the same
locale.
If strcoll_l is using an algorithm that requires allocation, this
needs to be fixed -- there's no fundamental reason it needs to
allocate.
Ok. It is no big deal to add a none-allocating path but the question
than is when to use it. We could stick to the current implementation and
just try to malloc() if the stack is not available but personally I
would not want strxfrm to even try to allocate memory beyond a certain
amount. Considering that __MAX_ALLOCA_CUTOFF is actually 64KB so that
strings up to 12.8KB could have a stack based index & rules cache one
could maybe avoid malloc() at all without hurting most real world use cases.
So there are six possible scenarios:
1. Do not allocate a cache. This slows performance but avoids a call to
strlen() and duplicate code (cached & uncached version).
2. Do allocate a cache on the stack only.
3. Do allocate on the stack first and otherwise malloc.
4. Do allocate on the stack first and otherwise malloc if the cache size
is below a treshold.
5. Allocate the cache via malloc only.
6. Allocate the cache via malloc only if the size is below a treshold.
Leonhard