This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][BZ #16009] Memory handling in strxfrm_l()



Am 21.11.2014 05:03, schrieb Rich Felker:
On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
tries to allocate the memory on the stack. This looks a bit weird to
me since if the needed memory is too large for the heap then it's
probably too large for the stack. Also alloca has a bad fail
behaviour, best case it terminates the program with a stack overflow
error, otherwise the "program behavior is undefined" (man page). So
I think one should avoid allocation attempts larger than
__libc_use_alloca recommends in any case.

Indeed, this is a serious bug.


I took a look at __libc_use_alloca and it's just

extern inline int __libc_use_alloca (size_t size)
{
  return size <= __MAX_ALLOCA_CUTOFF;
}

Should'nt it consider the currently used stack size to see if there is enough stack memory left? Or do I have a too simple idea of OS stack memory handling...

5. Handling of failed memory allocation. Since the last ressort in
the current implementation is a stack overflow there is currently no
"function could not compute" path. I'd like to add that in case the
needed buffer is too large at all (regarding size_t) or too large
for alloca + malloc fails. For the return value the man page states:
"The strxfrm() function returns the number of bytes required to
store the transformed string in dest excluding the terminating null
byte ('\0'). If the value returned is n or more, the contents of
dest are indeterminate." - Does that imply that returning n is a
good idea to communicate an error?

No, the function is not permitted to return an error; it's required by
ISO C to produce a result. Falsely reporting that it needs more space
for the result, and thereby causing the caller to keep allocating
larger and larger buffers until it runs out of memory itself, is not
valid; in particular, it could report different needed lengths for the
same string at different calls in the ame program with the same
locale.

If strcoll_l is using an algorithm that requires allocation, this
needs to be fixed -- there's no fundamental reason it needs to
allocate.


Ok. It is no big deal to add a none-allocating path but the question than is when to use it. We could stick to the current implementation and just try to malloc() if the stack is not available but personally I would not want strxfrm to even try to allocate memory beyond a certain amount. Considering that __MAX_ALLOCA_CUTOFF is actually 64KB so that strings up to 12.8KB could have a stack based index & rules cache one could maybe avoid malloc() at all without hurting most real world use cases.

So there are six possible scenarios:

1. Do not allocate a cache. This slows performance but avoids a call to strlen() and duplicate code (cached & uncached version).

2. Do allocate a cache on the stack only.

3. Do allocate on the stack first and otherwise malloc.

4. Do allocate on the stack first and otherwise malloc if the cache size is below a treshold.

5. Allocate the cache via malloc only.

6. Allocate the cache via malloc only if the size is below a treshold.

Leonhard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]