This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Leonhard Holz <leonhard dot holz at web dot de>
- Cc: libc-alpha at sourceware dot org
- Date: Sun, 23 Nov 2014 00:39:07 +0100
- Subject: Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
- Authentication-results: sourceware.org; auth=none
- References: <546DCC25 dot 4020808 at web dot de> <20141121040317 dot GT22465 at brightrain dot aerifal dot cx> <546F12A3 dot 5030204 at web dot de>
On Fri, Nov 21, 2014 at 11:23:31AM +0100, Leonhard Holz wrote:
>
> Am 21.11.2014 05:03, schrieb Rich Felker:
> >On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
> >>4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
> >>tries to allocate the memory on the stack. This looks a bit weird to
> >>me since if the needed memory is too large for the heap then it's
> >>probably too large for the stack. Also alloca has a bad fail
> >>behaviour, best case it terminates the program with a stack overflow
> >>error, otherwise the "program behavior is undefined" (man page). So
> >>I think one should avoid allocation attempts larger than
> >>__libc_use_alloca recommends in any case.
> >
> >Indeed, this is a serious bug.
> >
>
> I took a look at __libc_use_alloca and it's just
>
> extern inline int __libc_use_alloca (size_t size)
> {
> return size <= __MAX_ALLOCA_CUTOFF;
> }
>
> Should'nt it consider the currently used stack size to see if there
> is enough stack memory left? Or do I have a too simple idea of OS
> stack memory handling...
>
> >>5. Handling of failed memory allocation. Since the last ressort in
> >>the current implementation is a stack overflow there is currently no
> >>"function could not compute" path. I'd like to add that in case the
> >>needed buffer is too large at all (regarding size_t) or too large
> >>for alloca + malloc fails. For the return value the man page states:
> >>"The strxfrm() function returns the number of bytes required to
> >>store the transformed string in dest excluding the terminating null
> >>byte ('\0'). If the value returned is n or more, the contents of
> >>dest are indeterminate." - Does that imply that returning n is a
> >>good idea to communicate an error?
> >
> >No, the function is not permitted to return an error; it's required by
> >ISO C to produce a result. Falsely reporting that it needs more space
> >for the result, and thereby causing the caller to keep allocating
> >larger and larger buffers until it runs out of memory itself, is not
> >valid; in particular, it could report different needed lengths for the
> >same string at different calls in the ame program with the same
> >locale.
> >
> >If strcoll_l is using an algorithm that requires allocation, this
> >needs to be fixed -- there's no fundamental reason it needs to
> >allocate.
> >
>
> Ok. It is no big deal to add a none-allocating path but the question
> than is when to use it. We could stick to the current implementation
> and just try to malloc() if the stack is not available but
> personally I would not want strxfrm to even try to allocate memory
> beyond a certain amount. Considering that __MAX_ALLOCA_CUTOFF is
> actually 64KB so that strings up to 12.8KB could have a stack based
> index & rules cache one could maybe avoid malloc() at all without
> hurting most real world use cases.
>
> So there are six possible scenarios:
>
> 1. Do not allocate a cache. This slows performance but avoids a call
> to strlen() and duplicate code (cached & uncached version).
>
> 2. Do allocate a cache on the stack only.
>
> 3. Do allocate on the stack first and otherwise malloc.
>
> 4. Do allocate on the stack first and otherwise malloc if the cache
> size is below a treshold.
>
> 5. Allocate the cache via malloc only.
>
> 6. Allocate the cache via malloc only if the size is below a treshold.
>
You could also only cache last 16k characters on stack and if function
goes beyond that then recompute these / switch to uncached version.