This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
No, the function is not permitted to return an error; it's required by ISO C to produce a result. Falsely reporting that it needs more space for the result, and thereby causing the caller to keep allocating larger and larger buffers until it runs out of memory itself, is not valid; in particular, it could report different needed lengths for the same string at different calls in the ame program with the same locale. If strcoll_l is using an algorithm that requires allocation, this needs to be fixed -- there's no fundamental reason it needs to allocate.Ok. It is no big deal to add a none-allocating path but the question than is when to use it. We could stick to the current implementation and just try to malloc() if the stack is not available but personally I would not want strxfrm to even try to allocate memory beyond a certain amount. Considering that __MAX_ALLOCA_CUTOFF is actually 64KB so that strings up to 12.8KB could have a stack based index & rules cache one could maybe avoid malloc() at all without hurting most real world use cases.You could also only cache last 16k characters on stack and if function goes beyond that then recompute these / switch to uncached version.
Thank you all for the feedback. There are two things I overlooked: strxfrm needs to compute the whole src string because it has to return the needed dest length in any case and the weight-indices-cache is modified while traversing the string. So it's not possible to use a sliding-window-approach or restrict the cache size based on dest length.
I also agree that strxfrm is a function for pre-computing things that need to be fast somewhere else, so performance has not the highest priority. Anyway, the "faster" approach is implemented so why not reuse it.
My proposal now is the following:* allocate a fixed size cache array on the stack (e.g. 20kb supporting strings up to 4000 characters) * fill it with values until either the end of the string is reached or the cache is full
* go with the cached version if end of string is reached * go with the uncached version if notThis avoids strlen() + malloc() and is "fast" for standard real world issues like word sorting and solid for large strings.
Leonhard
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |