In the code for bsearch, if l and u are both very, very large, this line in libc/stdlib/bsearch.c calculates the midpoint incorrectly due to an integer overflow: idx = (l + u) / 2;
You do not even understand how binary searching works, do you? The sum can never exceed nmemb and nmemb obviously fits into an size_t.
I mean a valid nmemb size. It can never be too large. If you pass in garbage, you get out garbage. There is nothing wrong with that.
The point is, the sum l + u obviously can exceed nmemb, because if the searched-for value is at the end, on the first iteration l is increased and u stays equal to nmemb. At this point in the execution, _prior_ to dividing by 2, integer overflow can occur.
No, l + u cannot overflow since a) arrays of 1 byte values don't make any sense (same for 2 bytes and probably 3 bytes) and b) for every record size >= 2 we don't overflow. I'm not adding useless changes for bogus code.
For what it's worth, here's a test case that causes bsearch() to hang when sizeof(size_t) == 4: #include <stdlib.h> #include <string.h> #define LEN 0x80000001 static int compar(const void *a, const void *b) { return *(char*)a - *(char*)b; } int main() { char *arr = malloc(LEN), key = 1; memset(arr, 0, LEN); bsearch(&key, arr, LEN, 1, compar); } The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to.
> The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l > + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to. A good reason is that it causes a performance penalty. To trigger overflow you need to fill more than half of address space with bytes. As it needs to be sorted one could just simulate it by array with 256 indexes where that number starts. Anyone that triggers that scenario should be fired for incompetence.
Created attachment 8275 [details] possible patch an preformance optimization (In reply to Ondrej Bilka from comment #6) > > The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l > + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to. > > A good reason is that it causes a performance penalty. To trigger overflow > you need to fill more than half of address space with bytes. As it needs to > be sorted one could just simulate it by array with 256 indexes where that > number starts. Anyone that triggers that scenario should be fired for > incompetence. I agree with you on the performance issue. But upon careful review of the code a few performance optimizations makes this a more relevant issue. By hoisting the calculation of array addresses from the loop the implementation can be made slightly faster, but as a result math is done on pointer addresses directly causing the (__l+__u)/2 to __l-(__u-__l)/2 'bug' to become a mandatory fix.