Bug 2753 - Integer overflow in bsearch
Summary: Integer overflow in bsearch
Status: REOPENED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.20
: P2 normal
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-10 20:38 UTC by Robin Green
Modified: 2016-01-12 14:27 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
possible patch an preformance optimization (689 bytes, text/x-csrc)
2015-04-27 19:45 UTC, will bradshaw
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Green 2006-06-10 20:38:47 UTC
In the code for bsearch, if l and u are both very, very large, this line in
libc/stdlib/bsearch.c calculates the midpoint incorrectly due to an integer
overflow:

 idx = (l + u) / 2;
Comment 1 Ulrich Drepper 2006-06-13 15:27:05 UTC
You do not even understand how binary searching works, do you?  The sum can
never exceed nmemb and nmemb obviously fits into an size_t.
Comment 2 Ulrich Drepper 2006-06-13 15:29:36 UTC
I mean a valid nmemb size.  It can never be too large.  If you pass in garbage,
you get out garbage.  There is nothing wrong with that.
Comment 3 Robin Green 2006-06-13 17:45:31 UTC
The point is, the sum l + u obviously can exceed nmemb, because if the
searched-for value is at the end, on the first iteration l is increased and u
stays equal to nmemb. At this point in the execution, _prior_ to dividing by 2,
integer overflow can occur.
Comment 4 Ulrich Drepper 2006-08-12 20:33:29 UTC
No, l + u cannot overflow since a) arrays of 1 byte values don't make any sense
(same for 2 bytes and probably 3 bytes) and b) for every record size >= 2 we
don't overflow.  I'm not adding useless changes for bogus code.
Comment 5 Ulf Magnusson 2014-11-13 10:13:43 UTC
For what it's worth, here's a test case that causes bsearch() to hang when sizeof(size_t) == 4:


#include <stdlib.h>
#include <string.h>

#define LEN 0x80000001

static int compar(const void *a, const void *b) {
	return *(char*)a - *(char*)b;
}

int main() {
	char *arr = malloc(LEN), key = 1;
	memset(arr, 0, LEN);
	bsearch(&key, arr, LEN, 1, compar);
}


The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to.
Comment 6 Ondrej Bilka 2014-11-25 21:54:07 UTC
> The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l > + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to.

A good reason is that it causes a performance penalty. To trigger overflow you need to fill more than half of address space with bytes. As it needs to be sorted one could just simulate it by array with 256 indexes where that number starts. Anyone that triggers that scenario should be fired for incompetence.
Comment 7 will bradshaw 2015-04-27 19:45:39 UTC
Created attachment 8275 [details]
possible patch an preformance optimization

(In reply to Ondrej Bilka from comment #6)
> > The bug could be fixed by replacing '__idx = (__l + __u) / 2' with '__idx = __l > + (__u - __l)/2' in bits/stdlib-bsearch.h. I don't see a good reason not to.
> 
> A good reason is that it causes a performance penalty. To trigger overflow
> you need to fill more than half of address space with bytes. As it needs to
> be sorted one could just simulate it by array with 256 indexes where that
> number starts. Anyone that triggers that scenario should be fired for
> incompetence.

I agree with you on the performance issue. But upon careful review of the code a few performance optimizations makes this a more relevant issue.

By hoisting the calculation of array addresses from the loop the implementation can be made slightly faster, but as a result math is done on pointer addresses directly causing the  (__l+__u)/2 to __l-(__u-__l)/2 'bug' to become a mandatory fix.