As pointed out by someone at <http://marc.info/?l=gimp-developer&m=129567990905823&w=2>, the scanf implementation of glibc will crash when given input containing a lot of digits.
This is the sample code copied from the post mentioned above:
Expected output none; actual output:
$ perl -e 'print "5"x21000000' | ./a.out
Tested and reproduced on:
RHEL 5.7 (x86_64)
Debian Squeeze (armv5tel)
Same thing on Fedora 14 (x86_64).
Good news everyone:
ISO C99 §188.8.131.52 item 10 says:
"[...] the result of the conversion is placed in the object pointed to by [...]. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined."
So the standard permits the crash; problem solved.
I'll leave this bug open though, so that alternatives to segfaulting can be considered.
POSIX uses the more-clear language "or if the result of the conversion cannot be represented in the space provided" rather than "... in the object". In either case, I believe this is referring to string conversions that overflow the destination buffer, not numeric conversions. I can't find any language regarding what happens when a numeric value is outside the range of the type, but the expected form is specified in terms of strtol, etc., so it would not be unreasonable to expect scanf to behave the same as these functions.
By the way, can the bug be reproduced with a huge string of zeros? If so, the numeric overflow issue is irrelevant and the behavior is definitely well-defined by the standard.
(In reply to comment #3)
> By the way, can the bug be reproduced with a huge string of zeros?
Yes it can.
I checked in a patch.
The fix is part of glibc 2.15. The issue was present since the dawn of times.
Follow-up fix: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=20b38e0
(In reply to Andreas Schwab from comment #7)
> Follow-up fix: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=20b38e0
I think this is just a performance fix because the buffer is populated by one character at a time, so __libc_use_alloca is always true on the first buffer extension, and then we are in the use_malloc case.
It's still a logic error.
(In reply to Andreas Schwab from comment #9)
> It's still a logic error.
What do you mean? Clearly, the code does not do what is intended, but as far as I can tell, there is no observable impact whatsoever (not even performance).
A logic error is an error in the logic.