Bug 15929 - scanf doesn't set errno to ERANGE for unsigned short overflows
Summary: scanf doesn't set errno to ERANGE for unsigned short overflows
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: stdio (show other bugs)
Version: 2.18
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-04 11:55 UTC by Filipe Gonçalves
Modified: 2016-09-07 22:26 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Sample code that is able to reproduce the bug (222 bytes, text/x-csrc)
2013-09-04 11:55 UTC, Filipe Gonçalves
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Filipe Gonçalves 2013-09-04 11:55:07 UTC
Created attachment 7186 [details]
Sample code that is able to reproduce the bug

Overview:
When scanf is called with %hu to parse an unsigned short from input, ERANGE is not being set for values that do not fit inside an unsigned short. 

Steps to reproduce:
Compile the attached program. Run the program and try entering different values. On a machine with 16-bit shorts, entering a number that a short can't store, like 70000, will not trigger errno to be set to ERANGE.

Actual results:
errno is not set to ERANGE. The value read wraps around and there's no way for the caller to know if there has been an overflow. In a system with n-bit shorts, if number x is read, the actual value stored is x%(2^n) (it wraps around).

Expected results:
errno should have been set to ERANGE.

Build date & hardware:
Tested with debian's built-in glibc 2.13, and also with the latest version, 2.18, compiled from official source.
filipe@debian:~/glibc-build$ uname -a
Linux debian 3.2.0-4-486 #1 Debian 3.2.46-1 i686 GNU/Linux

Additional information:
I happened to notice that errno is correctly set if the value entered is greater than INT_MAX. So, apparently, it looks like scanf ignores the fact that it is given a pointer to unsigned short, treating it as pointer to int instead. Thus, every value between USHRT_MAX and INT_MAX is assumed to fit into a short. Here's an example in my system:
$ ./bug
Enter a number: 65535
You entered 65535
Enter a number: 65536
You entered 0
Enter a number: 2000000
You entered 33920
Enter a number: 2000000000000000000000000000000000000000000000000
Value is too large, try again.
Enter a number:
Comment 1 Andreas Schwab 2013-09-04 12:42:18 UTC
POSIX does not specify ERANGE for fscanf.  The ERANGE error comes from strtoul which checks against ULONG_MAX (not UINT_MAX).  If you want to do range checking you have to do it yourself.
Comment 2 Filipe Gonçalves 2013-09-04 13:59:20 UTC
(In reply to Andreas Schwab from comment #1)
> POSIX does not specify ERANGE for fscanf.  The ERANGE error comes from
> strtoul which checks against ULONG_MAX (not UINT_MAX).  If you want to do
> range checking you have to do it yourself.

Well, the manpage contradicts itself, it mentions that ERANGE is used when "The result of an integer conversion would exceed the size that can be stored in the corresponding integer type.", but later on it says that the standards to which it conforms to do not mention ERANGE. What do we have then?

If I call scanf, I can't do range checking, by the time it returns, how am I supposed to know if there was an overflow?

If ERANGE is not mentioned in the standards and that serves as an excuse on why scanf doesn't deal with it properly, then it might be better to remove that from the "ERRORS" section, otherwise, we have an awkward situation here, where it kind of works, but when it doesn't, it's not a bug because the standard doesn't mention it.

But ok then, I'll just code my own function to read an unsigned short and forget that scanf exists.
Comment 3 Andreas Schwab 2013-09-04 20:32:21 UTC
There are no manpages as part of glibc, you should report this to the provider of them (probably the man-pages project).

scanf is really only useful for very simple interaction.  If you want more sophisticated behaviour you need to parse the input yourself.
Comment 4 Carlos O'Donell 2013-09-06 06:15:46 UTC
Closing this as invalid since the code and manual are correct. Changes to the POSIX standard need to go through the Austin Group.

Please note that manual pages, likely the Linux kernel manual pages, are not the canonical source of documentation for glibc.
Comment 5 benjaminmoody 2016-09-07 22:26:49 UTC
POSIX doesn't specify anything.  Both POSIX and the C standard are
transparently idiotic on this point.  (What do they actually say?
They say that this is *undefined behavior*.  Not
implementation-defined.  Undefined, as in "pass the wrong string to
sscanf, and the library is permitted to set your printer on fire.")

Presumably somebody, somewhere, once wrote a broken implementation of
sscanf that would crash when given a value too large as input.  And it
was decided that permitting that implementation to be called
"conformant" was more important than having a sane standard.  That's
no justification for glibc not to behave sanely.

Note that glibc already goes beyond what the standards require, in
that it properly detects range errors for 'long', 'unsigned long',
'float', and 'double' values.  If all we cared about was following the
letter of the standards, the code could easily be simplified to use
'long long', 'unsigned long long', and 'long double' for everything.

(I suppose somebody might interject at this point and yell
"performance!", but seriously, you don't use sscanf for performance.
You use it because it's more convenient and easier to read than a
pagefull of strtol's.)

*Every* use of %d, %hd, %hhd, etc., is currently a bug, unless:

 - a maximum field width is used that makes overflows impossible, or

 - the input is so well constrained that you can guarantee ahead of
   time that it can never overflow (in which case, why are you using
   scanf?)

Isn't that enough of a reason to fix it?

As to documentation, the glibc manual is not "correct": it currently
makes no mention (that I can see) of how *scanf behaves on integer or
floating-point overflows.  If you refuse to fix this bug on the
grounds that programmers are expected to do their own range-checking,
then programmers need to know which conversions are safe to use in
that way (%ld good, %d bad.)