Bug 15917

Summary: scanf %f doesn't parse "0e+0" correctly
Product: glibc Reporter: Adam Sampson <ats-sourceware>
Component: stdioAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: bugdal, erik
Priority: P2 Flags: fweimer: security-
Version: 2.18   
Target Milestone: 2.19   
Host: Target:
Build: Last reconfirmed:

Description Adam Sampson 2013-08-31 23:22:08 UTC
With glibc 2.18, if you try to parse "0e+0" with scanf's %f format, it will stop at the "e"; it ought to read the whole thing.

Here's an example:

#include <stdio.h>
int main() {
    float a, b;
    int r = sscanf("0e+0 42", "%f %f", &a, &b);
    printf("r=%d a=%f b=%f\n", r, a, b);
    return 0;
}

With glibc 2.18, this prints:

r=1 a=0.000000 b=0.000000

With glibc 2.13 (for example), it prints what I'd expect:

r=2 a=0.000000 b=42.000000

This is caused by an oversight in this commit, which introduced got_digit etc. to the vfscanf %f code:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=6ecec3b616aeaf121c68c1053cd17fdcf0cdb5a2
("Don't accept exp char without preceding digits in scanf float parsing")

When parsing "0e+0", the initial 0 is eaten by the code that checks for a 0x hex prefix ("if (width != 0 && c == L_('0'))"), but that code doesn't set got_digit to say it's seen a digit, so the "e" doesn't get treated as an exponent marker.

Adding "got_digit = 1;" in that block fixes it for me, but you may prefer to do something more subtle -- e.g. following the intention of the patch that introduced the bug, it should presumably reject "0xe+0" as nonsense, so you'd only want to set got_digit if the 0 was actually treated as a digit.

(In case it helps anyone searching for this problem: I spotted this because it broke ATLAS's autotuning code, because masrch wrote something like "0e+00 0e+00 0e+00" to a file with printf and then failed to scanf it back in. The error this resulted in during the ATLAS build was "xmasrch: /src/math/atlas/work/ATLAS//tune/sysinfo/masrch.c:116: matime: Assertion `fscanf(fp, "%lf", &mflop[i])' failed.")
Comment 1 Adam Sampson 2013-08-31 23:26:52 UTC
Erm, it should reject something like 0x.e+0 as nonsense, that is. 0xe+0 isn't nonsense!
Comment 2 Rich Felker 2013-09-01 03:42:09 UTC
I'm confused why this issue even happened, since parsing of hex floats should be completely separate from parsing of decimal floats (the allowed forms are sufficiently different, not to mention the computations to compute the values to store). I guess the leading 0 is already eaten before it's determined that the input is not hex.

Anyway, 0xe+0 is not nonsense, but parsing of course should stop just before the plus sign. Also, it should be noted that 0x.e+0 is invalid for scanf but valid for strtod (here, strtod only reads the first character).
Comment 3 Andreas Schwab 2013-10-29 13:32:00 UTC
0x.e+0 is as valid as 0xe+0, with the subject sequence being "0x.e".
Comment 4 Rich Felker 2013-10-30 01:48:08 UTC
Indeed, I missed that. My own implementation agrees with you, as does the specification, of course.
Comment 5 Andreas Schwab 2013-10-31 11:58:59 UTC
Fixed by a4966c6.
Comment 6 ats 2013-10-31 12:00:38 UTC
On Thu, Oct 31, 2013 at 11:58:59AM +0000, schwab@linux-m68k.org wrote:
>          Resolution|---                         |FIXED

Awesome -- thanks very much. :)