On a string input containing "nan()" with parentheses (possibly with n-char-sequence), the scanf functions assume that the subject sequence is just "nan". Note that strtod is correct, i.e. it takes the parentheses into account. Consider the following testcase: #include <stdio.h> #include <stdlib.h> static void test_strtod (const char *s) { char *endptr; double d; printf ("strtod test on %s\n", s); d = strtod (s, &endptr); printf ("d = %g \"%s\"\n", d, endptr); } int main (void) { int r; double a, b, c; test_strtod ("nan*"); test_strtod ("nan()*"); r = sscanf ("nan nan() 1", "%lf%lf%lf", &a, &b, &c); printf ("sscanf return value: %d\n", r); if (r >= 1) printf ("a = %g\n", a); if (r >= 2) printf ("b = %g\n", b); if (r >= 3) printf ("c = %g\n", c); r = fscanf (stdin, "%lf%lf%lf", &a, &b, &c); printf ("fscanf return value: %d\n", r); if (r >= 1) printf ("a = %g\n", a); if (r >= 2) printf ("b = %g\n", b); if (r >= 3) printf ("c = %g\n", c); return 0; } I get the following output with GNU libc 2.31 and 2.37 on Debian: $ printf "nan nan() 1" | ./naninput strtod test on nan* d = nan "*" strtod test on nan()* d = nan "*" sscanf return value: 2 a = nan b = nan fscanf return value: 2 a = nan b = nan instead of strtod test on nan* d = nan "*" strtod test on nan()* d = nan "*" sscanf return value: 3 a = nan b = nan c = 1 fscanf return value: 3 a = nan b = nan c = 1 (as obtained with MacOS X 12.6 and Android 13).
Note that if the string starts with "nan(" but does not match nan(n-char-sequence_opt), then scanf must reject the conversion (after reading the longest prefix). Examples: * "nan(foo" (no closing parenthesis) * "nan(a b)" (the space is not valid in n-char-sequence) Currently it doesn't, because it stops at "nan" (it does not read the longest prefix). These cases are similar to issues mentioned in bug 12701, but currently this is not the same bug.
I agree this does look like a conformance issue with the scanf family of functions using __vfscanf_internal() implemetnation.
I am trying to debug the issue. I have an additional question. For the examples, strtod returns: * "nan(foo" : nan "(foo" * "nan(foo bar) : nan "(foo bar)" And sscanf should return conversion error for both of these cases, effectively no output, am I right?
(In reply to Avinal Kumar from comment #3) > And sscanf should return conversion error for both of these cases, > effectively no output, am I right? Yes, a conversion error after reading the longest prefix.
And for fscanf, footnote 289 of ISO C17 says: "fscanf pushes back at most one input character onto the input stream. Therefore, some sequences that are acceptable to strtod, strtol, etc., are unacceptable to fscanf." The bug on the general behavior of parsing numbers is bug 12701. So, to completely fix the glibc behavior on NaN strings, both this bug 30647 and bug 12701 need to be fixed.
I have submitted a patch to fix this bug along with tests, please take a look: https://sourceware.org/pipermail/libc-alpha/2024-October/160708.html
The master branch has been updated by Adhemerval Zanella <azanella@sourceware.org>: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=04e8698fcca7d1e932bc54f5b60e1bbce2e87601 commit 04e8698fcca7d1e932bc54f5b60e1bbce2e87601 Author: Avinal Kumar <avinal.xlvii@gmail.com> Date: Fri Oct 25 15:48:27 2024 +0530 stdio-common: Fix scanf parsing for NaN types [BZ #30647] The scanf family of functions like sscanf and fscanf currently ignore nan() and nan(n-char-sequence). This happens because __vfscanf_internal only checks for 'nan'. This commit adds support for all valid nan types i.e. nan, nan() and nan(n-char-sequence), where n-char-sequence can be [a-zA-Z0-9_]+, thus fixing the bug 30647. Any other representation of NaN should result in conversion error. New tests are also added to verify the correct parsing of NaN types for float, double and long double formats. Signed-off-by: Avinal Kumar <avinal.xlvii@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Fixed on 2.41.