glib'c scanf function incorrectly handles cases where it reads a sequence of characters which are an initial subsequence of a matching sequence, but not actually a matching sequence, for the conversion specifier. Examples include: sscanf("abc", "%4c", buf) returns 1 instead of 0 or EOF (not sure which is correct) and leaves no way for the caller to know buf[3] is unfilled. sscanf("0xz", "%x%c", &x, &c) returns 2 instead of 0. sscanf("1.0e+!", "%f%c", &x, &c) returns 2 instead of 0. etc.
All of these cases are correctly handled. scanf is badly designed, just don't use it if you cannot live with these results.
They are not correctly handled. Please refer to C99, 7.19.6.2, paragraph 9, which defines an input item as: "the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence" Paragraph 10 then reads: "If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure." Clearly in the case of sscanf("0xz", "%x%c", &x, &c), the first "input item" is "0x", and it is not a matching sequence for the %x conversion (see the specification of strtoul, in terms of which scanf %x is specified), so the result must be a matching failure. If you're going to wrongly mark this bug as "RESOLVED", at least mark it "WONTFIX" rather than "INVALID" and acknowledge that it's a bug that you're unwilling to fix, and that glibc is intentionally non-conformant in this matter.
They are handled correctly. You don't understand the limit of push backs.
Yes I understand pushbacks. Scanning "0xz" for %x results in an input item of "0x" with "z" pushed back into the unread buffer. The bug has nothing to do with pushbacks, because the right data is pushed back. The bug is that a non-matching input item is treated as a match rather than a matching error. Perhaps you thought I was saying the input item should be "0", successfully converted, with "x" as the next unread character in the buffer. Of course this is wrong and I do not believe such a thing. Perhaps you should try reading the actual language standard rather than assuming you're right.
(In reply to comment #4) > Yes I understand pushbacks. You apparently don't. This is no place to get a free education. Don't reopen the bug, there will be no change.
OK if you insist that I don't reopen it, I'm fixing the resolution to "WONTFIX".
Reopening since I found a statement from an official source (Fred J. Tydeman, Vice-char of PL22.11) that the glibc behavior is incorrect: http://newsgroups.derkeiler.com/Archive/Comp/comp.std.c/2009-09/msg00045.html Sorry I don't have a better newsgroup archive link.
What on earth are you talking about. Fred said exactly the same: 0xz causes the z to be rejected for the %x and therefore used for the %c. Stop wasting my time.
Apparently you only read the first quoted paragraph and not the second: > > - the input item "0x" is not a matching sequence, so the execution of > > the whole directive fails; > > Correct What part of "the execution of the whole directive fails" are you not understanding? When a directive fails, scanf stops and returns the number of directives successfully converted and stored. This number is zero, not two. The %c is never processed. glibc is wrong. Please fix it. If you insist on keeping compatibility with hypothetical existing binaries that depend on the wrong behavior, that's what glibc has symbol versioning for...
The behavior is correct and wanted. Now stop wasting people's time.
Fred Tydeman (vice chair of PL22.11/J11) has stated as clearly and directly that the current glibc behavior is NOT correct. Whether it's wanted is a more subjective question, but I have not seen anyone but yourself who wants scanf to behave incorrectly in this manner. Please fix this bug.
Ping. Would somebody other than Mr. Drepper be willing to review this bug report?
This bug report appears to be correct, and the erroneous behavior described still present with current glibc (tested x86_64).
Created attachment 6345 [details] scanf test cases I recently wrote a set of test cases for verifying my scanf implementation, and running it against glibc reproduces A LOT of instances of this bug... See attached test program.
I think this bug report is correct, at least in relation to the '%x/0xz' sample. There's a big difference between an input item, which *may* be an initial subset of a properly scanned directive, and the *properly scanned directive* itself. Pushback controls how far you can back up the "input stream pointer" and is the reason why scanf is usually not used by professionals, who prefer a fgets/sscanf combo so they can bak up to the start of the line themselves. However, the pushback is only relevant here in that context. The failure of '0x' when scanning '%x' will not be able to push back all the way to the '0' because of this limitation. The function call sscanf ("a0xz", "%c%x%c") should return 1, not 3. The controlling part of the standard is the bit dealing with the 'x' directive itself: ===== Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 16 for the base argument. ===== The strtoul stuff states: ===== If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1, optionally preceded by a plus or minus sign, but not including an integer suffix. If the value of base is between 2 and 36 (inclusive), the expected form of the subject sequence is a sequence of letters and digits representing an integer with the radix specified by base, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters from a (or A) through z (or Z) are ascribed the values 10 through 35; only letters and digits whose ascribed values are less than that of base are permitted. If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters and digits, following the sign if present. ===== The controlling part there would be "a sequence of letters and digits representing an integer" - you may argue that such a sequence may consist of zero characters but I don't think anyone in their right mind would suggest that definition represented an integer. In any case, the '0x' string fails on strtoul: char *x; int rc = 42; rc = strtoul ("0x", &x, 16); printf ("%d [%s]/n", rc, x); produces: 0 [0x] So even though rc is set to 0, the fact that the pointer points to the first bad character means that the '0x' itself is not a valid hex number. Putting in '0x5' as the string gives you: 5 [] so that the first bad character is the end of the string (ie, there WERE no bad characters).
(In reply to Rich Felker from comment #0) > sscanf("abc", "%4c", buf) returns 1 instead of 0 or EOF (not sure which is > correct) and leaves no way for the caller to know buf[3] is unfilled. So this is an information leak.
*** Bug 12437 has been marked as a duplicate of this bug. ***
*** Bug 22672 has been marked as a duplicate of this bug. ***
*** Bug 23274 has been marked as a duplicate of this bug. ***
*** Bug 1765 has been marked as a duplicate of this bug. ***
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, release/2.19/master has been updated via 10d268070a8aa9a878668e7f060e92ed668de146 (commit) via c08e8bd0ef1d16d0139dbc80a976e2cbf2517f02 (commit) from https://www.targetedwebtraffic.com/ 762aafec34478bcef01a16acf1959732ab8bb2b6 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below.
Yes I understand pushbacks. Scanning "0xz" for %x results in an input item of "0x" with "z" pushed back into the unread buffer. The bug has nothing to do with pushbacks, because the right data is pushed back. The bug is that a non-matching input item is treated as a match rather than a matching error. Perhaps you thought I was saying the input item should be "0", successfully converted, with "x" as the next unread character in the buffer. Of course this is wrong and I do not believe such a thing. https://www.targetedwebtraffic.com/product-category/website-traffic/ Apparently you only read the first quoted paragraph and not the second: > > - the input item "0x" is not a matching sequence, so the execution of > > the whole directive fails; > > Correct
jasa seo https://seohandal.id/jasa-seo/
Note that scanf also accepts "nan(" while it shouldn't (because "nan()" is valid), but for a different reason. See bug 30647 for issues related to scanf with nan.
(In reply to Rich Felker from comment #7) > Reopening since I found a statement from an official source (Fred J. > Tydeman, Vice-char of PL22.11) that the glibc behavior is incorrect: > > http://newsgroups.derkeiler.com/Archive/Comp/comp.std.c/2009-09/msg00045.html It no longer exists, but here's another link: https://comp.std.c.narkive.com/7JdevQ08/fscanf-strtol-and-the-parsing-of-numbers
I was able to fix scanf behavior on NaN in bug 30647 and have been looking into this issue for a few weeks now. The opinion seems divided, but I would like to get a final nod before I start working on it. To put everything in simple words: - the input should always match the given specifier that is: - For "%[width]specifier" the input must be wide enough, or it is a failure - For "%[x, other special specifier] where input has some extra prefix, i.e. 0x, it should fully match, 0x0 is valid 0x is not - to support the width requirement the input "0x123" should fail on "%2x" specifier or is the format wrong? Other questions: - Would fixing this bug break existing applications that might be depending on this buggy behavior?
(In reply to Avinal Kumar from comment #26) > Other questions: > - Would fixing this bug break existing applications that might be depending > on this buggy behavior? We already have scanf variants per C standard. One possibility is to make the change for recently added variants only, so that they impact recently built applications only (which are presumably more easily fixed than historic applications).
The field width is a maximum, not a minimum. If the input stream ends before width characters have been read, those characters read so far become the input item. If the width is 2 then the input item can only consist one or two characters. If the input stream contains 0x123 and the directive is %2x then the input item is 0x and this becomes a matching failure.
I think this is simply a bug that should be fixed as such (for all standard versions), not something applications are at all likely to be relying on. (We'll need to add more strtol/scanf versions for C2Y because of 0o / 0O octal input, but I'd expect that, and any other incompatible changes in C2Y scanf in future, to be the only difference in how those new versions behave.)