RFC: *scanf vs. overflow

Sat May 23 17:18:31 GMT 2020

On 5/23/20 9:45 AM, Rich Felker wrote:

> It's relevant because you want to propose this for standardization.

I don't think it's ready for standardization now. I'm merely proposing it for
glibc. If it works well there, great; if not, then glibc should do something
more ambitious, such as Eric's proposal.

Doing nothing is not a good option; this is a real problem that affects many
real programs.

> that's syntax. It's /[^ ]{1,n}"/.

I'll concede that. That being said, the difference between syntax and semantics
is always somewhat arbitrary, and there's little point to slicing and dicing the
fuzzy boundary here. Regardless of whether the change is to "syntax" or
"semantics" it would be easy to change POSIX to allow the proposed behavior;
it's not a fundamental change to the spec.

> *Any* use of scanf on untrusted input is "vulnerable
> to the integer-overflow issue" in the sense that overflow is UB.

Absolutely, but that was not the point. The issue is what's the best thing to do
for these programs. Many carefully-written but imperfect programs have these
bugs, and although glibc is within its rights to dump core or worse when these
programs run, it's better if glibc's behavior improves their overall
reliability. Letting these errors run through silently causing further
corruption later is not a good strategy for improving overall reliability.
Pointing fingers at developers and telling them not to use scanf is not likely
to be a good strategy either.
> Any value that could be produced via overflow
> could also be produced via non-overflowing input, and you have to
> validate data either way.

Sure, but silently treating 2**32 as if it were zero is more likely to cause
real problems later. We need a better way for *scanf to reflect these errors
back to the calling code.