This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] vfprintf: validate nargs and positional offsets
On Thu, Feb 02, 2012 at 09:56:21PM -0800, Paul Eggert wrote:
> On 02/02/2012 03:15 PM, Kees Cook wrote:
>
> > + /* Check for potential integer overflow. */
> > + if (nargs > SIZE_MAX / (2 * sizeof (int) + sizeof (union printf_arg)))
> > + {
> > + done = -1;
> > + goto all_done;
> > + }
>
> I don't see how this is reliable in general. First, there's no
> integer calculation later such that integer overflow will occur
> if nargs == X + 1 but will not occur if nargs == X,
> where X is SIZE_MAX / (2 * sizeof (int) + sizeof (union printf_arg))).
> So why bother to check whether nargs > X?
The flaw is technically with "SIZE_MAX / sizeof (int)", but I felt it was
better to take the entire allocation into account instead of just the first
one.
Old flaw, with nargs = SIZE_MAX / sizeof (int):
allocate 0 bytes due to overflow (nargs * 4):
args_type = alloca (nargs * sizeof (int));
memset 0 bytes due to overflow (nargs * 4):
memset (args_type, s->_flags2 & _IO_FLAGS2_FORTIFY ? '\xff' : '\0',
nargs * sizeof (int));
allocate 0 bytes due to overflow (nargs * 12):
args_value = alloca (nargs * sizeof (union printf_arg));
allocate 0 bytes due to overflow (nargs * 4):
args_size = alloca (nargs * sizeof (int));
write NULL value (PA_INT) to stack + width_arg * 4 (int array) location:
if (specs[cnt].width_arg != -1)
args_type[specs[cnt].width_arg] = PA_INT;
> Second, the check is immediately followed by several calls to alloca,
> and on many hosts alloca has untoward behavior well before integer
> overflow occurs; all one needs to do is to alloca enough memory that the
> stack pointer grows past the allocated stack (jumping past any guard page
> at the end of the stack). The above code doesn't prevent this stack-overflow
> problem, so how is it attacking the bug in question?
As shown above, the overflow is actually at alloca size calculation time.
With my patch, the allocation must actually happen. As a result of it
actually being a real allocation, the memset happens as well. If the
stack has been exceeded, the memset will trip on the guard page. If not,
things are fine because we'll know that all the args_type dereferencing
will aim into the allocated and memset region.
However, you're right in that args_value and args_size are not strictly
written to (to trip on the guard page). It may be possible to line up
the alloca calls so that the guard page lands in an unused portion
of args_value, and that args_size ends up aligned in heap such that
s->_flags2 corresponds with an args_size[specs[cnt].data_arg] assignment.
I see various potential solutions:
- Since we're trying to protect the FORTIFY case, we just need to cache
the s->_flags2 check for FORTIFY earlier in the function, before the
potential heap write. Then when the args_type array walk finally hits
an unused position, the program aborts on the fortify check. I can't
imagine a format string actually being able to exist that could trigger
the stack exhaustion, but then be long enough to use all those specs,
but it still seems like a fragile solution.
- Call memset on the remaining two allocated regions. Given the size of
normal vfprintf operations, this should not cause too much pain. This is
robust on systems with a guard page, but potentially makes the situation
worse on those without.
- Replace alloca with calloc and malloc. This solves the problem but
probably incurs too high a performance cost.
- Hybrid: use calloc/malloc when more than a page worth of memory would
be needed, otherwise just stick with alloca. This would mean that only
format strings with more than 204 (on 32-bit, or 102 on 64-bit) would
need to hit the heap.
- Have alloca magically detect heap collision. The layout is arch and
rlimit specific, but this information is, technically, available. The
max stack size is known to getrlimit, the initial stack location could
be recorded by ld.so, and max heap location could be recorded by glibc.
Seems grossly fragile.
Are there some others? Seems like hybrid might be the way forward? Man is
alloca gross. :)
> > /* Allocate memory for the argument descriptions. */
> > args_type = alloca (nargs * sizeof (int));
> > memset (args_type, s->_flags2 & _IO_FLAGS2_FORTIFY ? '\xff' : '\0',
> > @@ -1715,13 +1722,17 @@ do_positional:
> > for (cnt = 0; cnt < nspecs; ++cnt)
> > {
> > /* If the width is determined by an argument this is an int. */
> > - if (specs[cnt].width_arg != -1)
> > + if (specs[cnt].width_arg > -1 && specs[cnt].width_arg < nargs)
> > args_type[specs[cnt].width_arg] = PA_INT;
> >
> > /* If the precision is determined by an argument this is an int. */
> > - if (specs[cnt].prec_arg != -1)
> > + if (specs[cnt].prec_arg > -1 && specs[cnt].prec_arg < nargs)
> > args_type[specs[cnt].prec_arg] = PA_INT;
>
> I don't see why the above two changes are needed. Doesn't
> __parse_one_specwc guarantee that specs[cnt].width_arg < max_ref_arg
> (and therefore specs[cnt].width_arg < nargs), and similarly for
> specs[cnt].prec_arg?
This is true, (and in fact the test case can't test the failure condition
of these changes because of that) but it seemed trivial to add the check
so that if the behavior of read_int or __parse_one_spec* ever changed,
this portion of the code would remain robust.
-Kees
--
Kees Cook @outflux.net