Bug 17829 - Incorrect handling of precision specifier in printf family
Summary: Incorrect handling of precision specifier in printf family
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: stdio (show other bugs)
Version: 2.19
: P2 critical
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-12 04:32 UTC by nfxjfg
Modified: 2020-07-07 14:54 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Fix computations for max precision in vfprintf (286 bytes, patch)
2015-12-23 00:23 UTC, Alexander Cherepanov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description nfxjfg 2015-01-12 04:32:47 UTC
The following program shows no output:

#include <stdio.h>
#include <limits.h>

int main(int argc, char **argv)
{
    printf("%.*s\n", INT_MAX, "hi");
    return 0;
}

The precision given is INT_MAX; this should turn the precision specifier into a no-OP, equivalent to 'printf("%s\n", "hi");'. Making the precision value somewhat lower seems to make it work.

snprintf() seems to have a similar issue, but the failure seems to start with even lower precision values.

Other libcs handle this correctly (I tested mingw-w64 and musl).

Using glibc 2.19-13 on Debian 32 bit x86.
Comment 1 Florian Weimer 2015-02-18 14:26:56 UTC
Carlos, do you remember what the “32” in stdio-common/vfprintf.c guards against?  (You helped to fix some overflow-related issues in this area.)

   1574       if (prec > width
   1575           && prec > sizeof (work_buffer) / sizeof (work_buffer[0]) - 32)
   1576         {
   1577           if (__glibc_unlikely (prec >= INT_MAX / sizeof (CHAR_T) - 32))
   1578             {
   1579               __set_errno (EOVERFLOW);
   1580               done = -1;
   1581               goto all_done;
   1582             }
   1583           size_t needed = ((size_t) prec + 32) * sizeof (CHAR_T);

I'm a bit at a loss here.  Certainly, this use is not recommended because printf will allocate tons of memory as part of the format processing.
Comment 2 Carlos O'Donell 2015-02-18 14:32:58 UTC
(In reply to Florian Weimer from comment #1)
> Carlos, do you remember what the “32” in stdio-common/vfprintf.c guards
> against?  (You helped to fix some overflow-related issues in this area.)
> 
>    1574       if (prec > width
>    1575           && prec > sizeof (work_buffer) / sizeof (work_buffer[0]) -
> 32)
>    1576         {
>    1577           if (__glibc_unlikely (prec >= INT_MAX / sizeof (CHAR_T) -
> 32))
>    1578             {
>    1579               __set_errno (EOVERFLOW);
>    1580               done = -1;
>    1581               goto all_done;
>    1582             }
>    1583           size_t needed = ((size_t) prec + 32) * sizeof (CHAR_T);
> 
> I'm a bit at a loss here.  Certainly, this use is not recommended because
> printf will allocate tons of memory as part of the format processing.

The +32 is an arbitrarily selected value to make the buffer large enough to be OK for the largest precision we need. It is an artifact of sloppy accounting for how much would be needed. The correct fix is to be more precise in computing what we need.
Comment 3 nfxjfg 2015-02-18 17:26:50 UTC
>Certainly, this use is not recommended because printf will allocate tons of memory as part of the format processing.

There's literally no reason why it'd need to allocate memory of the size of the maximum _possible_ length of the string. In fact, I'd argue printf doesn't need to do unbounded memory allocations at all.
Comment 4 Florian Weimer 2015-02-20 09:20:56 UTC
(In reply to nfxjfg from comment #3)
> >Certainly, this use is not recommended because printf will allocate tons of memory as part of the format processing.
> 
> There's literally no reason why it'd need to allocate memory of the size of
> the maximum _possible_ length of the string. In fact, I'd argue printf
> doesn't need to do unbounded memory allocations at all.

But it's what the code does today.  It could be implemented differently, sure, but until someone writes the code, submits it, and gets it through review, it's how things are.
Comment 5 Alexander Cherepanov 2015-12-23 00:05:21 UTC
To add to two already mentioned problems (unnecessary malloc and fail for INT_MAX):

C11 seems not to specify any limit for a precision and values larger than INT_MAX could be useful. The variable for precision is declared as int[1] and extracted from a format string as int with overflow control[2]. A larger type like size_t and something like saturated processing of the number is a better fit.

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c;h=ae0145295479d0da11ae6bd496bd7039546419de;hb=HEAD#l1359
[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/printf-parse.h;h=e9903549997b113c5a38be3c2a463c2d83954c51;hb=HEAD#l68
Comment 6 Alexander Cherepanov 2015-12-23 00:23:14 UTC
Created attachment 8862 [details]
Fix computations for max precision in vfprintf

While the full fix requires some work allowing INT_MAX on 64-bit platforms seems easy.

The computation of the amount of the required memory is done in size_t but the check uses INT_MAX for some reason. Replacing it with SIZE_MAX looks like a right thing.
Comment 7 Alexander Cherepanov 2016-01-07 18:46:07 UTC
1. The same problem with the width. Consider the program:

#include <stdio.h>
#include <errno.h>

int main()
{
   int res =  printf("%2147483647d", 1);
   fprintf(stderr, "res = %d, errno = %d\n", res, errno);
}

printf should succeed but it fails: "res = -1, errno = 75". There are 
two problematic checks similar to the one for precision:

https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c;h=6829d4dc8e7fe7c066a06f1857ee926e0f48c379;hb=HEAD#l1459
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c;h=6829d4dc8e7fe7c066a06f1857ee926e0f48c379;hb=HEAD#l1491

2. When the width is taken from an argument via asterisk, INT_MIN is not 
handled properly -- signed integer overflow:

1451         /* Negative width means left justified.  */
1452         if (width < 0)
1453           {
1454             width = -width;
1455             pad = L_(' ');
1456             left = 1;
1457           }
Comment 8 Vincent Lefèvre 2017-05-22 13:11:13 UTC
Similarly, assuming the defect in the C standard mentioned in bug 21360, the following program:

#include <stdio.h>

int main(void)
{
  long n = -1;

  snprintf (NULL, 0, "%100000000000s%ln", "", &n);
  printf ("%ld\n", n);
  return 0;
}

outputs -1 instead of 100000000000 on x86_64 (64-bit long's).

Note: This may be regarded as being above the environmental limits, but nothing seems to be documented about them, and it would be better to get the expected result anyway.
Comment 9 Vincent Lefèvre 2017-05-22 14:10:12 UTC
(In reply to Vincent Lefèvre from comment #8)
[...]
> outputs -1 instead of 100000000000 on x86_64 (64-bit long's).

Actually, I'm not sure of what is expected in case of overflow on the return value. I've just asked in the Austin Group mailing-list: https://www.mail-archive.com/austin-group-l@opengroup.org/msg01038.html
Comment 10 Florian Weimer 2017-05-22 14:14:26 UTC
(In reply to Vincent Lefèvre from comment #9)
> (In reply to Vincent Lefèvre from comment #8)
> [...]
> > outputs -1 instead of 100000000000 on x86_64 (64-bit long's).
> 
> Actually, I'm not sure of what is expected in case of overflow on the return
> value. I've just asked in the Austin Group mailing-list:
> https://www.mail-archive.com/austin-group-l@opengroup.org/msg01038.html

See bug 14771 comment 4.
Comment 11 Vincent Lefèvre 2017-05-22 14:24:30 UTC
(In reply to Florian Weimer from comment #10)
> See bug 14771 comment 4.

This is a different issue. I'm not talking about the size argument, but the 'n' format specifier.
Comment 12 Florian Weimer 2017-05-22 14:35:32 UTC
(In reply to Vincent Lefèvre from comment #11)
> (In reply to Florian Weimer from comment #10)
> > See bug 14771 comment 4.
> 
> This is a different issue. I'm not talking about the size argument, but the
> 'n' format specifier.

They are related.  I'm not sure if it is reasonable to expect that if snprintf fails with EOVERFLOW, %n output has been written.
Comment 13 Vincent Lefèvre 2017-05-22 14:53:43 UTC
(In reply to Florian Weimer from comment #12)
> (In reply to Vincent Lefèvre from comment #11)
> > (In reply to Florian Weimer from comment #10)
> > > See bug 14771 comment 4.
> > 
> > This is a different issue. I'm not talking about the size argument, but the
> > 'n' format specifier.
> 
> They are related.

Bug 14771 comment 4 was about a conflict between ISO C and POSIX, and doesn't give a hint here.

> I'm not sure if it is reasonable to expect that if snprintf fails with
> EOVERFLOW, %n output has been written.

There is no %n output. %n allows one to get information about the length of some sequence of characters. According to bug 21360, one should get a "meaningful" value, even for something that would *never* be output. So, if one follows this idea, it could be reasonable to expect a meaningful value if int is too small for the return value, e.g. to overcome the limitation of the int type.
Comment 14 cvs-commit@gcc.gnu.org 2020-07-07 14:54:49 UTC
The master branch has been updated by Joseph Myers <jsm28@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6caddd34bd7ffb5ac4f36c8e036eee100c2cc535

commit 6caddd34bd7ffb5ac4f36c8e036eee100c2cc535
Author: Joseph Myers <joseph@codesourcery.com>
Date:   Tue Jul 7 14:54:12 2020 +0000

    Remove most vfprintf width/precision-dependent allocations (bug 14231, bug 26211).
    
    The vfprintf implementation (used for all printf-family functions)
    contains complicated logic to allocate internal buffers of a size
    depending on the width and precision used for a format, using either
    malloc or alloca depending on that size, and with consequent checks
    for size overflow and allocation failure.
    
    As noted in bug 26211, the version of that logic used when '$' plus
    argument number formats are in use is missing the overflow checks,
    which can result in segfaults (quite possibly exploitable, I didn't
    try to work that out) when the width or precision is in the range
    0x7fffffe0 through 0x7fffffff (maybe smaller values as well in the
    wprintf case on 32-bit systems, when the multiplication by sizeof
    (CHAR_T) can overflow).
    
    All that complicated logic in fact appears to be useless.  As far as I
    can tell, there has been no need (outside the floating-point printf
    code, which does its own allocations) for allocations depending on
    width or precision since commit
    3e95f6602b226e0de06aaff686dc47b282d7cc16 ("Remove limitation on size
    of precision for integers", Sun Sep 12 21:23:32 1999 +0000).  Thus,
    this patch removes that logic completely, thereby fixing both problems
    with excessive allocations for large width and precision for
    non-floating-point formats, and the problem with missing overflow
    checks with such allocations.  Note that this does have the
    consequence that width and precision up to INT_MAX are now allowed
    where previously INT_MAX / sizeof (CHAR_T) - EXTSIZ or more would have
    been rejected, so could potentially expose any other overflows where
    the value would previously have been rejected by those removed checks.
    
    I believe this completely fixes bugs 14231 and 26211.
    
    Excessive allocations are still possible in the floating-point case
    (bug 21127), as are other integer or buffer overflows (see bug 26201).
    This does not address the cases where a precision larger than INT_MAX
    (embedded in the format string) would be meaningful without printf's
    return value overflowing (when it's used with a string format, or %g
    without the '#' flag, so the actual output will be much smaller), as
    mentioned in bug 17829 comment 8; using size_t internally for
    precision to handle that case would be complicated by struct
    printf_info being a public ABI.  Nor does it address the matter of an
    INT_MIN width being negated (bug 17829 comment 7; the same logic
    appears a second time in the file as well, in the form of multiplying
    by -1).  There may be other sources of memory allocations with malloc
    in printf functions as well (bug 24988, bug 16060).  From inspection,
    I think there are also integer overflows in two copies of "if ((width
    -= len) < 0)" logic (where width is int, len is size_t and a very long
    string could result in spurious padding being output on a 32-bit
    system before printf overflows the count of output characters).
    
    Tested for x86-64 and x86.