[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Alternative nSelectors patch (Was: bzip2 1.0.7 released)



On Tue, 2019-07-02 at 08:34 +0200, Julian Seward wrote:
> This seems to me like a better patch than my proposal, so I retract
> my proposal and vote for this one instead.

I did keep your cleanup of the compress.c assert. And I don't think I
would have proposed something different if I didn't just happen to
stumble upon that odd 32767.bz2 testcase. It seems unlikely someone
would use so many unused selectors. But the file format allows for it,
so lets just handle it.

> The one thing that concerned me was that, it would be a disaster -- having
> ignored all selectors above 18002 -- if subsequent decoding actually *did*
> manage somehow to try to read more than 18002 selectors out of s->selectorMtf,
> because we'd be reading uninitialised memory.  But this seems to me can't
> happen because, after the selector-reading loop, you added
> 
> +      if (nSelectors > BZ_MAX_SELECTORS)
> +        nSelectors = BZ_MAX_SELECTORS;
> 
> and the following loop:
> 
>        /*--- Undo the MTF values for the selectors. ---*/
>        ...
> 
> is the only place that reads s->selectorMtf, and then only for the range
> 0 .. nSelectors-1.
> 
> So it seems good to me.  Does this sync with your analysis?

Yes. There is also the other BZ_MAX_SELECTORS sized array s->selector.
But that is similarly guarded. First it is filled from the selectorMtf
array with that for loop 0 .. nSelectors-1. Then it is accessed through
the GET_MTF_VAL macro as selector[groupNo], but that access is guarded
with:

      if (groupNo >= nSelectors)                  \
         RETURN(BZ_DATA_ERROR);                   \

So that prevents any bad access.

Attached is the patch with a commit message that hopefully explains why
the change is correct (and why the CVE, although a source code bug,
wasn't really exploitable in the first place). Hope it makes sense.

Cheers,

Mark
From b357f4ec14a8b5b11b37621ee9f2a10f518b6c65 Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Wed, 3 Jul 2019 01:28:11 +0200
Subject: [PATCH] Accept as many selectors as the file format allows.

But ignore any larger that the theoretical maximum, BZ_MAX_SELECTORS.

The theoretical maximum number of selectors depends on the maximum
blocksize (900000 bytes) and the number of symbols (50) that can be
encoded with a different Huffman tree. BZ_MAX_SELECTORS is 18002.

But the bzip2 file format allows the number of selectors to be encoded
with 15 bits (because 18002 isn't a factor of 2 and doesn't fit in
14 bits). So the file format maximum is 32767 selectors.

Some bzip2 encoders might actually have written out more selectors
than the theoretical maximum because they rounded up the number of
selectors to some convenient factor of 8.

The extra 14766 selectors can never be validly used by the decompression
algorithm. So we can read them, but then discard them.

This is effectively what was done (by accident) before we added a
check for nSelectors to be at most BZ_MAX_SELECTORS to mitigate
CVE-2019-12900.

The extra selectors were written out after the array inside the
EState struct. But the struct has extra space allocated after the
selector arrays of 18060 bytes (which is larger than 14766).
All of which will be initialized later (so the overwrite of that
space with extra selector values would have been harmless).
---
 compress.c   |  2 +-
 decompress.c | 10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/compress.c b/compress.c
index 237620d..76adee6 100644
--- a/compress.c
+++ b/compress.c
@@ -454,7 +454,7 @@ void sendMTFValues ( EState* s )
 
    AssertH( nGroups < 8, 3002 );
    AssertH( nSelectors < 32768 &&
-            nSelectors <= (2 + (900000 / BZ_G_SIZE)),
+            nSelectors <= BZ_MAX_SELECTORS,
             3003 );
 
 
diff --git a/decompress.c b/decompress.c
index 20ce493..3303499 100644
--- a/decompress.c
+++ b/decompress.c
@@ -287,7 +287,7 @@ Int32 BZ2_decompress ( DState* s )
       GET_BITS(BZ_X_SELECTOR_1, nGroups, 3);
       if (nGroups < 2 || nGroups > BZ_N_GROUPS) RETURN(BZ_DATA_ERROR);
       GET_BITS(BZ_X_SELECTOR_2, nSelectors, 15);
-      if (nSelectors < 1 || nSelectors > BZ_MAX_SELECTORS) RETURN(BZ_DATA_ERROR);
+      if (nSelectors < 1) RETURN(BZ_DATA_ERROR);
       for (i = 0; i < nSelectors; i++) {
          j = 0;
          while (True) {
@@ -296,8 +296,14 @@ Int32 BZ2_decompress ( DState* s )
             j++;
             if (j >= nGroups) RETURN(BZ_DATA_ERROR);
          }
-         s->selectorMtf[i] = j;
+         /* Having more than BZ_MAX_SELECTORS doesn't make much sense
+            since they will never be used, but some implementations might
+            "round up" the number of selectors, so just ignore those. */
+         if (i < BZ_MAX_SELECTORS)
+           s->selectorMtf[i] = j;
       }
+      if (nSelectors > BZ_MAX_SELECTORS)
+        nSelectors = BZ_MAX_SELECTORS;
 
       /*--- Undo the MTF values for the selectors. ---*/
       {
-- 
1.8.3.1