This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: qsort O(n^2) could be DoS attack

From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
To: libc-alpha at sourceware dot org
Date: Wed, 21 Feb 2018 09:34:10 -0300
Subject: Re: qsort O(n^2) could be DoS attack
Authentication-results: sourceware.org; auth=none
References: <20180210163703.GA27574@domone>

On 10/02/2018 14:37, Ondřej Bílka wrote:
> Like was argumented in other functions main problem with deterministic
> worst case is when attacker sends input to web server that makes it
> unresponsive because of processing O(n^2) requests.
> 
> Here we deleted mergesort and kept simple qsort without addressing it.

I assume you are referring my qsort refactor patchset [1].

> 
> Here O(n ln n) is possible with randomization of pivot for example add
> check that if one part has more than 4/5 elements get next pivot as
> number from /dev/urandom.
> 

However for this reference I think you missed to send the actual link.

Although standard not specify the expected or desirable complexity of
qsort and users should expect naive underlying implementation for
portability, I agree trying to get O(n*ln(n)) statistically for 
quicksort should be a QoI.

However I am not sure issuing potentially syscalls for every qsort is
the better approach (getrandom or multiple open/read/close).  I would
rather use a feasible PRNG with per thread state or even a different
implementation (even a simple heapsort or a more specialized one like
smoothsort if sorted arrays should be prioritize [1]).

Current algorithm already selects a pivot element using a
median-of-tree decision tree which reduces the probability of selecting
a bad pivot.  This is shown in benchmarks for pathological quicksort
cases where already sorted elements do not triger O(n^2) comparisons.
It does not however prevent the worse case from a well crafted input,
but I am not sure if this is really a problem with current optimizations. 

I implemented a random pivot on top of current implementation using
the Mersenne Twister [3] in my patch for libsupport and performance
seems reasonable:

Results for member size 4
  Sorted
  nmemb   |      base |   patched | diff
        32|      1257 |      3589 | 185.52
      4096|    302235 |    559485 | 85.12
     32768|   3020728 |   5284233 | 74.93
    524288|  59306436 |  96959396 | 63.49

  Repeated
  nmemb   |      base |   patched | diff
        32|      1873 |      4497 | 140.10
      4096|    904864 |   1020450 | 12.77
     32768|   8542801 |   9765228 | 14.31
    524288| 168426795 | 187274393 | 11.19

  MostlySorted
  nmemb   |      base |   patched | diff
        32|      1776 |      4138 | 133.00
      4096|    709937 |    951225 | 33.99
     32768|   6855890 |   8857712 | 29.20
    524288| 129385161 | 160719331 | 24.22

  Unsorted
  nmemb   |      base |   patched | diff
        32|      1941 |      5013 | 158.27
      4096|    969021 |   1127083 | 16.31
     32768|   9462116 |  10822292 | 14.37
    524288| 186560908 | 204820907 | 9.79

Results for member size 8
  Sorted
  nmemb   |      base |   patched | diff
        32|      1205 |      2740 | 127.39
      4096|    325554 |    568386 | 74.59
     32768|   3264125 |   5338112 | 63.54
    524288|  66107684 | 105387174 | 59.42

  Repeated
  nmemb   |      base |   patched | diff
        32|      2118 |      5196 | 145.33
      4096|    979114 |   1123651 | 14.76
     32768|   9028606 |  10304952 | 14.14
    524288| 174903867 | 193789921 | 10.80

  MostlySorted
  nmemb   |      base |   patched | diff
        32|      2346 |      5028 | 114.32
      4096|    759158 |    990094 | 30.42
     32768|   7810444 |   9180116 | 17.54
    524288| 135900146 | 170539570 | 25.49

  Unsorted
  nmemb   |      base |   patched | diff
        32|      2301 |      5056 | 119.73
      4096|   1014018 |   1159721 | 14.37
     32768|   9888078 |  10958201 | 10.82
    524288| 192479957 | 211803437 | 10.04

Results for member size 32
  Sorted
  nmemb   |      base |   patched | diff
        32|      1184 |      2764 | 133.45
      4096|    325865 |    573273 | 75.92
     32768|   3331750 |   5429113 | 62.95
    524288|  69067176 | 109235098 | 58.16

  Repeated
  nmemb   |      base |   patched | diff
        32|      4813 |      7390 | 53.54
      4096|   1624137 |   1902677 | 17.15
     32768|  15896739 |  17687038 | 11.26
    524288| 316328778 | 341595100 | 7.99

  MostlySorted
  nmemb   |      base |   patched | diff
        32|      5332 |      7289 | 36.70
      4096|   1312703 |   1610082 | 22.65
     32768|  12360726 |  14561200 | 17.80
    524288| 231603294 | 276966905 | 19.59

  Unsorted
  nmemb   |      base |   patched | diff
        32|      6047 |      8238 | 36.23
      4096|   1695241 |   1936552 | 14.23
     32768|  16430388 |  18756835 | 14.16
    524288| 329496913 | 352380198 | 6.94

It is still faster than a simple heapsort (implemented based on Linux
kernel version) and faster than smoothsort for non-sorted inputs.
The drawback it is requires 2.5KB of TLS space (we can tune it with a
better suite PRNG if the case).

So I am not sure whether we should either focus in tune qsort quicksort
implementation to behave O(n*ln(n)) statistically with random pivot, if
we just should use a different slower algorithm with O(n*ln(n)) guarantee
complexity, or if we should handle it as a not really an issue due current
optimization which minimizes the selection of bad pivot.

[1] https://sourceware.org/ml/libc-alpha/2018-01/msg00626.html
[2] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/qsort-smooth
[3] https://sourceware.org/ml/libc-alpha/2018-01/msg00627.html

References:
- qsort O(n^2) could be DoS attack
  - From: Ondřej Bílka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]