This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2 0/7] Refactor qsort implementation
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Sergey Senozhatsky <sergey dot senozhatsky dot work at gmail dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 4 Sep 2018 09:09:21 -0300
- Subject: Re: [PATCH v2 0/7] Refactor qsort implementation
- References: <20180831204238.10626-1-adhemerval.zanella@linaro.org> <20180903082928.GA387@jagdpanzerIV>
On 03/09/2018 05:29, Sergey Senozhatsky wrote:
> On (08/31/18 17:42), Adhemerval Zanella wrote:
>> The quicksort have the disvantage of O(n^2) as worse case, however
>> current glibc implementation seems to have handle the pivot selection
>> in suitable way. Ondřej Bílka has raised questioning regarding it could
>> be DoS attack [2], and although I do not dismiss this potentially issue
>> (although unlikely due the median pivot selection) I think it would be
>> worth to discuss it on another thread along with possible alternatives.
>
> Hi,
>
> So I don't know if there will be another discussion thread, but, just
> in case, I ran across Orson Peters' Pattern-defeating quicksort a while
> ago
> https://github.com/orlp/pdqsort
>
> A TL;DR description - it's a quicksort which fallbacks to a heap sort
> whenever it detects the worst case O(n ^ 2) pattern.
>
> As far as I can tell, the RUST programming language does use pdqsort:
> https://github.com/rust-lang/rust/blob/master/src/libcore/slice/sort.rs
>
> -ss
Thanks for bring this up, still reading its paper [1] but the idea seems
reasonable. From what I could understand it tries to leverage the heuristics
to change over introsort/heapsort, improve pivot selection by using Tukey's
ninther (which seems to be used on other implementation as well, such as
Go), and add some branchless operation on partition by adding data-dependent
moves.
I did a sniff performance evaluation using the reference implementation
in C++ adapted to provide a qsort signature and number do look interesting
(it is faster than our current implementation).
However currently I would like to focus first in the issue regarding
correctness and usability in your implementation. I think we can add
introsort to remove the worst case, at cost of some performance.
[1] https://drive.google.com/file/d/0B1-vl-dPgKm_T0Fxeno1a0lGT0E/view