This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

quicksort

From: Carsten Hey <carsten at debian dot org>
To: libc-alpha at sourceware dot org
Date: Tue, 29 Nov 2011 01:00:48 +0100
Subject: quicksort

Hi,

I created some patches [0][1] against qsort.c, i.e., _quicksort():

 68ea995  qsort.c: correctly place the pivot element after partitioning
 8aec01b  qsort.c: simplify partitioning code
 8e77619  qsort.c: minimize number of swaps in median of three pivot selection
 dfe08ac  qsort.c: add pseudo-median of nine pivot selection
 6810136  qsort.c: move equal elements on the left side
 a0d2eb3  qsort.c: sort multisets in O(N*k)
 3fa9ccc  qsort.c: swap long-wise instead of char-wise if possible
 b644ca6  qsort.c: update documentation (HEAD, github/master, master)

 [0] https://github.com/heyc/glibc/commits/master
 [1] git://github.com/heyc/glibc.git


Everything in this patches that would be copyrightable by me (if there
is anything at all) is licensed under the terms of the WTFPL 2.0 [3].

 [3] http://sam.zoy.org/wtfpl/


I created a simple program [4] to show that _quicksort() fails to sort
{2,0,0,1,2,0,0}, you just need to add printing the value of the pivot
element and returning after partitioning in qsort.c.  The fix in the
first patch actually belongs to the partitioning code introduced in the
following commit, but it seems to work with the current partitioning
code too.

 [4] http://paste.debian.net/downloadh/e7d21104


Instead of the 'split end' partitioning used in "Engineering a Sort
Function" I use a scheme based on the work of Justin Peel [5]. He only
published a much simplified version of his way to efficiently sort
multisets, this one is very fast with a very high probability. My
variant guarantees a worst case of O(N*k) (k is the number of distinct
elements) and works in the following way:

 * Check if the chosen pivot equals to a previous one by comparing the
   element right of the subfile's rightmost element to the new pivot.
   This is done whilst selecting the median.  If a median of three is
   used, it does not require an additional comparison on distinct
   elements since they can only be equal if the pivot also equals to the
   largest element of the sample (I did consider this optimization to be
   only of little value for pseudo-median of nine selection and thus did
   not apply it there).

 * If the pivot equals to a previous selected pivot, partition with
   equal elements on the right side, otherwise partition with equal
   elements on the left side.

 * After partitioning with equal elements on the right side, all
   elements in the right subfile reached their final position (they are
   left of the earlier chosen pivot and right of the current pivot) and
   thus can be skipped from further sorting.

 [5] http://stackoverflow.com/questions/2105737/has-anyone-seen-this-improvement-to-quicksort-before

The main advantage of using this is instead of 'split end' partitioning
is in my opinion that the removed overhead eases implementing possible
future enhancements efficiently:

 * Quicksort with multiple pivots (which is patented [6][7] despite
   prior art [8]) to reduce the number of needed variable exchanges.

 * Handle mostly sorted data specially.

 [6] http://www.google.com/patents/about?id=RwCkAAAAEBAJ
 [7] http://www.faqs.org/patents/app/20100121848
 [8] http://www.angelfire.com/space2/m_kaykobad/publications/J13.pdf

Reducing overhead in the inner loops also improves performance slightly,
but in my tests, this overhead costed only one or two percent on
pseudo-random data if compiled as separate objects.  Drawback is that
sequences that contain every element twice are sorted slower without
'split end' partitioning.


I wondered if there is a reason why you are using a hand-coded stack
besides "Sedgewick said so".  Recursing to the smallest subfile first
and a hand-coded tail-recursive call (to avoid depending on compilers to
do this) address the general memory problem too and saving less than N/4
function calls can hardly be an compelling argument if at least lg(N!)
calls to the comparison function are needed on distinct data.
A remaining reason for bloating the code with a stack could be keeping
the memory usage at an absolute minimum, if this is desired.


Regards
Carsten

Follow-Ups:
- Re: quicksort
  - From: Mike Frysinger

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]