This is the mail archive of the
`libc-alpha@sourceware.org`
mailing list for the glibc project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: Carsten Hey <carsten at debian dot org>*To*: libc-alpha at sourceware dot org*Date*: Tue, 29 Nov 2011 01:00:48 +0100*Subject*: quicksort

Hi, I created some patches [0][1] against qsort.c, i.e., _quicksort(): 68ea995 qsort.c: correctly place the pivot element after partitioning 8aec01b qsort.c: simplify partitioning code 8e77619 qsort.c: minimize number of swaps in median of three pivot selection dfe08ac qsort.c: add pseudo-median of nine pivot selection 6810136 qsort.c: move equal elements on the left side a0d2eb3 qsort.c: sort multisets in O(N*k) 3fa9ccc qsort.c: swap long-wise instead of char-wise if possible b644ca6 qsort.c: update documentation (HEAD, github/master, master) [0] https://github.com/heyc/glibc/commits/master [1] git://github.com/heyc/glibc.git Everything in this patches that would be copyrightable by me (if there is anything at all) is licensed under the terms of the WTFPL 2.0 [3]. [3] http://sam.zoy.org/wtfpl/ I created a simple program [4] to show that _quicksort() fails to sort {2,0,0,1,2,0,0}, you just need to add printing the value of the pivot element and returning after partitioning in qsort.c. The fix in the first patch actually belongs to the partitioning code introduced in the following commit, but it seems to work with the current partitioning code too. [4] http://paste.debian.net/downloadh/e7d21104 Instead of the 'split end' partitioning used in "Engineering a Sort Function" I use a scheme based on the work of Justin Peel [5]. He only published a much simplified version of his way to efficiently sort multisets, this one is very fast with a very high probability. My variant guarantees a worst case of O(N*k) (k is the number of distinct elements) and works in the following way: * Check if the chosen pivot equals to a previous one by comparing the element right of the subfile's rightmost element to the new pivot. This is done whilst selecting the median. If a median of three is used, it does not require an additional comparison on distinct elements since they can only be equal if the pivot also equals to the largest element of the sample (I did consider this optimization to be only of little value for pseudo-median of nine selection and thus did not apply it there). * If the pivot equals to a previous selected pivot, partition with equal elements on the right side, otherwise partition with equal elements on the left side. * After partitioning with equal elements on the right side, all elements in the right subfile reached their final position (they are left of the earlier chosen pivot and right of the current pivot) and thus can be skipped from further sorting. [5] http://stackoverflow.com/questions/2105737/has-anyone-seen-this-improvement-to-quicksort-before The main advantage of using this is instead of 'split end' partitioning is in my opinion that the removed overhead eases implementing possible future enhancements efficiently: * Quicksort with multiple pivots (which is patented [6][7] despite prior art [8]) to reduce the number of needed variable exchanges. * Handle mostly sorted data specially. [6] http://www.google.com/patents/about?id=RwCkAAAAEBAJ [7] http://www.faqs.org/patents/app/20100121848 [8] http://www.angelfire.com/space2/m_kaykobad/publications/J13.pdf Reducing overhead in the inner loops also improves performance slightly, but in my tests, this overhead costed only one or two percent on pseudo-random data if compiled as separate objects. Drawback is that sequences that contain every element twice are sorted slower without 'split end' partitioning. I wondered if there is a reason why you are using a hand-coded stack besides "Sedgewick said so". Recursing to the smallest subfile first and a hand-coded tail-recursive call (to avoid depending on compilers to do this) address the general memory problem too and saving less than N/4 function calls can hardly be an compelling argument if at least lg(N!) calls to the comparison function are needed on distinct data. A remaining reason for bloating the code with a stack could be keeping the memory usage at an absolute minimum, if this is desired. Regards Carsten

**Follow-Ups**:**Re: quicksort***From:*Mike Frysinger

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |