memcpy-optimized version of qsort.

Wolfgang Glas wolfgang.glas@ev-i.at
Mon Jun 7 13:40:00 GMT 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

  I recently studied the implementation of the merge sort algorithm, which is 
used by glibc's qsort() implementation, because I have been writing a 
program, which uses qsort() very frequently and I came across it during 
profiling my program. I use glibc-2.3.3 as shipped together with SuSE-9.1 (as 
far as I know a CVS snapshot from April 2004).

  Inside msort_with_tmp() in stdlib/msort.c I recognized, that the two sorted 
subsequences are merged into a temporary array and afterwards, the temporary 
array is copied back as a whole to the original place. This way each array 
element is copied twice per iteration, which makes n*log2(n)*2 copy 
operations for sorting n elements.

  I felt, that there must be a way to copy each element only once per 
iteration and after a few hours wtching my children play, I got the solution. 
(Please don't pledge me, if this is described in the standard literature:)

  The key to to my optimized memory bookkepping is the observation, that if 
you want to merge two sorted sequences into a resulting sequence, the second 
sequence could as well be stored in the upper part of the resulting array, 
because these elemnts are only copied from a higher address to a lower 
address, thus merge proess does not destroy any valuable information.

  So the attached patch implements msort_with_tmp() involving the following 
steps:

given: sort sequence b with n elements, tmp is temp. storage of size n 
elements.

n1:=n/2

1) sort first half of b from b into tmp.
2) sort second half of b from b+n1 into b+n1
3) merge tmp and b+n1 into b.

The minor bookkeeping issues, that have to be handled are best seen in the 
source code. I'd like to especially mention the switch for the trivial cases 
n==0 or n==1 at the start of the code. This gave a minor speedup on my 
Linux/P4 laptop copmared to the 'if (n<=0) return;' version.

  Benchmarks results on my P4 2GHz laptop:

$ gcc -static qsort_test.c -o qsort_test
$ time ./qsort_test 10000000 > out

real    0m13.084s
user    0m12.197s
sys     0m0.750s
$ gcc -static -L/usr/src/packages/BUILD/glibc-2.3/cc qsort_test.c -o 
qsort_test
$ time ./qsort_test 10000000 > out2

real    0m11.033s
user    0m10.046s
sys     0m0.817s

$ gcc -static qsort_test2.c -o qsort_test2
$ time ./qsort_test2 10000000 > out

real    0m49.588s
user    0m47.356s
sys     0m1.284s
$ gcc -static -L/usr/src/packages/BUILD/glibc-2.3/cc qsort_test2.c -o 
qsort_test2
$ time ./qsort_test2 10000000 > out2

real    0m44.355s
user    0m42.948s
sys     0m1.179s

So depending on the test case I observe a speedup of 10-20% for sorting 10 
Million keys. The optimized version of the lib is placed 
in /usr/src/packages/BUILD/glibc-2.3/cc. Executables linked w/o -L use the 
stock glibc from my distro. I linked the attached benchmark programs with 
- -static, so I didn't have to install the new shared library during the 
development process.

  Please perform further benchmarks with my patch and consider it for 
inclusion in forthcoming glibc releases.

  Best regards,

   Wolfgang
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAxHBeTif4DCz+puQRAhtGAJ4/qig+Ur+SdsE5RK9BdTvlEJxCPwCg0yHH
qzaWcT02xXsMSt71Xq6GW0w=
=EX6z
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glibc-2.3.3-msort-memopt.diff
Type: text/x-diff
Size: 3140 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msort.c
Type: text/x-csrc
Size: 4783 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qsort_test.c
Type: text/x-csrc
Size: 636 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qsort_test2.c
Type: text/x-csrc
Size: 698 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0003.bin>


More information about the Libc-alpha mailing list