memcpy-optimized version of qsort.
Wolfgang Glas
wolfgang.glas@ev-i.at
Mon Jun 7 13:40:00 GMT 2004
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
I recently studied the implementation of the merge sort algorithm, which is
used by glibc's qsort() implementation, because I have been writing a
program, which uses qsort() very frequently and I came across it during
profiling my program. I use glibc-2.3.3 as shipped together with SuSE-9.1 (as
far as I know a CVS snapshot from April 2004).
Inside msort_with_tmp() in stdlib/msort.c I recognized, that the two sorted
subsequences are merged into a temporary array and afterwards, the temporary
array is copied back as a whole to the original place. This way each array
element is copied twice per iteration, which makes n*log2(n)*2 copy
operations for sorting n elements.
I felt, that there must be a way to copy each element only once per
iteration and after a few hours wtching my children play, I got the solution.
(Please don't pledge me, if this is described in the standard literature:)
The key to to my optimized memory bookkepping is the observation, that if
you want to merge two sorted sequences into a resulting sequence, the second
sequence could as well be stored in the upper part of the resulting array,
because these elemnts are only copied from a higher address to a lower
address, thus merge proess does not destroy any valuable information.
So the attached patch implements msort_with_tmp() involving the following
steps:
given: sort sequence b with n elements, tmp is temp. storage of size n
elements.
n1:=n/2
1) sort first half of b from b into tmp.
2) sort second half of b from b+n1 into b+n1
3) merge tmp and b+n1 into b.
The minor bookkeeping issues, that have to be handled are best seen in the
source code. I'd like to especially mention the switch for the trivial cases
n==0 or n==1 at the start of the code. This gave a minor speedup on my
Linux/P4 laptop copmared to the 'if (n<=0) return;' version.
Benchmarks results on my P4 2GHz laptop:
$ gcc -static qsort_test.c -o qsort_test
$ time ./qsort_test 10000000 > out
real 0m13.084s
user 0m12.197s
sys 0m0.750s
$ gcc -static -L/usr/src/packages/BUILD/glibc-2.3/cc qsort_test.c -o
qsort_test
$ time ./qsort_test 10000000 > out2
real 0m11.033s
user 0m10.046s
sys 0m0.817s
$ gcc -static qsort_test2.c -o qsort_test2
$ time ./qsort_test2 10000000 > out
real 0m49.588s
user 0m47.356s
sys 0m1.284s
$ gcc -static -L/usr/src/packages/BUILD/glibc-2.3/cc qsort_test2.c -o
qsort_test2
$ time ./qsort_test2 10000000 > out2
real 0m44.355s
user 0m42.948s
sys 0m1.179s
So depending on the test case I observe a speedup of 10-20% for sorting 10
Million keys. The optimized version of the lib is placed
in /usr/src/packages/BUILD/glibc-2.3/cc. Executables linked w/o -L use the
stock glibc from my distro. I linked the attached benchmark programs with
- -static, so I didn't have to install the new shared library during the
development process.
Please perform further benchmarks with my patch and consider it for
inclusion in forthcoming glibc releases.
Best regards,
Wolfgang
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFAxHBeTif4DCz+puQRAhtGAJ4/qig+Ur+SdsE5RK9BdTvlEJxCPwCg0yHH
qzaWcT02xXsMSt71Xq6GW0w=
=EX6z
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glibc-2.3.3-msort-memopt.diff
Type: text/x-diff
Size: 3140 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msort.c
Type: text/x-csrc
Size: 4783 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qsort_test.c
Type: text/x-csrc
Size: 636 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qsort_test2.c
Type: text/x-csrc
Size: 698 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/libc-alpha/attachments/20040607/17812616/attachment-0003.bin>
More information about the Libc-alpha
mailing list