This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 6/7] stdlib: Optimization qsort{_r} swap implementation



On 23/01/2018 04:04, Paul Eggert wrote:
> Adhemerval Zanella wrote:
>> At the cost of large text sizes and slight more code:
> 
> Yes, that's a common tradeoff for this sort of optimization. My guess is that most glibc users these days would like to spend 4 kB of text space to gain a 2%-or-so CPU speedup. (But it's just a guess. :-)
>> I still prefer my version where generates shorter text segment and also
>> optimizes for uint32_t.
> 
> The more-inlined version could also optimize for uint32_t. Such an optimization should not change the machine code on platforms with 32-bit pointers (since uint32_t has the same size and alignment restrictions as void *, and GCC should be smart enough to figure this out) but should speed up the size-4 case on platforms with 64-bit pointers.
> 
> Any thoughts on why the more-inlined version is a bit slower when input is already sorted?

Again do we really to over-engineering it? GCC profile usage shows 95% to total 
issues done with up to 9 elements and 92% of key size 8.  Firefox is somewhat 
more diverse with 72% up to 17 elements and 95% of key size 8.  I think that 
adding even more code complexity by parametrizing the qsort calls to inline 
the swap operations won't really make much difference in the aforementioned
user cases.

I would rather add specialized sort implementation such as BSD family, heapsort
and mergesort, to provide different algorithm for different constraints (mergesort
for stable-sort, heapsort/mergesort to avoid worse-case from quicksort). We might
even extend it to add something like introsort.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]