This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Status of strcmp.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Wed, 14 Aug 2013 23:51:11 +0200
- Subject: Status of strcmp.
- References: <20130807140911 dot GA31968 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ926EE-MYDJR5Eftf+DUefBg-Gox0pw57vZ7XUwsO3OPJg at mail dot gmail dot com> <20130808190716 dot GA4589 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ92+C6uXyrUhTd3OWuoa6v2SeUaKLBuqaNX5Sqtn4ANBdg at mail dot gmail dot com> <CAHjhQ90S-1uBhwV44KODTcQkr=0U-P+_9Pu0O=RbYYY9e82JCA at mail dot gmail dot com> <20130809164420 dot GB4972 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ91rFwppQ4ixhPNuB9xe8FH9OrEoz3=eFrTQTscwOvSBCA at mail dot gmail dot com>
On Wed, Aug 14, 2013 at 11:46:23AM +0400, Liubov Dmitrieva wrote:
Sorry but there was error that I did not noticed. When I generated a
implementation a gcc optimized away my check if I cross page. Without
that a performance was much better on big sizes.
I uploaded fixed version that is better when there are unaligned load
but not so if it competes to ssse3 one.
It did not change a critical part, when we strings differ in first 16
characters. In practice most of time is spend there and new
implementation considerably improves this case. On gcc benchmark a
performance ratios were left almost unchanged by slowing loop down.
A reason why these sizes are important is empirical, when I did
measurements around 90% of calls were of this type. You could also
consider possible usecases, sorting and searching- strings will likely
differ in first character; checking againist fixed word - there are only
few words with 16 characters and more. Also when I cross checked this
with how large strings passed to strlen are they are most of time less
than 80 characters large which also supports importance of header.
It is better for sizes upto 64 bytes which means that a header does good
job.
Also this touches of limitation of my benchmark which is selection of
distribution. These checks tend to be corner cases and from them alone
it is hard to say if there are more introduced regression than
regressions fixed.
Basically for each two implementations I could find a distribution which
says that A is regression but also distribution that says B is
regression.
A best course could be use 16-byte ssse3 loop or something else, I do
not know yet.