This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] faster strcmp by avoiding sse42.

From: OndÅej BÃlka <neleai at seznam dot cz>
Cc: libc-alpha at sourceware dot org
Date: Wed, 7 Aug 2013 14:28:03 +0200
Subject: Re: [RFC] faster strcmp by avoiding sse42.
References: <20130806213033 dot GA5290 at domone dot kolej dot mff dot cuni dot cz>

On Tue, Aug 06, 2013 at 11:30:33PM +0200, OndÅej BÃlka wrote:
> Hi,
> 
> Continuing to improving implementation that needlessly use sse42 we move
> to strcmp. A strcmp_sse42 is actually faster than existing
> implementations. It is mostly caused by lack of unrolling in other
> implementations than sse4 itself.
>
Hi, I recalled that in strlen I could improve loop speed by using memory operands instead registers.
This improves performance especially when data is in L2 cache and more.
A updated graphs are at

http://kam.mff.cuni.cz/~ondra/benchmark_string/strcmp_profile.html

http://kam.mff.cuni.cz/~ondra/strcmp_profile070813.tar.bz2 

A optimized loop that I use is following:

.L17:
        addq    $64, %rdi
        addq    $64, %rsi
.L12:
        movdqu  (%rsi), %xmm4
        pcmpeqb (%rdi), %xmm4
        pminub  (%rdi), %xmm4
        movdqu  16(%rsi), %xmm3
        pcmpeqb 16(%rdi), %xmm3
        pminub  16(%rdi), %xmm3
        movdqu  32(%rsi), %xmm2
        pcmpeqb 32(%rdi), %xmm2
        pminub  32(%rdi), %xmm2
        movdqu  48(%rsi), %xmm0
        pcmpeqb 48(%rdi), %xmm0
        pminub  48(%rdi), %xmm0
        pminub  %xmm4, %xmm0
        pminub  %xmm3, %xmm0
        pminub  %xmm2, %xmm0
        pcmpeqb %xmm6, %xmm0
        pmovmskb        %xmm0, %eax
        testl   %eax, %eax
        je      .L17
        jmp     .L15

Follow-Ups:
- Re: [RFC] faster strcmp by avoiding sse42.
  - From: Richard Henderson

References:
- [RFC] faster strcmp by avoiding sse42.
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]