This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 3/2] Use strspn/strcspn/strpbrk ifunc in internal calls.
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Fri, 6 Mar 2015 05:22:48 -0800
- Subject: Re: [PATCH 3/2] Use strspn/strcspn/strpbrk ifunc in internal calls.
- Authentication-results: sourceware.org; auth=none
- References: <20140227123238 dot GA26291 at domone dot podge> <20140227124206 dot GA26474 at domone dot podge> <5318A03D dot 3000705 at redhat dot com> <20140306163241 dot GA11843 at domone dot podge> <5318B58B dot 5040704 at redhat dot com> <20140306205212 dot GB11843 at domone dot podge> <53192422 dot 2050101 at redhat dot com> <20140318100138 dot GC8415 at domone dot podge> <20150306020353 dot GB12857 at vapier>
On Thu, Mar 5, 2015 at 6:03 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On 18 Mar 2014 11:01, OndÅej BÃlka wrote:
>> To make a strtok faster and improve performance in general we need to do one
>> additional change.
>>
>> A comment:
>>
>> /* It doesn't make sense to send libc-internal strcspn calls through a PLT.
>> The speedup we get from using SSE4.2 instruction is likely eaten away
>> by the indirect call in the PLT. */
>>
>> Does not make sense at all because nobody bothered to check it. Gap
>> between these implementations is quite big, when haystack is empty a
>> sse2 is around 40 cycles slower because it needs to populate a lookup
>> table and difference only increases with size. That is much bigger than
>> plt slowdown which is few cycles.
>>
>> Even benchtest show a gap which also may be reverse by branch
>> misprediction but my internal benchmark shown.
>>
>> simple_strspn stupid_strspn __strspn_sse42 __strspn_sse2
>> Length 0, alignment 0, acc len 6: 18.6562 35.2344 17.0469 61.6719
>> Length 6, alignment 0, acc len 6: 59.5469 72.5781 16.4219 73.625
>>
>> This patch also handles strpbrk which is implemented by including a
>> x86_64/multiarch/strcspn.S file.
>>
>> * sysdeps/x86_64/multiarch/strspn.S: Remove plt indirection.
>> * sysdeps/x86_64/multiarch/strcspn.S: Likewise.
>
> since H.J. wrote the code, he probably should be the one approving this change
> -mike
Looks good to me. Please commit. Sorry for the long delay.
Thanks.
--
H.J.