This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 18 Mar 2014 11:01, OndÅej BÃlka wrote: > To make a strtok faster and improve performance in general we need to do one > additional change. > > A comment: > > /* It doesn't make sense to send libc-internal strcspn calls through a PLT. > The speedup we get from using SSE4.2 instruction is likely eaten away > by the indirect call in the PLT. */ > > Does not make sense at all because nobody bothered to check it. Gap > between these implementations is quite big, when haystack is empty a > sse2 is around 40 cycles slower because it needs to populate a lookup > table and difference only increases with size. That is much bigger than > plt slowdown which is few cycles. > > Even benchtest show a gap which also may be reverse by branch > misprediction but my internal benchmark shown. > > simple_strspn stupid_strspn __strspn_sse42 __strspn_sse2 > Length 0, alignment 0, acc len 6: 18.6562 35.2344 17.0469 61.6719 > Length 6, alignment 0, acc len 6: 59.5469 72.5781 16.4219 73.625 > > This patch also handles strpbrk which is implemented by including a > x86_64/multiarch/strcspn.S file. > > * sysdeps/x86_64/multiarch/strspn.S: Remove plt indirection. > * sysdeps/x86_64/multiarch/strcspn.S: Likewise. since H.J. wrote the code, he probably should be the one approving this change -mike
Attachment:
signature.asc
Description: Digital signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |