This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug string/23709] glibc 2.25 lacks sse2 optimized strstr()


https://sourceware.org/bugzilla/show_bug.cgi?id=23709

--- Comment #22 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to paul.borile from comment #21)
> So if cpu has the AVX set we loose all the SSE2 optimizations but we keep
> the avx ones, correct ?

Not really, only Haswell chips in fact will have this behavior [1]:

Haswell (Client)        GT3E            0       0x6     0x4     0x6     Family
6 Model 70
                        ULT             0       0x6     0x4     0x5     Family
6 Model 69
                        S               0       0x6     0x3     0xC     Family
6 Model 60

Haswell (Server)        E, EP, EX       0       0x6     0x3     0xF     Family
6 Model 63

On these chips the internal glibc flags bit_arch_Fast_Rep_String,
bit_arch_Fast_Unaligned_Load, bit_arch_Fast_Unaligned_Copy, and
bit_arch_Prefer_PMINUB_for_stringop won't be set:

The bit_arch_Fast_Rep_String is only used on ifunc selection on i686 (32-bits)
and it selects the *ssse3_rep* memcpy, memmove, bcopy, and mempcpy.  It it is
not set the *ssse3* variant is used instead (not sure which is the performance
difference between them).

The bit_arch_Fast_Unaligned_Load influences both i686 and x86_64. For x86_64 it
influences the selections of the SSE2 unaligned optimization variants for
stpncpy, strcpy, strncpy, stpcpy, strncat, strcat, and strstr.  For all but
strstr an ssse3 or sse2 variant is used instead (not sure either which is the
performance difference between them).

The bit_arch_Fast_Unaligned_Copy influences mempcpy, memmove, and memcpy. If
chip has not SSE3 the bit will select either a RMS or a unaligned variant. For
Haswell the *_avx_unaligned_erms variants will be selected, so this bits won't
interfere with best selections.

The bit_arch_PMINUB_for_stringop is not used on ifunc selection.

[1] https://en.wikichip.org/wiki/intel/cpuid

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]