This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Make string/memory functions optimized for unaligned SSE2 as default


On Thu, Aug 27, 2015 at 10:02:36AM -0700, H.J. Lu wrote:
> The current default string/memory functions for x86-64 in libc and ld.so were
> implemented before SSE is allowed in ld.so and unaligned SSE load/store
> is faster on most processors.  Today, we can use the same string/memory
> functions in libc and ld.so, most of x86-64 processors have fast unaligned
> SSE load/store.  We should update the default string/memory functions for
> x864-64 to unaligned SSE2 version.  Those functions are
> 
> memcpy-sse2-unaligned.S   strcat-sse2-unaligned.S  strncat-sse2-unaligned.S
> stpcpy-sse2-unaligned.S   strcmp-sse2-unaligned.S  strncpy-sse2-unaligned.S
> stpncpy-sse2-unaligned.S  strcpy-sse2-unaligned.S  strstr-sse2-unaligned.S
> 
I would like that. 

As I recall these are also faster for older
processors in practice. I do not have data now, I would need to retest
as I omitted sse2 implemenatations from tests as they were too slow to
consider. I will write when I will retest that.

I had patches that improve ssse3 implementations by handling sizes upto
64 bytes as in unaligned case and then using ssse3 to align, same with
sse2. So until these patches will come in unaligned should be default
except perhaphs memset and memcpy.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]