This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug libc/1262] New: Use larger-block comparisons for memcmp on x86?
- From: "greenrd at greenrd dot org" <sourceware-bugzilla at sources dot redhat dot com>
- To: glibc-bugs at sources dot redhat dot com
- Date: 29 Aug 2005 00:32:25 -0000
- Subject: [Bug libc/1262] New: Use larger-block comparisons for memcmp on x86?
- Reply-to: sourceware-bugzilla at sources dot redhat dot com
I noticed an string-comparison inefficiency in libgcj (the support library for
the GNU Compiler for Java), and it turns out that memcmp in glibc is even slower
than the inefficient code I found in libgcj (when code is compiled with
-march=386 and run on athlon, anyway).
Currently, memcmp compares 8 bits at a time. This could be quadrupled to 32 bits
at a time (i.e. sizeof (void *)) for -march=386, and even 128 bits at a time for
chips that support SSE2/altivec! However, I haven't tested the latter.
Here's snippets that show the general idea (although it does not conform to GNU
coding style). I *don't* claim that this is optimal, but I do claim that it is
functionally correct! Rdg is my initials. Note that jchar is the Java char which
is always 16 bits. Of course in glibc, sizeof(jchar) would be eliminated, or
replaced with 2 for the case of comparing wide chars.
#define RDG_ARRAYS_EQUAL { while (--i >= 0) \
{ \
if (*a1++ != *a2++) \
return false; \
} \
return true; \
}
#define CHARS_PER_PTR (sizeof (void *) / sizeof (jchar))
inline jboolean RdgPtrArraysEqual (void **a1, void **a2, jint i) RDG_ARRAYS_EQUAL
inline jboolean RdgJCharArraysEqual (jchar *a1, jchar *a2, jint i) RDG_ARRAYS_EQUAL
...
return
(RdgPtrArraysEqual ((void **) xptr, (void **) yptr, i /= CHARS_PER_PTR)
&& RdgJCharArraysEqual (xptr + i * CHARS_PER_PTR, yptr + i * CHARS_PER_PTR,
count % CHARS_PER_PTR));
With glibc-2.3.90-8 on fedora rawhide, this results in 51% speedups for 220 byte
equal strings for -march=i386 running on athlon. (The actual memcmp improvement
is bigger, because the method is doing more than just memcmp.) Perhaps
surprisingly, there is no noticeable slowdown for this algorithm when comparing
strings that differ in the first character.
--
Summary: Use larger-block comparisons for memcmp on x86?
Product: glibc
Version: unspecified
Status: NEW
Severity: enhancement
Priority: P2
Component: libc
AssignedTo: gotom at debian dot or dot jp
ReportedBy: greenrd at greenrd dot org
CC: glibc-bugs at sources dot redhat dot com
http://sources.redhat.com/bugzilla/show_bug.cgi?id=1262
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.