This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GNU C Library master sources branch master updated. glibc-2.26.9000-970-g2bce01e

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  2bce01ebbaf8db52ba4a5635eb5744f989cdbf69 (commit)
      from  243b63337c2c02f30ec3a988ecc44bc0f6ffa0ad (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------;a=commitdiff;h=2bce01ebbaf8db52ba4a5635eb5744f989cdbf69

commit 2bce01ebbaf8db52ba4a5635eb5744f989cdbf69
Author: Siddhesh Poyarekar <>
Date:   Wed Dec 13 18:50:27 2017 +0530

    aarch64: Improve strcmp unaligned performance
    Replace the simple byte-wise compare in the misaligned case with a
    dword compare with page boundary checks in place.  For simplicity I've
    chosen a 4K page boundary so that we don't have to query the actual
    page size on the system.
    This results in up to 3x improvement in performance in the unaligned
    case on falkor and about 2.5x improvement on mustang as measured using
    	* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
    	time whenever possible.

diff --git a/ChangeLog b/ChangeLog
index 22df17b..a5419e1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2017-12-13  Siddhesh Poyarekar  <>
+	* sysdeps/aarch64/strcmp.S (misaligned8): Compare dword at a
+	time whenever possible.
 2017-12-12  Carlos O'Donell <>
 	* elf/Makefile [$(nss-crypt)$(static-nss-crypt) == yesno]
diff --git a/sysdeps/aarch64/strcmp.S b/sysdeps/aarch64/strcmp.S
index e99d662..c260e1d 100644
--- a/sysdeps/aarch64/strcmp.S
+++ b/sysdeps/aarch64/strcmp.S
@@ -72,6 +72,7 @@ L(start_realigned):
 	cbz	syndrome, L(loop_aligned)
 	/* End of performance-critical section  -- one 64B cache line.  */
 #ifndef	__AARCH64EB__
 	rev	syndrome, syndrome
 	rev	data1, data1
@@ -145,12 +146,38 @@ L(mutual_align):
 	b	L(start_realigned)
-	/* We can do better than this.  */
+	/* Align SRC1 to 8 bytes and then compare 8 bytes at a time, always
+	   checking to make sure that we don't access beyond page boundary in
+	   SRC2.  */
+	tst	src1, #7
+	b.eq	L(loop_misaligned)
 	ldrb	data1w, [src1], #1
 	ldrb	data2w, [src2], #1
 	cmp	data1w, #1
 	ccmp	data1w, data2w, #0, cs	/* NZCV = 0b0000.  */
-	b.eq	L(misaligned8)
+	L(done)
+	tst	src1, #7
+	L(misaligned8)
+	/* Test if we are within the last dword of the end of a 4K page.  If
+	   yes then jump back to the misaligned loop to copy a byte at a time.  */
+	and	tmp1, src2, #0xff8
+	eor	tmp1, tmp1, #0xff8
+	cbz	tmp1, L(do_misaligned)
+	ldr	data1, [src1], #8
+	ldr	data2, [src2], #8
+	sub	tmp1, data1, zeroones
+	orr	tmp2, data1, #REP8_7f
+	eor	diff, data1, data2	/* Non-zero if differences found.  */
+	bic	has_nul, tmp1, tmp2	/* Non-zero if NUL terminator.  */
+	orr	syndrome, diff, has_nul
+	cbz	syndrome, L(loop_misaligned)
+	b	L(end)
 	sub	result, data1, data2


Summary of changes:
 ChangeLog                |    5 +++++
 sysdeps/aarch64/strcmp.S |   31 +++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 2 deletions(-)

GNU C Library master sources

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]