This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] ARM: Add optimized ARMv7 strcmp implementation

From: Richard Henderson <rth at twiddle dot net>
To: Will Newton <will dot newton at linaro dot org>, libc-alpha at sourceware dot org
Cc: Richard Earnshaw <rearnsha at arm dot com>
Date: Wed, 23 Apr 2014 09:13:24 -0700
Subject: Re: [PATCH] ARM: Add optimized ARMv7 strcmp implementation
Authentication-results: sourceware.org; auth=none
References: <1398256338-24784-1-git-send-email-will dot newton at linaro dot org>

On 04/23/2014 05:32 AM, Will Newton wrote:
> +	clz	tmp1, \synd
> +	lsl	r1, \d2, tmp1
> +	.if \restore_r6
> +	ldrd	r6, r7, [sp, #8]
> +	.endif
> +	cfi_restore (r6)
> +	cfi_restore (r7)
> +	lsl	\d1, \d1, tmp1
> +	cfi_remember_state
> +	lsr	result, \d1, #24
> +	ldrd	r4, r5, [sp], #16
> +	cfi_restore (r4)
> +	cfi_restore (r5)
> +	sub	result, result, r1, lsr #24

You can minimize the size of the unwind info (avoiding DW_CFA_advance opcodes)
by delaying restores until the previous storage may be clobbered, or until some
other adjustment must be made.  The storage is clobbered, obviously, by the
post-increment of SP deallocating the storage.

> +#else
> +	/* To use the big-endian trick we'd have to reverse all three words.
> +	   that's slower than this approach.  */
> +	rev	\synd, \synd
> +	clz	tmp1, \synd
> +	bic	tmp1, tmp1, #7
> +	lsr	r1, \d2, tmp1
> +	cfi_remember_state
> +	.if \restore_r6
> +	ldrd	r6, r7, [sp, #8]
> +	.endif
> +	cfi_restore (r6)
> +	cfi_restore (r7)

Also, it would appear that you're remembering two different states between the
big and little endian versions of this code.  The LE version remembers before
restoring r6/r7; the BE version remembers afterward.

I suspect you want

	ldrd	r4, r5, [sp], #16
	cfi_remember_state
	cfi_def_cfa_offset(0)
	cfi_restore (r4)
	cfi_restore (r5)
	cfi_restore (r6)
	cfi_restore (r7)

which has a single advance within this code sequence, and r6/r7 remembered as
saved, and actually describes the stack adjustment.

> +	strd	r4, r5, [sp, #-16]!
> +	cfi_def_cfa_offset (16)
> +	cfi_offset (r4, -16)
> +	cfi_offset (r5, -12)
> +	orr	tmp1, src1, src2
> +	strd	r6, r7, [sp, #8]
> +	cfi_offset (r6, -8)
> +	cfi_offset (r7, -4)

FWIW, I find using cfi_rel_offset much easier to reason with, since you get to
match up numbers in the cfi data with numbers in the assembly.  E.g.

	cfi_rel_offset(r6, 8)
	cfi_rel_offset(r7, 12)

but perhaps not that helpful in this case.

> +.Lmisaligned_exit:
> +	cfi_remember_state
> +	mov	result, tmp1
> +	ldr	r4, [sp], #16
> +	cfi_restore (r4)
> +	bx	lr

Even though only r4 was modified and reloaded, r5-r7 are still described as
saved.  After popping all of the storage, that's not correct -- you need to
restore all 4 registers.  Again, you're not describing the stack adjustment.
Again, no point in doing the remember 2 insns before the restores.

Alternately, since this unwind info isn't for correctness of exception
handling, you could simply drop all this restore business and let the unwind
info be incorrect for the 1-2 insns before the return.

r~

Follow-Ups:
- Re: [PATCH] ARM: Add optimized ARMv7 strcmp implementation
  - From: Roland McGrath
- Re: [PATCH] ARM: Add optimized ARMv7 strcmp implementation
  - From: Will Newton

References:
- [PATCH] ARM: Add optimized ARMv7 strcmp implementation
  - From: Will Newton

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]