This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [Patch, MIPS] Modify memcpy.S for mips32r6/mips64r6
- From: Richard Henderson <rth at twiddle dot net>
- To: sellcey at imgtec dot com, Joseph Myers <joseph at codesourcery dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 23 Dec 2014 09:52:56 -0800
- Subject: Re: [Patch, MIPS] Modify memcpy.S for mips32r6/mips64r6
- Authentication-results: sourceware.org; auth=none
- References: <7ec2bf7e-fc1e-428b-ac0a-747f2a3ab3e6 at BAMAIL02 dot ba dot imgtec dot org> <alpine dot DEB dot 2 dot 10 dot 1412221758190 dot 5278 at digraph dot polyomino dot org dot uk> <1419354526 dot 27606 dot 73 dot camel at ubuntu-sellcey>
On 12/23/2014 09:08 AM, Steve Ellcey wrote:
> + andi t8,a0,7
> + lapc t9,L(atable)
> + PTR_LSA t9,t8,t9,2
> + jrc t9
> +L(atable):
> + bc L(lb0)
> + bc L(lb7)
> + bc L(lb6)
> + bc L(lb5)
> + bc L(lb4)
> + bc L(lb3)
> + bc L(lb2)
> + bc L(lb1)
> +L(lb7):
> + lb a3, 6(a1)
> + sb a3, 6(a0)
> +L(lb6):
> + lb a3, 5(a1)
> + sb a3, 5(a0)
> +L(lb5):
> + lb a3, 4(a1)
> + sb a3, 4(a0)
> +L(lb4):
> + lb a3, 3(a1)
> + sb a3, 3(a0)
> +L(lb3):
> + lb a3, 2(a1)
> + sb a3, 2(a0)
> +L(lb2):
> + lb a3, 1(a1)
> + sb a3, 1(a0)
> +L(lb1):
> + lb a3, 0(a1)
> + sb a3, 0(a0)
L(lbx):
> +
> + li t9,8
> + subu t8,t9,t8
> + PTR_SUBU a2,a2,t8
> + PTR_ADDU a0,a0,t8
> + PTR_ADDU a1,a1,t8
> +L(lb0):
This table is regular enough that I wonder if it wouldn't be better to do some
arithmetic instead of a branch-to-branch. E.g.
andi t7,a0,7
li t8,L(lb0)-L(lbx)
lsa t8,t7,t8,8
lapc t9,L(lb0)
selnez t8,t8,t7
PTR_SUBU t9,t9,t8
jrc t9
Which is certainly smaller than your 12 insns, unlikely to be slower on any
conceivable hardware, but probably faster on most.
r~