This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PowerPC LE strlen
- From: Alan Modra <amodra at gmail dot com>
- To: Will Schmidt <will_schmidt at vnet dot ibm dot com>
- Cc: libc-alpha at sourceware dot org, ryan dot arnold at gmail dot com
- Date: Wed, 14 Aug 2013 08:51:29 +0930
- Subject: Re: PowerPC LE strlen
- References: <20130809051815 dot GH3294 at bubble dot grove dot modra dot org> <1376427228 dot 3823 dot 17 dot camel at brimstone>
On Tue, Aug 13, 2013 at 03:53:48PM -0500, Will Schmidt wrote:
> > - nor rTMP1, rTMP2, rTMP1
> > - and. rWORD1, rTMP1, rMASK
>
> > + nor rTMP3, rTMP2, rTMP1
> > + and. rTMP3, rTMP3, rMASK
>
> ^ For this and related changes, is this clean-up such that it's easier
> to read, or is there an underlying improvement in how we were using the
> involved registers?
The LE tail uses the result of the "and"s in the main loop. Since
they were both originally in rTMP1, I changed the "and" result for the
second word, thinking that was necessary. It isn't in the case of
non-power7 strlen (a fact I only just realised) because we have two
distinct tails, in contrast with many other string/memory functions
that handle two or more words in the main loop yet have a single exit
from the loop. However, it is necessary to renumber rTMP1 to
something other than r0 since I want to subtract one from rTMP1 in the
LE tail and "addi" is preferable to "addic". It's also necessary to
change regs used in the entry path so that the "and" results are in
the same reg as that in the loop.
So one of the changes here is a consequence of poking at a number of
other functions before I looked at non-power7 strlen. I'm not aware
of any case where using a different gpr produces different timing.
--
Alan Modra
Australia Development Lab, IBM