This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ARM] Optimised strchr and strlen


On 24 December 2011 21:01, Richard Henderson <rth@twiddle.net> wrote:
> On 12/23/2011 12:31 PM, David Gilbert wrote:
>> Sure; it's pretty much the same trick as my strlen routine.
> ...
>> OK, so I gave that a go - and the results are:
>
> I can't help but wonder if just the one branch in the first loop is best.

Yes.

> Also, it appears one can use uqadd8 and do the aligned two words in parallel
> rather than having everything serialize on the GT flags and SEL.
>
> I've run this through glibc's test-strchr, but havn't gotten around to
> benchmarking it at all. ?Since you've already got that set up, perhaps
> you could give it a whirl.

Here we go - you're code is the green line; rth_strchr - your uqadd8
trick is very nice;
the peak speed is a nice bit higher than my version using a set of uadd8's and
sel (you get 1 instruction less in the main loop).

https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/InitialStrchr?action=AttachFile&do=view&target=strchr-withrth-strchr-abs.png

The simple routine is still easily winning below 32 bytes though, and
there is still that odd notch at 16.

(I think your uqadd8 trick would be a nice improvement on my strlen
and memchr routines).

Dave


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]