This is the mail archive of the mailing list for the libc-ports project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ARM] Optimised strchr and strlen

On 20 December 2011 23:07, Joseph S. Myers <> wrote:
> On Mon, 19 Dec 2011, Dr. David Alan Gilbert wrote:
>> +@ A very simple strchr routine, from benchmarks on A9 it's a bit faster than
>> +@ the current version in eglibc.
>> +@ While I have a version that does 8 bytes/loop and is a lot faster on long
>> +@ strings, it is slower on short strings, and short strings seem more common
>> +@ in strchr usage.
> That sounds like a possible case for a hybrid function, with an unrolled
> initial part testing some number of characters to cover short strings (it
> might be possible to get things aligned at the same time) and the more
> complicated version for longer strings.

Of course the difficulty with strchr (compared with memchr) is that you have
no length parameter to hint at how much is left, and thus whether it's
worth making that switch; so it has to be a heuristic and it's going to cost
you something in the small case.

> ?What's the actual size distribution you see in strchr use?

It varies heavily by program.  I only found one of the SPEC benchmarks using
it to a measurable amount (i.e. showed up in profile), and that was
mostly in 24byte strings
with the match at random positions within the string.

From some embedded benchmarks I found that almost all the uses of strchr
were calls with less than 8 byte strings (mostly unexpected fallout
from within libc rather
than the meat of the benchmark).
One of the things people commonly seem to do with strchr is see whether a
character is in a set, and run that strchr along a string - e.g. something like

char *mystr=....
char tmp;

for(tmp=*mystr;tmp;mystr++) {
   if (strchr(" \t\n\r", tmp)!=NULL) { do something }

With the string being searched depressingly small.   There are some places where
you might hope the compiler could be smart and do something with that,
although often the string being searched is actually a variable (think
$IFS in shell).
(I fancy something like a strchr_short with the compiler calling that
when it knows
the string is less than some length - but how do you define that
length between the
compiler and library?)

The other thing I found in some examples was splitting env strings; so
searching for
the '=' in NAME=VALUE, the NAME parts tend not to be particularly long.

The worst case tends to be where you're using strchr on a long string and you
don't actually find the match.

A ltrace of gcc's cc1 shows a few of the cases - e.g.

Mostly short identifiers:

strchr("nothrow", ' ')                           = NULL
strchr("final", ' ')                             = NULL
strchr("dfinish", ' ')                           = NULL

This is a case that can get long - finding '.' in filenames; this can
be one where
you get longer strings without  a match.

strchr("/usr/local/include", '.')                = NULL
strchr("/usr/local/include", '.')                = NULL
'.') = ".5.2/include"

I suspect something like parsing a format string:
strchr("-+ #0", 's')                             = NULL

Splitting assignments:
strchr("__UINT16_C(c)=c", '=')                   = "=c"
strchr("__UINT_LEAST32_MAX__=4294967295U", '=')  = "=4294967295U"

Here's a profile graph of different strlen's on an ARM:

That 'simple' one is showing the benefit at the short lengths,
the 'smarter' one I have is doing 8 bytes/loop and is nice on the long
strings - but as you can see worse at the short ones.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]