This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING][PATCH] More effective generic strrchr.


On Wed, Dec 04, 2013 at 01:13:34PM +0000, Will Newton wrote:
> On 4 December 2013 12:58, OndÅej BÃlka <neleai@seznam.cz> wrote:
> > ping
> > On Sat, Oct 05, 2013 at 09:15:42AM +0200, OndÅej BÃlka wrote:
> >>
> >> A generic implementation of strrchr could be improved by following
> >> patch. As it is likely that last occurrence of c will be near end we
> >> could quickly find it. This is also much faster for degenerate inputs
> >> like:
> >>
> >> strrchr("xyxyxyxyxyxyxyxyxyxyxyxyx...",'x');
> >>
> >> where a loop slows us down.
> >>
> >>       * string/strrchr.c: Improve implementation.
> 
> This changelog is not really adequate or correct.
> 
> >>
> >> ---
> >>  string/strrchr.c | 19 ++-----------------
> >>  1 file changed, 2 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/string/strrchr.c b/string/strrchr.c
> >> index bdec841..ad242b6 100644
> >> --- a/string/strrchr.c
> >> +++ b/string/strrchr.c
> >> @@ -23,23 +23,8 @@
> >>  char *
> >>  strrchr (const char *s, int c)
> >>  {
> >> -  const char *found, *p;
> >> -
> >> -  c = (unsigned char) c;
> >> -
> >> -  /* Since strchr is fast, we use it rather than the obvious loop.  */
> >> -
> >> -  if (c == '\0')
> >> -    return strchr (s, '\0');
> >> -
> >> -  found = NULL;
> >> -  while ((p = strchr (s, c)) != NULL)
> >> -    {
> >> -      found = p;
> >> -      s = p + 1;
> >> -    }
> >> -
> >> -  return (char *) found;
> >> +  /* We need to include terminating zero for case c == 0.  */
> >> +  return __memrchr (s, c, strlen (s) + 1);
> 
> Do you have benchmark results for this? It would be good to test on
> systems that have fast memrchr (e.g. i386/powerpc/x86_64) and ones
> that don't (the rest).
> 
> -- 
> Will Newton
> Toolchain Working Group, Linaro
Yes I did benchmarking and implementation is around 60 times faster than
old string one on x86.

New implementation is around three times slower than old one.

And there is no difference between implementations.

This variance was caused by choosing different inputs (string consisting
in 50% from searched character, string with character only at start and
string with character only at end) so you need to find which factors are
important in practice.

One of current implementations problems is that its bit dosable, main
application of strrchr is finding last / for basename so attacker could
supply path ///////////////...///

Actual times of attached benchmark are

frequent - old string implementation

real	0m28.205s
user	0m28.215s
sys	0m0.000s

frequent - new string implementation

real	0m0.476s
user	0m0.475s
sys	0m0.000s

frequent - libc implementation

real	0m0.517s
user	0m0.516s
sys	0m0.000s

start - old string implementation

real	0m0.471s
user	0m0.471s
sys	0m0.000s

start - new string implementation

real	0m1.445s
user	0m1.445s
sys	0m0.000s

start - libc implementation

real	0m0.479s
user	0m0.479s
sys	0m0.000s

end - old string implementation

real	0m0.474s
user	0m0.474s
sys	0m0.000s

end - new string implementation

real	0m0.473s
user	0m0.473s
sys	0m0.000s

new - libc implementation

real	0m0.479s
user	0m0.479s
sys	0m0.000s

Attachment: strrchr.tar.bz2
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]