This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

strstr benchtest.

From: OndÅej BÃlka <neleai at seznam dot cz>
To: libc-alpha at sourceware dot org
Date: Tue, 12 May 2015 23:09:46 +0200
Subject: strstr benchtest.
Authentication-results: sourceware.org; auth=none
References: <20150512205311 dot GA24057 at domone>

On Tue, May 12, 2015 at 10:53:11PM +0200, OndÅej BÃlka wrote:
> Hi,
>
As I mentioned strstr benchmarking I used following.

Current benchmark is as useless as rest of string benchmarks. To compile
you need to delete benchtest/bench-strstr.o as this is tested by header
inclusion and benchtest Makefile doesn't detect that strstr.c changed.

Main problem of current benchtest was very biased input. I replaced that
with randomly generated ones that are closer to average case.

A problem with periodic input with period 16 is that it would skew
branch prediction a lot. It also creates periodic pattern for branches
and they would be predicted perfectly while in reality performance would
be magnitude worse. 

Second problem is that it was a worst-case scenario for this heuristic
instead average as first three characters form searched triplet instead
randomly distributed when characters are selected independently.

I couldn't get definite answer. As I said before this aproach is good
when string has high entropy and problematic on lower-entropy ones.
Selection which is better depends on entropy of strings used by given
application. It isn't always better but benefits outweigth the cost.

diff --git a/benchtests/bench-strstr.c b/benchtests/bench-strstr.c
index 74f3ee8..0412c1b 100644
--- a/benchtests/bench-strstr.c
+++ b/benchtests/bench-strstr.c
@@ -81,31 +81,15 @@ do_test (size_t align1, size_t align2, size_t len1, size_t len2,
   char *s1 = (char *) (buf1 + align1);
   char *s2 = (char *) (buf2 + align2);

-  static const char d[] = "1234567890abcdef";
-#define dl (sizeof (d) - 1)
-  char *ss2 = s2;
-  for (size_t l = len2; l > 0; l = l > dl ? l - dl : 0)
+  for (size_t l = 0; l < len2; l++)
     {
-      size_t t = l > dl ? dl : l;
-      ss2 = mempcpy (ss2, d, t);
+      s2[l] = (rand () % 128) + 1;
     }
   s2[len2] = '\0';

-  if (fail)
+  for (size_t l = 0; l < len1; l++)
     {
-      char *ss1 = s1;
-      for (size_t l = len1; l > 0; l = l > dl ? l - dl : 0)
-	{
-	  size_t t = l > dl ? dl : l;
-	  memcpy (ss1, d, t);
-	  ++ss1[len2 > 7 ? 7 : len2 - 1];
-	  ss1 += t;
-	}
-    }
-  else
-    {
-      memset (s1, '0', len1);
-      memcpy (s1 + len1 - len2, s2, len2);
+      s1[l] = (rand () % 128) + 1;
     }
   s1[len1] = '\0';

Follow-Ups:
- Re: strstr benchtest.
  - From: Florian Weimer

References:
- [PATCH] Improve generic strstr performance.
  - From: OndÅej BÃlka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]