This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Improve strstr performance

From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
To: libc-alpha at sourceware dot org
Date: Fri, 29 Jun 2018 16:51:39 -0300
Subject: Re: [PATCH] Improve strstr performance
References: <DB5PR08MB1030DA9F862D70A49C3FA40383780@DB5PR08MB1030.eurprd08.prod.outlook.com>


On 11/06/2018 07:31, Wilco Dijkstra wrote:
> Improve strstr performance.  Strstr tends to be slow because it uses
> many calls to memchr and a slow byte loop to scan for the next match.
> Performance is significantly improved by using strnlen on larger blocks
> and using strchr to search for the next matching character.  strcasestr
> can also use strnlen to scan ahead, and memmem can use memchr to check
> for the next match.
> 
> On the GLIBC bench tests the overall performance gains on Cortex-A72 are:
> strstr: +25%
> strcasestr: +4.3%
> memmem: +18%
> 
> On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%.
> 
> Tested on AArch64, passes GLIBC tests.

I am checking if it is worth to keep the implementation in sync with gnulib,
since it contains some changes over the years which did not make into glibc.

I create a branch with these changes [1] along with your changes, but
results are mixed:

   - For strstr/strcasestr performance is the same for neddles larger than
     4, but faster for shorter needles.

   - For memmem performance is faster for needle of size 2, but slower for
     large haystack.

My plan is to check if we can simplify the code a bit, as gnulib one has
done; but in overall the change should be ok (modulo one fix below).

[1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/str-two-way-optimization

> diff --git a/string/strcasestr.c b/string/strcasestr.c
> index 90ba189790481a58eb7627eb8c396530f988f985..d54c4b56853735fb8f5ebd59419b222ca8827651 100644
> --- a/string/strcasestr.c
> +++ b/string/strcasestr.c
> @@ -37,8 +37,8 @@
>  /* Two-Way algorithm.  */
>  #define RETURN_TYPE char *
>  #define AVAILABLE(h, h_l, j, n_l)			\
> -  (!memchr ((h) + (h_l), '\0', (j) + (n_l) - (h_l))	\
> -   && ((h_l) = (j) + (n_l)))
> +  (((j) + (n_l) <= (h_l)) || ((h_l) += strnlen ((void*)((h) + (h_l)), 512), \
> +			      (j) + (n_l) <= (h_l)))

Why did it not trigger linknamespace issues with strnlen usage? For instance:

$ cat conform/ISO/assert.h/linknamespace.out 
[initial] __assert_fail -> [libc.a(assert.o)] __dcgettext -> [libc.a(dcgettext.o)] __dcigettext -> [libc.a(dcigettext.o)] strstr -> [libc.a(strstr.o)] strnlen

On the branch I created it contains a fix for this [2].

[2] https://sourceware.org/git/?p=glibc.git;a=commit;h=6cc58a9708ff256323c34d200fd9362f25bf66f7

>  #define CHECK_EOL (1)
>  #define RET0_IF_0(a) if (!a) goto ret0
>  #define CANON_ELEMENT(c) TOLOWER (c)
> diff --git a/string/strstr.c b/string/strstr.c
> index b3b5deb673758510616d32849bcf2fc15ce941cd..346f303c7423c95de0fbcdeb1a9f31293b303372 100644
> --- a/string/strstr.c
> +++ b/string/strstr.c
> @@ -33,10 +33,11 @@
>  
>  #define RETURN_TYPE char *
>  #define AVAILABLE(h, h_l, j, n_l)			\
> -  (!memchr ((h) + (h_l), '\0', (j) + (n_l) - (h_l))	\
> -   && ((h_l) = (j) + (n_l)))
> +  (((j) + (n_l) <= (h_l)) || ((h_l) += strnlen ((void*)((h) + (h_l)), 512), \
> +			      (j) + (n_l) <= (h_l)))
>  #define CHECK_EOL (1)
>  #define RET0_IF_0(a) if (!a) goto ret0
> +#define FASTSEARCH(S,C,N) (void*) strchr ((void*)(S), (C))
>  #include "str-two-way.h"
>  
>  #undef strstr
>

References:
- [PATCH] Improve strstr performance
  - From: Wilco Dijkstra

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]