This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH][BZ #12100] strstr with unaligned loads.
- From: Rich Felker <dalias at aerifal dot cx>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 26 Aug 2013 12:24:46 -0400
- Subject: Re: [PATCH][BZ #12100] strstr with unaligned loads.
- Authentication-results: sourceware.org; auth=none
- References: <20130825090758 dot GA23686 at domone dot kolej dot mff dot cuni dot cz> <20130825155406 dot GV20515 at brightrain dot aerifal dot cx> <20130826070925 dot GA4292 at domone dot kolej dot mff dot cuni dot cz>
On Mon, Aug 26, 2013 at 09:09:25AM +0200, OndÅej BÃlka wrote:
> On Sun, Aug 25, 2013 at 11:54:06AM -0400, Rich Felker wrote:
> > tatus: O
> > Content-Length: 651
> > Lines: 17
> > On Sun, Aug 25, 2013 at 11:07:58AM +0200, OndÅej BÃlka wrote:
> > > Hi,
> > >
> > > This patch continues to remove ineffective sse4.2 functions. Now we
> > > focus on strstr.
> > >
> > > I use a techique of doing vectorized search for first pair of characters
> > > which is fastest in practice.
> > What do you do after finding the first pair? As far as I can tell,
> It enumerates all occurences of first pair.
> > this approach is only useful at the start of the string. Once you've
> > gotten into two-way, searching for the next occurrence of the first
> > two characters is not a useful operation. Of course it looks to me
> Not very useful? A two way algorithm spend most time in ineffective
> looking for these (or calculating critical factorization which is also
> slow.). Actual code is following
No, before you make any such claims, you should read up on and
understand the algorithm. You may be able to apply the optimized "find
first character" or maybe even "find first pair" to one case here, but
it is not searching for the beginning of the needle. The code you
> phaystack = &haystack[i + j];
> while (i < needle_len && (CANON_ELEMENT (*pneedle++)
> == CANON_ELEMENT (*phaystack++)))
> if (needle_len <= i)
> j += i - suffix + 1;
> memory = 0;
is checking how many characters of the second factor of the needle
match at the current point, and if it's not all of them, advancing by
that many characters plus 1, and repeating.
Note that the above-quoted code is only for the short-needle case
without the bad character table, which I also claim should be removed
(empirically, glibc is much slower than musl on short needles,
presumably because musl always uses the bad-character table). There
may be additional logic considerations to integrating any optimized
"next pair" search with the bad character table; I have not checked
> And most time first character will not match and in cases when it will
> second won't most of time.
The case you have to worry about is when they do match every single
time, and whether this increases the runtime from linear to quadratic.