This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Use Quicksearch in strstr


Hi,

Alexander Monakov wrote:
> On Mon, 29 Oct 2018, Wilco Dijkstra wrote:
>
>> The needle length is limited to 254 - combined with a hashed shift table
>> this reduces the shift table memory 16 to 32 times, lowering preprocessing
>> overhead and minimizing cache effects.
>
> This is a bit unclear: given needle length limit, unhashed table would require
> 256 bytes, hashing as implemented in the patch reduces it to 64 bytes. I think
> unhashed variant should be even faster, is saving 192 bytes on stack worth it?

The startup overhead is significant for small haystacks, so reducing that is
actually faster. The hashing uses a simple AND instruction which is hidden
across the memcmp call, so doesn't cost much on a modern core. 

> How does the worst case change look like (I think it's searching for "a{251}ba"
> in "aaa...aaa" if memcmp works left-to-right)?  Does it make sense to move
> needle length cutoff further below 254?

To be fair you have to compare the worst case of one algorithm vs the worst
case of another since they are different... Two-way is slow on searching for
ba{254}b in a long sequence of ba{253}ba{253}. Two-way is only about 3 times
faster on this case despite having a theoretical factor of 256/2 advantage.

I don't think there is any gain slowing down common cases in order to
reduce the worst-case difference (assuming that is feasibe).

> +      do
> +     {
> +       hs--;

> Doesn't this move hs out of bounds and thus invoke undefined behavior?

That's a theoretical issue given *(a+(i-1)), *((a-1)+i) and *((a+i)-1) behave
identically.

Wilco
    

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]