This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] Improve memmem.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Paul Eggert <eggert at cs dot ucla dot edu>
- Cc: libc-alpha at sourceware dot org
- Date: Sun, 24 May 2015 16:03:19 +0200
- Subject: Re: [PATCH v2] Improve memmem.
- Authentication-results: sourceware.org; auth=none
- References: <20150513000329 dot GA23595 at domone> <55537C4A dot 20001 at cs dot ucla dot edu> <20150513185107 dot GA4100 at domone> <5553F532 dot 6060604 at cs dot ucla dot edu> <20150514092926 dot GA7949 at domone> <55558D6A dot 3000004 at cs dot ucla dot edu>
On Thu, May 14, 2015 at 11:08:42PM -0700, Paul Eggert wrote:
> OndÅej BÃlka wrote:
> >I am using different end here
> >
> >+ const unsigned char *haystack_end = (const unsigned char *)
> >+ haystack_start + haystack_len
> >+ - needle_len + 1;
>
> Ah, sorry, didn't see that. But in that case the name
> 'haystack_end' is misleading -- that's not the haystack's end, but
> is something else. So a renaming would appear to be in order.
>
Do you have better suggestion?
> >Main motivations is that pairs are still too common
>
> Too common where? Do we have traces of actual programs?
I actually have applications that I use have most haystacks less than 64
bytes so it doesn't make difference.
However its better to be prepared in case programmer uses kb length
haystacks where it would happen. An english digraph th frequency is
around 1% so you will likely switch in first 1/10 of input. For triplets
there could be same problem but I decided to keep it simple,
alternatively could add quadruple check I am open what to use.