17087 – Implement memcasemem()

Bug 17087 - Implement memcasemem()

Summary: Implement memcasemem()

Status:	RESOLVED INVALID

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	string (show other bugs)
Version:	unspecified

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2014-06-25 15:19 UTC by Ken at MIT
Modified:	2016-10-24 18:09 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
attachment-1825-0.html (948 bytes, text/html) 2014-06-25 18:30 UTC, Ken at MIT	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ken at MIT 2014-06-25 15:19:01 UTC

The of function strstr() has the case-insensitive version strcasestr(), but memmem() does not have the case-insensitive version memcasemem().

The memcasemem() version would be useful for optimizing some programs. For example Suricata intrustion prevetion and detection code.

Comment 1 Ondrej Bilka 2014-06-25 16:40:44 UTC

On Wed, Jun 25, 2014 at 03:19:01PM +0000, kenatmit at gmail dot com wrote:
> 
> The of function strstr() has the case-insensitive version strcasestr(), but
> memmem() does not have the case-insensitive version memcasemem().
> 
> The memcasemem() version would be useful for optimizing some programs. For
> example Suricata intrustion prevetion and detection code.
> 
That would not help as optimization, a case conversion is more expensive
than detecting end condition and in vectorized implementations it is
faster to detect terminating null than add special casing if detected
byte was before or after end.

Comment 2 Ken at MIT 2014-06-25 18:30:18 UTC

Created attachment 7657 [details]
attachment-1825-0.html

The case conversion can be implemented in SIMD instructions and take less
than one instruction per byte. The same that can be done in strcasestr().

Detecting if a detected byte was after the end can be done using count
leading zeros, a shift and a mask. This is also possible/required in
memmem().

The strcasestr() function can not be used in place of memcasemem() as that
would require adding NULL terminations to both the needle and the haystack,
which is not always possible.

On Wed, Jun 25, 2014 at 12:40 PM, neleai at seznam dot cz <
sourceware-bugzilla@sourceware.org> wrote:

> https://sourceware.org/bugzilla/show_bug.cgi?id=17087
>
> --- Comment #1 from Ondrej Bilka <neleai at seznam dot cz> ---
> On Wed, Jun 25, 2014 at 03:19:01PM +0000, kenatmit at gmail dot com wrote:
> >
> > The of function strstr() has the case-insensitive version strcasestr(),
> but
> > memmem() does not have the case-insensitive version memcasemem().
> >
> > The memcasemem() version would be useful for optimizing some programs.
> For
> > example Suricata intrustion prevetion and detection code.
> >
> That would not help as optimization, a case conversion is more expensive
> than detecting end condition and in vectorized implementations it is
> faster to detect terminating null than add special casing if detected
> byte was before or after end.
>
> --
> You are receiving this mail because:
> You reported the bug.
>

Comment 3 Adhemerval Zanella 2016-10-23 14:05:45 UTC

As from comment #2 at BZ#17879, new features should be proposed on libc-alpha, not in Bugzilla.  Please restart the discussions there, including a careful synthesis of the arguments from previous libc-alpha discussions of the issue to help the 
community in reaching consensus.

Comment 4 jsm-csl@polyomino.org.uk 2016-10-24 17:31:03 UTC

On Sun, 23 Oct 2016, adhemerval.zanella at linaro dot org wrote:

> As from comment #2 at BZ#17879, new features should be proposed on 
> libc-alpha, not in Bugzilla.  Please restart the discussions there, 
> including a careful synthesis of the arguments from previous libc-alpha 
> discussions of the issue to help the community in reaching consensus.

This comment is only appropriate where there are previous discussions for 
which such a synthesis is useful.  I'm not aware of any relevant 
discussions for this proposed API.

Comment 5 Adhemerval Zanella 2016-10-24 17:49:24 UTC

My understanding is bugzilla should not act as a backlog for new features without a libc-alpha thread with a prior discussion about it.  I am saying it to avoid the various current bugzillas where the upholder expects someone else to take the job of pushing both the implementation and/or the discussion forward.

I would expect for such cases to first have a discussion on libc-alpha and if the new features is desirable in any way (either by trying pushing on some standard or by adding a GNU extension), then we can process to open a bug with the appropriated maillist history.

In this case, should it continue to linger on bugzilla or ask the bug opener to bring on glibc?

Comment 6 jsm-csl@polyomino.org.uk 2016-10-24 17:54:48 UTC

My point is that you seem to have taken some text I once wrote about one 
such feature request in Bugzilla and cut-and-pasted it in a different 
issue to which it does not apply.  When asking someone to start a 
discussion on libc-alpha, it's nonsensical to request that they include a 
synthesis of previous libc-alpha discussions if there were no such 
relevant discussions.

Comment 7 Adhemerval Zanella 2016-10-24 17:58:17 UTC

Right, I see your point now I apologize for using your other bug response.  In this specific case, should we reopen the bug and ask for a follow up in maillist?

Comment 8 jsm-csl@polyomino.org.uk 2016-10-24 18:09:08 UTC

I don't see the use in having bugs open for feature requests lacking 
consensus.  More speculative ideas can go at 
<https://sourceware.org/glibc/wiki/Development_Todo/Master> (preferably 
making clear that consensus would need to be established) in the absence 
of someone actively working on building consensus and then on an 
implementation.  That page can point to prior discussions where any exist.