Bug 14613 - missed optimization in strcspn
Summary: missed optimization in strcspn
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: string (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Ondrej Bilka
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-24 20:20 UTC by Eric Blake
Modified: 2016-08-11 16:48 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Blake 2012-09-24 20:20:11 UTC
I ran into an instance of someone using strcspn(var,"=") to find the offset of a single rejection byte, and arguing that glibc should know better:
https://www.redhat.com/archives/libvir-list/2012-September/msg01630.html

The current glibc code is rather inefficient in this case; the naive C fallback does a strchr(reject,s[i]) for each byte of s, and even the assembly versions in the various .S files tend to start by computing a table of 256 bits (8 bytes) to learn which characters are in reject, before even starting to make a pass through s.

But when the second argument of strcspn is exactly one byte long, it seems like it should be much more efficient to do the equivalent of a single pass:
 strchrnul(s,*reject)-s
Since there is real code out there that uses strcspn() on single-byte rejections in order to avoid a subtraction outside of the library code, it seems like glibc should be catering to this optimization.
Comment 1 Ondrej Bilka 2012-12-19 21:08:42 UTC
I will address this in my optimized strcspn patch that is in my TODO list. 

Upto three character needles I am faster than sse4.2 version.
Comment 2 Adhemerval Zanella 2016-08-11 16:48:32 UTC
Commit d3496c9f4f27d3009b71be87f6108b4fed7314bd implements the table using a 256 bytes table (it is faster than a more compact one) and also adds header optmization to use builtin strcspn,