I ran into an instance of someone using strcspn(var,"=") to find the offset of a single rejection byte, and arguing that glibc should know better: https://www.redhat.com/archives/libvir-list/2012-September/msg01630.html The current glibc code is rather inefficient in this case; the naive C fallback does a strchr(reject,s[i]) for each byte of s, and even the assembly versions in the various .S files tend to start by computing a table of 256 bits (8 bytes) to learn which characters are in reject, before even starting to make a pass through s. But when the second argument of strcspn is exactly one byte long, it seems like it should be much more efficient to do the equivalent of a single pass: strchrnul(s,*reject)-s Since there is real code out there that uses strcspn() on single-byte rejections in order to avoid a subtraction outside of the library code, it seems like glibc should be catering to this optimization.
I will address this in my optimized strcspn patch that is in my TODO list. Upto three character needles I am faster than sse4.2 version.
Commit d3496c9f4f27d3009b71be87f6108b4fed7314bd implements the table using a 256 bytes table (it is faster than a more compact one) and also adds header optmization to use builtin strcspn,