Bug 5807 - strlen() not effective
Summary: strlen() not effective
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P3 minor
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks: 5806
  Show dependency treegraph
 
Reported: 2008-02-29 00:42 UTC by Egmont Koblinger
Modified: 2014-07-02 06:59 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
Alan Mycroft's hack for strlen() (1.62 KB, patch)
2008-04-17 19:55 UTC, Stas Yakovlev
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Egmont Koblinger 2008-02-29 00:42:44 UTC
Most of the functions similar to strlen() that have to detect whether any bytes
of an integer is zero are very efficient. However, in glibc-2.7/string/strlen.c
this efficient code that's used in lots of other functions is surrounded by an
#if 0, and instead a trivial code is used which exits the loop and examines each
bytes separately if any of the bytes is within the range 129-255 or 0. That is
roughly 15/16 of all random cases in 4-bit architecture and even more in 8-bit.
Hence I think this function is hardly any more efficient than if you read one
long int and then simply examined all its bytes separately.

Is there any reason for the code that looks way more effective and is being used
in many other source files to be commented out here?
Comment 1 Carlos O'Donell 2008-03-02 03:29:47 UTC
The math is wrong. 

It looks glibc has a broken version of Alan Mycroft's HAKMEMC postings.
See: http://www.cl.cam.ac.uk/~am21/progtricks.html

The solution is "((x - 0x01010101) & ~x & 0x80808080)", but the "& ~x" is
missing from the glibc version.

The "#if 0" was added Tue Jan 21 03:39:54 1992 UTC (16 years, 1 month ago) by
roland, and the patch looked like this:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/Attic/strlen.c.diff?r1=1.1&r2=1.2&cvsroot=glibc

I can reproduce this on cvs head. The generic strlen function is horribly
inefficient.

Roland can you comment on this? 

What's the legal status of using that algorithm?
Comment 2 Carlos O'Donell 2008-03-02 15:12:14 UTC
Note: All the string/* operations should use the corrected algorithm, and the
old comments should be removed.
Comment 3 Stas Yakovlev 2008-04-17 19:55:59 UTC
Created attachment 2703 [details]
Alan Mycroft's hack for strlen()

Hi All.

I know that you only accept patches if
contributor signs an assignment.

But do you accept small bug fix without an
assignment, like that I attached to the bug ?

This is a fix for strlen() only
(I can do it for all other functions if you can accept
patches like this). It deletes old comments,
and adds Alan Mycroft's hack.

Stas.
Comment 4 Carlos O'Donell 2008-04-22 13:03:57 UTC
Stas,

The changes are more than trivial, you would need a copyright assignment for
glibc on file with the FSF.
Comment 5 Stas Yakovlev 2008-04-23 15:48:26 UTC
Carlos,

Thanks for reply.
Comment 6 Ulrich Drepper 2009-03-15 08:50:01 UTC
I changed the code.  Although noboy seems to use it.