This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Ping: [PATCH v4] faster strlen on x64


I've tried to run that test suite on Haswell machine we have to
compare _revised version and _new version but got Segmentation fault.
I downloaded the archive, extracted all, and run at the test directory
"make" and "./benchmarks" commands one by one.
When ./benchmarks script called ./report binary the program broke.

The stack is:
Program received signal SIGSEGV, Segmentation fault.
0x000000301524d4d8 in __printf_fp () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.15-57.fc17.x86_64
(gdb) bt
#0  0x000000301524d4d8 in __printf_fp () from /lib64/libc.so.6
#1  0x000000301524a748 in vfprintf () from /lib64/libc.so.6
#2  0x000000301526e124 in vsprintf () from /lib64/libc.so.6
#3  0x0000003015250987 in sprintf () from /lib64/libc.so.6
#4  0x0000000000402984 in report_fn (smp=0x7ffff7fee000,
fname=0x403d47 "function", flags=0, binaries=0x7ffff7ffbd20) at
report.c:91
#5  0x0000000000403603 in main () at functions.h:1


--
Liubov Dmitrieva

2013/2/25 OndÅej BÃlka <neleai@seznam.cz>:
> Ping,
>
>
> On Wed, Feb 13, 2013 at 12:38:40PM +0100, OndÅej BÃlka wrote:
>> Hello,
>>
>> I wrote at previous version that unaligned read of first 16 bytes is bad
>> tradeoff. When I made faster strcpy header I realized that it was because
>> I was doing separate check if it crosses page.
>>
>> When I do only check if next 64 bytes do not cross page and first do
>> unaligned 16 byte load then it causes only small overhead for larger
>> strings. This makes my implementation faster for wider family of
>> workloads. It speed up gcc benchmark and most other programs.
>>
>> On unit tests revised version is somewhat slower than previous version.
>> It is caused by choosing first 16 bytes only rarely which causes branch
>> misprediction.
>>
>> I did two additional small improvements, first is squashing padding patch.
>> Second bit is test to cross page can be done as x%4096 < 4096-48 instead
>> x%4096 <= 4096-64 because I align x into 16 bytes.
>>
>> I updated benchmarks, difference between new and revised version is at
>> http://kam.mff.cuni.cz/~ondra/benchmark_string/strlen_profile.html
>> http://kam.mff.cuni.cz/~ondra/strlen_profile.tar.bz2
>>
>>
>> Ondra
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]