This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Massive performance regression of glibc string functions


I am using the rdtsc timing in glibc string tests. Here is strlen data on

Intel(R) Xeon(R) CPU           X3350  @ 2.66GHz

                    	strlen_2_11	builtin_strlen	strlen in glibc 2.9
LAT: Pos    1, alignment  0:	8	16	16
LAT: Pos    2, alignment  0:	8	24	16
LAT: Pos    3, alignment  0:	8	24	16
LAT: Pos    4, alignment  0:	8	24	16
LAT: Pos    5, alignment  0:	8	24	16
LAT: Pos    6, alignment  0:	8	24	24
LAT: Pos    7, alignment  0:	8	24	16
LAT: Pos    1, alignment  1:	8	16	8
LAT: Pos    2, alignment  2:	8	24	16
LAT: Pos    3, alignment  3:	8	24	16
LAT: Pos    4, alignment  4:	8	32	24
LAT: Pos    5, alignment  5:	8	32	24
LAT: Pos    6, alignment  6:	16	32	24
LAT: Pos    7, alignment  7:	16	32	24
LAT: Pos    4, alignment  0:	8	24	16
LAT: Pos    4, alignment  1:	16	24	16
LAT: Pos    8, alignment  0:	8	24	16
LAT: Pos    8, alignment  1:	8	40	32
LAT: Pos   16, alignment  0:	16	24	24
LAT: Pos   16, alignment  1:	16	40	32
LAT: Pos   32, alignment  0:	16	32	24
LAT: Pos   32, alignment  1:	16	48	40
LAT: Pos   64, alignment  0:	24	40	40
LAT: Pos   64, alignment  1:	24	56	56
LAT: Pos  128, alignment  0:	32	64	64
LAT: Pos  128, alignment  1:	32	80	80
LAT: Pos  256, alignment  0:	56	136	128
LAT: Pos  256, alignment  1:	56	152	136
LAT: Pos  512, alignment  0:	96	264	256
LAT: Pos  512, alignment  1:	96	272	264
LAT: Pos 1024, alignment  0:	224	512	504
LAT: Pos 1024, alignment  1:	224	528	520
LAT: Pos    1, alignment  0:	8	16	16
LAT: Pos    2, alignment  0:	8	24	16
LAT: Pos    3, alignment  0:	8	24	16
LAT: Pos    4, alignment  0:	8	24	16
LAT: Pos    5, alignment  0:	8	24	16
LAT: Pos    6, alignment  0:	8	24	24
LAT: Pos    7, alignment  0:	8	24	16
LAT: Pos    1, alignment  1:	16	16	8
LAT: Pos    2, alignment  2:	8	24	16
LAT: Pos    3, alignment  3:	8	24	16
LAT: Pos    4, alignment  4:	8	32	24
LAT: Pos    5, alignment  5:	16	32	24
LAT: Pos    6, alignment  6:	8	32	24
LAT: Pos    7, alignment  7:	16	32	24
LAT: Pos    4, alignment  0:	8	24	16
LAT: Pos    4, alignment  1:	8	24	16
LAT: Pos    8, alignment  0:	8	24	16
LAT: Pos    8, alignment  1:	8	40	32
LAT: Pos   16, alignment  0:	16	24	24
LAT: Pos   16, alignment  1:	16	40	32
LAT: Pos   32, alignment  0:	16	32	24
LAT: Pos   32, alignment  1:	16	48	40
LAT: Pos   64, alignment  0:	24	40	40
LAT: Pos   64, alignment  1:	24	56	56
LAT: Pos  128, alignment  0:	32	64	64
LAT: Pos  128, alignment  1:	32	80	80
LAT: Pos  256, alignment  0:	56	136	128
LAT: Pos  256, alignment  1:	56	152	136
LAT: Pos  512, alignment  0:	96	264	256
LAT: Pos  512, alignment  1:	96	272	264
LAT: Pos 1024, alignment  0:	224	512	504
LAT: Pos 1024, alignment  1:	224	528	520
LAT: Pos    0, alignment  0:	8	16	16
LAT: Pos    1, alignment  0:	8	16	16
LAT: Pos    1, alignment  1:	8	16	8
LAT: Pos    2, alignment  0:	8	24	16
LAT: Pos    2, alignment  1:	16	24	8
LAT: Pos    2, alignment  2:	8	24	16
LAT: Pos    3, alignment  0:	8	24	16
LAT: Pos    3, alignment  1:	8	24	16
LAT: Pos    3, alignment  2:	16	24	16
LAT: Pos    3, alignment  3:	16	24	16
LAT: Pos    4, alignment  0:	8	24	16
LAT: Pos    4, alignment  1:	8	24	16
LAT: Pos    4, alignment  2:	16	24	16
LAT: Pos    4, alignment  3:	8	24	16
LAT: Pos    4, alignment  4:	16	32	24
LAT: Pos    5, alignment  0:	8	24	16
LAT: Pos    5, alignment  1:	8	32	24
LAT: Pos    5, alignment  2:	16	32	24
LAT: Pos    5, alignment  3:	16	32	24
LAT: Pos    5, alignment  4:	16	32	24
LAT: Pos    5, alignment  5:	8	32	24
LAT: Pos    6, alignment  0:	8	24	24
LAT: Pos    6, alignment  1:	16	32	24
LAT: Pos    6, alignment  2:	16	32	24
LAT: Pos    6, alignment  3:	8	32	24
LAT: Pos    6, alignment  4:	16	32	24
LAT: Pos    6, alignment  5:	16	32	24
LAT: Pos    6, alignment  6:	16	32	24
LAT: Pos    7, alignment  0:	8	24	16
LAT: Pos    7, alignment  1:	8	40	32
LAT: Pos    7, alignment  2:	16	32	32
LAT: Pos    7, alignment  3:	16	32	24
LAT: Pos    7, alignment  4:	8	32	24
LAT: Pos    7, alignment  5:	16	32	24
LAT: Pos    7, alignment  6:	8	32	24
LAT: Pos    7, alignment  7:	16	32	24
LAT: Pos    8, alignment  0:	8	24	16
LAT: Pos    8, alignment  1:	8	40	32
LAT: Pos    8, alignment  2:	16	32	32
LAT: Pos    8, alignment  3:	16	32	24
LAT: Pos    8, alignment  4:	8	32	32
LAT: Pos    8, alignment  5:	8	32	24
LAT: Pos    8, alignment  6:	8	32	24
LAT: Pos    8, alignment  7:	16	24	24
LAT: Pos    8, alignment  8:	16	24	16
LAT: Pos    9, alignment  0:	8	24	16
LAT: Pos    9, alignment  1:	16	40	32
LAT: Pos    9, alignment  2:	8	40	32
LAT: Pos    9, alignment  3:	16	32	24
LAT: Pos    9, alignment  4:	8	32	32
LAT: Pos    9, alignment  5:	16	32	24
LAT: Pos    9, alignment  6:	8	32	24
LAT: Pos    9, alignment  7:	16	24	16
LAT: Pos    9, alignment  8:	16	24	16
LAT: Pos    9, alignment  9:	8	40	32
LAT: Pos   10, alignment  0:	8	24	16
LAT: Pos   10, alignment  1:	16	40	32
LAT: Pos   10, alignment  2:	8	40	32
LAT: Pos   10, alignment  3:	16	40	32
LAT: Pos   10, alignment  4:	16	32	32
LAT: Pos   10, alignment  5:	8	32	24
LAT: Pos   10, alignment  6:	16	32	16
LAT: Pos   10, alignment  7:	16	24	24
LAT: Pos   10, alignment  8:	16	24	16
LAT: Pos   10, alignment  9:	16	40	32
LAT: Pos   10, alignment 10:	16	40	32
LAT: Pos   11, alignment  0:	8	24	16
LAT: Pos   11, alignment  1:	8	40	32
LAT: Pos   11, alignment  2:	8	40	32
LAT: Pos   11, alignment  3:	8	40	32
LAT: Pos   11, alignment  4:	8	32	32
LAT: Pos   11, alignment  5:	16	32	24
LAT: Pos   11, alignment  6:	16	32	24
LAT: Pos   11, alignment  7:	16	24	24
LAT: Pos   11, alignment  8:	16	24	16
LAT: Pos   11, alignment  9:	16	40	32
LAT: Pos   11, alignment 10:	16	40	32
LAT: Pos   11, alignment 11:	16	40	32
LAT: Pos   12, alignment  0:	8	24	16
LAT: Pos   12, alignment  1:	8	40	32
LAT: Pos   12, alignment  2:	8	40	32
LAT: Pos   12, alignment  3:	8	40	32
LAT: Pos   12, alignment  4:	16	32	32
LAT: Pos   12, alignment  5:	16	32	24
LAT: Pos   12, alignment  6:	16	32	24
LAT: Pos   12, alignment  7:	16	24	24
LAT: Pos   12, alignment  8:	16	24	16
LAT: Pos   12, alignment  9:	16	40	40
LAT: Pos   12, alignment 10:	16	40	32
LAT: Pos   12, alignment 11:	16	40	32
LAT: Pos   12, alignment 12:	16	32	32
LAT: Pos   13, alignment  0:	8	24	24
LAT: Pos   13, alignment  1:	8	40	40
LAT: Pos   13, alignment  2:	8	40	32
LAT: Pos   13, alignment  3:	16	32	32
LAT: Pos   13, alignment  4:	16	32	32
LAT: Pos   13, alignment  5:	16	32	24
LAT: Pos   13, alignment  6:	16	32	24
LAT: Pos   13, alignment  7:	16	24	24
LAT: Pos   13, alignment  8:	16	24	16
LAT: Pos   13, alignment  9:	16	40	40
LAT: Pos   13, alignment 10:	16	40	32
LAT: Pos   13, alignment 11:	16	32	32
LAT: Pos   13, alignment 12:	8	32	32
LAT: Pos   13, alignment 13:	16	32	24
LAT: Pos   14, alignment  0:	8	24	24
LAT: Pos   14, alignment  1:	16	40	32
LAT: Pos   14, alignment  2:	16	40	32
LAT: Pos   14, alignment  3:	16	32	32
LAT: Pos   14, alignment  4:	16	32	32
LAT: Pos   14, alignment  5:	16	32	24
LAT: Pos   14, alignment  6:	16	32	24
LAT: Pos   14, alignment  7:	16	32	24
LAT: Pos   14, alignment  8:	16	32	24
LAT: Pos   14, alignment  9:	16	40	32
LAT: Pos   14, alignment 10:	16	40	32
LAT: Pos   14, alignment 11:	16	40	32
LAT: Pos   14, alignment 12:	16	32	32
LAT: Pos   14, alignment 13:	16	32	24
LAT: Pos   14, alignment 14:	16	32	24
LAT: Pos   15, alignment  0:	8	24	24
LAT: Pos   15, alignment  1:	16	40	32
LAT: Pos   15, alignment  2:	16	40	32
LAT: Pos   15, alignment  3:	16	40	32
LAT: Pos   15, alignment  4:	16	32	32
LAT: Pos   15, alignment  5:	16	32	32
LAT: Pos   15, alignment  6:	16	32	32
LAT: Pos   15, alignment  7:	16	24	24
LAT: Pos   15, alignment  8:	16	24	24
LAT: Pos   15, alignment  9:	16	40	32
LAT: Pos   15, alignment 10:	16	40	32
LAT: Pos   15, alignment 11:	16	40	32
LAT: Pos   15, alignment 12:	8	32	32
LAT: Pos   15, alignment 13:	16	32	32
LAT: Pos   15, alignment 14:	16	32	32
LAT: Pos   15, alignment 15:	16	32	24

Data on memcmp and strcmp show similar results. The new ones
in glibc 2.11 are much better than the old ones in glibc 2.9.

If you believe there is a regression, please provide length as well
as alignments on input data. I will take a look.

Thanks.


H.J.
----
On Fri, Nov 6, 2009 at 6:04 AM, Petr Baudis <pasky@suse.cz> wrote:
> ?Hi!
>
> ?I have been doing some benchmarking of several string functions and
> discovered that some of them are *much* slower than in the past; the
> regressions are measured against glibc-2.9. I'm testing on small
> strings (4..128, though for 128 much bigger sample of calls would be
> needed for good comparison), following the common wisdom that operations
> on small strings are the bulk of the calls.
>
> ?In case of strlen(), there seems to be regression only with very small
> strings on AMD, so this is probably fine.
>
> ?In case of memcmp(), strcmp() and strncmp(), glibc-2.10.1 seems to
> improve performance somewhat especially for larger strings, but
> glibc-2.11 has massive performance drop across all vendors!
> (Interestingly, glibc-2.10.1 is also slightly slower than glibc-2.9 in
> these functions on Core i7.)
>
> ?In case of strcmp(), strncmp(), glibc-2.10.1 seems to improve performance
> somewhat especially for larger strings, but glibc-2.11 has massive
> performance drop on all vendors.
>
> ?I'd like to ask how the string routine changes were benchmarked,
> for what architectures and string sizes are they supposed to be
> optimized and why. I think it would be good to do something about this
> regression. ;-)
>
> ?For the benchmarking, I'm using
>
> ? ? ? ?http://pasky.or.cz/~pasky/dev/glibc/strbench/
>
> that I quickly hacked together. Here is the data I have collected
> on various x86_64 systems, running with 2048 iterations; apply
> reasonable error margins, of course:
>
>
> model name ? ? ?: AMD Opteron (tm) Processor 848
> cache size ? ? ?: 1024 KB
> flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow rep_good nopl
>
> fucn,size ? ? ? 2.9-vanilla ? ? 2.10.1-vanilla ?2.11-vanilla ? ?2.11-amd
> strlen4 ? ? ? ? 5.630000 ? ? ? ?6.890000 ? ? ? ?7.060000 ? ? ? ?5.660000
> strlen8 ? ? ? ? 4.940000 ? ? ? ?3.580000 ? ? ? ?3.700000 ? ? ? ?4.170000
> strlen32 ? ? ? ?2.220000 ? ? ? ?1.340000 ? ? ? ?1.490000 ? ? ? ?2.310000
> strlen128 ? ? ? 1.220000 ? ? ? ?0.830000 ? ? ? ?0.900000 ? ? ? ?1.330000
> memcmp4 ? ? ? ? 3.350000 ? ? ? ?3.330000 ? ? ? ?4.400000 ? ? ? ?3.310000
> memcmp8 ? ? ? ? 1.840000 ? ? ? ?1.740000 ? ? ? ?2.660000 ? ? ? ?2.140000
> memcmp32 ? ? ? ?0.970000 ? ? ? ?0.800000 ? ? ? ?1.770000 ? ? ? ?1.300000
> memcmp128 ? ? ? 0.330000 ? ? ? ?0.310000 ? ? ? ?1.050000 ? ? ? ?0.650000
> strcmp4 ? ? ? ? 2.400000 ? ? ? ?2.290000 ? ? ? ?5.620000 ? ? ? ?2.470000
> strcmp8 ? ? ? ? 1.600000 ? ? ? ?1.280000 ? ? ? ?3.260000 ? ? ? ?1.560000
> strcmp32 ? ? ? ?0.950000 ? ? ? ?0.600000 ? ? ? ?1.630000 ? ? ? ?0.870000
> strcmp128 ? ? ? 0.350000 ? ? ? ?0.210000 ? ? ? ?1.010000 ? ? ? ?0.310000
> strncmp4 ? ? ? ?2.560000 ? ? ? ?2.250000 ? ? ? ?5.880000 ? ? ? ?2.960000
> strncmp8 ? ? ? ?1.400000 ? ? ? ?1.410000 ? ? ? ?3.230000 ? ? ? ?1.700000
> strncmp32 ? ? ? 0.710000 ? ? ? ?0.770000 ? ? ? ?1.370000 ? ? ? ?0.940000
> strncmp128 ? ? ?0.270000 ? ? ? ?0.270000 ? ? ? ?0.670000 ? ? ? ?0.350000
>
>
> model name ? ? ?: Dual Core AMD Opteron(tm) Processor 165
> cache size ? ? ?: 1024 KB
> flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy
>
> func,size ? ? ? 2.9-vanilla ? ? 2.10.1-vanilla ?2.11-vanilla ? ?2.11-amd
> strlen4 ? ? ? ? 6.780000 ? ? ? ?8.350000 ? ? ? ?8.580000 ? ? ? ?6.850000
> strlen8 ? ? ? ? 5.920000 ? ? ? ?4.300000 ? ? ? ?4.420000 ? ? ? ?5.010000
> strlen32 ? ? ? ?2.570000 ? ? ? ?1.440000 ? ? ? ?1.430000 ? ? ? ?2.660000
> strlen128 ? ? ? 1.260000 ? ? ? ?0.910000 ? ? ? ?0.850000 ? ? ? ?1.240000
> memcmp4 ? ? ? ? 3.960000 ? ? ? ?4.040000 ? ? ? ?5.160000 ? ? ? ?2.840000
> memcmp8 ? ? ? ? 2.020000 ? ? ? ?2.060000 ? ? ? ?3.000000 ? ? ? ?1.890000
> memcmp32 ? ? ? ?0.770000 ? ? ? ?0.720000 ? ? ? ?1.350000 ? ? ? ?0.980000
> memcmp128 ? ? ? 0.260000 ? ? ? ?0.240000 ? ? ? ?0.540000 ? ? ? ?0.430000
> strcmp4 ? ? ? ? 2.740000 ? ? ? ?2.750000 ? ? ? ?6.790000 ? ? ? ?2.910000
> strcmp8 ? ? ? ? 1.410000 ? ? ? ?1.410000 ? ? ? ?3.600000 ? ? ? ?1.620000
> strcmp32 ? ? ? ?0.630000 ? ? ? ?0.580000 ? ? ? ?1.260000 ? ? ? ?0.700000
> strcmp128 ? ? ? 0.200000 ? ? ? ?0.180000 ? ? ? ?0.620000 ? ? ? ?0.230000
> strncmp4 ? ? ? ?3.080000 ? ? ? ?2.720000 ? ? ? ?7.180000 ? ? ? ?3.540000
> strncmp8 ? ? ? ?1.580000 ? ? ? ?1.440000 ? ? ? ?3.940000 ? ? ? ?1.880000
> strncmp32 ? ? ? 0.720000 ? ? ? ?0.670000 ? ? ? ?1.310000 ? ? ? ?0.840000
> strncmp128 ? ? ?0.240000 ? ? ? ?0.220000 ? ? ? ?0.550000 ? ? ? ?0.280000
>
>
> model name ? ? ?: Intel(R) Xeon(R) CPU ? ? ? ? ? X3220 ?@ 2.40GHz
> cache size ? ? ?: 4096 KB
> flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
>
> func,size ? ? ? 2.9-vanilla ? ? 2.10.1-vanilla ?2.11-vanilla ? ?2.11-amd
> strlen4 ? ? ? ? 3.870000 ? ? ? ?3.050000 ? ? ? ?3.270000 ? ? ? ?3.870000
> strlen8 ? ? ? ? 2.370000 ? ? ? ?1.530000 ? ? ? ?1.640000 ? ? ? ?3.450000
> strlen32 ? ? ? ?1.040000 ? ? ? ?0.480000 ? ? ? ?0.470000 ? ? ? ?1.520000
> strlen128 ? ? ? 0.600000 ? ? ? ?0.290000 ? ? ? ?0.280000 ? ? ? ?0.680000
> memcmp4 ? ? ? ? 2.080000 ? ? ? ?2.260000 ? ? ? ?2.680000 ? ? ? ?1.800000
> memcmp8 ? ? ? ? 1.040000 ? ? ? ?1.130000 ? ? ? ?1.460000 ? ? ? ?1.860000
> memcmp32 ? ? ? ?0.270000 ? ? ? ?0.270000 ? ? ? ?0.350000 ? ? ? ?0.770000
> memcmp128 ? ? ? 0.070000 ? ? ? ?0.070000 ? ? ? ?0.090000 ? ? ? ?0.190000
> strcmp4 ? ? ? ? 1.910000 ? ? ? ?1.910000 ? ? ? ?3.480000 ? ? ? ?1.920000
> strcmp8 ? ? ? ? 0.960000 ? ? ? ?0.950000 ? ? ? ?1.200000 ? ? ? ?0.960000
> strcmp32 ? ? ? ?0.240000 ? ? ? ?0.240000 ? ? ? ?0.290000 ? ? ? ?0.240000
> strcmp128 ? ? ? 0.060000 ? ? ? ?0.060000 ? ? ? ?0.080000 ? ? ? ?0.060000
> strncmp4 ? ? ? ?2.030000 ? ? ? ?1.690000 ? ? ? ?4.240000 ? ? ? ?2.810000
> strncmp8 ? ? ? ?1.020000 ? ? ? ?0.850000 ? ? ? ?1.610000 ? ? ? ?1.410000
> strncmp32 ? ? ? 0.260000 ? ? ? ?0.210000 ? ? ? ?0.380000 ? ? ? ?0.360000
> strncmp128 ? ? ?0.070000 ? ? ? ?0.060000 ? ? ? ?0.100000 ? ? ? ?0.080000
>
>
> model name ? ? ?: Intel(R) Core(TM)2 Duo CPU ? ? E8400 ?@ 3.00GHz
> cache size ? ? ?: 6144 KB
> flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority
>
> func,size ? ? ? 2.9-vanilla ? ? 2.10.1-vanilla ?2.11-vanilla ? ?2.11-amd
> strlen4 ? ? ? ? 3.090000 ? ? ? ?2.960000 ? ? ? ?2.750000 ? ? ? ?3.450000
> strlen8 ? ? ? ? 1.890000 ? ? ? ?1.230000 ? ? ? ?1.360000 ? ? ? ?3.140000
> strlen32 ? ? ? ?0.810000 ? ? ? ?0.370000 ? ? ? ?0.340000 ? ? ? ?1.220000
> strlen128 ? ? ? 0.460000 ? ? ? ?0.220000 ? ? ? ?0.200000 ? ? ? ?0.660000
> memcmp4 ? ? ? ? 2.160000 ? ? ? ?1.820000 ? ? ? ?2.500000 ? ? ? ?1.800000
> memcmp8 ? ? ? ? 1.100000 ? ? ? ?0.910000 ? ? ? ?1.500000 ? ? ? ?1.170000
> memcmp32 ? ? ? ?0.310000 ? ? ? ?0.220000 ? ? ? ?0.320000 ? ? ? ?0.380000
> memcmp128 ? ? ? 0.090000 ? ? ? ?0.060000 ? ? ? ?0.090000 ? ? ? ?0.110000
> strcmp4 ? ? ? ? 1.860000 ? ? ? ?1.910000 ? ? ? ?3.530000 ? ? ? ?1.570000
> strcmp8 ? ? ? ? 0.960000 ? ? ? ?0.960000 ? ? ? ?1.170000 ? ? ? ?0.840000
> strcmp32 ? ? ? ?0.280000 ? ? ? ?0.250000 ? ? ? ?0.300000 ? ? ? ?0.270000
> strcmp128 ? ? ? 0.050000 ? ? ? ?0.050000 ? ? ? ?0.090000 ? ? ? ?0.070000
> strncmp4 ? ? ? ?1.740000 ? ? ? ?1.750000 ? ? ? ?3.790000 ? ? ? ?2.840000
> strncmp8 ? ? ? ?0.940000 ? ? ? ?0.850000 ? ? ? ?1.380000 ? ? ? ?1.380000
> strncmp32 ? ? ? 0.220000 ? ? ? ?0.220000 ? ? ? ?0.320000 ? ? ? ?0.400000
> strncmp128 ? ? ?0.050000 ? ? ? ?0.050000 ? ? ? ?0.090000 ? ? ? ?0.080000
>
>
> model name ? ? ?: Intel(R) Core(TM) i7 CPU ? ? ? ? 920 ?@ 2.67GHz
> cache size ? ? ?: 8192 KB
> flags ? ? ? ? ? : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm ida
>
> func,size ? ? ? 2.9-vanilla ? ? 2.10.1-vanilla ?2.11-vanilla ? ?2.11-amd
> strlen4 ? ? ? ? 3.440000 ? ? ? ?3.500000 ? ? ? ?2.780000 ? ? ? ?3.320000
> strlen8 ? ? ? ? 2.260000 ? ? ? ?1.750000 ? ? ? ?1.440000 ? ? ? ?2.220000
> strlen32 ? ? ? ?0.850000 ? ? ? ?0.500000 ? ? ? ?0.380000 ? ? ? ?0.900000
> strlen128 ? ? ? 0.470000 ? ? ? ?0.260000 ? ? ? ?0.200000 ? ? ? ?0.500000
> memcmp4 ? ? ? ? 2.180000 ? ? ? ?2.060000 ? ? ? ?2.500000 ? ? ? ?1.840000
> memcmp8 ? ? ? ? 1.100000 ? ? ? ?1.050000 ? ? ? ?1.320000 ? ? ? ?1.060000
> memcmp32 ? ? ? ?0.270000 ? ? ? ?0.260000 ? ? ? ?0.350000 ? ? ? ?0.330000
> memcmp128 ? ? ? 0.080000 ? ? ? ?0.070000 ? ? ? ?0.090000 ? ? ? ?0.090000
> strcmp4 ? ? ? ? 1.660000 ? ? ? ?1.930000 ? ? ? ?2.250000 ? ? ? ?1.640000
> strcmp8 ? ? ? ? 0.830000 ? ? ? ?0.970000 ? ? ? ?1.140000 ? ? ? ?0.840000
> strcmp32 ? ? ? ?0.210000 ? ? ? ?0.240000 ? ? ? ?0.240000 ? ? ? ?0.210000
> strcmp128 ? ? ? 0.050000 ? ? ? ?0.070000 ? ? ? ?0.080000 ? ? ? ?0.060000
> strncmp4 ? ? ? ?1.740000 ? ? ? ?1.830000 ? ? ? ?2.490000 ? ? ? ?2.570000
> strncmp8 ? ? ? ?0.870000 ? ? ? ?0.920000 ? ? ? ?1.220000 ? ? ? ?1.300000
> strncmp32 ? ? ? 0.220000 ? ? ? ?0.230000 ? ? ? ?0.260000 ? ? ? ?0.320000
> strncmp128 ? ? ?0.050000 ? ? ? ?0.050000 ? ? ? ?0.090000 ? ? ? ?0.080000
>
>
> ?* numbers after function names indicate string sizes
> ?** 2.11-amd is very old AMD-provided x86_64 string routines patch
> (it doesn't implement some of the new things like bounded pointers
> checks support) that we still use in SUSE glibc:
>
> ? ? ? ?http://pasky.or.cz/~pasky/dev/glibc/amd64-string-2.11.diff
>
> If the regression against 2.10.1 is fixed, it is probably not very
> interesting, it performs better only at very short memcmp()s.)
>
> ?*** I can't seem to find newer AMD processors to test on right now,
> sorry. If you have any, feel free to run the benchmark there - just
> get the /strbench/ directory and run `./strbench.sh outfile`.
>
> ?Kind regards,
>
> --
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Petr "Pasky" Baudis
> A lot of people have my books on their bookshelves.
> That's the problem, they need to read them. -- Don Knuth
>



-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]