This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: String Functions for x86-64 (memcmp)
- From: "Menezes, Evandro" <evandro dot menezes at amd dot com>
- To: "Ulrich Drepper" <drepper at redhat dot com>, libc-alpha at sourceware dot org
- Cc: "Meissner, Michael" <michael dot meissner at amd dot com>
- Date: Mon, 15 May 2006 17:18:54 -0500
- Subject: RE: String Functions for x86-64 (memcmp)
On the left, the results for the proposed memcmp, on the right, for the current one. I filtered duplicted lines with sort.
First, on an Athlon 64:
memcmp simple_memcmp memcmp simple_memcmp
Length 1, alignment 0/ 0: 9 8 | Length 1, alignment 0/ 0: 16 8
Length 1, alignment 1/ 1: 9 8 | Length 1, alignment 1/ 1: 16 8
Length 2, alignment 0/ 0: 16 13 | Length 2, alignment 0/ 0: 19 13
Length 2, alignment 2/ 2: 16 13 | Length 2, alignment 2/ 2: 19 13
Length 3, alignment 0/ 0: 19 17 | Length 3, alignment 0/ 0: 23 17
Length 3, alignment 3/ 3: 19 17 | Length 3, alignment 3/ 3: 23 17
Length 4, alignment 0/ 0: 21 21 | Length 4, alignment 0/ 0: 25 21
Length 4, alignment 4/ 4: 21 21 | Length 4, alignment 4/ 4: 25 21
Length 5, alignment 0/ 0: 24 38 | Length 5, alignment 0/ 0: 28 38
Length 5, alignment 5/ 5: 39 38 | Length 5, alignment 5/ 5: 28 38
Length 6, alignment 0/ 0: 42 42 | Length 6, alignment 0/ 0: 51 42
Length 6, alignment 6/ 6: 41 42 | Length 6, alignment 6/ 6: 51 42
Length 7, alignment 0/ 0: 44 47 | Length 7, alignment 0/ 0: 52 47
Length 7, alignment 7/ 7: 44 47 | Length 7, alignment 7/ 7: 52 47
Length 8, alignment 0/ 0: 12 50 | Length 8, alignment 0/ 0: 56 50
Length 9, alignment 0/ 0: 16 49 | Length 9, alignment 0/ 0: 61 49
Length 9, alignment 1/ 1: 16 49 | Length 9, alignment 1/ 1: 61 49
Length 10, alignment 0/ 0: 18 56 | Length 10, alignment 0/ 0: 64 56
Length 10, alignment 2/ 2: 18 56 | Length 10, alignment 2/ 2: 64 56
Length 11, alignment 0/ 0: 23 58 | Length 11, alignment 0/ 0: 63 58
Length 11, alignment 3/ 3: 23 58 | Length 11, alignment 3/ 3: 63 58
Length 12, alignment 0/ 0: 40 62 | Length 12, alignment 0/ 0: 70 62
Length 12, alignment 4/ 4: 26 62 | Length 12, alignment 4/ 4: 70 62
Length 13, alignment 0/ 0: 42 66 | Length 13, alignment 0/ 0: 72 66
Length 13, alignment 5/ 5: 28 66 | Length 13, alignment 5/ 5: 72 66
Length 14, alignment 0/ 0: 45 71 | Length 14, alignment 0/ 0: 76 71
Length 14, alignment 6/ 6: 46 71 | Length 14, alignment 6/ 6: 76 71
Length 15, alignment 0/ 0: 48 74 | Length 15, alignment 0/ 0: 80 74
Length 15, alignment 7/ 7: 48 74 | Length 15, alignment 7/ 7: 80 74
Length 16, alignment 0/ 0: 16 73 | Length 16, alignment 0/ 0: 28 73
Length 16, alignment 1/ 2: 15 75 | Length 16, alignment 1/ 2: 62 75
Length 32, alignment 0/ 0: 16 130 | Length 32, alignment 0/ 0: 35 130
Length 32, alignment 2/ 4: 19 124 | Length 32, alignment 2/ 4: 67 124
Length 32, alignment 7/ 2: 19 119 | Length 32, alignment 7/ 2: 67 119
Length 64, alignment 0/ 0: 23 242 | Length 64, alignment 0/ 0: 50 242
Length 64, alignment 3/ 6: 31 215 | Length 64, alignment 3/ 6: 104 215
Length 64, alignment 6/ 4: 31 215 | Length 64, alignment 6/ 4: 87 215
Length 128, alignment 0/ 0: 33 457 | Length 128, alignment 0/ 0: 80 457
Length 128, alignment 4/ 0: 41 409 | Length 128, alignment 4/ 0: 110 409
Length 128, alignment 5/ 6: 55 410 | Length 128, alignment 5/ 6: 149 410
Length 256, alignment 0/ 0: 72 898 | Length 256, alignment 0/ 0: 153 809
Length 256, alignment 4/ 0: 93 793 | Length 256, alignment 4/ 0: 210 793
Length 256, alignment 5/ 2: 119 791 | Length 256, alignment 5/ 2: 231 791
Length 512, alignment 0/ 0: 124 1778 | Length 512, alignment 0/ 0: 273 1778
Length 512, alignment 3/ 2: 215 1564 | Length 512, alignment 3/ 2: 400 1564
Length 512, alignment 6/ 4: 215 1559 | Length 512, alignment 6/ 4: 398 1559
Length 1024, alignment 0/ 0: 228 3529 | Length 1024, alignment 0/ 0: 513 3529
Length 1024, alignment 2/ 4: 407 3100 | Length 1024, alignment 2/ 4: 734 3100
Length 1024, alignment 7/ 6: 407 3101 | Length 1024, alignment 7/ 6: 751 3101
Length 2048, alignment 0/ 0: 436 7042 | Length 2048, alignment 0/ 0: 993 7042
Length 2048, alignment 1/ 6: 791 6167 | Length 2048, alignment 1/ 6: 1421 6167
Length 4096, alignment 0/ 0: 731 12328 | Length 4096, alignment 0/ 0: 1953 13479
Now, on a P4:
memcmp simple_memcmp memcmp simple_memcmp
Length 1, alignment 0/ 0: 0 8 | Length 1, alignment 0/ 0: 0 0
Length 1, alignment 1/ 1: 8 8 | Length 1, alignment 1/ 1: 0 0
Length 2, alignment 0/ 0: 0 8 | Length 2, alignment 0/ 0: 0 0
Length 2, alignment 2/ 2: 0 0 | Length 2, alignment 2/ 2: 0 40
Length 3, alignment 0/ 0: 0 8 | Length 3, alignment 0/ 0: 0 40
Length 3, alignment 3/ 3: 0 8 | Length 3, alignment 3/ 3: 0 0
Length 4, alignment 0/ 0: 8 0 | Length 4, alignment 0/ 0: 16 0
Length 4, alignment 4/ 4: 8 0 | Length 4, alignment 4/ 4: 16 0
Length 5, alignment 0/ 0: 0 16 | Length 5, alignment 0/ 0: 24 8
Length 5, alignment 5/ 5: 8 16 | Length 5, alignment 5/ 5: 24 8
Length 6, alignment 0/ 0: 16 24 | Length 6, alignment 0/ 0: 32 16
Length 6, alignment 6/ 6: 16 24 | Length 6, alignment 6/ 6: 32 16
Length 7, alignment 0/ 0: 24 32 | Length 7, alignment 0/ 0: 32 24
Length 7, alignment 7/ 7: 24 32 | Length 7, alignment 7/ 7: 40 24
Length 8, alignment 0/ 0: 0 40 | Length 8, alignment 0/ 0: 40 32
Length 9, alignment 0/ 0: 8 136 | Length 9, alignment 0/ 0: 48 128
Length 9, alignment 1/ 1: 8 136 | Length 9, alignment 1/ 1: 48 128
Length 10, alignment 0/ 0: 0 80 | Length 10, alignment 0/ 0: 56 72
Length 10, alignment 2/ 2: 0 80 | Length 10, alignment 2/ 2: 56 72
Length 11, alignment 0/ 0: 8 88 | Length 11, alignment 0/ 0: 152 80
Length 11, alignment 3/ 3: 8 88 | Length 11, alignment 3/ 3: 152 80
Length 12, alignment 0/ 0: 8 88 | Length 12, alignment 0/ 0: 104 80
Length 12, alignment 4/ 4: 16 88 | Length 12, alignment 4/ 4: 104 80
Length 13, alignment 0/ 0: 24 96 | Length 13, alignment 0/ 0: 112 160
Length 13, alignment 5/ 5: 24 168 | Length 13, alignment 5/ 5: 112 160
Length 14, alignment 0/ 0: 32 176 | Length 14, alignment 0/ 0: 120 160
Length 14, alignment 6/ 6: 32 168 | Length 14, alignment 6/ 6: 120 168
Length 15, alignment 0/ 0: 40 168 | Length 15, alignment 0/ 0: 128 160
Length 15, alignment 7/ 7: 32 168 | Length 15, alignment 7/ 7: 120 160
Length 16, alignment 0/ 0: 0 176 | Length 16, alignment 0/ 0: 0 168
Length 16, alignment 1/ 2: 8 96 | Length 16, alignment 1/ 2: 64 88
Length 32, alignment 0/ 0: 8 312 | Length 32, alignment 0/ 0: 16 296
Length 32, alignment 2/ 4: 8 152 | Length 32, alignment 2/ 4: 72 144
Length 32, alignment 7/ 2: 8 312 | Length 32, alignment 7/ 2: 72 304
Length 64, alignment 0/ 0: 8 432 | Length 64, alignment 0/ 0: 24 432
Length 64, alignment 3/ 6: 8 288 | Length 64, alignment 3/ 6: 104 280
Length 64, alignment 6/ 4: 8 440 | Length 64, alignment 6/ 4: 104 432
Length 128, alignment 0/ 0: 32 688 | Length 128, alignment 0/ 0: 40 688
Length 128, alignment 4/ 0: 32 544 | Length 128, alignment 4/ 0: 152 536
Length 128, alignment 5/ 6: 32 688 | Length 128, alignment 5/ 6: 192 688
Length 256, alignment 0/ 0: 72 1056 | Length 256, alignment 0/ 0: 120 1048
Length 256, alignment 4/ 0: 96 1056 | Length 256, alignment 4/ 0: 352 1048
Length 256, alignment 5/ 2: 112 1056 | Length 256, alignment 5/ 2: 392 1048
Length 512, alignment 0/ 0: 192 2080 | Length 512, alignment 0/ 0: 184 2072
Length 512, alignment 3/ 2: 384 2072 | Length 512, alignment 3/ 2: 688 2064
Length 512, alignment 6/ 4: 392 2080 | Length 512, alignment 6/ 4: 672 2064
Length 1024, alignment 0/ 0: 344 4128 | Length 1024, alignment 0/ 0: 312 4120
Length 1024, alignment 2/ 4: 792 4128 | Length 1024, alignment 2/ 4: 1256 4128
Length 1024, alignment 7/ 6: 784 4128 | Length 1024, alignment 7/ 6: 1248 4112
Length 2048, alignment 0/ 0: 664 8224 | Length 2048, alignment 0/ 0: 568 8216
Length 2048, alignment 1/ 6: 1592 8224 | Length 2048, alignment 1/ 6: 2416 8216
Length 4096, alignment 0/ 0: 1112 16408 | Length 4096, alignment 0/ 0: 1080 16408
Thanks,
--
_______________________________________________________
Evandro Menezes AMD Austin, TX