This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Intel's new rte_memcpy()


On Fri, Jan 30, 2015 at 5:52 AM, Luke Gorrie <luke@snabb.co> wrote:
> Howdy!
>
> I am hoping for some feedback and advice for me as an application developer.
>
> Intel have recently posted a couple of memcpy() implementations and
> suggested that these have significant advantages for networking
> applications. There is one for Sandy Bridge and one for Haswell. The
> proposal is that networking application developers would statically
> link one or both of these into their applications instead of
> dynamically linking with glibc. The proposal is part of their Data
> Plane Development Kit (dpdk.org).
>
> They explain it much better than I do:
> http://dpdk.org/ml/archives/dev/2014-November/008158.html
>
> and their code is here:
> https://gist.github.com/lukego/efc82a15bde5ec83cb1b
>
> My question to the list is this:
>
> Should networking application developers adopt Intel's custom
> implementation if (like me) they are absolutely dependent on good and
> consistent performance of memcpy on all recent hardware (>= Sandy
> Bridge) and Linux distributions? (and then -- what to do about
> memmove?)
>
> I have done some cursory benchmarks with cachebench:
> http://dpdk.org/ml/archives/dev/2015-January/011574.html
>
> ... with a correction to the rte_memcpy on Haswell results:
> http://dpdk.org/ml/archives/dev/2015-January/011691.html
>

I import it to hjl/memcpy branch at

https://sourceware.org/git/?p=glibc.git;a=summary

Here is the bench-memcpy comparison against __memcpy_avx_unaligned
on Haswell:

                          __memcpy_rte_avx __memcpy_avx_unaligned
Length 1, alignment 0/ 0:    9.64062        10.5625
Length 1, alignment 0/ 0:    9.26562        9.54688
Length 1, alignment 0/ 0:    8.75        9.45312
Length 1, alignment 0/ 0:    8.65625        9.125
Length 2, alignment 0/ 0:    10.7969        10
Length 2, alignment 1/ 0:    9.48438        8.98438
Length 2, alignment 0/ 1:    9.82812        8.89062
Length 2, alignment 1/ 1:    9.125        8.89062
Length 4, alignment 0/ 0:    11.3594        9.73438
Length 4, alignment 2/ 0:    9.64062        9.125
Length 4, alignment 0/ 2:    9.35938        8.79688
Length 4, alignment 2/ 2:    9.3125        8.20312
Length 8, alignment 0/ 0:    10.4375        9.17188
Length 8, alignment 3/ 0:    9.07812        7.59375
Length 8, alignment 0/ 3:    9.53125        7.95312
Length 8, alignment 3/ 3:    9.53125        7.95312
Length 16, alignment 0/ 0:    9.45312        10.7969
Length 16, alignment 4/ 0:    8.28125        10
Length 16, alignment 0/ 4:    8.51562        10.0469
Length 16, alignment 4/ 4:    8.09375        9.67188
Length 32, alignment 0/ 0:    7.40625        10.9375
Length 32, alignment 5/ 0:    8.32812        12.9844
Length 32, alignment 0/ 5:    7.40625        11.9219
Length 32, alignment 5/ 5:    7.21875        12
Length 64, alignment 0/ 0:    9.92188        14.0156
Length 64, alignment 6/ 0:    8.75        21.4062
Length 64, alignment 0/ 6:    8.98438        18.9531
Length 64, alignment 6/ 6:    8.46875        18.75
Length 128, alignment 0/ 0:    13.2188        24.75
Length 128, alignment 7/ 0:    13.2188        32.4844
Length 128, alignment 0/ 7:    15.7344        32.5312
Length 128, alignment 7/ 7:    13.4062        36.9531
Length 256, alignment 0/ 0:    14.9844        19.7812
Length 256, alignment 8/ 0:    17.6875        24.1094
Length 256, alignment 0/ 8:    37.7969        22.2969
Length 256, alignment 8/ 8:    17.5469        20.5781
Length 512, alignment 0/ 0:    20.6094        24.5312
Length 512, alignment 9/ 0:    22.6562        27.9219
Length 512, alignment 0/ 9:    71.6719        27.7031
Length 512, alignment 9/ 9:    30.8125        26.9062
Length 1024, alignment 0/ 0:    39.4219        36.3125
Length 1024, alignment 10/ 0:    37.6562        42.0312
Length 1024, alignment 0/10:    44.875        41.9375
Length 1024, alignment 10/10:    39.4219        43.3281
Length 2048, alignment 0/ 0:    64.6562        97.0469
Length 2048, alignment 11/ 0:    65.25        97.0781
Length 2048, alignment 0/11:    82.7969        607.281
Length 2048, alignment 11/11:    69.4844        138.047
Length 4096, alignment 0/ 0:    122.453        153.781
Length 4096, alignment 12/ 0:    158.016        181.328
Length 4096, alignment 0/12:    218.609        1104.64
Length 4096, alignment 12/12:    174.172        300.797
Length 8192, alignment 0/ 0:    243.156        275.859
Length 8192, alignment 13/ 0:    311.406        330.312
Length 8192, alignment 0/13:    394.359        1802.88
Length 8192, alignment 13/13:    289.312        532.922
Length 16384, alignment 0/ 0:    568.938        553.203
Length 16384, alignment 14/ 0:    683.812        671.844
Length 16384, alignment 0/14:    859.125        3364.7
Length 16384, alignment 14/14:    611.469        1001.5
Length 32768, alignment 0/ 0:    3704.61        3683.7
Length 32768, alignment 15/ 0:    3793.03        3845.58
Length 32768, alignment 0/15:    3776.97        5321.25
Length 32768, alignment 15/15:    3742.62        3986.92
Length 65536, alignment 0/ 0:    7831.95        7480.41
Length 65536, alignment 16/ 0:    8018.58        7784.28
Length 65536, alignment 0/16:    7914.61        10203.6
Length 65536, alignment 16/16:    7902.78        8019.75
Length 0, alignment 0/ 0:    11.8594        12.0938
Length 0, alignment 0/ 0:    10.0938        11.4062
Length 0, alignment 0/ 0:    9.5        11.3125
Length 0, alignment 0/ 0:    9.8125        11.0781
Length 1, alignment 0/ 0:    10.0938        10.2969
Length 1, alignment 1/ 0:    8.79688        9.07812
Length 1, alignment 0/ 1:    8.65625        9.17188
Length 1, alignment 1/ 1:    8.65625        9.21875
Length 2, alignment 0/ 0:    10.8438        10.2344
Length 2, alignment 2/ 0:    9.78125        8.75
Length 2, alignment 0/ 2:    9.03125        9.03125
Length 2, alignment 2/ 2:    9.03125        8.51562
Length 3, alignment 0/ 0:    9.5        9.5
Length 3, alignment 3/ 0:    8.9375        8.32812
Length 3, alignment 0/ 3:    8.98438        8.28125
Length 3, alignment 3/ 3:    8.9375        8.60938
Length 4, alignment 0/ 0:    12.7031        8.84375
Length 4, alignment 4/ 0:    10.1406        8.70312
Length 4, alignment 0/ 4:    9.40625        8.28125
Length 4, alignment 4/ 4:    9.71875        8.51562
Length 5, alignment 0/ 0:    9.59375        9.21875
Length 5, alignment 5/ 0:    9.03125        8.23438
Length 5, alignment 0/ 5:    8.70312        8.28125
Length 5, alignment 5/ 5:    8.75        8.51562
Length 6, alignment 0/ 0:    10.6562        9.03125
Length 6, alignment 6/ 0:    8.84375        8.23438
Length 6, alignment 0/ 6:    8.84375        8.28125
Length 6, alignment 6/ 6:    9.125        8.46875
Length 7, alignment 0/ 0:    10.6562        8.98438
Length 7, alignment 7/ 0:    8.70312        8.79688
Length 7, alignment 0/ 7:    8.84375        8.28125
Length 7, alignment 7/ 7:    9.07812        8.46875
Length 8, alignment 0/ 0:    11.0781        8.5625
Length 8, alignment 8/ 0:    9.07812        7.48438
Length 8, alignment 0/ 8:    9.07812        7.17188
Length 8, alignment 8/ 8:    9.07812        8.04688
Length 9, alignment 0/ 0:    10.2344        8.65625
Length 9, alignment 9/ 0:    8.84375        8.04688
Length 9, alignment 0/ 9:    8.375        7.53125
Length 9, alignment 9/ 9:    8.46875        8.04688
Length 10, alignment 0/ 0:    11.125        8.46875
Length 10, alignment 10/ 0:    9.5        7.07812
Length 10, alignment 0/10:    9.03125        7.53125
Length 10, alignment 10/10:    9.07812        7.90625
Length 11, alignment 0/ 0:    11.3594        8.375
Length 11, alignment 11/ 0:    8.79688        7.17188
Length 11, alignment 0/11:    8.79688        7.17188
Length 11, alignment 11/11:    9.71875        7.90625
Length 12, alignment 0/ 0:    10.9844        8.65625
Length 12, alignment 12/ 0:    8.79688        7.17188
Length 12, alignment 0/12:    8.89062        7.17188
Length 12, alignment 12/12:    8.10938        7.90625
Length 13, alignment 0/ 0:    10.8906        8.75
Length 13, alignment 13/ 0:    8.65625        8.04688
Length 13, alignment 0/13:    8.9375        7.21875
Length 13, alignment 13/13:    9.71875        7.45312
Length 14, alignment 0/ 0:    10.8906        8.5625
Length 14, alignment 14/ 0:    8.98438        7.625
Length 14, alignment 0/14:    8.84375        7.21875
Length 14, alignment 14/14:    8.79688        7.45312
Length 15, alignment 0/ 0:    11.3125        8.46875
Length 15, alignment 15/ 0:    9.03125        7.625
Length 15, alignment 0/15:    9.07812        7.57812
Length 15, alignment 15/15:    9.5        8.375
Length 16, alignment 0/ 0:    9.625        10.0469
Length 16, alignment 16/ 0:    6.9375        9.76562
Length 16, alignment 0/16:    6.46875        9.07812
Length 16, alignment 16/16:    8.0625        9.67188
Length 17, alignment 0/ 0:    8.75        10.0625
Length 17, alignment 17/ 0:    6.51562        9.59375
Length 17, alignment 0/17:    6.46875        9.03125
Length 17, alignment 17/17:    8.1875        9.67188
Length 18, alignment 0/ 0:    8.46875        10.0156
Length 18, alignment 18/ 0:    7.03125        9.54688
Length 18, alignment 0/18:    6.46875        9.125
Length 18, alignment 18/18:    7.92188        9.35938
Length 19, alignment 0/ 0:    8.20312        10.1406
Length 19, alignment 19/ 0:    6.51562        9.90625
Length 19, alignment 0/19:    6.46875        9.07812
Length 19, alignment 19/19:    8.09375        9.26562
Length 20, alignment 0/ 0:    8.79688        10.4219
Length 20, alignment 20/ 0:    6.51562        9.54688
Length 20, alignment 0/20:    6.51562        9.39062
Length 20, alignment 20/20:    8.1875        9.625
Length 21, alignment 0/ 0:    8.375        9.26562
Length 21, alignment 21/ 0:    7.07812        9.03125
Length 21, alignment 0/21:    7.07812        9.07812
Length 21, alignment 21/21:    8.09375        9.67188
Length 22, alignment 0/ 0:    8.28125        9.6875
Length 22, alignment 22/ 0:    6.46875        9.07812
Length 22, alignment 0/22:    6.46875        9.90625
Length 22, alignment 22/22:    8.09375        9.67188
Length 23, alignment 0/ 0:    8.375        10.2344
Length 23, alignment 23/ 0:    7.34375        9.95312
Length 23, alignment 0/23:    6.46875        9.125
Length 23, alignment 23/23:    8.28125        9.26562
Length 24, alignment 0/ 0:    8.89062        9.73438
Length 24, alignment 24/ 0:    6.46875        9.59375
Length 24, alignment 0/24:    6.42188        9.07812
Length 24, alignment 24/24:    8.1875        9.26562
Length 25, alignment 0/ 0:    8.15625        10.4219
Length 25, alignment 25/ 0:    6.46875        10.2344
Length 25, alignment 0/25:    6.42188        9.5
Length 25, alignment 25/25:    8.23438        10.1406
Length 26, alignment 0/ 0:    8.5625        9.64062
Length 26, alignment 26/ 0:    6.42188        9.90625
Length 26, alignment 0/26:    6.46875        9.4375
Length 26, alignment 26/26:    8.14062        9.21875
Length 27, alignment 0/ 0:    9.25        9.82812
Length 27, alignment 27/ 0:    6.5625        9.59375
Length 27, alignment 0/27:    6.51562        9.07812
Length 27, alignment 27/27:    8.09375        9.625
Length 28, alignment 0/ 0:    8.5625        9.59375
Length 28, alignment 28/ 0:    6.89062        9.5
Length 28, alignment 0/28:    6.46875        9.53125
Length 28, alignment 28/28:    7.73438        9.71875
Length 29, alignment 0/ 0:    8.375        10.375
Length 29, alignment 29/ 0:    6.46875        9.90625
Length 29, alignment 0/29:    6.42188        9.95312
Length 29, alignment 29/29:    8.04688        9.54688
Length 30, alignment 0/ 0:    8.5625        9.78125
Length 30, alignment 30/ 0:    7.03125        9.125
Length 30, alignment 0/30:    6.46875        9.125
Length 30, alignment 30/30:    7.78125        9.59375
Length 31, alignment 0/ 0:    8.60938        9.78125
Length 31, alignment 31/ 0:    7.03125        9.54688
Length 31, alignment 0/31:    6.51562        9.125
Length 31, alignment 31/31:    8.23438        9.3125
Length 48, alignment 0/ 0:    10.8906        10.2969
Length 48, alignment 3/ 0:    9.48438        11.0312
Length 48, alignment 0/ 3:    8.84375        11.0312
Length 48, alignment 3/ 3:    8.65625        10.4688
Length 80, alignment 0/ 0:    16.8906        13.9219
Length 80, alignment 5/ 0:    14.1875        22.2969
Length 80, alignment 0/ 5:    21.1719        18.4375
Length 80, alignment 5/ 5:    17.5469        15.6875
Length 96, alignment 0/ 0:    12.1406        13.6719
Length 96, alignment 6/ 0:    12.0625        21.3594
Length 96, alignment 0/ 6:    14.6562        19.0781
Length 96, alignment 6/ 6:    12.1406        19.3594
Length 112, alignment 0/ 0:    12.8438        12.75
Length 112, alignment 7/ 0:    14.2812        18.7969
Length 112, alignment 0/ 7:    17.125        17.875
Length 112, alignment 7/ 7:    14        17.3594
Length 144, alignment 0/ 0:    15.0312        25.2812
Length 144, alignment 9/ 0:    15.3125        32.5781
Length 144, alignment 0/ 9:    16.75        30.7188
Length 144, alignment 9/ 9:    15.5938        30.7188
Length 160, alignment 0/ 0:    12.8438        23.9688
Length 160, alignment 10/ 0:    12.9844        30.1562
Length 160, alignment 0/10:    20.6562        32.5312
Length 160, alignment 10/10:    13.0312        34.3438
Length 176, alignment 0/ 0:    14.1094        23.7812
Length 176, alignment 11/ 0:    16.5781        29.3281
Length 176, alignment 0/11:    25.5625        31.3281
Length 176, alignment 11/11:    16.0625        30.4844
Length 192, alignment 0/ 0:    14.8906        22.3438
Length 192, alignment 12/ 0:    17.7344        30.4375
Length 192, alignment 0/12:    25.4688        29.8281
Length 192, alignment 12/12:    14.8906        29.4219
Length 208, alignment 0/ 0:    15.875        21.4062
Length 208, alignment 13/ 0:    19.1719        29.9688
Length 208, alignment 0/13:    20.25        29.6875
Length 208, alignment 13/13:    17.1719        26.9844
Length 224, alignment 0/ 0:    14.9844        20.6562
Length 224, alignment 14/ 0:    16.0625        28.625
Length 224, alignment 0/14:    33.1875        27.375
Length 224, alignment 14/14:    13.7344        29.375
Length 240, alignment 0/ 0:    15.7344        19.4062
Length 240, alignment 15/ 0:    21.5        29.75
Length 240, alignment 0/15:    40.0625        27.6406
Length 240, alignment 15/15:    17.7812        24.25
Length 272, alignment 0/ 0:    17.2656        19.4531
Length 272, alignment 17/ 0:    20.2031        23.0938
Length 272, alignment 0/17:    20.2969        30.5781
Length 272, alignment 17/17:    19.0781        28.4844
Length 288, alignment 0/ 0:    14.25        24.5781
Length 288, alignment 18/ 0:    19.6875        31.1406
Length 288, alignment 0/18:    22.1562        28.5312
Length 288, alignment 18/18:    17.2188        26.4375
Length 304, alignment 0/ 0:    16.8906        23.7812
Length 304, alignment 19/ 0:    19.2656        30.3438
Length 304, alignment 0/19:    43.375        28.4844
Length 304, alignment 19/19:    20.5781        25.7812
Length 320, alignment 0/ 0:    18        23.5938
Length 320, alignment 20/ 0:    19.5469        30.0312
Length 320, alignment 0/20:    43.6562        27.2812
Length 320, alignment 20/20:    20.3906        25.0469
Length 336, alignment 0/ 0:    19.3594        22.5781
Length 336, alignment 21/ 0:    21.0312        28.8594
Length 336, alignment 0/21:    50.9688        26.3438
Length 336, alignment 21/21:    31.1406        24.9531
Length 352, alignment 0/ 0:    15.5469        21.9219
Length 352, alignment 22/ 0:    20.2031        29.0312
Length 352, alignment 0/22:    51.5781        27
Length 352, alignment 22/22:    20.5312        25.125
Length 368, alignment 0/ 0:    18.2969        21.5469
Length 368, alignment 23/ 0:    20.7188        25.375
Length 368, alignment 0/23:    56.3281        24.5781
Length 368, alignment 23/23:    24.0625        22.0625
Length 384, alignment 0/ 0:    18.0156        21.3594
Length 384, alignment 24/ 0:    21.8281        25.875
Length 384, alignment 0/24:    49.8438        24.25
Length 384, alignment 24/24:    24.1094        22.2031
Length 400, alignment 0/ 0:    20.3281        20.1094
Length 400, alignment 25/ 0:    22.4219        23.8281
Length 400, alignment 0/25:    51.0938        32.25
Length 400, alignment 25/25:    30.4844        32.7656
Length 416, alignment 0/ 0:    16.7969        25.875
Length 416, alignment 26/ 0:    21.7812        32.3438
Length 416, alignment 0/26:    52.2188        30.5312
Length 416, alignment 26/26:    24.5312        32.6719
Length 432, alignment 0/ 0:    19.5938        25.5938
Length 432, alignment 27/ 0:    22.7656        34.2031
Length 432, alignment 0/27:    67.5312        30.2031
Length 432, alignment 27/27:    26.8594        29.6406
Length 448, alignment 0/ 0:    18.625        24.5312
Length 448, alignment 28/ 0:    23.125        31.5156
Length 448, alignment 0/28:    66.6094        29.0781
Length 448, alignment 28/28:    27.0938        27.6562
Length 464, alignment 0/ 0:    21.0469        24.3438
Length 464, alignment 29/ 0:    22.0625        31.2188
Length 464, alignment 0/29:    63.7656        28.5781
Length 464, alignment 29/29:    36.5781        29.5
Length 480, alignment 0/ 0:    17.6875        24.1094
Length 480, alignment 30/ 0:    21.3125        31.1875
Length 480, alignment 0/30:    68.1875        28.2969
Length 480, alignment 30/30:    27.875        28.8594
Length 496, alignment 0/ 0:    21.2812        24.0625
Length 496, alignment 31/ 0:    22.1562        28.0625
Length 496, alignment 0/31:    72.2344        26.625
Length 496, alignment 31/31:    31.0469        27.6875
Length 4096, alignment 0/ 0:    123.391        154.516

 __memcpy_rte_avx is faster in most cases.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]