This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2 2/2] aarch64: Optimized memcpy and memmove for Kunpeng processor


Hi Wilco, sorry for the delay in replying because we tested lot after modifing memmove.

> In order to select the right memmove implementation,
> multiarch/memmove.c needs similar changes as multiarch/memcpy.c.

That's true, we missed this patch and will submit it next patch.

> Also since the memmove entry sequence does both check for medium
> and large cases, the full overlap check should be done in both.
> Currently only sizes 96-512 benefit, not the move_long case:

Yes, we will add full overlap check in move_long case next patch.

And what confusing us now is that, we removed dst_unaligned code in memcpy according to the previous comments, which did not affect performance after testing in memcpy cases. But in the case when uses memmove function and enters the memcpy part, unaligned cases is significantly slower than aligned case according to the results of the first half part of memmove-walk as shown in the bottom. So do you think we should still remove dst_unaligned code? 

We analyse the reason is of more judgement in the begin of memmove and may weak processor ability to handle this case, and so dst_unaligned make difference.

> Well it looks the dst_unaligned code (which deals with a specific
> issue on ThunderX2) ...

I remember you memtioned the specific issue on ThunderX2 before, could you tell us more about it?

Function: memmove
Variant: walk
                    __memmove_thunderx	__memmove_thunderx2	__memmove_falkor	__memmove_kunpeng2	__memmove_generic
========================================================================================================================
      length=128:        33.99 (-73.69%)	       18.65 (  4.67%)	       17.75 (  9.29%)	       19.21 (  1.80%)	       19.57	
      length=129:        35.41 (  2.08%)	       37.43 ( -3.51%)	       35.87 (  0.79%)	       34.71 (  4.01%)	       36.16	
      length=256:        45.55 (-37.95%)	       32.61 (  1.23%)	       35.59 ( -7.79%)	       32.95 (  0.20%)	       33.02	
      length=257:        66.36 (  4.20%)	       69.50 ( -0.33%)	       68.03 (  1.80%)	       68.53 (  1.08%)	       69.27	
      length=512:        82.77 (-34.10%)	       65.67 ( -6.41%)	       65.61 ( -6.30%)	       60.13 (  2.57%)	       61.72	
      length=513:       146.19 (  3.90%)	      132.98 ( 12.59%)	      132.28 ( 13.05%)	      151.50 (  0.41%)	      152.12	
     length=1024:       155.75 (-26.13%)	      142.74 (-15.60%)	      126.97 ( -2.83%)	      121.58 (  1.53%)	      123.48	
     length=1025:       289.15 (  4.72%)	      318.71 ( -5.02%)	      262.97 ( 13.35%)	      307.00 ( -1.16%)	      303.48	
     length=2048:       298.85 (-22.16%)	      233.98 (  4.35%)	      249.71 ( -2.08%)	      245.37 ( -0.30%)	      244.63	
     length=2049:       409.46 ( 14.62%)	      399.08 ( 16.78%)	      508.64 ( -6.07%)	      465.79 (  2.87%)	      479.54	
     length=4096:       543.10 (-11.30%)	      445.35 (  8.73%)	      491.40 ( -0.71%)	      435.61 ( 10.73%)	      487.95	
     length=4097:       680.95 ( 18.96%)	      593.99 ( 29.31%)	      990.52 (-17.89%)	      882.91 ( -5.08%)	      840.23	
     length=8192:      1047.46 ( -8.01%)	      867.03 ( 10.59%)	      977.80 ( -0.83%)	      850.57 ( 12.29%)	      969.74	
     length=8193:      1224.46 ( 21.97%)	      979.34 ( 37.59%)	     1981.71 (-26.29%)	     1714.96 ( -9.29%)	     1569.12	
    length=16384:      2055.73 ( -5.42%)	     1701.01 ( 12.77%)	     1944.38 (  0.29%)	     1683.51 ( 13.67%)	     1950.11	
    length=16385:      2314.62 ( 23.38%)	     1774.44 ( 41.26%)	     3967.45 (-31.34%)	     3385.52 (-12.07%)	     3020.82	
    length=32768:      5153.99 (-32.25%)	     3426.50 ( 12.08%)	     3875.16 (  0.56%)	     3338.91 ( 14.32%)	     3897.16	
    length=32769:      5343.41 (  9.64%)	     3375.50 ( 42.92%)	     7925.06 (-34.01%)	     6716.28 (-13.57%)	     5913.72	
    length=65536:     10361.70 (-35.90%)	     6768.32 ( 11.23%)	     7759.75 ( -1.78%)	     6658.73 ( 12.66%)	     7624.32	
    length=65537:     10284.00 ( 12.00%)	     6528.85 ( 44.13%)	    15844.40 (-35.58%)	    13437.90 (-14.98%)	    11686.80	
   length=131072:     20539.30 (-34.71%)	    13672.50 ( 10.33%)	    15567.10 ( -2.10%)	    13325.60 ( 12.60%)	    15247.50	
   length=131073:     20868.20 ( 10.97%)	    12807.80 ( 45.36%)	    31605.90 (-34.83%)	    26788.20 (-14.28%)	    23440.70	
   length=262144:     41304.50 (-35.25%)	    26883.30 ( 11.97%)	    31038.70 ( -1.63%)	    26533.40 ( 13.12%)	    30539.40	
   length=262145:     41157.90 ( 12.84%)	    25568.20 ( 45.85%)	    63229.00 (-33.90%)	    53525.00 (-13.35%)	    47220.50	
   length=524288:     81777.00 (-32.88%)	    54133.00 ( 12.04%)	    61853.30 ( -0.51%)	    52869.40 ( 14.09%)	    61542.20	
   length=524289:     81986.90 ( 14.71%)	    50562.00 ( 47.40%)	   126255.00 (-31.33%)	   105969.00 (-10.23%)	    96132.70	
  length=1048576:    163628.00 (-33.00%)	   107776.00 ( 12.00%)	   123819.00 ( -1.00%)	   105831.00 ( 14.00%)	   123170.00	
  length=1048577:    177503.00 ( 12.00%)	    98680.60 ( 51.09%)	   253068.00 (-26.00%)	   211155.00 ( -5.00%)	   201763.00	
  length=2097152:    336756.00 (-34.00%)	   224097.00 ( 11.00%)	   254575.00 ( -1.00%)	   219864.00 ( 13.00%)	   253124.00	
  length=2097153:    373590.00 (  9.00%)	   214822.00 ( 48.00%)	   506479.00 (-23.00%)	   426299.00 ( -3.00%)	   414899.00	
  length=4194304:    662606.00 (-35.00%)	   437195.00 ( 11.00%)	   497288.00 ( -2.00%)	   427614.00 ( 13.00%)	   491729.00	
  length=4194305:    697910.00 (  9.00%)	   417656.00 ( 45.00%)	  1020670.00 (-32.62%)	   856051.00 (-12.00%)	   769599.00	
  length=8388608:   1307990.00 (-34.88%)	   852030.00 ( 12.00%)	   983092.00 ( -2.00%)	   834918.00 ( 13.00%)	   969712.00	
  length=8388609:   1416420.00 (  8.70%)	   821262.00 ( 47.06%)	  2030660.00 (-30.89%)	  1708360.00 (-10.11%)	  1551450.00	
 length=16777216:   2586380.00 (-33.02%)	  1702120.00 ( 12.46%)	  1970000.00 ( -1.32%)	  1676900.00 ( 13.76%)	  1944360.00	
 length=16777217:   2796060.00 ( 13.29%)	  1627720.00 ( 49.52%)	  4079100.00 (-26.51%)	  3410640.00 ( -5.77%)	  3224440.00	
 length=33554432:   5241680.00 (-33.96%)	  3488860.00 ( 10.84%)	  4890730.00 (-24.99%)	  3474630.00 ( 11.20%)	  3912900.00	
 length=33554433:   5666550.00 ( 14.71%)	  3357520.00 ( 49.46%)	  8039630.00 (-21.01%)	  6824230.00 ( -2.72%)	  6643780.00


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]