[PATCH] x86: Optimize atomic_compare_and_exchange_[val|bool]_acq [BZ #28537]

H.J. Lu hjl.tools@gmail.com
Wed Nov 3 15:04:15 GMT 2021


>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and cause
excessive cache line bouncing.  Check the current memory value first and
return immediately if writing cache line may fail to reduce cache line
bouncing on contended locks.

This fixes BZ# 28537.
---
 sysdeps/x86/atomic-machine.h | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/sysdeps/x86/atomic-machine.h b/sysdeps/x86/atomic-machine.h
index 2692d94a92..92c7cf58b7 100644
--- a/sysdeps/x86/atomic-machine.h
+++ b/sysdeps/x86/atomic-machine.h
@@ -73,9 +73,19 @@ typedef uintmax_t uatomic_max_t;
 #define ATOMIC_EXCHANGE_USES_CAS	0
 
 #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
-  __sync_val_compare_and_swap (mem, oldval, newval)
+  ({ __typeof (*(mem)) oldmem = *(mem), ret;				\
+     ret = (oldmem == (oldval)						\
+	    ? __sync_val_compare_and_swap (mem, oldval, newval)		\
+	    : oldmem);							\
+     ret; })
 #define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \
-  (! __sync_bool_compare_and_swap (mem, oldval, newval))
+  ({ __typeof (*(mem)) old = *(mem);					\
+     int ret;								\
+     if (old != (oldval))						\
+       ret = 1;								\
+     else								\
+       ret = !__sync_bool_compare_and_swap (mem, oldval, newval);	\
+     ret; })
 
 
 #define __arch_c_compare_and_exchange_val_8_acq(mem, newval, oldval) \
-- 
2.33.1



More information about the Libc-alpha mailing list