[PATCH] x86: Optimize atomic_compare_and_exchange_[val|bool]_acq [BZ #28537]

Arjan van de Ven arjan@linux.intel.com
Wed Nov 3 19:22:37 GMT 2021


On 11/3/2021 10:55 AM, Oleh Derevenko wrote:
> Arjan,
> 
>> eh I am not sure I understand what you say since cmpxchg uses the exact same
>> cache protocol/etc to do its read...
> Well, if you are sure... I did not have that information.
> 
> The last question if you permit.
> What, as to your opinion, are the reasons they did it in the hardware
> implementation? This thing, I mean:
>> The full compare and swap will grab the cache line exclusive and cause excessive cache line bouncing.
> Why was not this optimization initially implemented in the CPU?


an instruction that needs the same cacheline for read and maybe exclusive will get it exclusive,
in cpu microarchitecture there usually isn't some fast way (and we all want locks ot be fast
clearly) to do a two phase cache protocol on the same line

> 
> Well, I guess I can explain it myself. Because in the success cases
> these extra checks would add to execution time. And cmpcxhg can be
> used for many purposes — not only for polling from four threads in
> parallel.



More information about the Libc-alpha mailing list