This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Benchmarking __libc_single_threaded


* Wilco Dijkstra:

> I benchmarked this on several AArch64 systems. On Cortex-A72 and Cortex-A53
> there is a 8-15% gain for the hidden variant, however on modern cores there is
> practically no difference despite an extra 6% instructions for this benchmark.
> I got stable and repeatable results in all cases.
>
>> Basically, it demonstrates the performance overhead of passing a
>> std::shared_ptr down a somewhat arbitrarily nested call chain.  Only
>> single-threaded mode is benchmarked, the multi-threaded mode is quite
>> slow no matter what.
>
> Indeed, the difference of the single-threaded optimization is easily 3-4 times.
> The extra performance gain from the hidden DSO symbol is tiny in comparison
> even on older cores and only applies to DSOs. So there isn't a justification
> for the extra complexity of a per-DSO hidden symbol.

Thanks for doing the additional benchmarking.

The global symbol approach also has the advantage that we can control
the placement of the variable (along with other rarely-written
variables) in libc.so, once we find a way to tell GCC that it shouldn't
generate code that needs a copy relocation.  With the hidden symbol, we
would have to pad the flag to the size of a cache line.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]