This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2] x86-64: Optimize strcmp/wcscmp with AVX2
* Alexander Monakov:
> this does not. The whole point was that frequency behavior means the
> slowdown on programs making *occasional* calls to strcmp will not be
> captured by microbenchmarks. What good is saving dozens of cycles on
> strcmp calls if the remaining program is slowed down by 5%?
>
> I was missing that AVX frequency limits kick in only if "heavy" operations
> are used -- on recent generations. I'm not sure that's true for older, e.g.
> Haswell, generations. Intel's white paper explaining Haswell AVX clocks
> makes no distinction of "light" vs. "heavy" operations:
>
> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/performance-xeon-e5-v3-advanced-vector-extensions-paper.pdf
This should be easy to measure. Aren't there perf counters for that?
The CORE_POWER.LVL0_TURBO_LICENSE, CORE_POWER.LVL1_TURBO_LICENSE,
CORE_POWER.LVL2_TURBO_LICENSE counters?
Run the benchmark in parallel with itself, and then with other compute
loads, and see which of the counters increase?