This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] x86-64: Optimize strcmp/wcscmp with AVX2


* Alexander Monakov:

> this does not. The whole point was that frequency behavior means the
> slowdown on programs making *occasional* calls to strcmp will not be
> captured by microbenchmarks. What good is saving dozens of cycles on
> strcmp calls if the remaining program is slowed down by 5%?
>
> I was missing that AVX frequency limits kick in only if "heavy" operations
> are used -- on recent generations. I'm not sure that's true for older, e.g.
> Haswell, generations. Intel's white paper explaining Haswell AVX clocks
> makes no distinction of "light" vs. "heavy" operations:
>
> https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/performance-xeon-e5-v3-advanced-vector-extensions-paper.pdf

This should be easy to measure.  Aren't there perf counters for that?
The CORE_POWER.LVL0_TURBO_LICENSE, CORE_POWER.LVL1_TURBO_LICENSE,
CORE_POWER.LVL2_TURBO_LICENSE counters?

Run the benchmark in parallel with itself, and then with other compute
loads, and see which of the counters increase?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]