This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: glibc benchmarks' results can be unreliable for short runtimes (on Aarch64)


Wilco,

On 6/21/2019 2:01 PM, Wilco Dijkstra wrote:
Hi Anton,
Recently I was doing an optimized implementation of memcpy/memmove or TX2. While running internal microbenchmarks I noticed that for the "fast" benchmarks (~10ms runtime) the results vary quite significantly across runs (5%-20%). It is possible to find two runs that show my implementation actually significantly worsened the performance. Also there are (quite common) cases when the "baseline" implementation gets worse and the "tested" implementation gets better (or vice versa) across the runs.
Yes this is certainly possible for any short running benchmark, which is why I recently increased the minimum iteration count 128 times. I ran it on a fixed frequency server and got quite stable results. However if your CPU does frequency scaling then 10ms is likely too short for consistent results.
I think that we can assume frequency throttling to be a general rule these days.

The first solution to this that comes to mind is to increase the runtime for the "fast" benchmarks. If I increase bench-memcpy runtime 32x (the actual runtime for TX2 would be ~2s) the results for a particular implementation are always within 5% range. The effect of one benchmark gains and another one loses for different runs while not as significant still remains. So, are there any reasons not to bumping up the runtime of the "fast" benchmarks to 1s-2s?
1 second per benchmark sounds reasonable, however if you just increase INNER_LOOP_ITERS a lot then various benchmarks will become way too slow. So you may need to move them to INNER_LOOP_ITERS_MEDIUM or something similar. If you use "time $(run-bench)" in the benchtests makefile it prints out the time for each benchmark.

OK, I understand this, thanks. I will use INNER_LOOP_ITERS_MEDIUM then.

--
  Thanks,
  Anton


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]