This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

glibc benchmarks' results can be unreliable for short runtimes (on Aarch64)


Folks,

Recently I was doing an optimized implementation of memcpy/memmove
or TX2. While running internal microbenchmarks I noticed that for the
"fast" benchmarks (~10ms runtime) the results vary quite significantly
across runs (5%-20%). It is possible to find two runs that show my
implementation actually significantly worsened the performance. Also
there are (quite common) cases when the "baseline" implementation
gets worse and the "tested" implementation gets better (or vice versa)
across the runs.

The first solution to this that comes to mind is to increase the runtime
for the "fast" benchmarks. If I increase bench-memcpy runtime 32x (the
actual runtime for TX2 would be ~2s) the results for a particular
implementation are always within 5% range. The effect of one benchmark
gains and another one loses for different runs while not as significant still
remains.

So, are there any reasons not to bumping up the runtime of the
"fast" benchmarks to 1s-2s?

--
  Thanks,
  Anton


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]