This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH] Improve string benchtests
>>> Same as before for wcpncpy: instead of reimplement the generic implementation
>>> on benchtests we can just include them. And it also leads to an possible
>>> optimization on generic implementation for wcpncpy.
>> The point is to enable useful comparisons of string implementations. If we include
>> the generic implementation then we just compare the generic implementation with
>> itself in many cases. And that isn't useful. If I change a generic implementation I
>> want to see the difference that makes in the benchmark comparison rather than
>> showing no difference.
> My understanding is we have the generic implementation as the baseline
> where arch-specific optimization might be applied and the idea of the
> comparison is to check against it. I see no point in using a different
> implementation on benchtests, it should compare against exactly what
> glibc is currently providing.
I have to disagree, we cannot do an exact comparison unless build the generic
string functions as part of GLIBC and call them via the PLT. Including source
files with lots of #define magic is never going to be equivalent.
The goal here is not an accurate comparison with generic string functions but
to enable a realistic comparison with an efficient baseline - the existing byte
oriented implementations provide a baseline but are too slow to be useful.
> If you want to check if the your changes improves the generic, you can
> compare against multiples glibc builds.
That doesn't work so well given it takes a long time to rebuild GLIBC and
benchmarks. For all benchmarking I do, I always create a direct comparison of
old vs new in a single run so it shows the differences and can be run repeatedly
to confirm. The string bench is setup to do this already, so why remove this
>> Maybe the name generic_xxx is confusing? It's meant to be the baseline,
>> something which you should beat in all cases with the actual implementation.
> My understanding is the baseline should be the generic implementation which
> is selected if the architecture does not provide an optimized one.
That means you never compare the generic implementation against a baseline.
Given that is what we do today, I don't see why we should stop doing that.