Proposal for CPU dispatching in libc
Agner Fog
agner@agner.org
Thu Jul 2 10:51:00 GMT 2009
Looks great! cpu-dispatching in the loader is a new invention, much more
efficient than what is done in Intel's compiler and libraries. I hope
you didn't patent this invention :-)
The new @gnu_indirect_function attribute needs to be documented for gas.
Is it also available in gcc as a function attribute?
Is this method testable? I looked at the strcmp implementation. If we
want to test all versions of the function on the same computer, then we
need access to the individual entries. I see entries named __strcmp_sse2
and __strcmp_sse42. We need a standard for how to name the different
versions. Both Intel and AMD have a history of using weird, confusing
and misleading names for their instruction sets. We should be careful
with just using whatever weird name Intel or AMD may come up with. Right
now we can foresee an issue with the FMA instruction set which is
different for Intel and AMD.
I see a section named .text.sse4.2, but no .text.sse2. Using different
section names for different instruction set versions is a good idea
because it improves cache coherence. But this will work only if we have
a standard for how to name these sections.
Is it possible to build a slim version of libc with certain instruction
set versions excluded? This could be useful for embedded applications.
Petr Baudis wrote:
> This work is already being done in glibc git tree by Ulrich Drepper
> and H.J. Lu; a new ELF symbol type STT_GNU_IFUNC is used; if symbol
> of this type is called, the function returns address of the actual
> function that should be called on this and subsequent calls.
>
> Currently, SSE4.2-optimized strcmp() is already committed.
>
>
More information about the Libc-help
mailing list