Proposal for CPU dispatching in libc

Agner Fog agner@agner.org
Thu Jul 2 10:51:00 GMT 2009


Looks great! cpu-dispatching in the loader is a new invention, much more 
efficient than what is done in Intel's compiler and libraries. I hope 
you didn't patent this invention :-)

The new @gnu_indirect_function attribute needs to be documented for gas. 
Is it also available in gcc as a function attribute?

Is this method testable? I looked at the strcmp implementation. If we 
want to test all versions of the function on the same computer, then we 
need access to the individual entries. I see entries named __strcmp_sse2 
and __strcmp_sse42. We need a standard for how to name the different 
versions. Both Intel and AMD have a history of using weird, confusing 
and misleading names for their instruction sets. We should be careful 
with just using whatever weird name Intel or AMD may come up with. Right 
now we can foresee an issue with the FMA instruction set which is 
different for Intel and AMD.

I see a section named .text.sse4.2, but no .text.sse2. Using different 
section names for different instruction set versions is a good idea 
because it improves cache coherence. But this will work only if we have 
a standard for how to name these sections.

Is it possible to build a slim version of libc with certain instruction 
set versions excluded? This could be useful for embedded applications.


Petr Baudis wrote:
> This work is already being done in glibc git tree by Ulrich Drepper
> and H.J. Lu; a new ELF symbol type STT_GNU_IFUNC is used; if symbol
> of this type is called, the function returns address of the actual
> function that should be called on this and subsequent calls.
>
>   Currently, SSE4.2-optimized strcmp() is already committed.
>
>   



More information about the Libc-help mailing list