Presently AArch64, ARM, s390x, s390, ppc32, ppc64, sparc32, and sparc64 all support passing dl_hwcap to the resolver function. This allows the resolver to do something interesting based on the hardware capabilities. On i386, and x86_64 we do not pass dl_hwcap to the resolver functions and therefore have no save way to access this information. We should immediately bring i386 and x86_64 up to parity with the other architectures and once this is done we should follow up by adding a symbol dependency to all ifunc attribute functions compiled by gcc, such that new binaries that need this feature are assured they run under an environment that provides the dl_hwcap and not just garbage in the incoming argument register. Without this feature i386 and x86_64 ifunc's don't have much use.
Why? On i386/x86_64 it can (and usually does) just use the cpuid instruction to query the same information.
(In reply to Jakub Jelinek from comment #1) > Why? On i386/x86_64 it can (and usually does) just use the cpuid > instruction to query the same information. It's faster than cpuid and doesn't serialize the instruction stream. If you really need cpuid then you're free to call it and process the information yourself, but AT_HWCAP (merged with AT_HWCAP2 on 64-bit platforms) contains useful information the ifunc resolver might need.
(In reply to Carlos O'Donell from comment #2) > (In reply to Jakub Jelinek from comment #1) > > Why? On i386/x86_64 it can (and usually does) just use the cpuid > > instruction to query the same information. > > It's faster than cpuid and doesn't serialize the instruction stream. If you > really need cpuid then you're free to call it and process the information > yourself, but AT_HWCAP (merged with AT_HWCAP2 on 64-bit platforms) contains > useful information the ifunc resolver might need. So one problem is that AT_HWCAP can't tell you about XSAVE/OSXSAVE which is required to enable AVX512 variants. So you're probably right that most people for modern hardware detection will have to rely on cpuid. Which is problematic. I wonder if we can just document the safe functions to use in the IFUNC and that would be enough, for example __builtin_cpu_init, __buildin_cpu_is, and __builtin_cpu_supports should be safe because gcc already uses them in multiversioning on x86?
Either way, from a documentation perspective it's easier when the arches are the same, so adding dl_hwcap to the calling convention, and then documenting safe functions, should make this easier.