This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: libc-alpha at sourceware dot org
- Date: Tue, 09 Jun 2015 16:17:56 -0300
- Subject: Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB
- Authentication-results: sourceware.org; auth=none
- References: <55760314 dot 6070601 at linux dot vnet dot ibm dot com> <20150609163835 dot GI17573 at brightrain dot aerifal dot cx> <557726CA dot 9030100 at linaro dot org> <20150609183320 dot 5C3552C3BE6 at topped-with-meat dot com> <1433875886 dot 21101 dot 59 dot camel at sjmunroe-ThinkPad-W500>
On 09-06-2015 15:51, Steven Munroe wrote:
> On Tue, 2015-06-09 at 11:33 -0700, Roland McGrath wrote:
>>> I believe the idea is to provide a fast way to emulate a functionality
>>> similar to __builtin_cpu_supports for powerpc. For x86, this builtin
>>> will create 'cpuid' instruction, but since powerpc lacks a similar one
>>> it should rely on hardware capability information provided by kernel.
>>
>> On x86 using cpuid is quite slow as instruction-level overheads go.
>> It's certainly nowhere near as fast as doing a direct load from memory.
>> So this analogue does not suggest anything like justification for the
>> kind of microoptimization being discussed.
>
> In the X86 implementation the cpuid is cached by __builtin_cpu_init(). I
> suspect the result is saved in static or TLS.
>
> That said the x86/x86_64 ISA and micro arch are different from POWER
> with different tradeoffs.
>
> It would inappropriate to impose these assumptions on other platforms
>
> Our proposal is appropriate for the reality of POWER and using the
> HWCAP.
>
In fact the __builtin_cpu_supports generate for x86_64 a read from
a static struct defined in libgcc:
* libgcc/config/i386/cpuinfo.c:
struct __processor_model
{
unsigned int __cpu_vendor;
unsigned int __cpu_type;
unsigned int __cpu_subtype;
unsigned int __cpu_features[1];
} __cpu_model = { };
And it is initialized in constructor (__cpu_indicator_init) using the
cpuid. Either way, for powerpc even using the same mechanism will
incur in a static GOT relocation as it is defined in a dynamic library
(with the different it won't have a dynamic relocation).