This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: FMA4 detection fails.
- From: Carlos O'Donell <carlos at redhat dot com>
- To: "Pawar, Amit" <Amit dot Pawar at amd dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Thu, 2 Jun 2016 20:20:55 -0400
- Subject: Re: FMA4 detection fails.
- Authentication-results: sourceware.org; auth=none
- References: <SN1PR12MB073357908CD9467919A8814797580 at SN1PR12MB0733 dot namprd12 dot prod dot outlook dot com>
On 06/02/2016 05:35 AM, Pawar, Amit wrote:
> After the git commit "f4b6d20366aac66070f1cf50552cf2951991a1e5" FMA4 detection is failing in trunk.
You need to explain _why_ it's not working.
> This commit was related to Bugzilla id https://sourceware.org/bugzilla/show_bug.cgi?id=19214 and is ok to fix in following way in cpu-features.c file?
>
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index a5fa81f..384e328 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -87,10 +87,6 @@ get_common_indeces (struct cpu_features *cpu_features,
> if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> cpu_features->feature[index_arch_FMA_Usable]
> |= bit_arch_FMA_Usable;
> - /* Determine if FMA4 is usable. */
> - if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
> - cpu_features->feature[index_arch_FMA4_Usable]
> - |= bit_arch_FMA4_Usable;
> }
You move the feature check from the subroutine get_common_indeces...
> }
> }
> @@ -230,6 +226,25 @@ init_cpu_features (struct cpu_features *cpu_features)
> cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
> cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
>
> + /* Can we call xgetbv? */
> + if (CPU_FEATURES_CPU_P (cpu_features, OSXSAVE))
> + {
> + unsigned int xcrlow;
> + unsigned int xcrhigh;
> +
> + asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
> +
> + /* Is YMM and XMM state usable? */
> + if ((xcrlow & (bit_YMM_state | bit_XMM_state)) ==
> + (bit_YMM_state | bit_XMM_state))
> + {
> + /* Determine if FMA4 is usable. */
> + if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
> + cpu_features->feature[index_arch_FMA4_Usable]
> + |= bit_arch_FMA4_Usable;
... and up into the init_cpu_features (caller of get_common_indeces).
As you refactor it up the call stack you end up copying the osxsave/xgetbv
checks which get_common_indeces was already doing.
As far as I can tell your patch just moves the check earlier... why?
> + }
> + }
> +
> if (family == 0x15)
> {
> #if index_arch_Fast_Unaligned_Load != index_arch_Fast_Copy_Backward
>
>
> Thanks,
> Amit Pawar
--
Cheers,
Carlos.