This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] BZ #14059 - HAS_FMA4 check needs to also check for AVX
On 05/11/2012 09:10 AM, Allan McRae wrote:
On 11/05/12 16:48, Andreas Jaeger wrote:
On 05/11/2012 08:34 AM, Andreas Jaeger wrote:
On 05/11/2012 12:18 AM, Carlos O'Donell wrote:
On Thu, May 10, 2012 at 6:08 PM, Carlos O'Donell
<carlos@systemhalted.org> wrote:
On Tue, May 8, 2012 at 4:23 AM, Andreas Jaeger<aj@suse.de> wrote:
There are two things broken in checking for HAS_FMA4/AVX on x86-64 in
the multiarch code:
1. We should disable FAM4 if AVX is not available (due to disabled
OSXSAVE)
2. commit 56f6f6a2403cfa7267cad722597113be35ecf70d reverted some changes
from commit 08cf777f9e7f6d826658a99c7d77a359f73a45bf but forgot to
revert the change for AVX. We really have to disable AVX if it's
The issue is easy reproduceable under Xen for the reporter.
Also, the two commits introduced YMM_Usable and removed its usage but
did not remove the code from the headers, I'm cleaning this up as well.
I'm appending a patch that I tested on Linux/x86-64 (without Xen).
Ok to commit?
No.
I actually think the code Ulrich wrote originally was almost correct,
but missing this:
If I had to fix it it would look like this:
Thanks, this looks great - but misses the check for FMA4.
Here's a combined patch that compiled fine for me on
Linux/x86-64. Jeff, could you test it under Xen, please?
Andreas
2012-05-11 Andreas Jaeger<aj@suse.de>
Carlos O'Donell<carlos_odonell@mentor.com>
[BZ #14059]
* sysdeps/x86_64/multiarch/init-arch.c (__init_cpu_features):
Fix check for AVX, enable FMA4 only if it exists and if AVX is
usable.
* sysdeps/x86_64/multiarch/init-arch.h (bit_AVX_Usable): Renamed
from bit_YMM_usable.
(bit_FMA4_Usable): New macro.
(bit_XMM_state): New macro.
(bit_YMM_state): New macro.
(index_AVX_Usable): Renamed from index_YMM_Usable.
(index_FMA4_Usable): New macro.
(HAS_YMM_USABLE): Delete.
(HAS_AVX): Use HAS_ARCH_FEATURE.
(HAS_FMA4): Likewise.
That version above contained a typo. Here's an update
Allan, does this work now?
No. I still get an illegal instruction in __strcasecmp_l_avx with this
version.
Allan, thanks for testing!
Indeed, it fails:
The code tests the bits in cpuid and does not use the HAS_AVX flag:
# ifdef HAVE_AVX_SUPPORT
leaq __strcasecmp_avx(%rip), %rax
testl $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
jnz 2f
# endif
Please test the following.
I'm now offline until Monday, so won't be able to continue this, if
anybody likes to finish and commit the patch, I would appreciate it,
Andreas
2012-05-11 Andreas Jaeger <aj@suse.de>
* sysdeps/x86_64/multiarch/strcmp.S: Use bit_AVX_Usable.
diff --git a/sysdeps/x86_64/multiarch/strcmp.S
b/sysdeps/x86_64/multiarch/strcmp.S
index 2b9870b..eaf852b 100644
--- a/sysdeps/x86_64/multiarch/strcmp.S
+++ b/sysdeps/x86_64/multiarch/strcmp.S
@@ -1,5 +1,5 @@
/* strcmp with SSE4.2
- Copyright (C) 2009, 2010, 2011 Free Software Foundation, Inc.
+ Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
Contributed by Intel Corporation.
This file is part of the GNU C Library.
@@ -106,7 +106,7 @@ ENTRY(__strcasecmp)
1:
# ifdef HAVE_AVX_SUPPORT
leaq __strcasecmp_avx(%rip), %rax
- testl $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
+ testl $bit_AVX_Usable,
__cpu_features+FEATURE_OFFSET+index_AVX_Usable(%rip)
jnz 2f
# endif
leaq __strcasecmp_sse42(%rip), %rax
@@ -129,7 +129,7 @@ ENTRY(__strncasecmp)
1:
# ifdef HAVE_AVX_SUPPORT
leaq __strncasecmp_avx(%rip), %rax
- testl $bit_AVX, __cpu_features+CPUID_OFFSET+index_AVX(%rip)
+ testl $bit_AVX_Usable,
__cpu_features+FEATURE_OFFSET+index_AVX_Usable(%rip)
jnz 2f
# endif
leaq __strncasecmp_sse42(%rip), %rax
--
Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126