Bug 14059 - HAS_FMA4 check needs to also check for AVX
Summary: HAS_FMA4 check needs to also check for AVX
Status: RESOLVED FIXED
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.15
: P2 normal
Target Milestone: ---
Assignee: Carlos O'Donell
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-04 16:35 UTC by Jim Westfall
Modified: 2014-06-25 11:06 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments
patch sent to ubuntu folks to fix their glibc build (334 bytes, patch)
2012-05-07 20:01 UTC, Jim Westfall
Details | Diff
patch that fixes avx/fma4 runtime detection in master (513 bytes, patch)
2012-05-07 22:50 UTC, Jim Westfall
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Westfall 2012-05-04 16:35:41 UTC
Hi

In number of the sysdeps/x86_64/fpu/multiarch/*.c files there is code like this


libm_ifunc (__ieee754_exp,
            HAS_FMA4 ? __ieee754_exp_fma4
            : (HAS_AVX ? __ieee754_exp_avx : __ieee754_exp_sse2));

The HAS_FMA4 check is only looking at the FMA4 bit from cpuid() to determine if it runs the fma4 version of the function, but fma4 instructions are dependent on avx instructions.  This can result in the following invalid opcode if avx isn't available.

(gdb) exec-file python-dbg
(gdb) run
Starting program: /usr/bin/python-dbg
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Python 2.7.3 (default, Apr 20 2012, 22:01:19)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print test

Program received signal SIGILL, Illegal instruction.
__ieee754_exp_fma4 (x=-0.5) at ../sysdeps/ieee754/dbl-64/e_exp.c:63
63 ../sysdeps/ieee754/dbl-64/e_exp.c: No such file or directory.
(gdb) bt
#0 __ieee754_exp_fma4 (x=-0.5) at ../sysdeps/ieee754/dbl-64/e_exp.c:63
#1 0x000000000058509f in ?? ()
#2 0x00000000009bde80 in ?? ()
#3 0x0000000100c52a10 in ?? ()
#4 0x0000000000417380 in ?? ()
#5 0x00000000009d2970 in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb) info address __ieee754_exp_fma4
Symbol "__ieee754_exp_fma4" is a function at address 0x7ffff6cc35f0.
(gdb) disassemble 0x7ffff6cc35f0,+30
Dump of assembler code from 0x7ffff6cc35f0 to 0x7ffff6cc360e:
   0x00007ffff6cc35f0 <__ieee754_exp_fma4+0>: push %rbp
   0x00007ffff6cc35f1 <__ieee754_exp_fma4+1>: mov %rsp,%rbp
   0x00007ffff6cc35f4 <__ieee754_exp_fma4+4>: and $0xffffffffffffffe0,%rsp
   0x00007ffff6cc35f8 <__ieee754_exp_fma4+8>: add $0x10,%rsp
=> 0x00007ffff6cc35fc <__ieee754_exp_fma4+12>: vmovsd %xmm0,-0x20(%rsp)
   0x00007ffff6cc3602 <__ieee754_exp_fma4+18>: mov -0x20(%rsp),%rax
   0x00007ffff6cc3607 <__ieee754_exp_fma4+23>: mov %rax,%rcx
   0x00007ffff6cc360a <__ieee754_exp_fma4+26>: shr $0x20,%rcx


thanks
jim
Comment 1 Andreas Jaeger 2012-05-07 19:06:59 UTC
This looks related to http://sourceware.org/bugzilla/show_bug.cgi?id=13583 which was fixed with commits afc5ed09cbce5d6fd48b3a8c5ec427b31f996880.

Can you confirm that that patch - most probably together with 6ee65ed6ddbf04402fad0bec6aa9c73b9d982ae4 - fixes your problem? Or is this really a separate issue?
Comment 2 Jim Westfall 2012-05-07 20:01:11 UTC
Created attachment 6398 [details]
patch sent to ubuntu folks to fix their glibc build
Comment 3 Jim Westfall 2012-05-07 20:03:21 UTC
Hi

My backtrace is actually from 2.15ubuntu10 which has
afc5ed09cbce5d6fd48b3a8c5ec427b31f996880 applied.  

Something similar to that would be needed for disabling FMA4 if AVX is
unavailable.  Attached is the patch I submitted to the ubuntu folks to fix the
issue.

I will do a fresh build against master and will report back.

I am a bit dubious that master is broken with AVX runtime detection because of
the following 2 commits

08cf777f9e7f6d826658a99c7d77a359f73a45bf
56f6f6a2403cfa7267cad722597113be35ecf70d

Specifically 56f6f6a2403cfa7267cad722597113be35ecf70d reverts HAS_YMM_USABLE
back to HAS_AVX, but it didn't also revert the code in
sysdeps/x86_64/multiarch/init-arch.c to unset the AVX bit.  So It now appears
that HAS_AVX will be true when OSXSAVE is disabled, which would be
re-introducing the issue from bug 13583.

thanks
jim
Comment 4 Jim Westfall 2012-05-07 22:46:18 UTC
Hi

Here is what I see with master/head.  My test case is simply this

#include <math.h>

int main(void) {
  double x = 0.5;
  exp(x);
  return 0;
}
gcc -o exp-test exp-test.c -lm

Intel i5-2405S dom0 under xen.  AVX runtime detection is re-broken.

Program terminated with signal 4, Illegal instruction.
#0  0x00007f00d708ff71 in __ieee754_exp_avx (x=0.5) at ../sysdeps/ieee754/dbl-64/e_exp.c:55
55	__ieee754_exp(double x) {
(gdb) info address __ieee754_exp_avx
Symbol "__ieee754_exp_avx" is a function at address 0x7f00d708ff70.
(gdb) disassemble __ieee754_exp_avx,+30
Dump of assembler code from 0x7f00d708ff70 to 0x7f00d708ff8e:
   0x00007f00d708ff70 <__ieee754_exp_avx+0>:	push   %rbx
=> 0x00007f00d708ff71 <__ieee754_exp_avx+1>:	vmovapd %xmm0,%xmm2
   0x00007f00d708ff75 <__ieee754_exp_avx+5>:	sub    $0x20,%rsp
   0x00007f00d708ff79 <__ieee754_exp_avx+9>:	vstmxcsr 0x1c(%rsp)
   0x00007f00d708ff7f <__ieee754_exp_avx+15>:	mov    0x1c(%rsp),%ebx
   0x00007f00d708ff83 <__ieee754_exp_avx+19>:	mov    %ebx,%eax
   0x00007f00d708ff85 <__ieee754_exp_avx+21>:	and    $0x9f,%ah
   0x00007f00d708ff88 <__ieee754_exp_avx+24>:	mov    %eax,0x1c(%rsp)
   0x00007f00d708ff8c <__ieee754_exp_avx+28>:	vldmxcsr 0x1c(%rsp)
End of assembler dump.

AMD 6272 dom0 under xen
Program terminated with signal 4, Illegal instruction.
#0  0x00007fe07cf29d31 in __ieee754_exp_fma4 (x=0.5)
    at ../sysdeps/ieee754/dbl-64/e_exp.c:55
55	../sysdeps/ieee754/dbl-64/e_exp.c: No such file or directory.
(gdb) disassemble __ieee754_exp_fma4,+30
Dump of assembler code from 0x7fe07cf29d30 to 0x7fe07cf29d4e:
   0x00007fe07cf29d30 <__ieee754_exp_fma4+0>:	push   %rbx
=> 0x00007fe07cf29d31 <__ieee754_exp_fma4+1>:	vmovapd %xmm0,%xmm1
   0x00007fe07cf29d35 <__ieee754_exp_fma4+5>:	sub    $0x20,%rsp
   0x00007fe07cf29d39 <__ieee754_exp_fma4+9>:	vstmxcsr 0x1c(%rsp)
   0x00007fe07cf29d3f <__ieee754_exp_fma4+15>:	mov    0x1c(%rsp),%ebx
   0x00007fe07cf29d43 <__ieee754_exp_fma4+19>:	mov    %ebx,%eax
   0x00007fe07cf29d45 <__ieee754_exp_fma4+21>:	and    $0x9f,%ah
   0x00007fe07cf29d48 <__ieee754_exp_fma4+24>:	mov    %eax,0x1c(%rsp)
   0x00007fe07cf29d4c <__ieee754_exp_fma4+28>:	vldmxcsr 0x1c(%rsp)
End of assembler dump.

jim
Comment 5 Jim Westfall 2012-05-07 22:50:27 UTC
Created attachment 6399 [details]
patch that fixes avx/fma4 runtime detection in master

This patch fixes a (missed?) revert as part of 08cf777f9e7f6d826658a99c7d77a359f73a45bf and 56f6f6a2403cfa7267cad722597113be35ecf70d, and also disables fma4 if avx is unavailable.
Comment 6 Andreas Jaeger 2012-05-08 08:14:38 UTC
Thanks for the patch.
Comment 7 Carlos O'Donell 2012-05-17 14:10:55 UTC
Fixed with this commit:

http://sources.redhat.com/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=1a0994f5356214e8af8a1c1cc33fbf74a7ac8993

We've cleaned this up considerably, since technically FMA4 doesn't depend on AVX, it just depends on YMM/XMM state being usable (saved and restored by the OS) and the feature being present.

We also fix the AVX detection with this patch.

Tested on AVX-enabled and non-AVX-enabled hosts, and for good measure we now have a regression test for this.

I'm going to backport this to 2.15.1 to fix the mess there, see BZ313753 for the backport request.
Comment 8 Jackie Rosen 2014-02-16 17:44:03 UTC Comment hidden (spam)