This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug math/21967] When 512-bit AVX2 wrapper functions in mathvec are used?

From: "andrew.n.senkevich at gmail dot com" <sourceware-bugzilla at sourceware dot org>
To: glibc-bugs at sourceware dot org
Date: Mon, 21 Aug 2017 11:09:56 +0000
Subject: [Bug math/21967] When 512-bit AVX2 wrapper functions in mathvec are used?
Auto-submitted: auto-generated
References: <bug-21967-131@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=21967

--- Comment #2 from Andrew Senkevich <andrew.n.senkevich at gmail dot com> ---
(In reply to H.J. Lu from comment #0)
> 521-bit AVX2 wrapper functions in mathvec, likesvml_d_log8_core-avx2.S, have
> 
> #define _ZGVeN8v_log _ZGVeN8v_log_avx2_wrapper
> #include "../svml_d_log8_core.S"
> 
> It is used by svml_d_log8_core.c with
> 
> static inline void *
> IFUNC_SELECTOR (void)
> {
>   const struct cpu_features* cpu_features = __get_cpu_features ();
> 
>   if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable))
>     return OPTIMIZE (skx);
> 
>   if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable))
>     return OPTIMIZE (knl);
> 
>   return OPTIMIZE (avx2_wrapper);
> }
> 
> So if AVX512 isn't available, _ZGVeN8v_log_avx2_wrapper will be used.  It
> is compiled into
> 
> 0000000000000000 <_ZGVeN8v_log_avx2_wrapper>:
>    0:	55                   	push   %rbp
>    1:	48 89 e5             	mov    %rsp,%rbp
>    4:	48 83 e4 c0          	and    $0xffffffffffffffc0,%rsp
>    8:	48 81 ec 80 00 00 00 	sub    $0x80,%rsp
>    f:	62 f1 7c 48 11 04 24 	vmovups %zmm0,(%rsp)
>   16:	c5 fd 10 04 24       	vmovupd (%rsp),%ymm0
>   1b:	e8 00 00 00 00       	callq  20 <_ZGVeN8v_log_avx2_wrapper+0x20>	1c:
> R_X86_64_PC32	__GI__ZGVdN4v_log-0x4
>   20:	c5 fd 11 44 24 40    	vmovupd %ymm0,0x40(%rsp)
>   26:	c5 fd 10 44 24 20    	vmovupd 0x20(%rsp),%ymm0
>   2c:	e8 00 00 00 00       	callq  31 <_ZGVeN8v_log_avx2_wrapper+0x31>	2d:
> R_X86_64_PC32	__GI__ZGVdN4v_log-0x4
>   31:	c5 fd 11 44 24 60    	vmovupd %ymm0,0x60(%rsp)
>   37:	62 f1 7c 48 10 44 24 01 	vmovups 0x40(%rsp),%zmm0
>   3f:	48 89 ec             	mov    %rbp,%rsp
>   42:	5d                   	pop    %rbp
>   43:	c3                   	retq   
> 
> But
> 
>    f:	62 f1 7c 48 11 04 24 	vmovups %zmm0,(%rsp)
> 
> is an AVX512F instruction.  How is it supposed to work?

Using of avx2_wrapper in avx512 ifunc is logic bug. Looks like where are no
possibility to extract hi part of zmm w/o AVX512 instuction.
Mathvec runtime tests has runtime ISA check, so those illegal instruction
wasn't catched on testing.

As I remember those wrappers were originally created for 2 use cases: Glibc
build with AS not supporting AVX512 and for Glibc build with
--disable-multi-arch.

AVX512 mathvec functions are called only if binary compiled for AVX512 ISA.
So if such a binary run on HW w/o AVX512 it will fail earlier than inside
vector function.

I think we need simply remove 
>   return OPTIMIZE (avx2_wrapper);

-- 
You are receiving this mail because:
You are on the CC list for the bug.

References:
- [Bug math/21967] New: When 512-bit AVX2 wrapper functions in mathvec are used?
  - From: hjl.tools at gmail dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]