This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug math/21967] When 512-bit AVX2 wrapper functions in mathvec are used?
- From: "andrew.n.senkevich at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Mon, 21 Aug 2017 11:09:56 +0000
- Subject: [Bug math/21967] When 512-bit AVX2 wrapper functions in mathvec are used?
- Auto-submitted: auto-generated
- References: <bug-21967-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=21967
--- Comment #2 from Andrew Senkevich <andrew.n.senkevich at gmail dot com> ---
(In reply to H.J. Lu from comment #0)
> 521-bit AVX2 wrapper functions in mathvec, likesvml_d_log8_core-avx2.S, have
>
> #define _ZGVeN8v_log _ZGVeN8v_log_avx2_wrapper
> #include "../svml_d_log8_core.S"
>
> It is used by svml_d_log8_core.c with
>
> static inline void *
> IFUNC_SELECTOR (void)
> {
> const struct cpu_features* cpu_features = __get_cpu_features ();
>
> if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable))
> return OPTIMIZE (skx);
>
> if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable))
> return OPTIMIZE (knl);
>
> return OPTIMIZE (avx2_wrapper);
> }
>
> So if AVX512 isn't available, _ZGVeN8v_log_avx2_wrapper will be used. It
> is compiled into
>
> 0000000000000000 <_ZGVeN8v_log_avx2_wrapper>:
> 0: 55 push %rbp
> 1: 48 89 e5 mov %rsp,%rbp
> 4: 48 83 e4 c0 and $0xffffffffffffffc0,%rsp
> 8: 48 81 ec 80 00 00 00 sub $0x80,%rsp
> f: 62 f1 7c 48 11 04 24 vmovups %zmm0,(%rsp)
> 16: c5 fd 10 04 24 vmovupd (%rsp),%ymm0
> 1b: e8 00 00 00 00 callq 20 <_ZGVeN8v_log_avx2_wrapper+0x20> 1c:
> R_X86_64_PC32 __GI__ZGVdN4v_log-0x4
> 20: c5 fd 11 44 24 40 vmovupd %ymm0,0x40(%rsp)
> 26: c5 fd 10 44 24 20 vmovupd 0x20(%rsp),%ymm0
> 2c: e8 00 00 00 00 callq 31 <_ZGVeN8v_log_avx2_wrapper+0x31> 2d:
> R_X86_64_PC32 __GI__ZGVdN4v_log-0x4
> 31: c5 fd 11 44 24 60 vmovupd %ymm0,0x60(%rsp)
> 37: 62 f1 7c 48 10 44 24 01 vmovups 0x40(%rsp),%zmm0
> 3f: 48 89 ec mov %rbp,%rsp
> 42: 5d pop %rbp
> 43: c3 retq
>
> But
>
> f: 62 f1 7c 48 11 04 24 vmovups %zmm0,(%rsp)
>
> is an AVX512F instruction. How is it supposed to work?
Using of avx2_wrapper in avx512 ifunc is logic bug. Looks like where are no
possibility to extract hi part of zmm w/o AVX512 instuction.
Mathvec runtime tests has runtime ISA check, so those illegal instruction
wasn't catched on testing.
As I remember those wrappers were originally created for 2 use cases: Glibc
build with AS not supporting AVX512 and for Glibc build with
--disable-multi-arch.
AVX512 mathvec functions are called only if binary compiled for AVX512 ISA.
So if such a binary run on HW w/o AVX512 it will fail earlier than inside
vector function.
I think we need simply remove
> return OPTIMIZE (avx2_wrapper);
--
You are receiving this mail because:
You are on the CC list for the bug.