Bug 23249

Summary: Epyc and other current AMD CPUs do not select the "haswell" platform subdirectory
Product: glibc Reporter: Allan Jensen <linux>
Component: libcAssignee: Florian Weimer <fweimer>
Status: RESOLVED FIXED    
Severity: enhancement CC: arthur200126, david.abdurachmanov, drepper.fsp, fweimer, josephriches, pmenzel+sourceware.org-bugzilla, tcl_de
Priority: P2 Flags: fweimer: security-
Version: 2.27   
Target Milestone: 2.33   
Host: Target:
Build: Last reconfirmed: 2020-04-22 00:00:00
Attachments: What I think the discussion is going for

Description Allan Jensen 2018-05-30 11:15:55 UTC
Recently a "haswell" sub-arch was introduced to be similar to the old i686 subarch for x86. It is documented as requiring BMI1, BMI2, LZCNT, MOVBE, POPCNT, AVX2 and FMA, but undocumented also checks the CPU is an Intel CPU before using the faster paths.

Considering this is very similar to the old scandal of the Intel compiler, I would suggest glibc fixes that before it becomes public knowledge.
Comment 1 Florian Weimer 2018-05-31 11:27:29 UTC
We fix performance issues as they are identified, see bug 19467 for an example.

However, changes in this area require a deep understanding of CPU architecture (and, preferably, future plans, so that deployed code does not take a performance hit when switching to newer CPUs).  Just because something is implemented at the instruction set level doesn't mean the implementation is efficient.  For example, the first CPU generation with wider vector registers often still uses old, narrower execution units, so not using the wider registers can be more efficient.
Comment 2 Allan Jensen 2018-05-31 14:48:41 UTC
What does that comment have to do with the fact that you are providing the optimized path to all Intel CPUs including future ones if they implement the necessary extensions, but deliberately do not give AMD processors the same fast path?

Current gen AMD cpus are more likely to benefit from Haswell optimized binaries than some future Intel chip, or mobile Intel chip.
Comment 3 Allan Jensen 2018-05-31 14:52:21 UTC
And please note that I am talking about the picking up of 3rdparty libraries would potentially to use this new sub-arch dir 'lib64/haswell', I am not talking about glibc internal CPU scheduling which is completely separate and very much optimized for every arch.

It is a specific line of code that is wrong here cpu_features.c:400.
Comment 4 Florian Weimer 2018-06-01 09:03:56 UTC
Okay, fell free to bring this up on libc-alpha.  Let's see what the Intel and AMD maintainers think.
Comment 5 Florian Weimer 2018-09-26 17:05:05 UTC
FWIW, I verified that with glibc 2.28, the "haswell" subdirectories are not searched on an AMD EPYC 7251 CPU.
Comment 6 Florian Weimer 2019-09-09 07:15:15 UTC
*** Bug 24979 has been marked as a duplicate of this bug. ***
Comment 7 Joey Riches 2020-02-25 16:26:31 UTC
Sorry to necro an old thread but I wanted to know if there was any more discussion around this?

I patched cpu-features.c to allow arch_kind_amd cpus to match the haswell platform definition and have been getting good very good performance results with the glibc-bench benchmarks as well as a haswell optimized openblas which in turn improved results for R and octave (on a znver2 cpu).

It seems intel as already has done most of the hard work for amd here?
Comment 8 Florian Weimer 2020-02-25 16:33:18 UTC
(In reply to Joey Riches from comment #7)
> Sorry to necro an old thread but I wanted to know if there was any more
> discussion around this?

I'm still blocked until I get definitive feedback from AMD.
Comment 9 Florian Weimer 2020-04-22 15:41:32 UTC
I'm happy to report that I've been in contact with the right people at AMD for a while.

I do not know yet what the exact outcome will be (if the “haswell” directory will be used), but there will be a way to automatically load AVX2-optimized libraries on AMD CPUs as well.
Comment 10 Florian Weimer 2020-05-08 18:43:04 UTC
I surveyed the existing code and wrote a summary:

hwcaps subdirectory selection in the dynamic loader
<https://sourceware.org/pipermail/libc-alpha/2020-May/113757.html>
Comment 11 Mingye Wang 2020-05-10 00:41:46 UTC
I agree that a "generational" revision scheme would be helpful in the long run. For that information we can consult compiler databases (gcc/gcc/common/config/i386/i386-common.c + gcc/config/i386/i386.h or llvm/clang/include/clang/Basic/X86Target.def) and make some sort of table to work with.

The LLVM project's version seems to already imply a generational scheme, although they appear to be taking a lot of liberties by eliding a lot of the stuff they don't use. They also don't seem to care about less-used stuff like VIA.

The GCC database does include the VIA CPUs, but the part about eden-x4 not having any sort of AVX is a bit dubious. Hell, their documentation mentions it having AVX2...
Comment 12 tcl_de 2020-06-24 21:55:55 UTC
Regarding VIA, the GCC onlinedocs say that eden-x4 supports AVX and AVX2, but as you have pointed out, they are not enabled in GCC's i386-common.c. I have had a look at CPUID dump from the instlatx64 project for an eden-x4 (see http://users.atw.hu/instlatx64/CentaurHauls/CentaurHauls00006FE_CNR_Isaiah_CPUID.txt ) and if I decode it correctly, the GCC onlinedocs also miss some supported instruction sets: MOVBE, POPCNT, AES, PCLMUL, FSGSBASE, RDRND, BMI, BMI2 and F16C.
This means that all instruction sets listed for Haswell should be supported except for FMA.
As far as I can tell, if RDRND is also removed from the list, this is the least common denominator among all AVX2 cpus (technically bdver4 supports RDRND, but Linux disables it by default due to buggy BIOS support, see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c49a0a80137c7ca7d6ced4c812c9e07a949f6f24)
Comment 13 Mingye Wang 2020-06-26 15:27:23 UTC
> Regarding VIA, the GCC onlinedocs say that eden-x4 supports AVX and AVX2, but as you have pointed out, they are not enabled in GCC's i386-common.c.

I have reported the problem to gcc as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95030. Let me forward your cpuid links.
Comment 14 tcl_de 2020-06-26 19:29:46 UTC
(In reply to Mingye Wang from comment #13)
> > Regarding VIA, the GCC onlinedocs say that eden-x4 supports AVX and AVX2, but as you have pointed out, they are not enabled in GCC's i386-common.c.
> 
> I have reported the problem to gcc as
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95030. Let me forward your
> cpuid links.

Thanks for forwarding the information. I just realized that when I had a look at the CPUID dump, I overlook something, as I only focussed on the instruction sets list for haswell, but the eden-x4 actually supports a few more that are not in that list: PREFETCHW, RDSEED, ADX, ABM, XSAVE, and maybe also XSAVEOPT and XSAVEC, but I am not sure about the latter two.
Comment 15 Mingye Wang 2020-08-31 05:03:12 UTC
Created attachment 12811 [details]
What I think the discussion is going for
Comment 16 Florian Weimer 2020-08-31 13:13:52 UTC
Sorry, I didn't update the ticket. I think the consensus is not to make further changes to the existing hwcaps mechanism due to the issues mentioned in comment 10.

Instead, we are focusing on a new approach:

https://sourceware.org/pipermail/libc-alpha/2020-June/115250.html

The psABI document has been updated:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9a6b9396884b67c7c
Comment 17 Florian Weimer 2020-12-04 08:50:37 UTC
This is effectively fixed in glibc 2.33 by:

commit f267e1c9dd7fb8852cc32d6eafd96bbcfd5cbb2b
Author: Florian Weimer <fweimer@redhat.com>
Date:   Fri Dec 4 09:13:43 2020 +0100

    x86_64: Add glibc-hwcaps support
    
    The subdirectories match those in the x86-64 psABI:
    
    https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9a6b9396884b67c7c
    
    Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

Libraries for Epyc and other current AMD CPUs need to installed into glibc-hwcaps/x86-64-v3 subdirectory in order to be picked up.
Comment 18 Joey Riches 2021-01-03 19:19:53 UTC
Will glibc still load from from `/usr/lib64/haswell` or will that functionality be stripped out? 

Unfortunately x86_64-v3 includes XSAVE instructions which excludes haswell so moving to x86_64-v3 will be a performance regression for our haswell users.
Comment 19 Florian Weimer 2021-01-04 08:02:06 UTC
(In reply to Joey Riches from comment #18)
> Will glibc still load from from `/usr/lib64/haswell` or will that
> functionality be stripped out? 

No formal decision has been made about this.

> Unfortunately x86_64-v3 includes XSAVE instructions which excludes haswell
> so moving to x86_64-v3 will be a performance regression for our haswell
> users.

XSAVE is *required* for AVX register support. If a Haswell-based system does not support this feature, the kernel has been booted with the “noxsave” option, or a hypervisor has been misconfigured.
Comment 20 Joey Riches 2021-01-05 19:27:11 UTC
Thanks for the clarification, the confusion stemmed from gcc docs not listing xsave for haswell. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

Thanks again for all the work on this issue.
Comment 21 Florian Weimer 2021-01-05 20:03:29 UTC
(In reply to Joey Riches from comment #20)
> Thanks for the clarification, the confusion stemmed from gcc docs not
> listing xsave for haswell.
> https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

Ah, right, that's one of the issues raised in the sister bug 24080.