This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
> -----Original Message----- > From: Carlos O'Donell <carlos@redhat.com> > Sent: Friday, July 6, 2018 1:40 AM > To: Pawar, Amit <Amit.Pawar@amd.com>; libc-alpha@sourceware.org > Cc: H.J. Lu <hjl.tools@gmail.com> > Subject: Re: Enable AVX_Fast_Unaligned_Load by default for Zen > > On 07/02/2018 05:56 AM, Pawar, Amit wrote: > > Hi all, > > > > Attached patch enables AVX_Fast_Unaligned_Load flag by default for Zen > and Zen+ except Excavator case. Identifying this flag is moved to common > path now. > > > > OK to commit ? if so please do it from my side. > > > > Thanks > > Amit Pawar > > > > > > 0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch > > > > > > From d751c3ba0b98bbdc0c7363980eacb7b29657cb59 Mon Sep 17 00:00:00 > 2001 > > From: Amit Pawar <Amit.Pawar@amd.com> > > Date: Mon, 2 Jul 2018 14:52:36 +0530 > > Subject: [PATCH] Preferring AVX_Fast_Unaligned_Load as default from > Zen. > > > > From Zen onwards this will be enabled. It was disabled for Excavator > > case and same will unchanged. > > > > * sysdeps/x86/cpu-features.c (get_common_indeces): > > AVX_Fast_Unaligned_Load is enabled when AVX2 is detected. > > * sysdeps/x86/cpu-features.c (init_cpu_features): > > AVX_Fast_Unaligned_Load is disabled for Excavator core. > > --- > > sysdeps/x86/cpu-features.c | 16 +++++++++++----- > > 1 file changed, 11 insertions(+), 5 deletions(-) > > > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > > index 0fc3674..01525db 100644 > > --- a/sysdeps/x86/cpu-features.c > > +++ b/sysdeps/x86/cpu-features.c > > @@ -78,8 +78,15 @@ get_common_indeces (struct cpu_features > *cpu_features, > > /* The following features depend on AVX being usable. */ > > /* Determine if AVX2 is usable. */ > > if (CPU_FEATURES_CPU_P (cpu_features, AVX2)) > > + { > > cpu_features->feature[index_arch_AVX2_Usable] > > |= bit_arch_AVX2_Usable; > > + > > + /* Unaligned load with 256-bit AVX registers are faster on > > + * Intel/AMD processors with AVX2. */ > > + cpu_features- > >feature[index_arch_AVX_Fast_Unaligned_Load] > > + |= bit_arch_AVX_Fast_Unaligned_Load; > > OK. This moves the check to the common path, so Intel and AMD checks can > use the feature. > > > + } > > /* Determine if FMA is usable. */ > > if (CPU_FEATURES_CPU_P (cpu_features, FMA)) > > cpu_features->feature[index_arch_FMA_Usable] > > @@ -298,11 +305,6 @@ init_cpu_features (struct cpu_features > *cpu_features) > > } > > } > > > > - /* Unaligned load with 256-bit AVX registers are faster on > > - Intel processors with AVX2. */ > > - if (CPU_FEATURES_ARCH_P (cpu_features, AVX2_Usable)) > > - cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load] > > - |= bit_arch_AVX_Fast_Unaligned_Load; > > OK. Removes it from the Intel-only path. > > > > > /* Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER > > if AVX512ER is available. Don't use AVX512 to avoid lower > > CPU @@ -354,6 +356,10 @@ init_cpu_features (struct cpu_features > *cpu_features) > > cpu_features->feature[index_arch_Fast_Unaligned_Load] > > |= (bit_arch_Fast_Unaligned_Load > > | bit_arch_Fast_Copy_Backward); > > + > > + /* Unaligned AVX loads are slower.*/ > > + cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load] > > + &= ~bit_arch_AVX_Fast_Unaligned_Load; > > > This disables it for all family == 0x15. > > Existing code notes Excavator is model >= 0x06 && model <= 0x7f. > > If that's the case then the above code should be in a new set of brackets. Thanks for reviewing and only Excavator model supports AVX2 under family == 0x15. > > e.g. > > 352 /* "Excavator" */ > 353 if (model >= 0x60 && model <= 0x7f) > { > > 354 cpu_features->feature[index_arch_Fast_Unaligned_Load] > 355 |= (bit_arch_Fast_Unaligned_Load > 356 | bit_arch_Fast_Copy_Backward); > > /* Unaligned AVX loads are slower.*/ > cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load] > &= ~bit_arch_AVX_Fast_Unaligned_Load; > } > > Can you confirm that this is correct? Yes, this is required and new attached patch contains the required change. > > Which set of conditionals identifies Excavator? This model range under family == 0x15 is only for Excavator core. Is OK to commit ? > > > } > > } > > else > > -- 2.7.4 > > > -- > Cheers, > Carlos.
Attachment:
0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch
Description: 0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch
Attachment:
ChangeLog
Description: ChangeLog
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |