This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Enable AVX_Fast_Unaligned_Load by default for Zen



> -----Original Message-----
> From: Carlos O'Donell <carlos@redhat.com>
> Sent: Friday, July 6, 2018 1:40 AM
> To: Pawar, Amit <Amit.Pawar@amd.com>; libc-alpha@sourceware.org
> Cc: H.J. Lu <hjl.tools@gmail.com>
> Subject: Re: Enable AVX_Fast_Unaligned_Load by default for Zen
> 
> On 07/02/2018 05:56 AM, Pawar, Amit wrote:
> > Hi all,
> >
> > Attached patch enables AVX_Fast_Unaligned_Load flag by default for Zen
> and Zen+ except Excavator case. Identifying this flag is moved to common
> path now.
> >
> > OK to commit ? if so please do it from my side.
> >
> > Thanks
> > Amit Pawar
> >
> >
> > 0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch
> >
> >
> > From d751c3ba0b98bbdc0c7363980eacb7b29657cb59 Mon Sep 17 00:00:00
> 2001
> > From: Amit Pawar <Amit.Pawar@amd.com>
> > Date: Mon, 2 Jul 2018 14:52:36 +0530
> > Subject: [PATCH] Preferring AVX_Fast_Unaligned_Load as default from
> Zen.
> >
> > From Zen onwards this will be enabled. It was disabled for Excavator
> > case and same will unchanged.
> >
> > 	* sysdeps/x86/cpu-features.c (get_common_indeces):
> > 	AVX_Fast_Unaligned_Load is enabled when AVX2 is detected.
> > 	* sysdeps/x86/cpu-features.c (init_cpu_features):
> > 	AVX_Fast_Unaligned_Load is disabled for Excavator core.
> > ---
> >  sysdeps/x86/cpu-features.c | 16 +++++++++++-----
> >  1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> > index 0fc3674..01525db 100644
> > --- a/sysdeps/x86/cpu-features.c
> > +++ b/sysdeps/x86/cpu-features.c
> > @@ -78,8 +78,15 @@ get_common_indeces (struct cpu_features
> *cpu_features,
> >  	      /* The following features depend on AVX being usable.  */
> >  	      /* Determine if AVX2 is usable.  */
> >  	      if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> > +	      {
> >  		cpu_features->feature[index_arch_AVX2_Usable]
> >  		  |= bit_arch_AVX2_Usable;
> > +
> > +                /* Unaligned load with 256-bit AVX registers are faster on
> > +		 * Intel/AMD processors with AVX2.  */
> > +		cpu_features-
> >feature[index_arch_AVX_Fast_Unaligned_Load]
> > +		  |= bit_arch_AVX_Fast_Unaligned_Load;
> 
> OK. This moves the check to the common path, so Intel and AMD checks can
> use the feature.
> 
> > +	      }
> >  	      /* Determine if FMA is usable.  */
> >  	      if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> >  		cpu_features->feature[index_arch_FMA_Usable]
> > @@ -298,11 +305,6 @@ init_cpu_features (struct cpu_features
> *cpu_features)
> >  	    }
> >  	}
> >
> > -      /* Unaligned load with 256-bit AVX registers are faster on
> > -	 Intel processors with AVX2.  */
> > -      if (CPU_FEATURES_ARCH_P (cpu_features, AVX2_Usable))
> > -	cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load]
> > -	  |= bit_arch_AVX_Fast_Unaligned_Load;
> 
> OK. Removes it from the Intel-only path.
> 
> >
> >        /* Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER
> >           if AVX512ER is available.  Don't use AVX512 to avoid lower
> > CPU @@ -354,6 +356,10 @@ init_cpu_features (struct cpu_features
> *cpu_features)
> >  	    cpu_features->feature[index_arch_Fast_Unaligned_Load]
> >  	      |= (bit_arch_Fast_Unaligned_Load
> >  		  | bit_arch_Fast_Copy_Backward);
> > +
> > +	    /* Unaligned AVX loads are slower.*/
> > +	    cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load]
> > +		  &= ~bit_arch_AVX_Fast_Unaligned_Load;
> 
> 
> This disables it for all family == 0x15.
> 
> Existing code notes Excavator is model >= 0x06 && model <= 0x7f.
> 
> If that's the case then the above code should be in  a new set of brackets.
Thanks for reviewing and only Excavator model supports AVX2 under family == 0x15.
> 
> e.g.
> 
> 352           /* "Excavator"   */
> 353           if (model >= 0x60 && model <= 0x7f)
>                 {
> 
> 354               cpu_features->feature[index_arch_Fast_Unaligned_Load]
> 355			|= (bit_arch_Fast_Unaligned_Load
> 356			| bit_arch_Fast_Copy_Backward);
> 
>                   /* Unaligned AVX loads are slower.*/
>                   cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load]
> 			&= ~bit_arch_AVX_Fast_Unaligned_Load;
>                 }
> 
> Can you confirm that this is correct?
Yes, this is required and new attached patch contains the required change.
> 
> Which set of conditionals identifies Excavator?
This model range under family == 0x15 is only for Excavator core.
Is OK to commit ?
> 
> >  	}
> >      }
> >    else
> > -- 2.7.4
> 
> 
> --
> Cheers,
> Carlos.

Attachment: 0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch
Description: 0001-Preferring-AVX_Fast_Unaligned_Load-as-default-from-Z.patch

Attachment: ChangeLog
Description: ChangeLog


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]