This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][X86_64] Set bit_Fast_Unaligned_Load for Excavator family CPU's


On 12/16/2015 05:03 AM, Pawar, Amit wrote:
> For latest AMD Excavator family 0x15h CPU's, it is found that by
> running GLIBC benchtest SSE2_Unaligned based implementations are
> performing faster compare to SSE2 based implementations for routines
> mentioned below. Based on this observation bit_Fast_Unaligned_Load
> flag is set for Excavator family CPU's. This makes SSE2_Unaligned
> implementations as default for these routines.
> 
> Routines: strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy and strstr. 
> 
> Is attached patch OK for trunk ?

The implementation looks correct, but it needs some cleanup.

Next steps:

(1) Please review the contribution checklist:
https://sourceware.org/glibc/wiki/Contribution%20checklist

(2) This is a user visible change and needs a bug filed for it.
    File a bug please. Once you fix the bug you mark the bug fixed
    and set the milestone to the devel release you fixed it in e.g.
    if you fix it now, you set the milestone to 2.23.
    Include your bug number as requested below.

(3) Cleanup nits e.g. unused ecx, wrong indentation etc.

(4) Repost v2 and indicate changes from v1.

> From dba107d0483fff9a1440216fdb17a0a4cab28bfa Mon Sep 17 00:00:00 2001
> From: Amit Pawar <Amit.Pawar@amd.com>
> Date: Wed, 16 Dec 2015 15:23:24 +0530
> Subject: [PATCH] X86_64: Set bit_Fast_Unaligned_Load for family 0x15h
>  Excavator CPU's.

Add "(Bug XXXX)" to subject when you have a bug number.

> 
> GLIBC benchtest testcases shows SSE2_Unaligned based implementations
> are performing faster compare to SSE2 based implementations for
> routines: strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy
> and strstr. Flag bit_Fast_Unaligned_Load is set for Excavator family
> 0x15h CPU's. This makes SSE2_Unaligned based implementations as
> default for these routines.
> 
> 	* sysdeps/x86/cpu-features.c (init_cpu_features):
> 	Set bit_Fast_Unaligned_Load flag for Excavator family 0x15h CPU's.
> 	This makes SSE2_Unaligned version as default for string routines like
> 	strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy and strstr.

Just the canonical changes:

	* sysdeps/x86/cpu-features.c (init_cpu_features): If CPU is Excavator
	then set index_Fast_Unaligned_Load.

Your comment above includes all the information you need.

> ---
>  ChangeLog                  |  7 +++++++
>  sysdeps/x86/cpu-features.c | 11 +++++++++++
>  2 files changed, 18 insertions(+)
> 

Add the BZ # to the ChangeLog.

> +2015-12-16  Amit Pawar  <amit.pawar@amd.com>
> +

	[BZ #XXXX]
> +	* sysdeps/x86/cpu-features.c (init_cpu_features):
> +	Set bit_Fast_Unaligned_Load flag for Excavator family 0x15h CPU's.
> +	This makes SSE2_Unaligned version as default for string routines like
> +	strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy and strstr.

Again, just the canonical changes:

	* sysdeps/x86/cpu-features.c (init_cpu_features): If CPU is Excavator
	then set index_Fast_Unaligned_Load.

Your git commit message should include these details, but the ChangeLog should
be more terse to help people find related changes.

> +
>  2015-12-15  H.J. Lu  <hongjiu.lu@intel.com>
>  
>  	[BZ #19367]
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index aff894c..cedbc45 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -152,6 +152,17 @@ init_cpu_features (struct cpu_features *cpu_features)
>  		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx,
>  		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
>  		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
> +
> +      if (family == 0x15)
> +	{
> +	  ecx = cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx;

Why set ecx? What uses it?

> +
> +	  /* "Excavator"   */
> +	  if (model >= 0x60 && model <= 0x7f)
> +	      cpu_features->feature[index_Fast_Unaligned_Load]
> +		|= bit_Fast_Unaligned_Load;

Wrong indentation?

Should be (with leading indentation removed):

if (model >= 0x60 && model <= 0x7f)
  cpu_features->feature[index_Fast_Unaligned_Load]
    |= bit_Fast_Unaligned_Load;

And then indent that as required for the braces.

> +
> +	}
>      }
>    else
>      kind = arch_kind_other;
> -- 1.8.3.1

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]