This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] x86_64: memcpy/memmove family optimized with AVX512

From: "Carlos O'Donell" <carlos at redhat dot com>
To: "Senkevich, Andrew" <andrew dot senkevich at intel dot com>, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
Cc: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, "Carlos O'Donell" <codonell at redhat dot com>
Date: Wed, 13 Jan 2016 13:22:22 -0500
Subject: Re: [PATCH] x86_64: memcpy/memmove family optimized with AVX512
Authentication-results: sourceware.org; auth=none
References: <569547D3 dot 1070002 at linaro dot org> <D373487E0338A646909492FF43BA8BE32C436954 at CDSMSX102 dot ccr dot corp dot intel dot com>

On 01/13/2016 01:10 PM, Senkevich, Andrew wrote:
>> On 12-01-2016 12:13, Andrew Senkevich wrote:
>>> Hi,
>>> 
>>> here is AVX512 implementations of memcpy, mempcpy, memmove, 
>>> memcpy_chk, mempcpy_chk, memmove_chk. It shows average
>>> improvement more than 30% over AVX versions on KNL hardware,
>>> performance results attached. Ok for trunk?
>> 
>> It is too late for 2.23, but ok after review for 2.24.
> 
> We would like this patch to be considered for glibc 2.23 since the
> functionality completes AVX-512 improvements of mem* routines. Memset
> tuned for AVX-512 is already checked in so it looks reasonable to
> have full support in 2.23. Also the changes are strongly AVX-512
> specific, not adding any new interfaces so potential risk of the
> patch is pretty low.
> 
> We already got review comments without any major questions to the
> patch and fixed version will be ready today.
> 
> Given all this can the patch go to current glibc trunk after review
> is finished?

I've complete my review and the patches look good. I had one question
which should not block acceptance for 2.23 (wanted to know about the
scalability of AVX512 across threads in a process and how the routines
perform versus the others on a scalability perspective e.g. some
functional groups might not scale well due to interlocks and other
reasons).

Given that this work is half complete I think we should accept Intel's
patches and complete the work for 2.23.

A NEWS note is required to highlight the new support for AVX512.

Cheers,
Carlos.

References:
- Re: [PATCH] x86_64: memcpy/memmove family optimized with AVX512
  - From: Adhemerval Zanella
- RE: [PATCH] x86_64: memcpy/memmove family optimized with AVX512
  - From: Senkevich, Andrew

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]