This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]
- From: Stefan Liebler <stli at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 27 Apr 2016 10:14:56 +0200
- Subject: Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]
- Authentication-results: sourceware.org; auth=none
- References: <1461672469-2107-1-git-send-email-stli at linux dot vnet dot ibm dot com> <1461672469-2107-4-git-send-email-stli at linux dot vnet dot ibm dot com> <571F6E07 dot 6010403 at linaro dot org>
On 04/26/2016 03:32 PM, Adhemerval Zanella wrote:
On 26/04/2016 09:07, Stefan Liebler wrote:
There exist optimized memcpy functions on s390, but no optimized mempcpy.
This patch adds mempcpy entry points in memcpy.S files, which
use the memcpy implementation. Now mempcpy itself is also an IFUNC function
as memcpy is and the variants are listed in ifunc-impl-list.c.
The s390 string.h defines _HAVE_STRING_ARCH_mempcpy and __mempcpy/mempcpy
definitions. If n is constant and small enough, __mempcpy/mempcpy is an inlined
function, which uses memcpy and returns dest + n. In this constant case,
GCC emits instructions like mvi or mvc and avoids the function call to memcpy.
If n is not constant, then __mempcpy is called.
AFAIk GLIBC does not have any ports that implements mempcpy and not
memcpy and IMHO the inline version to convert mempcpy to memcpy is
always a gain.
So I would suggest to just avoid adding arch-specific bits and move the
mempcpy inline optimization at string/string.h to string/string2.h as
--
#define mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)
#define __mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)
__extern_always_inline void *
__mempcpy_inline (void *__restrict __dest,
const void *__restrict __src, size_t __n)
{
return (char *) memcpy (__dest, __src, __n) + __n;
}
--
This is convert any mempcpy to mempcy plus length and provide
the symbol only for older binaries.
According to the current behaviour on s390, a call to memcpy with
constant byte-length below 65536 is done by emitting mvc, mvi or
a suitable instruction sequence.
If gcc don't know about a constant byte-length, then memcpy function
in glibc is called.
For mempcpy cascades like in iconv/gconv_conf.c:
466: __mempcpy (__mempcpy (__mempcpy (gconv_path, __gconv_path_envvar,
user_len),
":", 1),
default_gconv_path, sizeof (default_gconv_path));
The current behaviour is to perform a call to memcpy(), add user_len to
the return value of the call and use mvi, mvc for the two calls with
constant byte-lengths.
If only _HAVE_STRING_ARCH_mempcpy is defined, there would be three calls
to mempcpy. Here you are right, this is not what you want.
The behaviour with this patch is to call mempcpy() once and use mvi, mvc
directly without the need to add user_len after the mempcpy call.
This way, there is no need to store user_len in an extra register or on
stack.
I don't know the conditions on other archs, whether memcpy is inlined or
called, thus I added the s390-specific condition to s390-string.h.