This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

mempcpy performance.

From: OndÅej BÃlka <neleai at seznam dot cz>
To: Wilco Dijkstra <wdijkstr at arm dot com>
Cc: eggert at cs dot ucla dot edu, libc-alpha at sourceware dot org
Date: Fri, 19 Dec 2014 23:06:14 +0100
Subject: mempcpy performance.
Authentication-results: sourceware.org; auth=none
References: <000f01d01ad0$17e381d0$47aa8570$ at com>

On Thu, Dec 18, 2014 at 02:37:04PM -0000, Wilco Dijkstra wrote:
> OndÅej BÃlka wrote:
> >> On Tue, Dec 16, 2014 at 12:50:38PM -0800, Paul Eggert wrote:
> >> Thanks, this is much better than worrying about how to pacify GCC.
> >> The code could be made a bit shorter and clearer with mempcpy, and
> >> there's no longer any need to distinguish between s and s1, so I
> >> suggest the following minor rewrite, which shrinks the code size by
> >> another 26 bytes (16%) on my x86-64 platform.
> > 
> >> char *
> >> STRNCAT (char *s1, const char *s2, size_t n)
> >> {
> >>   char *s1_end = mempcpy (s1 + strlen (s1), s2, __strnlen (s2, n));
> >>   *s1_end = '\0';
> >>   return s1;
> >> }
> > That looks better (with minor fix s/mempcpy/__mempcpy/), anybody objects
> > using this instead?
> 
> I don't think mempcpy is better, few targets define it (ARM, MIPS, AARCH64
> don't for example), so it means an extra call and return. Mempcpy is also 
> non-standard and rarely used, so would not be cache resident even if you 
> have an optimized implementation.
> 
Bit off topic,

I have on my todo list fix that, by on architectures without assembly
change definition to

#define mempcpy(dest, src, n) (memcpy (dest, src, n) + n)

which would remove extra call and possibly allow extra compiler
optimizations.

With assembly implementation cache residency is not big problem, most of
time its just few instruction that set up return value and jump to
memcpy after setting return value.

I have a patch that does that instead wasting space that was unreviewed
for year, and it also needs to be fixed on i386 and powerpc

Follow-Ups:
- RE: mempcpy performance.
  - From: Wilco Dijkstra

References:
- Re: [PATCH] Simplify strncat.
  - From: Wilco Dijkstra

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]