This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

mempcpy performance.

On Thu, Dec 18, 2014 at 02:37:04PM -0000, Wilco Dijkstra wrote:
> OndÅej BÃlka wrote:
> >> On Tue, Dec 16, 2014 at 12:50:38PM -0800, Paul Eggert wrote:
> >> Thanks, this is much better than worrying about how to pacify GCC.
> >> The code could be made a bit shorter and clearer with mempcpy, and
> >> there's no longer any need to distinguish between s and s1, so I
> >> suggest the following minor rewrite, which shrinks the code size by
> >> another 26 bytes (16%) on my x86-64 platform.
> > 
> >> char *
> >> STRNCAT (char *s1, const char *s2, size_t n)
> >> {
> >>   char *s1_end = mempcpy (s1 + strlen (s1), s2, __strnlen (s2, n));
> >>   *s1_end = '\0';
> >>   return s1;
> >> }
> > That looks better (with minor fix s/mempcpy/__mempcpy/), anybody objects
> > using this instead?
> I don't think mempcpy is better, few targets define it (ARM, MIPS, AARCH64
> don't for example), so it means an extra call and return. Mempcpy is also 
> non-standard and rarely used, so would not be cache resident even if you 
> have an optimized implementation.
Bit off topic,

I have on my todo list fix that, by on architectures without assembly
change definition to

#define mempcpy(dest, src, n) (memcpy (dest, src, n) + n)

which would remove extra call and possibly allow extra compiler

With assembly implementation cache residency is not big problem, most of
time its just few instruction that set up return value and jump to
memcpy after setting return value.

I have a patch that does that instead wasting space that was unreviewed
for year, and it also needs to be fixed on i386 and powerpc

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]