This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/4] S390: Implement mempcpy with help of memcpy. [BZ #19765]


Adhemerval Zanella wrote:
> Right, but I *think* compiler would be smart enough to just avoid the extra spilling. 
> Take this example for instance [1], using GCC 5.3 for s390x I see no difference in
> generated assembly if I the strategy I proposed (-DMEMPCPY_TO_MEMCPY) to
> the s390 specific you are suggesting.  In the end, I am proposing that architecture
> specific micro-optimization should be avoid in favor of a more specific one.  
> Specially the one that tend to avoid one or two extra spilling based on quite complex
> macro expansion.  [1] http://pastie.org/10824072

You need to use something like this to show the difference:

return __mempcpy (__mempcpy (__mempcpy (p1, s, len), p2, 1), p3, 16);

GCC doesn't even optimize mempcpy of constant size (PR70140), so if you do have
an optimized mempcpy like s390 here, you *still* need to use memcpy for small immediate
sizes (so they get inlined), and only use mempcpy for unknown or very large sizes.

We end up having to do these header tricks because GCC doesn't implement mempcpy
as a first-class builtin or allow targets to defer to memcpy.

There are similar issues with strchr (s, 0) being used instead of the faster strlen (s) + s.

Wilco


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]