This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH 0/6] PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY


ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return.  Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions.   The
optimization is attractive.  The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.

This patch series is the glibc part of this optimization, the checks
in ld.so necessary to ensure that functions with st_other localentry
non-zero are not called by code expecting localentry:0.  Also, lots of
powerpc64 glibc assembly didn't use the proper localentry:0
designation for functions that don't use r2, so the series fixes that
too, and some other assorted problems I noticed along the way.  Many
of the mem and str functions benefit.

Note that building a multiarch glibc kills this optimization for most
functions.  IFUNCs can't be called with the optimized stub without
changing the ABI.

Regression tested on powerpc64le using --with-cpu=power8,
--with-cpu=power7, and --with-cpu=power8 --enable-multiarch.

Alan Modra (6):
  PowerPC64, fix calls to _mcount
  PowerPC64 FRAME_PARM_SAVE
  PowerPC64 sysdep.h tidy
  PowerPC64 strncpy, stpncpy and strstr fixes
  PowerPC64 ENTRY_TOCLESS
  PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY

 ChangeLog                                          | 195 +++++++++++++++++++
 elf/dl-runtime.c                                   |   3 +-
 elf/elf.h                                          |   3 +-
 elf/testobj6.c                                     |   3 +
 sysdeps/aarch64/dl-machine.h                       |   1 +
 sysdeps/alpha/dl-machine.h                         |   5 +-
 sysdeps/arm/dl-machine.h                           |   1 +
 sysdeps/generic/dl-machine.h                       |   7 +-
 sysdeps/hppa/dl-machine.h                          |   5 +-
 sysdeps/i386/dl-machine.h                          |   1 +
 sysdeps/ia64/dl-machine.h                          |   3 +-
 sysdeps/m68k/dl-machine.h                          |   1 +
 sysdeps/microblaze/dl-machine.h                    |   1 +
 sysdeps/mips/dl-machine.h                          |   1 +
 sysdeps/nios2/dl-machine.h                         |   1 +
 sysdeps/powerpc/powerpc32/dl-machine.h             |   1 +
 sysdeps/powerpc/powerpc64/a2/memcpy.S              |   2 +-
 sysdeps/powerpc/powerpc64/addmul_1.S               |   2 +-
 sysdeps/powerpc/powerpc64/cell/memcpy.S            |   2 +-
 sysdeps/powerpc/powerpc64/dl-machine.c             |  22 ++-
 sysdeps/powerpc/powerpc64/dl-machine.h             |  54 +++---
 sysdeps/powerpc/powerpc64/dl-trampoline.S          |   4 +-
 sysdeps/powerpc/powerpc64/fpu/s_ceil.S             |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_ceilf.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_copysign.S         |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_copysignl.S        |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_fabsl.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_floor.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_floorf.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_isnan.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_llrint.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_llrintf.S          |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S        |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S       |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_rint.S             |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_rintf.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_round.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_roundf.S           |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_trunc.S            |   2 +-
 sysdeps/powerpc/powerpc64/fpu/s_truncf.S           |   2 +-
 sysdeps/powerpc/powerpc64/lshift.S                 |   2 +-
 sysdeps/powerpc/powerpc64/memcpy.S                 |   2 +-
 sysdeps/powerpc/powerpc64/memset.S                 |   2 +-
 sysdeps/powerpc/powerpc64/mul_1.S                  |   2 +-
 .../powerpc/powerpc64/multiarch/stpncpy-power7.S   |   3 +
 .../powerpc/powerpc64/multiarch/stpncpy-power8.S   |   5 +
 .../powerpc/powerpc64/multiarch/strncpy-power7.S   |   3 +
 .../powerpc/powerpc64/multiarch/strncpy-power8.S   |   3 +
 .../powerpc/powerpc64/multiarch/strrchr-power8.S   |  21 +-
 .../powerpc/powerpc64/multiarch/strstr-power7.S    |   5 +
 sysdeps/powerpc/powerpc64/power4/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power4/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power4/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power4/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S     |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S  |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S    |   2 +-
 sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S   |   2 +-
 sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S  |   2 +-
 sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power6/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power6/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S    |   2 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S   |   2 +-
 sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S  |   2 +-
 sysdeps/powerpc/powerpc64/power7/add_n.S           |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S    |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S     |   2 +-
 sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power7/memchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/memmove.S         |   4 +-
 sysdeps/powerpc/powerpc64/power7/mempcpy.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/memrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power7/rawmemchr.S       |   2 +-
 sysdeps/powerpc/powerpc64/power7/strcasecmp.S      |   3 +-
 sysdeps/powerpc/powerpc64/power7/strchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strchrnul.S       |   2 +-
 sysdeps/powerpc/powerpc64/power7/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strlen.S          |   2 +-
 sysdeps/powerpc/powerpc64/power7/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strncpy.S         |   9 +-
 sysdeps/powerpc/powerpc64/power7/strnlen.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power7/strstr.S          |  16 +-
 sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S    |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S     |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S     |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S    |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S   |   2 +-
 sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/memcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/memset.S          |   4 +-
 sysdeps/powerpc/powerpc64/power8/strcasestr.S      |   2 +-
 sysdeps/powerpc/powerpc64/power8/strchr.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strcpy.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strlen.S          |   2 +-
 sysdeps/powerpc/powerpc64/power8/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strncpy.S         |  10 +-
 sysdeps/powerpc/powerpc64/power8/strnlen.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strrchr.S         |   2 +-
 sysdeps/powerpc/powerpc64/power8/strspn.S          |   2 +-
 sysdeps/powerpc/powerpc64/power9/strcmp.S          |   2 +-
 sysdeps/powerpc/powerpc64/power9/strncmp.S         |   2 +-
 sysdeps/powerpc/powerpc64/ppc-mcount.S             |   4 +-
 sysdeps/powerpc/powerpc64/start.S                  |   4 +-
 sysdeps/powerpc/powerpc64/strchr.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strcmp.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strlen.S                 |   2 +-
 sysdeps/powerpc/powerpc64/strncmp.S                |   2 +-
 sysdeps/powerpc/powerpc64/sysdep.h                 | 213 ++++++++++-----------
 sysdeps/s390/s390-32/dl-machine.h                  |   1 +
 sysdeps/s390/s390-64/dl-machine.h                  |   1 +
 sysdeps/sh/dl-machine.h                            |   1 +
 sysdeps/sparc/sparc32/dl-machine.h                 |   1 +
 sysdeps/sparc/sparc64/dl-machine.h                 |   1 +
 sysdeps/tile/dl-machine.h                          |   3 +-
 .../sysv/linux/powerpc/powerpc64/makecontext.S     |  26 +--
 sysdeps/x86_64/dl-machine.h                        |   1 +
 130 files changed, 562 insertions(+), 274 deletions(-)


-- 
Alan Modra
Australia Development Lab, IBM


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]