This is the mail archive of the
libc-ports@sources.redhat.com
mailing list for the libc-ports project.
[PATCH 00/26] ARM improvements
- From: Richard Henderson <rth at twiddle dot net>
- To: libc-ports at sourceware dot org
- Cc: Joseph Myers <joseph at codesourcery dot com>
- Date: Tue, 26 Feb 2013 19:16:00 -0800
- Subject: [PATCH 00/26] ARM improvements
The first two patches are required to get glibc to build with gcc 4.8
on armv7. Otherwise it doesn't actually notice armv7 and configures
for the default armv4.
The third patch, I thought I was going to need for an armv6 (not t2)
implementation of addmul_1 (using umaal). But in the end I managed
to match speed of the umaal version with the default umlal version,
so I opted not to submit it at all. It could be dropped, but I think
it makes sense to keep it.
Patches 4-18 improve the ability to build libc as a thumb2 binary.
In the end, almost all assembly is done in thumb2 mode if -mthumb
is present in ASFLAGS. Its that last that's the sticky part: by
default we copy only a couple of flags over from CFLAGS. I'm not
sure why we're not passing them all to the assembler. So at the
moment I'm just putting ASFLAGS on the make command-line to get
what I want.
Patches 19-23 add improved string routines for armv6t2. I've had these
hanging around for almost 2 years without properly submitting them.
Which is perhaps a bit silly, but the A8 host I was originally doing
testing on has a dreadfully low resolution clock, so it was hard to get
real numbers. Whereas the A15 has a 1ns resolution CLOCK_MONOTONIC_RAW.
I can post the benchmarks under separate cover if you like.
Patches 24-26 add improved gmp routines for armv4. They're written
from scratch, as I understand that glibc is LGPL2.1, and gmp is GPL3.
They're significantly faster than the generic defaults, and they're
of similar performance to gmp on the A15 (though probably not on the
in-order cores). For the sizes of multiplies that we're going to
encounter inside glibc, they're probably sufficient.
r~
Richard Henderson (26):
Sync config.guess and config.sub with upstream
arm: Update preconfigure fragment for gcc 4.8
arm: Handle armv6 in preconfigure
arm: Include libc-do-syscall in sysdep-rtld-routines
arm: Introduce thumb helpers s and pc_ofs
arm: Use pc_ofs
arm: Introduce and use GET_TLS
arm: Add IT insns for thumb mode
arm: Mark assembly files that will not use thumb mode
arm: Introduce and use LDST_PCREL
arm: Introduce and use NEGOFF series of macros
arm: Enable thumb2 mode in assembly files
arm: Store lr in r2 around GET_TLS
arm: Use push/pop mnemonics
arm: Delete LOADREGS macro
arm: Commonize BX conditionals
arm: Unless arm4t, pop return address directly into pc
arm: Use GET_TLS more often
arm: Add optimized ffs for armv6t2
arm: Implement armv6t2 optimized strlen
arm: Implement armv6t2 optimized strcpy
arm: Implement armv6t2 optimized strchr, strrchr, rawmemchr
arm: Rewrite armv6t2 memchr with uqadd8
arm: Add optimized addmul_1
arm: Add optimized submul_1
arm: Add optimized add_n and sub_n
ports/sysdeps/arm/__longjmp.S | 8 +-
ports/sysdeps/arm/add_n.S | 83 ++++++++
ports/sysdeps/arm/addmul_1.S | 60 ++++++
ports/sysdeps/arm/arm-mcount.S | 19 +-
ports/sysdeps/arm/armv6t2/ffs.S | 34 ++++
ports/sysdeps/arm/armv6t2/ffsll.S | 49 +++++
ports/sysdeps/arm/armv6t2/memchr.S | 216 ++++++++++-----------
ports/sysdeps/arm/armv6t2/rawmemchr.S | 81 ++++++++
ports/sysdeps/arm/armv6t2/stpcpy.S | 1 +
ports/sysdeps/arm/armv6t2/strchr.S | 138 +++++++++++++
ports/sysdeps/arm/armv6t2/strcpy.S | 213 ++++++++++++++++++++
ports/sysdeps/arm/armv6t2/strlen.S | 93 +++++++++
ports/sysdeps/arm/armv6t2/strrchr.S | 137 +++++++++++++
ports/sysdeps/arm/crti.S | 6 +-
ports/sysdeps/arm/crtn.S | 10 +-
ports/sysdeps/arm/dl-tlsdesc.S | 49 +++--
ports/sysdeps/arm/dl-trampoline.S | 15 +-
ports/sysdeps/arm/memcpy.S | 60 +++---
ports/sysdeps/arm/memmove.S | 60 +++---
ports/sysdeps/arm/memset.S | 2 +
ports/sysdeps/arm/preconfigure | 7 +-
ports/sysdeps/arm/setjmp.S | 6 +-
ports/sysdeps/arm/start.S | 10 +-
ports/sysdeps/arm/strlen.S | 2 +
ports/sysdeps/arm/sub_n.S | 2 +
ports/sysdeps/arm/submul_1.S | 67 +++++++
ports/sysdeps/arm/sysdep.h | 99 +++++++---
ports/sysdeps/unix/arm/sysdep.S | 39 ++--
ports/sysdeps/unix/sysv/linux/arm/Makefile | 2 +-
.../sysdeps/unix/sysv/linux/arm/____longjmp_chk.S | 6 +-
ports/sysdeps/unix/sysv/linux/arm/aeabi_read_tp.S | 6 +
ports/sysdeps/unix/sysv/linux/arm/clone.S | 17 +-
ports/sysdeps/unix/sysv/linux/arm/getcontext.S | 2 +-
ports/sysdeps/unix/sysv/linux/arm/mmap.S | 9 +-
ports/sysdeps/unix/sysv/linux/arm/mmap64.S | 14 +-
ports/sysdeps/unix/sysv/linux/arm/nptl/pt-vfork.S | 23 +--
.../unix/sysv/linux/arm/nptl/sysdep-cancel.h | 57 +++---
.../unix/sysv/linux/arm/nptl/unwind-forcedunwind.c | 4 +-
.../unix/sysv/linux/arm/nptl/unwind-resume.c | 4 +-
ports/sysdeps/unix/sysv/linux/arm/nptl/vfork.S | 26 ++-
ports/sysdeps/unix/sysv/linux/arm/setcontext.S | 4 +-
ports/sysdeps/unix/sysv/linux/arm/syscall.S | 5 +-
ports/sysdeps/unix/sysv/linux/arm/sysdep.h | 53 +++--
ports/sysdeps/unix/sysv/linux/arm/vfork.S | 3 +-
scripts/config.guess | 31 ++-
scripts/config.sub | 72 +++----
46 files changed, 1462 insertions(+), 442 deletions(-)
create mode 100644 ports/sysdeps/arm/add_n.S
create mode 100644 ports/sysdeps/arm/addmul_1.S
create mode 100644 ports/sysdeps/arm/armv6t2/ffs.S
create mode 100644 ports/sysdeps/arm/armv6t2/ffsll.S
create mode 100644 ports/sysdeps/arm/armv6t2/rawmemchr.S
create mode 100644 ports/sysdeps/arm/armv6t2/stpcpy.S
create mode 100644 ports/sysdeps/arm/armv6t2/strchr.S
create mode 100644 ports/sysdeps/arm/armv6t2/strcpy.S
create mode 100644 ports/sysdeps/arm/armv6t2/strlen.S
create mode 100644 ports/sysdeps/arm/armv6t2/strrchr.S
create mode 100644 ports/sysdeps/arm/sub_n.S
create mode 100644 ports/sysdeps/arm/submul_1.S
mode change 100755 => 100644 scripts/config.guess
mode change 100755 => 100644 scripts/config.sub
--
1.8.1.2