This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: [PATCH, newlib] Fix strlen using Thumb-2 with -Os -marm
- From: Pavel Pisa <ppisa4lists at pikron dot com>
- To: newlib at sourceware dot org, Thomas Preudhomme <thomas dot preudhomme at foss dot arm dot com>
- Cc: Sebastian Huber <sebastian dot huber at embedded-brains dot de>
- Date: Mon, 2 May 2016 23:06:00 +0200
- Subject: Re: [PATCH, newlib] Fix strlen using Thumb-2 with -Os -marm
- Authentication-results: sourceware.org; auth=none
- References: <38472988 dot FY0Xj2vxR9 at e108577-lin>
Hello Thomas and others,
On Thursday 28 of April 2016 18:06:45 Thomas Preudhomme wrote:
> Currently, the selection of which strlen routine to use for ARM processors
> is as follows (see newlib/libc/machine/arm/strlen.S or
> newlib/libc/machine/arm/strlen-stub.c):
>
> #if optimize for size
> #if defined __thumb__ && !defined __thumb2__
> Thumb-1 routine
> #else
> Thumb-2 routine
> #endif
> #else
> something
> #endif
>
> When compiling for ARM ISA, __thumb__ would be 0 and thus the Thumb-2
> routine would be chosen, no matter the architecture selected. This patch
> changes the logic to only look for capabilities of the processor. It also
> fallback to the C implementation for targets with only ARM execution state
> (armv4 and armv5).
I have already fight with that problem on RTEMS-4.12
on raspberry Pi v1 and want to propose allmost the same
patch today and found your proposal. So I want to confirm
the problem and that your patch should help there.
There is my diagnostics which confirm your finding.
RTEMS core and application for RPi1 BSP is compiled with
arm-rtems4.12-gcc -mcpu=arm1176jzf-s
which is mapped for RTEMS GCC compile to the plain ARM 32-bit mode
and non-thumb/default/"." multilib variant. But application
fails. I have found that cause is the call to the strlen()
which is not 32-bit ARM code as the rest and includes Thumb-2
instruction unknown to ARMv6T1 based CPU.
Source of the problem is
libc/machine/arm/lib_a-strlen.o
which is reposted by readelf as
No version information found in this file.
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "6T2"
Tag_CPU_arch: v6T2
Tag_THUMB_ISA_use: Thumb-2
All other library files has next attributes
No version information found in this file.
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "ARM7TDMI"
Tag_CPU_arch: v4T
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: small
Tag_ABI_optimization_goals: Aggressive Size
The strlen uses really unsupported Thumb-2 code
00018994 <strlen>:
18994: 4603 mov r3, r0
18996: f813 2b01 ldrb.w r2, [r3], #1
1899a: 2a00 cmp r2, #0
1899c: d1fb bne.n 18996 <strlen+0x2>
1899e: 1a18 subs r0, r3, r0
189a0: 3801 subs r0, #1
189a2: 4770 bx lr
GCC-6.0.1-RC-20160415 + Newlib-2.4.0 have been configured as
../../../src/gcc-6.1/configure --target=arm-rtems4.12 --prefix=/usr \
--build=x86_64-pc-linux-gnu \
--enable-languages=c,c++ \
--disable-libstdcxx-pch \
--with-gnu-ld \
--with-gnu-as \
--enable-threads \
--enable-target-optspace \
--with-system-zlib \
--verbose \
--disable-nls --without-included-gettext \
--disable-win32-registry \
--with-newlib \
--enable-plugin \
--enable-newlib-io-c99-formats \
--enable-version-specific-runtime-libs \
--enable-newlib-iconv \
--disable-lto \
--enable-libgomp \
--enable-newlib-iconv \
--enable-newlib-iconv-encodings="iso_8859_1,utf_8" \
When Newlib is compiled for default multilib variant (ARMv4 for RTEMS) then next command
is used (some includes and unimportant flags removed)
arm-rtems4.12-gcc -nostdinc -I . -Os \
-D_COMPILING_NEWLIB -DMALLOC_PROVIDED -DEXIT_PROVIDED -DSIGNAL_PROVIDED -DREENTRANT_SYSCALLS_PROVIDED \
-DHAVE_NANOSLEEP -DHAVE_BLKSIZE -DHAVE_FCNTL -DHAVE_ASSERT_FUNC -D_NO_GETLOGIN -D_NO_GETPWENT -D_NO_GETUT \
-D_NO_GETPASS -D_NO_SIGSET -D_NO_WORDEXP -D_NO_POPEN -D_NO_POSIX_SPAWN -D_I386MACH_ALLOW_HW_INTERRUPTS \
-fno-builtin -Os -D_COMPILING_NEWLIB -DMALLOC_PROVIDED -DEXIT_PROVIDED -DSIGNAL_PROVIDED \
-DREENTRANT_SYSCALLS_PROVIDED -DHAVE_NANOSLEEP -DHAVE_BLKSIZE -DHAVE_FCNTL -DHAVE_ASSERT_FUNC \
-D_NO_GETLOGIN -D_NO_GETPWENT -D_NO_GETUT -D_NO_GETPASS -D_NO_SIGSET -D_NO_WORDEXP -D_NO_POPEN \
-D_NO_POSIX_SPAWN -D_I386MACH_ALLOW_HW_INTERRUPTS -fno-builtin \
-g -O2 -g -Os -c -o lib_a-strlen.o strlen.S
My tested and attached proposal should be equivalent to yours.
But I have found some difference. When GCC is used for RPi
arm-rtems4.12-gcc -mcpu=arm1176jzf-s
then next Thumb related defines are set
__ARM_ARCH_ISA_THUMB 1
__THUMB_INTERWORK__ 1
but because build is 32-bit then none of __thumb__ and __thumb2__ is defined.
But because you use __ARM_ARCH_ISA_THUMB which is defined even for non Thumb
build
+#if __ARM_ARCH_ISA_THUMB == 2
+#include "strlen-thumb2-Os.S"
+
+#elif defined (__ARM_ARCH_ISA_THUMB)
then I expect that strlen() would be Thumb-1 assembler code in your case
but common simple C code in my case. The both options should be valid
for ARMv6T1. GCC default options leads to the same Thumb related defines
set so that means that even for ARMv4T based chips (i.e. LPC2148) Thumb 1
strlen is used even for non thumb build. It still should work but if my
analysis is the right then it would lead to inclusion of Thumb to the
applications which are expected to be non-Thumb builds. May it be that
it would lead to speed advantage if this is intentional or considered
non-harmfull.
Thanks for taking care for Newlib,
Pavel
diff --git a/newlib/libc/machine/arm/strlen-stub.c b/newlib/libc/machine/arm/strlen-stub.c
index bcd3d2d..01e0196 100644
--- a/newlib/libc/machine/arm/strlen-stub.c
+++ b/newlib/libc/machine/arm/strlen-stub.c
@@ -32,7 +32,8 @@
#include <limits.h>
#if defined __OPTIMIZE_SIZE__ || defined PREFER_SIZE_OVER_SPEED
-#if defined __thumb__ && !defined __thumb2__
+#if defined __thumb__
+#if !defined __thumb2__
/* Implemented in strlen.S. */
#else
@@ -40,6 +41,11 @@
#endif
+#else
+/* No thumb support and optimize for size */
+#include "../../string/strlen.c"
+#endif
+
#else /* defined __OPTIMIZE_SIZE__ || defined PREFER_SIZE_OVER_SPEED */
#if defined __thumb__ && ! defined __thumb2__
#include "../../string/strlen.c"
diff --git a/newlib/libc/machine/arm/strlen.S b/newlib/libc/machine/arm/strlen.S
index 5737145..f24b2ac 100644
--- a/newlib/libc/machine/arm/strlen.S
+++ b/newlib/libc/machine/arm/strlen.S
@@ -27,7 +27,8 @@
#include "acle-compat.h"
#if defined __OPTIMIZE_SIZE__ || defined PREFER_SIZE_OVER_SPEED
-#if defined __thumb__ && !defined __thumb2__
+#if defined __thumb__
+#if !defined __thumb2__
#include "strlen-thumb1-Os.S"
#else
@@ -35,6 +36,11 @@
#endif
+#else
+ /* ARMv4 or other variant without thumb support optimized for size */
+ /* Implemented in strlen-stub.c. */
+#endif
+
#else /* defined __OPTIMIZE_SIZE__ || defined PREFER_SIZE_OVER_SPEED */
#if defined __thumb__ && ! defined __thumb2__
/* Implemented in strlen-stub.c. */