This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
- From: Christos Gentsos <christos dot gentsos at cern dot ch>
- To: "Richard Earnshaw (lists)" <Richard dot Earnshaw at arm dot com>, <newlib at sourceware dot org>
- Cc:
- Date: Mon, 30 Sep 2019 19:16:24 +0200
- Subject: Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 188.184.36.48) smtp.rcpttodomain=arm.com smtp.mailfrom=cern.ch; dmarc=bestguesspass action=none header.from=cern.ch; dkim=none (message not signed); arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=620v1gfVrmzlJUnEdXNmcWUSxUA52IJy1hQxoFDZQMY=; b=MV1Iizcar5SR7d/waY2yA8p+Ju16QoXJouuhZe+0XZp/KUNGxerY+XgCWwTJeDpIk+TV/bsxlbo+dIXQyp3PFRSOe4AnNhEXIPswN9wenr1g98CJyOahsVQOATywRNIbYo/eVq4SHPHPC8QdRlhTNxJ2k8ZlWzw7BpoBfXRa556V0GBx/uH35f2T0x3lJPuzOWR9VYwd31dfWEM7AraSFDxlFRejAn9zydIcdu/v9zzdQ08zerBynwTRRVVDoEpciLAPOff8bM/kIm36tfYHPQsCAjiGuXsp4nEMQ5br02sACQaot38DVtOy6OyJKsDB+Mi/ylDdTWmYM+InNm1IaA==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BMaXgxu7fNdbX/GuuzgRpRDaZpIthiugZrkWIcYQEBUaSuZCpENvb27i6H1Eks2KEwZB2iVzzB7Fx1hlPnHtgK8FhQ/DHupl4SpAyVNX7w1LaNNTJ86AxWFx36VxgA9KcFkOlnvqALIDFczCFh1ApTJvtZ7pOB9fk3zT+IszsE/BcLVyxNKd8vSzKXjbJSaJrOW6WHUW00xskDO//cSVcT0IanNaQh/skKyJBP7yCJwGRMmB/PRK8qIvRUZJ47/MBkw58LJ9ZM2k1uhGMvHhSD45D8YUcf/I4KK80QEunyzv0oPB+dJhfh2r5jfuMiDaqhODl4lMk107Kl1ZXzRRNA==
- References: <20190930145510.2796395-1-christos.gentsos@cern.ch> <ed461284-fb27-f6c6-d505-f963bbd78aff@arm.com> <054d50a0-e6a0-79e8-37d1-e6af6e12d0c6@arm.com>
On Mon, Sep 30 2019 at 16:37:26 +0100, Richard Earnshaw (lists) wrote:
> On 30/09/2019 16:31, Richard Earnshaw (lists) wrote:
>> On 30/09/2019 15:55, Christos Gentsos wrote:
>>> The same pop instruction that is used to restore registers can be used
>>> to return from the function (as it is already done in other function
>>> implementations).
>>> ---
>>> newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
>>> newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 +---
>>> 2 files changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> index 61a72581..a0aad852 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> @@ -49,9 +49,7 @@ __aeabi_memmove:
>>> subs r3, r3, #1
>>> bcs 1b
>>> 2:
>>> - pop {r4}
>>> - pop {r1}
>>> - bx r1
>>> + pop {r4, pc}
>>> 3:
>>> movs r3, #0
>>> cmp r2, #0
>>> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> index aa8f2719..5bb80b20 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> @@ -110,9 +110,7 @@ __aeabi_memset:
>>> cmp r4, r3
>>> bne 8b
>>> 9:
>>> - pop {r4, r5, r6}
>>> - pop {r1}
>>> - bx r1
>>> + pop {r4, r5, r6, pc}
>>> 10:
>>> movs r3, r0
>>> movs r4, r1
>>>
>>
>> No. That isn't interworking clean on armv4t, which we still need to
>> support.
>>
>> Sorry.
>>
>> R.
>
> However, a patch that tests __ARM_ARCH >=5 and uses your improved
> sequence only in that case (preserving the old code otherwise) would
> probably be OK :-)
>
> R.
Oh sorry then, I wasn't aware of that, thanks for the correction. I
re-made the patch such that it now checks for __ARM_ARCH, as per your
suggestion. Does it look better?
Thanks,
Christos
---
newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 ++++
newlib/libc/machine/arm/aeabi_memset-thumb.S | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
index 61a72581..465a5a19 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
@@ -49,9 +49,13 @@ __aeabi_memmove:
subs r3, r3, #1
bcs 1b
2:
+#if __ARM_ARCH >= 5
+ pop {r4, pc}
+#else
pop {r4}
pop {r1}
bx r1
+#endif
3:
movs r3, #0
cmp r2, #0
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
index aa8f2719..52094a7b 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
@@ -110,9 +110,13 @@ __aeabi_memset:
cmp r4, r3
bne 8b
9:
+#if __ARM_ARCH >= 5
+ pop {r4, r5, r6, pc}
+#else
pop {r4, r5, r6}
pop {r1}
bx r1
+#endif
10:
movs r3, r0
movs r4, r1
--
2.23.0