This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Optimize epilogue in thumb __aeabi_{memmove,memset} implementations


On Mon, Sep 30 2019 at 16:37:26 +0100, Richard Earnshaw (lists) wrote:
> On 30/09/2019 16:31, Richard Earnshaw (lists) wrote:
>> On 30/09/2019 15:55, Christos Gentsos wrote:
>>> The same pop instruction that is used to restore registers can be used
>>> to return from the function (as it is already done in other function
>>> implementations).
>>> ---
>>>   newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 +---
>>>   newlib/libc/machine/arm/aeabi_memset-thumb.S  | 4 +---
>>>   2 files changed, 2 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S 
>>> b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> index 61a72581..a0aad852 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
>>> @@ -49,9 +49,7 @@ __aeabi_memmove:
>>>       subs    r3, r3, #1
>>>       bcs    1b
>>>   2:
>>> -    pop    {r4}
>>> -    pop    {r1}
>>> -    bx    r1
>>> +    pop    {r4, pc}
>>>   3:
>>>       movs    r3, #0
>>>       cmp    r2, #0
>>> diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S 
>>> b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> index aa8f2719..5bb80b20 100644
>>> --- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> +++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
>>> @@ -110,9 +110,7 @@ __aeabi_memset:
>>>       cmp    r4, r3
>>>       bne    8b
>>>   9:
>>> -    pop    {r4, r5, r6}
>>> -    pop    {r1}
>>> -    bx    r1
>>> +    pop    {r4, r5, r6, pc}
>>>   10:
>>>       movs    r3, r0
>>>       movs    r4, r1
>>>
>> 
>> No.  That isn't interworking clean on armv4t, which we still need to 
>> support.
>> 
>> Sorry.
>> 
>> R.
>
> However, a patch that tests __ARM_ARCH >=5 and uses your improved 
> sequence only in that case (preserving the old code otherwise) would 
> probably be OK :-)
>
> R.

Oh sorry then, I wasn't aware of that, thanks for the correction. I
re-made the patch such that it now checks for __ARM_ARCH, as per your
suggestion. Does it look better?

Thanks,
Christos
---
 newlib/libc/machine/arm/aeabi_memmove-thumb.S | 4 ++++
 newlib/libc/machine/arm/aeabi_memset-thumb.S  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/newlib/libc/machine/arm/aeabi_memmove-thumb.S b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
index 61a72581..465a5a19 100644
--- a/newlib/libc/machine/arm/aeabi_memmove-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memmove-thumb.S
@@ -49,9 +49,13 @@ __aeabi_memmove:
 	subs	r3, r3, #1
 	bcs	1b
 2:
+#if __ARM_ARCH >= 5
+	pop	{r4, pc}
+#else
 	pop	{r4}
 	pop	{r1}
 	bx	r1
+#endif
 3:
 	movs	r3, #0
 	cmp	r2, #0
diff --git a/newlib/libc/machine/arm/aeabi_memset-thumb.S b/newlib/libc/machine/arm/aeabi_memset-thumb.S
index aa8f2719..52094a7b 100644
--- a/newlib/libc/machine/arm/aeabi_memset-thumb.S
+++ b/newlib/libc/machine/arm/aeabi_memset-thumb.S
@@ -110,9 +110,13 @@ __aeabi_memset:
 	cmp	r4, r3
 	bne	8b
 9:
+#if __ARM_ARCH >= 5
+	pop	{r4, r5, r6, pc}
+#else
 	pop	{r4, r5, r6}
 	pop	{r1}
 	bx	r1
+#endif
 10:
 	movs	r3, r0
 	movs	r4, r1
-- 
2.23.0


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]