[patch,arm] strcmp implementation using LDRD

Corinna Vinschen vinschen@redhat.com
Thu Feb 9 10:43:00 GMT 2012

On Feb  8 09:58, Greta Yorsh wrote:
> The attached patch provides a new implementation of strcmp for ARM, using
> LDRD instead of LDR whenever possible.
> For older architectures that do not support LDRD, this implementation uses
> the same algorithm as before. 
> This patch replaces strcmp.c with strcmp.S. The huge inline assembly from
> strcmp.c was converted into plain assembly and included in strcmp.S under
> the appropriate predefines.
> Testing and benchmarking:
> * Validation: successfully passes a test that compares different strings of
> length 1-128 and offsets 0-8 from a word boundary. Checked on qemu/A15/A9,
> ARM/Thumb mode, Big/Little Endian. This test is also added to newlib
> testsuite as part of this patch.
> * Integration with gcc: no regression on qemu for arm-none-eabi --with-cpu
> a15/a9 --with-mode arm/thumb.
> * Performance (relative to the current strcmp in newlib, only in ARM mode): 
> On Dhrystone, the new implementation (ldrd) is 22% faster on Cortex-A15
> FPGA, and 16% on Cortex-A9 VE2. 
> On synthetic benchmarks, which measure the average number of cycles for
> strcmp on strings of length 4-128K and offsets 0,1,2,3,4,8 from a word
> boundary, where the strings are equal, the new implementation is three times
> faster for long strings, when the input strings have the same offset from a
> word boundary, and up to 30% faster in other cases, on both A15 FPGA and A9
> VE2.
> newlib/ChangeLog
> 2012-02-08  Greta Yorsh  <Greta.Yorsh@arm.com>
> 	* libc/machine/arm/strcmp.S: New File.
> 	* libc/machine/arm/strcmp.c: Deleted.
> 	* libc/machine/arm/Makefile.am: Replaces strcmp.c with strcmp.S
> 	* libc/machine/arm/Makefile.in: Regenerated. 
>       * testsuite/newlib.string/strcmp-1.c: New file.

Thanks for the patch.  Applied.


