[PATCH] aarch64: MTE compatible strlen
Andrea Corallo
andrea.corallo@arm.com
Wed Jun 3 09:53:11 GMT 2020
Hi all,
I'd like to submit this patch introducing an Arm MTE compatible strlen
implementation.
Follows a performance comparison of the strlen benchmark run on
Cortex-A72, Cortex-A53, Neoverse N1.
| length | alignment | perf-uplift A72 | perf-uplift A53 |perf-uplift |
|--------+-----------+-----------------+-----------------|------------|
| 1 | 1 | 1.00x | 0.96x | 1.13x |
| 1 | 0 | 2.15x | 0.96x | 1.00x |
| 2 | 2 | 1.16x | 0.95x | 1.09x |
| 2 | 0 | 1.17x | 0.93x | 1.00x |
| 3 | 3 | 1.30x | 0.95x | 1.09x |
| 3 | 0 | 1.32x | 0.96x | 1.00x |
| 4 | 4 | 1.14x | 0.87x | 0.99x |
| 4 | 0 | 1.14x | 0.96x | 1.00x |
| 5 | 5 | 1.15x | 0.89x | 1.09x |
| 5 | 0 | 1.19x | 0.96x | 1.00x |
| 6 | 6 | 1.14x | 0.96x | 1.39x |
| 6 | 0 | 1.14x | 0.95x | 1.00x |
| 7 | 7 | 1.03x | 0.90x | 1.09x |
| 7 | 0 | 1.14x | 0.95x | 1.27x |
| 4 | 0 | 1.15x | 0.87x | 1.00x |
| 4 | 7 | 1.15x | 0.96x | 1.10x |
| 4 | 2 | 1.27x | 0.95x | 1.39x |
| 2 | 2 | 1.14x | 0.96x | 1.09x |
| 8 | 0 | 1.15x | 0.96x | 1.00x |
| 8 | 7 | 1.14x | 0.96x | 1.09x |
| 8 | 3 | 1.17x | 0.96x | 1.39x |
| 5 | 3 | 1.14x | 0.96x | 1.39x |
| 16 | 0 | 1.15x | 0.83x | 1.48x |
| 16 | 7 | 1.14x | 0.80x | 1.43x |
| 16 | 4 | 1.15x | 0.83x | 1.48x |
| 10 | 4 | 1.15x | 0.96x | 1.27x |
| 32 | 0 | 1.04x | 0.88x | 1.16x |
| 32 | 7 | 1.02x | 0.84x | 1.19x |
| 32 | 5 | 1.04x | 0.84x | 1.23x |
| 21 | 5 | 1.14x | 0.83x | 1.60x |
| 64 | 0 | 1.17x | 0.80x | 1.75x |
| 64 | 7 | 1.17x | 0.77x | 1.83x |
| 64 | 6 | 1.17x | 0.77x | 1.57x |
| 42 | 6 | 1.00x | 0.80x | 1.42x |
| 128 | 0 | 0.96x | 0.68x | 1.80x |
| 128 | 7 | 0.96x | 0.66x | 1.85x |
| 128 | 7 | 0.96x | 0.67x | 1.86x |
| 85 | 7 | 1.05x | 0.75x | 1.87x |
| 256 | 0 | 0.98x | 0.69x | 1.88x |
| 256 | 7 | 0.98x | 0.68x | 1.92x |
| 256 | 8 | 0.99x | 0.69x | 1.88x |
| 170 | 8 | 0.96x | 0.72x | 1.86x |
| 512 | 0 | 0.99x | 0.65x | 1.90x |
| 512 | 7 | 0.98x | 0.65x | 1.92x |
| 512 | 9 | 0.99x | 0.65x | 1.92x |
| 341 | 9 | 0.98x | 0.68x | 1.99x |
| 1024 | 0 | 0.99x | 0.63x | 1.90x |
| 1024 | 7 | 0.99x | 0.62x | 1.92x |
| 1024 | 10 | 0.99x | 0.62x | 1.92x |
| 682 | 10 | 0.99x | 0.64x | 1.96x |
| 2048 | 0 | 0.99x | 0.61x | 1.92x |
| 2048 | 7 | 1.01x | 0.61x | 1.93x |
| 2048 | 11 | 1.00x | 0.61x | 1.95x |
| 1365 | 11 | 1.00x | 0.62x | 1.94x |
| 4096 | 0 | 1.00x | 0.61x | 1.93x |
| 4096 | 7 | 1.00x | 0.61x | 1.94x |
| 4096 | 12 | 1.00x | 0.61x | 1.95x |
| 2730 | 12 | 1.00x | 0.61x | 1.94x |
This patch is passing GLIBC tests.
Regards
Andrea
8< --- 8< --- 8<
Introduce an Arm MTE compatible strlen implementation.
Benchmarked on Cortex-A72, Cortex-A53, Neoverse N1 does not show
performance regressions.
Co-authored-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: strlen.patch
Type: text/x-diff
Size: 8030 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20200603/63286801/attachment.bin>
More information about the Libc-alpha
mailing list