This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: V3 [PATCH] aarch64: optimized memcpy implementation for thunderx2
- From: Richard Henderson <rth at twiddle dot net>
- To: Anton Youdkevitch <anton dot youdkevitch at bell-sw dot com>, Steve Ellcey <sellcey at cavium dot com>, libc-alpha at sourceware dot org
- Date: Wed, 10 Oct 2018 23:20:46 -0700
- Subject: Re: V3 [PATCH] aarch64: optimized memcpy implementation for thunderx2
- References: <20181010170052.GA30058@bell-sw.com>
On 10/10/18 10:00 AM, Anton Youdkevitch wrote:
> +
> +L(ext_table):
> + /* The first entry is for the alignment of 0 and is never
> + actually used (could be any value), the second is for
> + the alignment of 1 and the offset is zero as the first
> + code chunk follows the dispatching branch immediately */
> + .quad 0
> + .quad 0
> + .quad L(ext_size_2) - L(load_and_merge)
> + .quad L(ext_size_3) - L(load_and_merge)
> + .quad L(ext_size_4) - L(load_and_merge)
> + .quad L(ext_size_5) - L(load_and_merge)
> + .quad L(ext_size_6) - L(load_and_merge)
> + .quad L(ext_size_7) - L(load_and_merge)
> + .quad L(ext_size_8) - L(load_and_merge)
> + .quad L(ext_size_9) - L(load_and_merge)
> + .quad L(ext_size_10) - L(load_and_merge)
> + .quad L(ext_size_11) - L(load_and_merge)
> + .quad L(ext_size_12) - L(load_and_merge)
> + .quad L(ext_size_13) - L(load_and_merge)
> + .quad L(ext_size_14) - L(load_and_merge)
> + .quad L(ext_size_15) - L(load_and_merge)
There's no real good reason to have this table in .text.
You should put it in .rodata. It does mean that you'd
need one extra insn in loading the address, but that should
be unmeasurable compared to the load. I do suggest you use
properly pc-relative addresses in that case though.
I.e. "L(foo) - .".
r~