This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] aarch64: optimized memcpy implementation for thunderx2


Richard,

Thanks for the suggestion.

I probably should have stated it more clearly that the
updated patch uses the jump table, so, I'm going to
changed the alignment of the code chunks to "standard"
16 bytes (.p2align 4).


On 03.10.2018 18:06, Richard Henderson wrote:
On 10/1/18 11:22 AM, Anton Youdkevitch wrote:
+.p2align 7
+#define EXT_SIZE 2

Oh, and my other suggestion is not to use

	.p2align 7

which merely aligns to 128, and so could produce a 256 gap, but use

	.org L(load_and_merge)+(EXT_SIZE-1)*128

which will advance the pc to a multiple of 128 and will also generate an
assembler error if that "advance" moves backward.  I.e. you'll reliably error
out if changes to the code overflow the space reserved.


r~



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]