This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: V3 [PATCH] aarch64: optimized memcpy implementation for thunderx2




On 11.10.2018 17:34, Richard Henderson wrote:
On 10/11/18 6:32 AM, Anton Youdkevitch wrote:
be unmeasurable compared to the load.  I do suggest you use
properly pc-relative addresses in that case though.
I.e. "L(foo) - .".
Now I do not follow. Why is the existing addressing is a not
proper pc-relative one except for the part that it employs
the fact that the distance is small and adrp is not needed?
Or this is what you actually meant?

I suppose it doesn't matter, now that I write it out and count instructions,
but it would be the difference between

	adrp	tmp2, L(ext_table)
	add	tmp2, tmp2, :lo12:L(ext_table)
	ldr	tmp2, [tmp2, tmp1, LSL #3]
	adr	tmp3, L(load_and_merge)
	add	tmp2, tmp2, tmp3
	br	tmp2

and

	adrp	tmp2, L(ext_table)
	add	tmp2, tmp2, :lo12:L(ext_table)
	add	tmp2, tmp1, LSL #3
	ldr	tmp3, [tmp2]
	add	tmp2, tmp2, tmp3
	br	tmp2

If you're going to subtract L(load_and_merge), you might even save memory by
noting that the displacements fit in bytes instead of quads.
But isn't it a matter of clarity now? I mean, unless we really
care of additional ~100 bytes this is more important.


Also, the "dot" cannot be used for for cross-section address
generation.

Absolutely it can.  It is in fact exactly R_AARCH64_PREL64.
Oh... And the linker will fix the relocations in the resulting
(shared) library? OK then.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]