This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 0/3] aarch64: Update ld.so for vector abi
- From: Carlos O'Donell <carlos at redhat dot com>
- To: rth at twiddle dot net, libc-alpha at sourceware dot org
- Cc: szabolcs dot nagy at arm dot com, Richard Henderson <richard dot henderson at linaro dot org>
- Date: Thu, 2 Aug 2018 11:24:23 -0400
- Subject: Re: [PATCH 0/3] aarch64: Update ld.so for vector abi
- References: <20180801222347.18903-1-rth@twiddle.net>
On 08/01/2018 06:23 PM, rth@twiddle.net wrote:
> From: Richard Henderson <richard.henderson@linaro.org>
>
> There is a new calling convention defined for vectorized functions [1].
Correct me if I'm wrong.
(a) We have a lot of stuff to save/restore in SVE.
(b) It appears Szabolcs really wants to avoid the PLT at all with the
new vector procedure call stanadard, since this avoids ever
having to save/restore the large amounts of register data.
(b.1) Assumes that save and restore of SVE has serious negative performance
consequences, both in userspace and in kernel save/restore for
context switches.
(c) A PLT generally has only one kind of save/restore ABI that it follows
and it follows pessimistically the worse case to support all possible
calling conventions.
(d) The compiler you are using is generating calls using the new ABI and
those are going through the PLT, something in the dynamic loader is
also using these registers and corrupting call results, otherwise
you would never have made this patch to fix the problem.
If all this is true, I think this is the wrong solution.
The better solution for aarch64 is:
(1) All new-style SVE calls do *not* go through the PLT by default, but
indirect through the GOT and are always bind-now.
(2) By default ld.so does not save/restore the SVE registers during
lazy binding.
(3) If ld.so detects LD_AUDIT in use, or BIND_NOW=0, or lazy binding
is being forced, then it flips to PLT save/restore sequences that
do save all the required SVE registers, and routes the GOT entries
to the PLT entries, and we get *slow* lazy binding semantics that
work.
I don't expect you signed up for this, but that's my analysis.
> I have *not* attempted to extend the <bits/link.h> interface for
> the new ABI. This should be done with more discussion on list.
> I have instead simply saved and restored registers as the abi
> requires, so that the actual callee gets the correct data.
We *should* adjust bits/link.h at the same time and extend it like
we did for x86_64. LD_AUDIT should work.
--
Cheers,
Carlos.