[PATCH] riscv: Resolve symbols directly for symbols with STO_RISCV_VARIANT_CC.

Jessica Clarke jrtc27@jrtc27.com
Wed Aug 11 22:51:51 GMT 2021


On 11 Aug 2021, at 23:11, Palmer Dabbelt <palmer@dabbelt.com> wrote:
> 
> On Sun, 08 Aug 2021 20:47:07 PDT (-0700), kai.wang@sifive.com wrote:
>> In some cases, we do not want to go through the resolver for function
>> calls. For example, functions with vector arguments will use vector
>> registers to pass arguments. In the resolver, we do not save/restore the
>> vector argument registers for lazy binding efficiency. To avoid ruining
>> the vector arguments, functions with vector arguments will not go
>> through the resolver.
>> 
>> To achieve the goal, we will annotate the function symbols with
>> STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the dynamic
>> section. In the first pass on PLT relocations, we do not set up to call
>> _dl_runtime_resolve. Instead, we resolve the functions directly.
>> 
>> The related discussion could be found in
>> https://github.com/riscv/riscv-elf-psabi-doc/pull/190.
> 
> I'm trying to stay away from the foundation stuff these days, but from reading that spec it doesn't look like the variant CC has anything to do specifically with the V extension but instead is about allowing for arbitrary calling conventions for X and F registers as well.  Handling non-standard X-register calling conventions is really a whole different beast than handling non-standard F or V register calling conventions.  It's not crazy to allow for these, but lumping them all into a single bit just makes this unnecessarily difficult.

Whilst increased granularity would be nice to have, the reality is that st_other is an 8-bit field, 2 of which are currently allocated by the gABI, with proposals to grow that to 3, so there are only 5 bits to play with, thus anything more than just a single bit for this needs an extremely good justification beyond “it would be less difficult to use”, unfortunately. This bit is sufficient to address the issue with the V calling convention, and is also sufficient for any other non-standard calling convention, whether it involves X registers, F registers or something that hasn’t even been thought up yet.

The second reason why it’s done this way is I have a strong belief that RISC-V should not reinvent the wheel except for problems unique to RISC-V. AArch64 faced exactly the same problem with SVE a few years ago and this is the exact same solution they use and is deployed everywhere, so by following their lead we can be confident it works and reuse existing implementation code for the various tools and runtimes affected.

> In theory we can handle arbitrary F or V register calling conventions with the standard resolver, we just need to ensure we deal with save/restore of those registers.  IMO the right way to go is to just ban F and V register use from the resolver, as there's not much of a reason for them to be in there (none for F, and you probably don't want to spin up the V registers).  That just requires sorting out the build system, and would still allow us to have lazy binding for the variants.  At a bare minimum we can support arbitrary V register calling conventions for free until we start using the V registers in glibc, which is likely quite a way off.

It’s a nice idea in theory, but in practice it doesn’t work, the resolver path is far too complicated for that. Run-time linkers like to call into libc for various things (e.g. on FreeBSD, if you are in a multi-threaded program, all the locking and unlocking guarding the shared data structures ends up calling into libthr via hooks, and in glibc similarly it uses libpthread’s real mutex functions), so you end up needing to make sure those parts of your (extended) libc are also free from vectors and floats (including any auto-vectorisation). But the real kicker is IFUNCs; with lazy binding, the IFUNC resolver isn’t called until you first call the function, and you have no clue what the resolver is going to do other than that it’ll follow the ABI (we should specify that, but IFUNCs as a whole are underspecified in the psABI, we don’t currently document what the arguments are). That means the resolver is free to use F and V registers. Yes, all of these can be individually addressed via various extremely-targeted means, but experience suggests that it will break sooner or later. As a simple example, none of the libc functions called by the run-time linker could ever use memcpy, be it explicit or implicit (compiler-inserted instead of a large inline struct assignment), since that will eventually be vectorised, even if you compiled the directly-called libc functions themselves without V.

> Handling arbitrary X-register calling conventions is a lot tricker: for the fully arbitrary case we can't even have PLT entries, for example.  Is there actually a use case for these?  If so we'll have to support it, it just seems odd that one would care enough about the X ABI to want a different one while still being able to deal with the overhead of a dynamic call.

You’re correct that as it stands PLT entries are technically broken by this. We should fix that so even VARAINT_CC functions can’t use those registers as argument registers (AArch64 has that implicitly as it has IP0 and IP1 as reserved registers for PLTs and veneers, but since we do linker relaxation rather than range-extending with veneers we haven’t yet needed to explicitly reserve the PLT registers).

Jess

> So I'm not opposed to doing something like this, we just need some way to make sure it's actually solving a problem.
> 
>> ---
>> elf/elf.h                    |  7 +++++++
>> sysdeps/riscv/dl-dtprocnum.h | 21 +++++++++++++++++++++
>> sysdeps/riscv/dl-machine.h   | 26 ++++++++++++++++++++++++++
>> 3 files changed, 54 insertions(+)
>> create mode 100644 sysdeps/riscv/dl-dtprocnum.h
>> 
>> diff --git a/elf/elf.h b/elf/elf.h
>> index 4738dfa..0de29bf 100644
>> --- a/elf/elf.h
>> +++ b/elf/elf.h
>> @@ -3889,6 +3889,13 @@ enum
>> 
>> #define R_TILEGX_NUM		130
>> 
>> +/* RISC-V specific values for the Dyn d_tag field.  */
>> +#define DT_RISCV_VARIANT_CC	(DT_LOPROC + 1)
>> +#define DT_RISCV_NUM		2
>> +
>> +/* RISC-V specific values for the st_other field.  */
>> +#define STO_RISCV_VARIANT_CC	0x80
>> +
>> /* RISC-V ELF Flags */
>> #define EF_RISCV_RVC 			0x0001
>> #define EF_RISCV_FLOAT_ABI 		0x0006
>> diff --git a/sysdeps/riscv/dl-dtprocnum.h b/sysdeps/riscv/dl-dtprocnum.h
>> new file mode 100644
>> index 0000000..f189fd7
>> --- /dev/null
>> +++ b/sysdeps/riscv/dl-dtprocnum.h
>> @@ -0,0 +1,21 @@
>> +/* Configuration of lookup functions.  RISC-V version.
>> +   Copyright (C) 2019-2021 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library.  If not, see
>> +   <https://www.gnu.org/licenses/>.  */
>> +
>> +/* Number of extra dynamic section entries for this architecture.  By
>> +   default there are none.  */
>> +#define DT_THISPROCNUM	DT_RISCV_NUM
>> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
>> index aedf69f..6488483 100644
>> --- a/sysdeps/riscv/dl-machine.h
>> +++ b/sysdeps/riscv/dl-machine.h
>> @@ -54,6 +54,9 @@
>> #define ELF_MACHINE_NO_REL 1
>> #define ELF_MACHINE_NO_RELA 0
>> 
>> +/* Translate a processor specific dynamic tag to the index in l_info array.  */
>> +#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM)
>> +
>> /* Return nonzero iff ELF header is compatible with the running host.  */
>> static inline int __attribute_used__
>> elf_machine_matches_host (const ElfW(Ehdr) *ehdr)
>> @@ -299,6 +302,29 @@ elf_machine_lazy_rel (struct link_map *map, ElfW(Addr) l_addr,
>>   /* Check for unexpected PLT reloc type.  */
>>   if (__glibc_likely (r_type == R_RISCV_JUMP_SLOT))
>>     {
>> +      if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] != NULL))
>> +	{
>> +	  /* Check the symbol table for variant CC symbols.  */
>> +	  const Elf_Symndx symndx = ELFW(R_SYM) (reloc->r_info);
>> +	  const ElfW(Sym) *symtab =
>> +	    (const void *)D_PTR (map, l_info[DT_SYMTAB]);
>> +	  const ElfW(Sym) *sym = &symtab[symndx];
>> +	  if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC))
>> +	    {
>> +	      /* Avoid lazy resolution of variant CC symbols.  */
>> +	      const struct r_found_version *version = NULL;
>> +	      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
>> +		{
>> +		  const ElfW(Half) *vernum =
>> +		    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
>> +		  version = &map->l_versions[vernum[symndx] & 0x7fff];
>> +		}
>> +	      elf_machine_rela (map, reloc, sym, version, reloc_addr,
>> +				skip_ifunc);
>> +	      return;
>> +	    }
>> +	}
>> +
>>       if (__glibc_unlikely (map->l_mach.plt == 0))
>> 	{
>> 	  if (l_addr)



More information about the Libc-alpha mailing list