[PATCH v6 5/5] riscv: Add and use alignment-ignorant memcpy
Evan Green
evan@rivosinc.com
Mon Aug 7 22:10:51 GMT 2023
On Thu, Aug 3, 2023 at 3:30 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/3/23 11:42, Evan Green wrote:
> > On Thu, Aug 3, 2023 at 10:50 AM Richard Henderson
> > <richard.henderson@linaro.org> wrote:
> >> Outside libc something is required.
> >>
> >> An extra parameter to ifunc is surprising though, and clearly not ideal per the extra
> >> hoops above. I would hope for something with hidden visibility in libc_nonshared.a that
> >> could always be called directly.
> >
> > My previous spin took that approach, defining a
> > __riscv_hwprobe_early() in libc_nonshared that could route to the real
> > function if available, or make the syscall directly if not. But that
> > approach had the drawback that ifunc users couldn't take advantage of
> > the vDSO, and then all users had to comprehend the difference between
> > __riscv_hwprobe() and __riscv_hwprobe_early().
>
> I would define __riscv_hwprobe such that it could take advantage of the vDSO once
> initialization reaches a certain point, but cope with being run earlier than that point by
> falling back to the syscall.
>
> That constrains the implementation, I guess, in that it can't set errno, but just
> returning the negative errno from the syscall seems fine.
>
> It might be tricky to get a reference to GLRO(dl_vdso_riscv_hwprobe) very early, but I
> would hope that some application of __attribute__((weak)) might correctly get you a NULL
> prior to full relocations being complete.
Right, this is what we had in the previous iteration of this series,
and it did work ok. But it wasn't as good since it meant ifunc
selectors always got stuck in the null/fallback case and were forced
to make the syscall. With this mechanism they get to take advantage of
the vDSO.
>
>
> > In contrast, IMO this approach is much nicer. Ifunc writers are
> > already used to getting hwcap info via a parameter. Adding this second
> > parameter, which also provides hwcap-like things, seems like a natural
> > extension. I didn't quite follow what you meant by the "extra hoops
> > above".
>
> The check for null function pointer, for sure. But also consider how __riscv_hwprobe is
> going to be used.
>
> It might be worth defining some helper functions for probing a single key or a single
> field. E.g.
>
> uint64_t __riscv_hwprobe_one_key(int64_t key, unsigned int flags)
> {
> struct riscv_hwprobe pair = { .key = key };
> int err = __riscv_hwprobe(&pair, 1, 0, NULL, flags);
> if (err)
> return err;
> if (pair.key == -1)
> return -ENOENT;
> return pair.value;
> }
>
> This implementation requires that no future hwprobe key define a value which as a valid
> value in the errno range (or better, bit 63 unused). Alternately, or additionally:
>
> bool __riscv_hwprobe_one_mask(int64_t key, uint64_t mask, uint64_t val, int flags)
> {
> struct riscv_hwprobe pair = { .key = key };
> return (__riscv_hwprobe(&pair, 1, 0, NULL, flags) == 0
> && pair.key != -1
> && (pair.value & mask) == val);
> }
>
> These yield either
>
> int64_t v = __riscv_hwprobe_one_key(CPUPERF_0, 0);
> if (v >= 0 && (v & MISALIGNED_MASK) == MISALIGNED_FAST)
> return __memcpy_noalignment;
> return __memcpy_generic;
>
> or
>
> if (__riscv_hwprobe_one_mask(CPUPERF_0, MISALIGNED_MASK, MISALIGNED_FAST, 0))
> return __memcpy_noalignment;
> return __memcpy_generic;
>
> which to my mind looks much better for a pattern you'll be replicating so very many times
> across all of the ifunc implementations in the system.
Ah, I see. I could make a static inline function in the header that
looks something like this (mangled by gmail, sorry):
/* Helper function usable from ifunc selectors that probes a single key. */
static inline int __riscv_hwprobe_one(__riscv_hwprobe_t hwprobe_func,
signed long long int key,
unsigned long long int *value)
{
struct riscv_hwprobe pair;
int rc;
if (!hwprobe_func)
return -ENOSYS;
pair.key = key;
rc = hwprobe_func(&pair, 1, 0, NULL, 0);
if (rc) {
return rc;
}
if (pair.key < 0) {
return -ENOENT;
}
*value = pair.value;
return 0;
}
The ifunc selector would then be significantly cleaned up, looking
something like:
if (__riscv_hwprobe_one(hwprobe_func, RISCV_HWPROBE_KEY_CPUPERF_0, &value))
return __memcpy_generic;
if (value & RISCV_HWPROBE_MISALIGNED_MASK) == RISCV_HWPROBE_MISALIGNED_FAST)
return __memcpy_noalignment;
-Evan
More information about the Libc-alpha
mailing list