[PATCH] AArch64: Check kernel version for SVE ifuncs
Adhemerval Zanella Netto
adhemerval.zanella@linaro.org
Thu Mar 14 13:47:01 GMT 2024
On 14/03/24 05:35, Szabolcs Nagy wrote:
> The 03/13/2024 16:55, Adhemerval Zanella Netto wrote:
>> On 13/03/24 16:25, Szabolcs Nagy wrote:
>>> The 03/13/2024 15:12, Adhemerval Zanella Netto wrote:
>>>> On 13/03/24 11:31, Wilco Dijkstra wrote:
>>>>>
>>>>> Older Linux kernels may disable SVE after certain syscalls. Calling the
>>>>> SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
>>>>> As a result, applications with a high use of syscalls may run slower with
>>>>> the SVE memcpy. Avoid this by checking the kernel version and enable the
>>>>> SVE-optimized memcpy/memmove ifuncs only on Linux kernel 6.2 or newer.
>>>>>
>>>>> Passes regress, OK for commit?
>>>>
>>>> How does it handle backports? Parsing kernel version is really not ideal
>>>> (and that's why we removed it on previous version), can't kernel provide
>>>> this information through any means (as powerpc does for HTM without
>>>> suspend, PPC_FEATURE2_HTM_NO_SUSPEND) ?
>>>
>>> the check is for performance only, not correctness.
>>>
>>> if we detect a bad kernel as good, that can cause significant
>>> slowdown due to the traps depending on the workload.
>>> (at least this is the claim, i haven't verified, but plausible)
>>>
>>> if we detect a good kernel as bad (e.g. because of kernel
>>> backports), that is acceptable amount of slowdown (the fallback
>>> memcpy is reasonably fast).
>>>
>>> i think the kernel version check is ugly, but in this case it
>>> makes sense: direct detection of the behaviour is hard, the
>>> version check should reliably avoid detecting a bad kernel as
>>> good, and it is better than always assuming a bad kernel.
>>
>> Yes, I understand this. My point it won't be possible to backport this
>> kernel fix to get the performance improvement of SVE routines without
>> hacking the glibc as well. It is not really a blocker, but I would
>> expect kernel to do proper advertise for such performance change that
>> might interfere with ifunc selection. Maybe we can add a tunable to
>> force SVE selection, but I don't have a strong opinion.
>
> right.
>
> i can ask linux devs, if they are happy introducing an easy
> check in future kernels or if the change is easy to backport
> at all or we don't want to encourage that.
>
>>
>> And I don't think __ASSUME_FAST_SVE would work well here, it means it
>> would always detect a good kernel even when running on a older one
>
> why? if it is set based on min supported kernel version then
> running on older kernel is a bug. if it can be overriden by
> a tunable then it is a user error if the tunable is wrong.
>
>> (I am not sure how usual this is). The minimum supported kernel
>> version can work to ensure that this check won't be necessary, but in
>> this case we won't really need this test anyway.
>
> you mean after min version is increased the check can be removed?
> i expect all __ASSUME* based on min linux version works like that
> and it's useful to have the __ASSUME exactly to be able to find
> which code can be removed after a min version increase.
>
> i don't know if distros actually adjust the min version in their
> glibc, i guess that would be risky if it should work in containers,
> so 6.2 min version is probably far in the future.
All major distros I am aware of does not set --enable-kernel, so SVE
only will be selected either someone builds a glibc with
--enable-kernel=6.2 or when we raise the minimum version to 6.2. Not
a deal breaker, but the SVE routines will ended up not being actively
used for a long time.
>
>>
>> The _dl_discover_osversion (removed by b46d250656794e63a2946c481fda)
>> used to check the PT_NOTE, and fallback to uname or /proc/sys/kernel/osrelease.
>> I think we will need at least the osrelease fallback in the case of
>> uname syscall filtering.
>
> i didn't know about the vdso PT_NOTE.
>
> we could check osrelease there i guess. fallback is not critical
> as slightly slower memcpy is not the end of the world, and there
> is a balance how complicated code we want to maintain here.
I agree, maybe uname alone should be suffice.
More information about the Libc-alpha
mailing list