[RFC PATCH 00/10] arm64/sve: Add userspace vector length control API

Dave Martin Dave.Martin@arm.com
Thu Jan 12 11:26:00 GMT 2017

This is an add-on series for the Scalable Vector Extension (SVE) core
patches [1], adding an interface to allow userspace tasks to control
what vector length they use for SVE instructions.

(For an architectural overview of SVE, and an explanation of what a
"vector length" is, see [2].)

The amount of SVE register state depends on the vector length, so using
large vector lengths with SVE can requre the userspace signal frame to
grow.  This leads to some ABI impacts in common with other arches that
may have to grow their signal frame.

In this series I do not dictate any particular policy for the SVE vector
length: the kernel provides a default, but userspace can set the vector
length as it likes, provided the hardware supports the chosen length.

For example, userspace may pick a length that the images being loaded
have been validated against or optimised for.


The API proposed here consists of prctl() calls to allow a task to
set/query its vector length and related control flags, and corresponding
ptrace extensions to allow a debugger to do the same for a traced task:


ptrace() NT_ARM_SVE regset:
 * user_sve_header.flags & SVE_PT_VL_*
 * user_sve_header.vl

(I follow the existing convention of not including the arch name in
prctl names).

The context switch logic is also extended to set the correct vector
length when scheduling a task in, since the vector length may now differ
between tasks.

The expected users of this API are libc startup, dynamic linker and
runtime environment plumbing code.  I don't expect ordinary user code to
change its own vector length on-the-fly, partly because it is generally
The Wrong Thing To Do with respect to the SVE programming model, and
partly because of ABI subtleties which make it difficult to do this

ABI Impact

The current arm64 signal frame size is not sufficient to save all SVE
state for larger vector lengths.

This won't affect existing binary distros, since the signal frame is not
extended unless some SVE instructions are executed by the user task.

However, non-SVE code executing in the same processes as SVE-aware code
may start to see the kernel using more than MINSIGSTKSZ bytes of stack
to deliver a signal, which may lead to stack overruns.  Other ABI
breakages are also possible if we were to simply increase the
MINSIGSTKSZ #define.  SVE aware code will need to move to a new
mechanism to discover the signal frame size: perhaps a new prctl() (not
implemented in this series).

As a temporary workaround, I added a Kconfig option in [1] to clamp the
vector length to a safe maximum that hides this effect, but this was
only intended as a short-term kludge.

This series removes the Kconfig kludge and introduces a new runtime

# echo <vector length in bytes> >/proc/cpu/sve_default_vector_length

will now set the default vector length for newly-exec'd processes.  This
is initialised to the ABI-safe value 512 at boot (or the maximum value
supported by the hardware, if smaller).  Administrators / distro
maintainers / developers can set this to something different in boot
scripts if they are comfortable doing so, or to see what happens.

We _could_ increase the kernel default in the future when and if we are
satisfied that the change is sufficiently low-impact.

User tasks can always override the default via prctl(): the logic is
that non-SVE-aware code doesn't know how to change the vector length,
and so won't do that anyway.  SVE-aware code is presumed to understand
the consequences.

The vector length can be made inheritable (allowing implementation of
taskset-like tools, or running a testsuite with a particular vector
length) or not (for general-purpose processes; in which case the vector
length is reset to the default across exec).

[1] arm64: Scalable Vector Extension core support

[2] Technology Update: The Scalable Vector Extension (SVE) for the ARMv8-A architecture

Note: The size of an SVE vector register (the "vector length") is
choosable per hardware implementation, and the ISA allows software to be
coded independently of the actual vector length in use.  The vector
length can also be selected explicitly by software within the limits
supported by the hardware -- this is expected to be useful in some
situations.  This series exposes this control to userspace.

Dave Martin (10):
  prctl: Add skeleton for PR_SVE_{SET,GET}_VL controls
  arm64/sve: Track vector length for each task
  arm64/sve: Set CPU vector length to match current task
  arm64/sve: Factor out clearing of tasks' SVE regs
  arm64/sve: Wire up vector length control prctl() calls
  arm64/sve: Disallow VL setting for individual threads by default
  arm64/sve: Add vector length inheritance control
  arm64/sve: ptrace: Wire up vector length control and reporting
  arm64/sve: Enable default vector length control via procfs
  Revert "arm64/sve: Limit vector length to 512 bits by default"

 arch/arm64/Kconfig                    |  35 ----
 arch/arm64/include/asm/fpsimd.h       |  25 ++-
 arch/arm64/include/asm/fpsimdmacros.h |   7 +-
 arch/arm64/include/asm/processor.h    |  12 ++
 arch/arm64/include/uapi/asm/ptrace.h  |   5 +
 arch/arm64/kernel/entry-fpsimd.S      |   2 +-
 arch/arm64/kernel/fpsimd.c            | 334 +++++++++++++++++++++++++++++++---
 arch/arm64/kernel/ptrace.c            |  27 +--
 arch/arm64/kernel/signal.c            |  15 +-
 arch/arm64/mm/proc.S                  |   5 -
 include/uapi/linux/prctl.h            |  10 +
 kernel/sys.c                          |  12 ++
 12 files changed, 407 insertions(+), 82 deletions(-)


More information about the Gdb mailing list