x86 CPU features detection for applications (and AMX)

Peter Zijlstra peterz@infradead.org
Mon Jun 28 17:43:47 GMT 2021


On Mon, Jun 28, 2021 at 09:13:29AM -0700, Thiago Macieira wrote:

> Anyway, what's the current thinking on what the arch_prctl() should be? Is 
> that a per-thread state or will it affect the entire process group? And is it 
> a sticky functionality, or are we talking about ref/deref?

So I didn't follow the initial discussion too well; so I might be
getting this wrong. In which case I'm hoping Thomas and/or Andy will
correct me.

But I think the proposal was per process. Having this per thread would
be really unfortunate IMO.

> Maybe in order to answer that, we need to understand what the worst case 
> scenario we need to support is. What are they?
> 
> 1) alt-stack signal handlers, usually for crashing signals (to catch a stack 
> overflow)
> 
> 2) cooperative user-space task schedulers, e.g. coroutines
> 
> 3) preemptive user-space task schedulers (I don't know if such software exists 
> or even if it is possible)

I think it's been done; use sigsetmask()/pthread_sigmask() as 'IRQ'
disable, and run a preemption tick off of SIGALRM or something.

> 4) combination of 1 and 3

None of those I think. The worst case is old executables using
MINSIGSTKSZ and not using the magic signal context at all, just regular
old signals. If you run them on an AVX512 enabled machine, they overflow
their stack and cause memory corruption.

AFAICT the only feasible way forward with that is some sysctl which
default disables AVX512 and add the prctl() and have some unsafe wrapper
that enables AVX512 for select 'legacy' programs for as long as they
exist :/

That is, binaries/libraries compiled against a static (and small)
MINSIGSTKSZ are the enemy. Which brings us to:

> 5) #4, in which each part is comes from a separate library with no knowledge 
> of each other, and initialised concurrently in different threads

That's terrible... library init should *NEVER* spawn threads (I know,
don't start).

Anything that does this is basically unfixable, because we can't
guarantee the AMX prctl() gets done before the first thread.

So yes, worst case I suppose...


More information about the Libc-alpha mailing list