PING: V7 [PATCH] sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]

Thu Nov 19 17:29:17 GMT 2020

On Thu, Nov 19, 2020 at 04:37:45PM +0000, Dave Martin wrote:
> On Thu, Nov 19, 2020 at 02:59:34PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> > The 11/18/2020 18:09, Dave Martin via Libc-alpha wrote:
> > > On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > > > * Dave Martin:
> > > > > I have some thoughts on what a better interface might look like --
> > > > > basically separating the signal ucontext_t type from the setcontext()/
> > > > > getcontext() etc. type, and providing accessors for the architectural
> > > > > register state rather than just having a fixed struct definition for
> > > > > mcontext_t.
> > > > >
> > > > > But, there also may not be a lot of appetite for such a change, and
> > > > > I can't see how it could be backwards compatible.
> > > > >
> > > > > I can elaborate if people think it's worth discussing.
> > > > 
> > > > I think Rich Felker wants to copy signal contexts around to implement
> > > > critical sections that can't be interrupted by a signal handler, I
> > > > think that would need this fully fixed.
> > > 
> > > Is rseq a more suitable way to do that sort of thing on new-ish Linux?
> > > I guess a fallback may be needed for older / other kernels though.
> > 
> > rseq does not help with libc critical sections:
> > 
> > the point is not to restart the critical section
> > (which would require no side effect or mechanisms
> > to roll side effects back and that the section is
> > entirely written in asm between begin/end labels
> > so the kernel knows when the section is left),
> > 
> > but to let the critical section with all its side
> > effects complete and delay the signal handler
> > until then. (the slow and easy way to do this is
> > masking signals using syscalls around critical
> > sections, a fast solution needs signal wrapping
> > and saving the sigcontext.)
> > 
> > for example the entire malloc call can be a critical
> > section and an incomming signal delayed until malloc
> > completes. such solution allows hiding all libc
> > internal inconsistent state from user code so async
> > signal handlers can call all libc apis.
> 
> Isn't this a bit backwards.  This "makes" trivial signal handlers easier
> to write, but this is a bit of a Trojan horse: precisely because signal
> handlers can interrupt things, subtleties abound.  So, while there are
> plenty of naive signal handlers out there, there are far fewer that are
> genuinely trivial -- i.e., free from subtleties.
> 
> In any case, the problem of async signal safety remains: even if libc
> uses internal locks to hide it, library functions in general may not.
> 
> A better approach..

You're missing a lot of the context of this to understand the
motivations. While the malloc example is interesting and making it
AS-safe as an extension is perhaps nice, the real motivation is places
where we're implementing an interface that is already required or at
least expected to be AS-safe. If such a function needs to access a
shared resource under lock, then presently that lock can never be
taken except with all application signals blocked. Otherwise, it's
possible that the AS-safe function needing the lock runs from a signal
handler that interrupted code that already held the lock. (Note:
recursive locks can't solve this, because a more complex version of
the same deadlock can be recreated with a pair of threads each
interrupted by signal handlers each needing the lock the other holds
in the interrupted code.)

One particularly nasty example is munmap (and mmap or mremap with
MAP_FIXED) which needs to synchronize with removal of robust mutexes
from the pending slot of the unlocking thread's robust_list. (Without
doing this, there are fundamental race conditions whereby async
termination of a process with robust mutexes can corrupt memory-mapped
files.) Presently, this makes it so our munmap, etc. are not AS-safe.
Note that they're not required to be AS-safe by POSIX, but most
programmers assume they are. In order to make them AS-safe, every
pthread_mutex_unlock on a robust mutex would need to mask and unmask
signals, so that the lock that inhibits changes to mappings couldn't
be held arbitrarily long and in deadlocking ways when interrupted by a
signal. But this masking/unmasking would introduce 2 syscalls to a
very hot path that's not supposed to have any.

However the original motivation for signal wrapping (which is what
enables deferral) is fixing an otherwise unfixable race between execve
and abort. In order to exit via SIGABRT when the signal starts out
with SIG_IGN disposition, abort has to change the disposition, and use
locks to prevent it from being changed back and prevent the changed
disposition from being observable. Morally, execve should take this
lock to prevent passing the changed disposition to a new process image
if it races with abort. However, the lock has to be an AS-safe one
(since abort has to be AS-safe), and execve can't mask signals or the
new mask would be passed to the new process image.

With the proposed wrapping/deferral, we can safely take the abort lock
without masking any signals. There's some additional machinery needed
here in the signal wrapper to make everything come out right, which
I'm glossing over because it's not the point here, but it all works
out and gives you an execve that can't leak internal state of abort.

> would be to have function attributes that identify
> code that may run in signal context and async-signal-safe functions, so
> that the compiler can actually enforce that only reentrant functions are
> called from signal context.
> 
> Finally, if a fault signal is delivered while blocked or ignored it
> kills the process.  So handlers for fault signals raised by the kernel
> still wouldn't be able to call random libc functions: to prevent sudden
> death while in the middle of malloc etc., libc must not mask these
> signals, and wouldn't be safely reentrant while handling then -- thus we
> still have the problem we intended to solve.

This has been raised plenty of times already. The desired outcome is
that the process terminate. This is a feature not a bug. If you really
don't want that, you *could* process synchronous signals (SIGSEGV
raised by invalid access) synchronously rather than deferring them,
possibly after releasing the lock for the critical section, but all
this does is de-harden the code. The *only* way a fault can happen in
this code is for the internal state to be corrupted in a manner in
which no further execution in the process context is safe.

> This is ironic, since these signals are the _only_ signals that must be
> handled using signal handlers.  Other signals can all be accepted by
> other means, such as signalfd.

I think you mean the only signals for which installing a handler is
meaningless because they're only delivered when the program already
has undefined behavior. (Technically you could use them in ways that
can have meaningfully defined model *from your own code* or even from
library code like memcpy or something if you've created mappings that
are intended to fault, but there is no defined or even reasonably
definable condition under which that access would end up taking place
in one of the critical sections under consideration here.)

> The other reason to use signal handlers is to minimise response latency
> for asynchronous signals.  Masking signals in order to bulletproof
> code that the signal handler probably isn't going to use anyway would
> interfere with this goal.

Yes, deferring signals is a matter of trading very small bounded
amounts of latency to get rid of unbounded latency and possible
deadlock from code that holds a lock being interrupted by a signal
handler. Note that the equivalent trade (with much much larger
latency) already happens anywhere you have to mask signals temporarily
via a sigprocmask.

Rich