Bug 22396 - x86-64: siglongjmp does not restore PKRU register (pkey signal handler incompatibility)
Summary: x86-64: siglongjmp does not restore PKRU register (pkey signal handler incomp...
Status: NEW
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: 2.26
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-04 14:58 UTC by Florian Weimer
Modified: 2023-07-02 16:11 UTC (History)
6 users (show)

See Also:
Host:
Target: x86-64
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Weimer 2017-11-04 14:58:43 UTC
When a signal handler is left using siglongjmp, the PKRU value is not restored.  This is problematic because the signal handler has access to all memory protection keys disabled.

The application may not be able to save and restore the protection bits for all keys because the kernel API does not actually specify that the set of keys is a small, fixed set.
Comment 1 Carlos O'Donell 2017-11-05 06:07:34 UTC
(In reply to Florian Weimer from comment #0)
> When a signal handler is left using siglongjmp, the PKRU value is not
> restored.  This is problematic because the signal handler has access to all
> memory protection keys disabled.

The worst case is that the application fails because it can no longer access memory that it previously could?
 
> The application may not be able to save and restore the protection bits for
> all keys because the kernel API does not actually specify that the set of
> keys is a small, fixed set.

Correct, it could be an arbitrary number of keys.

I think this is simply a case where we could just specify that siglongjmp out of a signal handler leaves you in the most restrictive scenario possible, without access to any memory protected by the keys.

I don't see any easy technical solution to this problem.
Comment 2 Florian Weimer 2017-11-05 07:37:27 UTC
(In reply to Carlos O'Donell from comment #1)
> (In reply to Florian Weimer from comment #0)
> > When a signal handler is left using siglongjmp, the PKRU value is not
> > restored.  This is problematic because the signal handler has access to all
> > memory protection keys disabled.
> 
> The worst case is that the application fails because it can no longer access
> memory that it previously could?

Correct, and the signal handler could be provided by a completely different library which knows nothing about memory protection keys.
Comment 3 Florian Weimer 2017-11-08 06:10:27 UTC
I think HJ is freeing unused parts of jmpbuf, which we could use to store these bits.  Of course, it won't help if x86-64 gets many more access right bits.

So I think we need to change the kernel not to reset PKRU to the initial value (which could either be all-access or no-access, depending on kernel configuration).
Comment 4 Dave Hansen 2017-11-09 22:43:19 UTC
It should also be noted that this issue also applies to MPX which has a lot more state than protection keys.  (BNDSTATUS (8) + BNDCFGU (8) + BNDREGS0-3(4*16)).  The ABI for BNDREGS probably saves us here at least partly because it's pretty arguable that the registers are clobber at siglongjmp.  But BNDCFGU which turns the feature on/off is a problem.

Back to PK... The manpage (http://man7.org/linux/man-pages/man7/pkeys.7.html) could use some additional clarity here, but it does mention the signal behavior, and also says:

    The rights of any interrupted context are
    restored when the signal handler returns.

We could clarify that we truly mean sys_sigreturn() and that other mechanisms don't count as "returning".

The kernel _could_ detect when someone unblocks signals from inside a signal handler, but without doing a sys_sigreturn().  We could theoretically take some action there.  Does that do us any good?
Comment 5 Thiago Jung Bauermann 2017-11-10 22:16:41 UTC
Also, the signal(7) man page mentions that "a  signal  handler  function  must be very careful", and provides a white list of functions which it can safely call. Calling an unsafe function causes undefined behavior. Neither longjmp() nor siglongjmp() are in the white list.

Given the above, is it reasonable for a program to call {sig,}longjmp() from a signal handler and expect it to work?
Comment 6 Florian Weimer 2017-11-11 12:02:01 UTC
(In reply to Thiago Jung Bauermann from comment #5)
> Also, the signal(7) man page mentions that "a  signal  handler  function 
> must be very careful", and provides a white list of functions which it can
> safely call. Calling an unsafe function causes undefined behavior. Neither
> longjmp() nor siglongjmp() are in the white list.
> 
> Given the above, is it reasonable for a program to call {sig,}longjmp() from
> a signal handler and expect it to work?

Yes, the most stringent restrictions only apply to asynchronous signals.  Synchronous signals, such as SIGFPE due to integer division by zero, SIGBUS due to access to the truncated part of a file mapping, or SIGSEGV due to a null pointer dereference are delivered synchronously to the thread in question.  The same can be true for signals sent explicitly.  In these cases, we must support calling siglongjmp from a signal handler because it is an idiom which is often employed by non-C language runtimes.
Comment 7 Florian Weimer 2017-11-11 12:09:56 UTC
(In reply to Dave Hansen from comment #4)
> Back to PK... The manpage
> (http://man7.org/linux/man-pages/man7/pkeys.7.html) could use some
> additional clarity here, but it does mention the signal behavior, and also
> says:
> 
>     The rights of any interrupted context are
>     restored when the signal handler returns.
> 
> We could clarify that we truly mean sys_sigreturn() and that other
> mechanisms don't count as "returning".
> 
> The kernel _could_ detect when someone unblocks signals from inside a signal
> handler, but without doing a sys_sigreturn().  We could theoretically take
> some action there.  Does that do us any good?

Maybe we are approaching this from the wrong angle.  The main problem I see are pages which are read in many places, but are read-only most of the time.  Here, the delivery of a signal and a return through siglongjmp would revoke read access as well (with the kernel defaults).  If we could tell the kernel at pkey_alloc time that the access rights specified are the initial access rights for signal handlers (overriding the system default), then the issue would quite benign because the signal handle restores the access rights to their current values, which is essentially a NOP.  Only for regions which enhance their privileges using a temporary PKRU update, this would make a difference, but they would also have to have a sigsetjmp landing site, so they are obviously aware of the complexities.

For the larger problem, maybe we should add interface which merge the jmpbuf context with the signal handler context before returning to the sigsetjmp site.  This will at least help new programs.

But I think for MPK, the pkey_alloc flag would be sufficient for glibc's own needs.
Comment 8 Andreas Schwab 2017-11-11 12:19:46 UTC
It's not a matter of async vs sync signals, only the property of the interrupted function matters:

"Although longjmp() is an async-signal-safe function, if it is invoked from a signal handler which interrupted a non-async-signal-safe function or equivalent (such as the processing equivalent to exit() performed after a return from the initial call to main()), the behavior of any subsequent call to a non-async-signal-safe function or equivalent is undefined."

Thus a handler for a synchronously-generated signal that interrupted a non-async-signal-safe function has the same restrictions.