This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Linux: Implement interfaces for memory protection keys
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 5 Dec 2017 12:06:37 -0200
- Subject: Re: [PATCH] Linux: Implement interfaces for memory protection keys
- Authentication-results: sourceware.org; auth=none
- References: <20171104174948.A79F6400EA2DF@oldenburg.str.redhat.com> <ddcb66a6-b0d8-a820-a12d-970fa2021b37@linaro.org> <5f7602dd-daf1-36ec-3040-19cfd17f8fc2@redhat.com> <521ff046-f859-e8f4-3e75-f2a6e0d527b0@linaro.org> <878tfjm5ds.fsf@mid.deneb.enyo.de> <8c486d33-25ea-ee3e-113e-e41a777727e0@linaro.org> <74c45fda-bc2d-5bb8-4c39-175fbae236c6@redhat.com> <0a6d1b29-d490-4d77-de57-09273ba362db@linaro.org> <913fead4-1aad-334d-c231-30a431c582ee@redhat.com>
On 01/12/2017 14:30, Florian Weimer wrote:
> On 11/28/2017 08:01 PM, Adhemerval Zanella wrote:
>
>>>> My understanding from documentation it key 0 is default one and it
>>>> is assigned to any memory regions (through mmap) which has not been
>>>> explicitly assigned through pkey_mprotect. My questioning whether we
>>>> should filter out key 0 is to:
>>>>
>>>> 1. Issue an error when application tries to use a key which has
>>>> not been previously allocated by pkey_alloc (since key 0 is
>>>> pre-allocated by the kernel).
>>>>
>>>> 2. Avoid changing default masking bits for pages which has not
>>>> been explicitly assigned through a pkey_mprotect.
>>>>
>>>> However this does not prevent uses from issuing a RDPKRU/WRPKRU
>>>> directly.
>>>
>>> There is no way to probe allocated keys, so I don't think we can implement more extensive checking without further kernel interfaces.
>>
>> That was my understanding as well, I was just checking if I missed something
>> reading pkey documentation (from both architecture and kernel level). I
>> noticed there is some idea of extending the sysfs to provide more information
>> about the system supported keys [1], but even through it also does not
>> really provide the information of the allocated keys.
>>
>> [1] https://marc.info/?l=linux-kernel&m=150995950219669&w=2
>
> I've since learned that the PROT_EXEC key is dynamically allocated, too. It's not always 1. So this means that we can't consistently reject keys used by the kernel.
Right, I noted it also checking on kernel tools/testing/selftests/x86/protection_keys.c
test.
>
>>> +* Support for memory protection keys was added. The <sys/mman.h> header now
>>> + declares the functions pkey_alloc, pkey_free, pkey_memprotect, pkey_set,
>>> + pkey_get.
>>> +
>> > Typo here (s/memprotect/mprotect).
>
> Thanks, fixed.
>
>>> +New threads and subprocesses inherit the access rights of the current
>>> +thread. If a protection key is allocated subsequently, existing threads
>>> +(except the current) will use an unspecified system default for the
>>> +access rights associated with newly allocated keys.
>>
>> For subprocesses isn't the case that the default key 0 is used instead,
>> instead of the 'unspecified' one?
>
> Only after execve, I hope. It would be weird if PROT_EXEC memory turned readable in the subprocess after fork. It would also totally break my idea to use this for write-protecting the GOT most of the time. 8-P
>
> I just verified this on actual hardware. Should I add another test for that, maybe tst-pkey-fork? Maybe as a separate patch?
I think it could be a follow up patch.
>
>>> +@deftypefun int pkey_free (int @var{key})
>>> +@standards{Linux, sys/mman.h}
>>> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>>> +Deallocate the protection key, so that it can be reused by
>>> +@code{pkey_alloc}.
>>> +
>>> +Calling this function does not change the access rights of the freed
>>> +protection key. The calling thread and other threads may retain access
>>> +to it, even if it is subsequently allocated again. For this reason, it
>>> +is not recommended to call the @code{pkey_free} function.
>>
>> I presume you are referring to your 'MPK: pkey_free and key reuse' discussion,
>> but should we add a note that 'pkey_free' should be as long as the application
>> make sure to munmap the pages associated with the key meant to free first?
>
> No, it's also relevant that threads retain access when the key is reused.
My understanding is if we pkey_alloc -> mprotect -> munmap -> pkey_free,
it should not create any other side effects even for key reuse. Or am
I missing something here?
>
>>> +Some systems use memory protection keys to emulate certain combinations
>>> +of @var{protection} flags. Under such circumstances, specifying an
>>> +explicit protection key may behave as if additional flags have been
>>> +specified in @var{protection}, even though this does not happen with the
>>> +default protection key. For example, some systems can supported
>>
>> Shouldn't it be 'can support'?
>
> Yes, fixed.
>
>>> +int
>>> +pkey_mprotect (void *addr, size_t len, int prot, int pkey)
>>> +{
>>> + if (pkey == -1)
>>> + /* If the key is -1, the system call is precisely equivalent to
>>> + mprotect. */
>>> + return __mprotect (addr, len, prot);
>>> +#ifdef __NR_pkey_mprotect
>>> + return INLINE_SYSCALL (pkey_mprotect, 4, addr, len, prot, pkey);
>>> +#else
>>> + __set_errno (ENOSYS);
>>> + return -1;
>>> +#endif
>>> +}
>>
>> I would suggest to use INLINE_SYSCALL_CALL here to omit the argument number.
>
> Right, I made the change.
>
>
>>> + keys[0] = pkey_alloc (0, 0);
>>> + if (keys[0] < 0)
>>> + {
>>> + if (errno == ENOSYS)
>>> + {
>>> + puts ("warning: kernel does not support memory protection keys");
>>> + return EXIT_UNSUPPORTED;
>>> + }
>>
>> We have FAIL_UNSUPPORTED for these cases.
>
> Okay, I switched to that.
>
>>> + /* The manual page says that ENOSPC is used to report lack of
>>> + CPU support, but the kernel returns EINVAL. */
>>> + if (errno == ENOSPC || errno == EINVAL)
>>> + {
>>> + printf ("warning: CPU does not support memory protection keys: %m");
>>> + return EXIT_UNSUPPORTED;
>>> + }
>>> + FAIL_EXIT1 ("pkey_alloc: %m");
>>
>> I do not think we should comment the manpages description because it could
>> changed.
>
> Fair enough, I dropped the commend and the check for ENOSPC.
>
> The attached patch moves the implementation to the x86 directory because it turned out the syscalls work on i386 as well (but presumably only with a 64-bit kernel).
LGTM and we may extend the pkey_free documentation later if it were the case.
As a side note what are you ideas regarding using pkey on glibc (and could you
extend the idea of the GOT protection as well)?
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>