This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] Bug 20116: Fix use after free in pthread_create().
- From: Torvald Riegel <triegel at redhat dot com>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>, Alexey Makhalov <amakhalov at vmware dot com>, Florian Weimer <fweimer at redhat dot com>, Siddhesh Poyarekar <siddhesh at gotplt dot org>
- Date: Thu, 26 Jan 2017 14:52:59 +0100
- Subject: Re: [PATCH] Bug 20116: Fix use after free in pthread_create().
- Authentication-results: sourceware.org; auth=none
- References: <b7974946-1706-6d90-7288-7df7833a970b@redhat.com>
rOn Wed, 2017-01-25 at 22:38 -0500, Carlos O'Donell wrote:
> In bug 20116 several users report a use after free in pthread_create
> when using a detached thread.
>
> The problem has been confirmed by Florian Weimer and Alexey Makhalov.
> Alexey has provided a patch to fix the issue and Florian and I have
> produced a regression test for this along with a detailed model of
> struct pthread ownership.
>
> The patch provided by Alexey has only 6 lines of legally significant
> material and so we don't yet need a copyright assignment from VMWare.
> The limit is 15 lines, so VMWare has 9 lines remaining before needing
> to seek copyright assignment.
>
> In summary the failure looks like this:
>
> nptl/pthread_create.c
>
> __pthread_create_2_1:
>
> 538 struct pthread *pd = NULL;
> 539 int err = ALLOCATE_STACK (iattr, &pd);
>
> * Stack is allocated for the detached thread.
>
> START_THREAD_DEFN:
>
> 450 /* If the thread is detached free the TCB. */
> 451 if (IS_DETACHED (pd))
> 452 /* Free the TCB. */
> 453 __free_tcb (pd);
>
> * Detached thread exits quickly, freeing the stack.
> * The stack is at the maximum cache size so it unmaps the stack.
>
> __pthread_create_2_1:
>
> 713 if (pd->stopped_start)
> 714 /* The thread blocked on this lock either because we're doing TD_CREATE
> 715 event reporting, or for some other reason that create_thread chose.
> 716 Now let it run free. */
> 717 lll_unlock (pd->lock, LLL_PRIVATE);
> 718
> 719 /* We now have for sure more than one thread. The main thread might
> 720 not yet have the flag set. No need to set the global variable
> 721 again if this is what we use. */
> 722 THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
>
> * Creating thread loads pd->stopped_start which has been freed and crashes.
>
> Alexey's solution is to make a copy of stopped_start and avoid
> referencing the value. This solution is as good as we are going to
> get without any serious redesigns. Note that this still does not fix
> the similar and related bug 19951, and bug 19511, but the ownership model
> helps us talk about the potential solutions.
>
> What I have done in addition to this is written up the 'struct pthread'
> ownership model as it is currently implemented in glibc. This ownership
> model will allow us to talk about which operations are or are not allowed
> at any given point in time. If the comments are not self-explanatory then
> I've done something wrong. Thanks to Torvald Riegel for suggesting the
> approach to take i.e. ownership model as a means to clarify operations
> on the thread descriptor.
>
> No regressions on x86_64.
>
> New regression test fails before patch with sigsegv, and passes after patch.
>
> OK to commit for 2.25?
This is OK, thanks. I have a few minor comments on the wording of the
concurrency notes, but this shouldn't hold up committing this patch.
Overall, the notes are good and really helpful, I think. Good
description of the high-level synchronization scheme, which the
implementation can then be verified against.
The fact that the high-level description points out an existing bug
(19511) is also a good thing. IMO, this shows that we should have
proper concurrency notes for more of our concurrent code.
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index 867e347..9882882 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -54,25 +54,136 @@ unsigned int __nptl_nthreads = 1;
> /* Code to allocate and deallocate a stack. */
> #include "allocatestack.c"
>
> -/* createthread.c defines this function, and two macros:
> +/* CONCURRENCY NOTES:
> +
> + Understanding who is the owner of the 'struct pthread' or 'PD'
> + (refers to the value of the 'struct pthread *pd' function argument)
> + is critically important in determining exactly which operations are
> + allowed and which are not and when, particularly when it comes to the
> + implementation of pthread_create, pthread_join, pthread_detach, and
> + other functions which all operate on PD.
> +
> + The owner of PD is responsible for freeing the final resources
> + associated with PD, and may examine the memory underlying PD at any
> + point in time until it frees it back to the OS or to reuse by the
> + runtime.
> +
> + The thread which calls pthread_create is called the creating thread.
> + The creating thread begins as the owner of PD.
> +
> + During startup the new thread may examine PD in coordination with the
> + owner thread (which may be itself).
> +
> + The four cases of ownership transfer are:
> +
> + (1) Ownership of PD is released to the process (all threads may use it)
> + after the new thread starts in a joinable state
> + i.e. pthread_create returns a usable pthread_t.
> +
> + (2) Ownership of PD is released to the new thread starting in a detached
> + state.
> +
> + (3) Ownership of PD is dynamically released to a running thread via
> + pthread_detach.
> +
> + (4) Ownership of PD is acquired by the thread which calls pthread_join.
> +
> + Implementation notes:
> +
> + The PD->stopped_start and thread_ran variables are used to determine
> + exactly which of the four ownership states we are in and therefore
> + what actions can be taken. For example after (2) we cannot read or
> + write from PD anymore since the thread may no longer exist and the
> + memory may be unmapped. The most complicated cases happen during
> + thread startup:
> +
> + (a) If the created thread is in a detached (PTHREAD_CREATE_DETACHED),
> + or joinable (default PTHREAD_CREATE_JOINABLE) state and
> + STOPPED_START is true, then the creating thread has ownership of
> + PD until the PD->lock is released by pthread_create.
> +
> + (b) If the created thread is in a detached state
> + (PTHREAD_CREATED_DETACHED), and STOPPED_START is false, then the
> + creating thread has ownership of PD until the OS thread creation
> + routine returns without error.
Maybe this is better: "the creating thread has ownership of PD until it
invokes the OS kernel's thread creation routine. If this routine
returns without error, then the created thread owns PD; otherwise, see
(c) and (e) below."
I think that's a little clearer because the ownership of PD is unclear
during the invokation of the kernel routine -- this is fine though
because the calling thread is suspended and doesn't have to know.
I've also s/OS/OS kernel's/ so that it's clear that we mean the kernel.
glibc is part of the OS, I'd say.
> +
> + (c) If the detached thread setup failed and THREAD_RAN is true, then
> + the creating thread releases ownership to the new thread by
> + sending a cancellation signal.
Maybe add a note under which condition THREAD_RAN is set to true, so
that the relation to (b) becomes clearer?
> + (d) If the joinable thread setup failed and THREAD_RAN is true, then
> + then the creating thread retains ownership of PD and must cleanup
> + state. Ownership cannot be released to the process via the
> + return of pthread_create since a non-zero result entails PD is
> + undefined and therefore cannot be joined to free the resources.
> + We privately call pthread_join on the thread to finish handling
> + the resource shutdown (Or at least we should, see bug 19511).
> +
> + (e) If the thread creation failed and THREAD_RAN is false, then the
> + creating thread retains ownership of PD and must cleanup state.
> + No waiting for the new thread is required because it never
> + started.
> +