[PATCH 2/2] manual: Document __libc_single_threaded

Adhemerval Zanella adhemerval.zanella@linaro.org
Fri May 22 16:36:22 GMT 2020



On 22/05/2020 13:14, Rich Felker wrote:
> On Fri, May 22, 2020 at 11:07:20AM -0400, Rich Felker wrote:
>> On Fri, May 22, 2020 at 11:54:58AM +0100, Szabolcs Nagy wrote:
>>> The 05/22/2020 12:05, Florian Weimer wrote:
>>>> * Szabolcs Nagy:
>>>>
>>>>> The 05/21/2020 15:44, Florian Weimer wrote:
>>>>>> * Szabolcs Nagy:
>>>>>>> what's wrong with pthread_join updating it?
>>>>>>
>>>>>> It's tricky do it correctly if there are two remaining threads, one of
>>>>>> them the one being joined, the other one a detached thread.  A
>>>>>> straightforward implementation merely looking at __nptl_nthreads before
>>>>>> returning from pthread_join would not perform the required
>>>>>> synchronization on the detached thread exit.
>>>>>
>>>>> i'm trying to understand this, but don't see
>>>>> what's wrong if the last thread is detached.
>>>>
>>>> Sorry, I meant three reamining threads in total, i.e., two more threads
>>>> in addition to the one thread that keeps going after the other two
>>>> exited, and may use __libc_single_threaded in the future.
>>>>
>>>> Clearer now?
>>>
>>> hm so a detached thread is concurrently exiting with
>>> a pthread_join which sees a decremented __nptl_nthreads
>>> but the detached thread has not actually exited yet.
>>
>> In principle this is no big deal as long as the exiting thread cannot
>> make any further actions where its existence causes an observable
>> effect on users of __libc_single_threaded. (For this purpose, I think
>> you actually need to define what uses are valid, though; see setxid
>> remarks below.) If it makes a problem for pthread_join that's an
>> implementation detail that should be fixable. The bigger issue is
>> memory synchronization.
>>
>>> i think glibc can issue a memory barrier syscall before
>>> decrementing __nptl_nthreads in a detached thread, this
>>> means if pthread_join observes __nptl_nthreads==1
>>> then user memory accesses in the detached thread are
>>> synchronized with non-atomic memory accesses after
>>> pthread_join returns. (i.e. __nptl_nthreads==1 should
>>> mean at all times that as far as user code is concerned
>>> the process is single threaded even if some detached
>>> thread is still hanging around)
>>
>> This still has consequences for setxid safety which is why musl now
>> fully synchronizes the existing threads list. But if you're not using
>> the thread count for that, it's not an issue. Indeed I think
>> SYS_membarrier is a solution here, but if it's not supported or
>> blocked by seccomp then __libc_single_threaded must not be made true
>> again at this time.
> 
> Uhg, SYS_membarrier is *not* a solution here. The problem is far
> worse, because the user of __libc_single_threaded potentially lacks
> *compiler barriers* too.
> 
> Consider something like:
> 
> 	if (!__libc_single_threaded) { lock(); need_unlock=1; }
> 	x = *p;
> 	if (need_unlock) unlock();
> 	/* ... */
> 	if (!__libc_single_threaded) { lock(); need_unlock=1; }
> 	x = *p;
> 	if (need_unlock) unlock();
> 
> Here, in the case where __libc_single_threaded is true the second time
> around, there is no (memory or compiler) acquire barrier between the
> first access to *p and the second. Thus the compiler can (and actually
> does! I don't have a minimal PoC but musl actually just hit a bug very
> close to this) omit the second load from memory, and uses the cached
> value, which may be incorrect because the exiting thread modified it.

Does it help to enforce a relaxed atomic MO on __libc_single_threaded
access in this example?

> 
> This could potentially be avoided with complex contracts about
> barriers needed to use __libc_single_threaded, but it seems highly
> error-prone.
> 
> Note that the issue becomes much easier to hit with a sort of "pretest
> not under lock, re-check with lock" idiom of the form:
> 
> 	x = *p;
> 	if (predicate(x)) {
> 		if (!__libc_single_threaded) { lock(); need_unlock=1; }
> 		x = *p;
> 		/* ... */
> 		if (need_unlock) unlock();
> 	}
> 
> Unlike the above, this one does not depend on the release barrier in
> unlock() not also being a compiler acquire barrier.
> 
> Rich
> 


More information about the Libc-alpha mailing list