This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH]: Fix blocking pthread_join.
- From: Carlos O'Donell <carlos at redhat dot com>
- To: Rich Felker <dalias at libc dot org>, Stefan Liebler <stli at linux dot vnet dot ibm dot com>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Wed, 2 May 2018 13:54:37 -0400
- Subject: Re: [PATCH]: Fix blocking pthread_join.
- Openpgp: preference=signencrypt
- References: <8aeb7b8b-6b1e-9a60-e961-75cde1aa463b@linux.vnet.ibm.com> <20180502162945.GX1392@brightrain.aerifal.cx>
On 05/02/2018 12:29 PM, Rich Felker wrote:
> On Wed, Apr 25, 2018 at 01:27:07PM +0200, Stefan Liebler wrote:
>> Hi,
>>
>> On s390 (31bit) if glibc is build with -Os, pthread_join sometimes
>> blocks indefinitely. This is e.g. observable with
>> testcase intl/tst-gettext6.
>>
>> pthread_join is calling lll_wait_tid(tid), which performs the futex-wait
>> syscall in a loop as long as tid != 0 (thread is alive).
>>
>> On s390 (and build with -Os), tid is loaded from memory before
>> comparing against zero and then the tid is loaded a second time
>> in order to pass it to the futex-wait-syscall.
>> If the thread exits in between, then the futex-wait-syscall is
>> called with the value zero and it waits until a futex-wake occurs.
>> As the thread is already exited, there won't be a futex-wake.
>>
>> In lll_wait_tid, the tid is stored to the local variable __tid,
>> which is then used as argument for the futex-wait-syscall.
>> But unfortunately the compiler is allowed to reload the value
>> from memory.
>>
>> With this patch, the tid is loaded by dereferencing a volatile pointer.
>> Then the compiler is not allowed to reload the value for __tid from memory.
>>
>> Okay to commit?
>
> There should probably be a bugzilla issue for this, no?
Yes. Publicly visible bugs need one.
--
Cheers,
Carlos.