[PATCH 08/11] Cygwin: testsuite: Busy-wait in cancel3 and cancel5
Jon Turney
jon.turney@dronecode.org.uk
Tue Jul 18 11:20:18 GMT 2023
On 17/07/2023 16:41, Corinna Vinschen wrote:
> On Jul 17 16:21, Corinna Vinschen wrote:
>> On Jul 17 12:51, Jon Turney wrote:
>>> On 17/07/2023 12:05, Corinna Vinschen wrote:
>>>> diff --git a/winsup/cygwin/thread.cc b/winsup/cygwin/thread.cc
>>>> index f614e01c42f6..fceb9bda1806 100644
>>>> --- a/winsup/cygwin/thread.cc
>>>> +++ b/winsup/cygwin/thread.cc
>>>> @@ -546,6 +546,13 @@ pthread::exit (void *value_ptr)
>>>> class pthread *thread = this;
>>>> _cygtls *tls = cygtls; /* Save cygtls before deleting this. */
>>>> + /* Deferred cancellation still pending? */
>>>> + if (canceled)
>>>> + {
>>>> + WaitForSingleObject (cancel_event, INFINITE);
>>>> + value_ptr = PTHREAD_CANCELED;
>>>> + }
>>>> +
>>>> // run cleanup handlers
>>>> pop_all_cleanup_handlers ();
>>>> What do you think?
>>>
>>> I mean, by your own interpretation of the standard, this isn't required,
>>> because we're allowed to take arbitrarily long to deliver the async
>>> cancellation, and in this case, we took so long that the thread exited
>>> before it happened, too bad...
>>
>> True enough!
>>
>>> It doesn't seem a bad addition,
>>
> Actually, it seems we actually *have* to do this. I just searched
> for more info on that problem and, to my surprise, I found this in the
> most obvious piece of documentation:
>
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_exit.html
>
> Quote:
>
> As the meaning of the status is determined by the application (except
> when the thread has been canceled, in which case it is
> PTHREAD_CANCELED), [...]
>
>> On second thought...
>>
>> One thing bugging me is this:
>
> This is still a bit fuzzy, though. I'd appreciate any input.
>
>> Looking into pthread::cancel we have this order of things:
>>
>> // cancel deferred
>> mutex.unlock ();
>> canceled = true;
>> SetEvent (cancel_event);
>> return 0;
>>
>> The canceled var is set before the SetEvent call.
>> What if the thread is terminated after canceled is set to true but
>> before SetEvent is called?
>>
>> pthread::testcancel claims:
>>
>> We check for the canceled flag first. [...]
>> Only if the thread is marked as canceled, we wait for cancel_event
>> being really set, on the off-chance that pthread_cancel gets
>> interrupted before calling SetEvent.
>>
>> Neat idea to speed up the code, but doesn't that mean we have a
>> potential deadlock, especially given that pthread::testcancel calls WFSO
>> with an INFINITE timeout?
I'm not sure I follow: another thread sets cancelled = true, just before
we hit pthread::testcancel(), so we go into the WFSO, but then the other
thread continues, signals cancel_event and everything's fine.
What meaning are you assigning to "interrupted" here?
Are we worried about the thread calling pthread_cancel being cancelled
itself?
>> And if so, how do we fix this? Theoretically, the most simple
>> solution might be to call SetEvent before setting the canceled
>> variable, but in fact we would have to make setting canceld
>> and cancel_event an atomic operation.
Well, yeah, that is required for them to be coherent. But we have a
mutex on the thread object for that purpose, and I don't quite see why
it's released so early here.
More information about the Cygwin-patches
mailing list