This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] nptl: Remove cancellation checks from sem_{timed}wait (BZ #23006)



On 25/06/2019 09:35, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>>> My impression is that doesn't mean that the behavior is random (although
>>> the wording allows for that).  It's dependent on how things have been
>>> configured, and not on which threads wins a race during the call.
>>
>> My understanding is pthread cancellation is a mechanism to abort
>> unbounded block from a thread. Being random is not meaningful here
>> (where the sem_{timed}wait would block or not).
> 
> Not just unbounded blocks, inconveniently long ones as well.  (Otherwise
> no timed wait function would be a cancellation point.)

Right.

> 
> What I meant is that if such functions only check for pending
> cancellation on slow paths, then you might end up with cases where
> cancellation checks disappear from executions due to future
> optimizations.  As a result, an expected cancellation point may vanish
> from some inner loop, causing application breakage.

I see such modification being a semantic change rather than just an 
optimization and it is most likely a regression.  That's why I see
BZ#23006 itself required a Austin group clarification.

> 
>>> Any cancellation model which makes functions a cancellation points only
>>> if they block has this problem.  It also means that future optimizations
>>> (in glibc, the kernel, or the silicon) could effectively remove
>>> cancellation points.  I don't think this is desirable.
>>
>> That's why I think to make sem_{timed}wait *not* a cancellation
>> entrypoint would need to open a Austin Group defect, it is explicit
>> that the only blocking routines that not act as a cancellation are
>> pthread_mutex_lock, pthread_barrier_wait, pthread_spin_lock.
> 
> We discussed this before with getentropy/getrandom.  I argued in the
> same direction (if something can block for a long time, it should be a
> cancellation point).  Torvald, as our concurrency expert, argued against
> this.  Existing practice also shows that most of our file system calls
> are *not* cancellation points, and these can also wait forever when
> hitting network file systems.  So all this is a bit inconsistent,
> unfortunately.

Another point is see no point if deviate glibc implementation from other
libc regarding sem_{timed}wait (where it is usually a cancellation point).
It would most likely cause application breakage as well.

But I understand Torvarld view, specially with current racy statues of
glibc thread cancellation (BZ#12683).  Maybe when we finally make thread
cancellation safer we also focus on make the Linux file system calls
more consistent regarding cancellation.

In any case, do you think we should make sem_{timed}wait not a cancellation
entrypoint for 2.30? I can rework the patch if it is the case.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]