This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2 02/21] nptl: Fix testcases for new pthread cancellation mechanism


On Mon, May 7, 2018 at 1:13 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
> On 06/05/2018 23:49, Zack Weinberg wrote:
>> On 26 Feb 2018, Adhemerval Zanella wrote:
>>> For tst-cancel{2,3} case it remove the pipe close because it might
>>> cause the write syscall to return with side effects if the close is
>>> executed before the pthread cancel.
>>
>> ... however, this change appears to be wrong.  If cancellation is
>> broken, these tests will now deadlock rather than failing cleanly.
>
> On current cancellation implementation the thread will finish regardless
> and sigcancel_handler will act whether there is side-effects or not
> (the pipe close). The issue is cancellation should not happen if syscall
> returns but some side effects already took place, in this case the pipe
> close.

I think maybe I didn't explain clearly enough what I'm worried about
here.  What the test case does _when cancellation works_ is sensible.
But this is a test case, it also needs to behave sensibly when
cancellation _doesn't_ work.  Imagine a new port where, for some
reason, the cancellation mechanism is so broken that read/write aren't
acting as cancellation points at all. Without the close,
tst-cancel{2,3} will block forever in read/write.  We have the
test-driver timeout as a backstop, but we shouldn't rely on it.

> Yes, although for this specific case I am not sure if this could happen
> in practice.  I assume if a thread issues a 'signal' followed by a 'close',
> the signal target thread will receive the events in a ordered manner, i.e,
> the signal handler will be activated before the syscall sees any
> side-effects (the close).  It seems to be Linux behaviour, but I am not
> sure if a different system might act differently.

I don't think POSIX makes any requirements, but yes, in practice the
signal should always arrive first.

> And I try to avoid the timing check, such as pthread_timedjoin_np,
> because they tend to quite fragile in practice for such cases (due either
> to system load when testing glibc, machine performance, etc.).

This is reasonable.

For the new cancellation mechanism in general, we don't have a good
way of arranging for SIGCANCEL to arrive at exactly the critical
points within the syscall sequence, do we?  I am tempted to try to
write a test case that scripts gdb and single-steps through a call to
open() and fires SIGCANCEL at each instruction.

>> won't it?  I think teaching the backtrace logic about this would be
>> better than needing to use a raw syscall() and then mess with the
>> PowerPC implementation of syscall().  I might feel differently about
>> this change if __read_nocancel were a public API, but it isn't...
>
> With your current suggestion to powerpc syscall bits, there is no need
> to actually change the powerpc syscall implementation besides an
> additional CFI mechanism.  But I do not mind to change the testcase on
> the bz12683 fix itself, the only advantage I see is by using indirect
> syscall there is no need to actually change it again.

I don't feel especially strongly about this now we have a way that
doesn't add actual instructions to powerpc syscall().

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]