This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v2 02/21] nptl: Fix testcases for new pthread cancellation mechanism
On Mon, May 7, 2018 at 1:13 PM, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
> On 06/05/2018 23:49, Zack Weinberg wrote:
>> On 26 Feb 2018, Adhemerval Zanella wrote:
>>> For tst-cancel{2,3} case it remove the pipe close because it might
>>> cause the write syscall to return with side effects if the close is
>>> executed before the pthread cancel.
>>
>> ... however, this change appears to be wrong. If cancellation is
>> broken, these tests will now deadlock rather than failing cleanly.
>
> On current cancellation implementation the thread will finish regardless
> and sigcancel_handler will act whether there is side-effects or not
> (the pipe close). The issue is cancellation should not happen if syscall
> returns but some side effects already took place, in this case the pipe
> close.
I think maybe I didn't explain clearly enough what I'm worried about
here. What the test case does _when cancellation works_ is sensible.
But this is a test case, it also needs to behave sensibly when
cancellation _doesn't_ work. Imagine a new port where, for some
reason, the cancellation mechanism is so broken that read/write aren't
acting as cancellation points at all. Without the close,
tst-cancel{2,3} will block forever in read/write. We have the
test-driver timeout as a backstop, but we shouldn't rely on it.
> Yes, although for this specific case I am not sure if this could happen
> in practice. I assume if a thread issues a 'signal' followed by a 'close',
> the signal target thread will receive the events in a ordered manner, i.e,
> the signal handler will be activated before the syscall sees any
> side-effects (the close). It seems to be Linux behaviour, but I am not
> sure if a different system might act differently.
I don't think POSIX makes any requirements, but yes, in practice the
signal should always arrive first.
> And I try to avoid the timing check, such as pthread_timedjoin_np,
> because they tend to quite fragile in practice for such cases (due either
> to system load when testing glibc, machine performance, etc.).
This is reasonable.
For the new cancellation mechanism in general, we don't have a good
way of arranging for SIGCANCEL to arrive at exactly the critical
points within the syscall sequence, do we? I am tempted to try to
write a test case that scripts gdb and single-steps through a call to
open() and fires SIGCANCEL at each instruction.
>> won't it? I think teaching the backtrace logic about this would be
>> better than needing to use a raw syscall() and then mess with the
>> PowerPC implementation of syscall(). I might feel differently about
>> this change if __read_nocancel were a public API, but it isn't...
>
> With your current suggestion to powerpc syscall bits, there is no need
> to actually change the powerpc syscall implementation besides an
> additional CFI mechanism. But I do not mind to change the testcase on
> the bz12683 fix itself, the only advantage I see is by using indirect
> syscall there is no need to actually change it again.
I don't feel especially strongly about this now we have a way that
doesn't add actual instructions to powerpc syscall().
zw