This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2 02/21] nptl: Fix testcases for new pthread cancellation mechanism



On 06/05/2018 23:49, Zack Weinberg wrote:
> On 26 Feb 2018, Adhemerval Zanella wrote:
>> With upcoming fix for BZ#12683, pthread cancellation does not act for:
>>
>>   1. If syscall is blocked but with some side effects already having
>>      taken place (e.g. a partial read or write).
>>   2. After the syscall has returned.
>>
>> The main change is due the fact programs need to act in syscalls with
>> side-effects (for instance, to avoid leak of allocated resources or
>> handle partial read/write).
>>
>> This patch changes the NPTL testcase that assumes the old behavior and
>> also remove the tst-cancel-wrappers.sh test (which checks for symbols
>> that will not exist anymore).
> 
> It's OK and expected that we have to adjust test cases that were
> expecting the old behavior ...
> 
>> For tst-cancel{2,3} case it remove the pipe close because it might
>> cause the write syscall to return with side effects if the close is
>> executed before the pthread cancel.
> 
> ... however, this change appears to be wrong.  If cancellation is
> broken, these tests will now deadlock rather than failing cleanly.

On current cancellation implementation the thread will finish regardless
and sigcancel_handler will act whether there is side-effects or not
(the pipe close). The issue is cancellation should not happen if syscall
returns but some side effects already took place, in this case the pipe
close.

> 
> If I understand correctly, the problem you're trying to avoid is that
> the 'tf' thread could conceivably receive the closed-pipe event before
> the cancellation signal, even though the 'do_test' thread triggers the
> cancellation signal first.  I don't know of any way to fix this 100%,
> but I think it would be good enough to use pthread_timedjoin_np to
> sleep for a hundred milliseconds or so in the 'do_test' thread, and
> then close the pipe if the 'tf' thread is still alive.

Yes, although for this specific case I am not sure if this could happen
in practice.  I assume if a thread issues a 'signal' followed by a 'close', 
the signal target thread will receive the events in a ordered manner, i.e, 
the signal handler will be activated before the syscall sees any 
side-effects (the close).  It seems to be Linux behaviour, but I am not
sure if a different system might act differently.

And I try to avoid the timing check, such as pthread_timedjoin_np,
because they tend to quite fragile in practice for such cases (due either 
to system load when testing glibc, machine performance, etc.).

I will remove the close removal changes in next iteration.

> 
> (tst-cancel4.c appears to be trying to ensure that cancellations are
> pending with a pthread_barrier_t, but as far as I know there's no
> guarantee that if a thread does
> 
>    pthread_cancel(th);
>    pthread_barrier_wait(ba);
> 
> where 'th' also waits on 'ba', the SIGCANCEL will actually be
> delivered before the barrier unblocks, either.  Feh.)
> 
>> It also changes how to call the read syscall on tst-backtrace{5,6}
>> to use syscall instead of read cancelable syscall to avoid need to
>> handle the cancelable bridge function calls.  It requires a change
>> on powerpc syscall implementation to create a stackframe, since
>> powerpc backtrace rely on such information.
> 
> It doesn't look technically difficult to me to handle an additional
> stack frame or two in the trace.  They're always going to be there,
> aren't they?  In the new world order, the stack trace will always be
> 
>  0  handle_signal
>  1  <signal trampoline>
>  2  __syscall_cancel_arch
>  3  __syscall_cancel
>  4  read
>  5  fn
>  6  fn
>  7  fn
>  8  do_test
> 
> won't it?  I think teaching the backtrace logic about this would be
> better than needing to use a raw syscall() and then mess with the
> PowerPC implementation of syscall().  I might feel differently about
> this change if __read_nocancel were a public API, but it isn't...

With your current suggestion to powerpc syscall bits, there is no need
to actually change the powerpc syscall implementation besides an
additional CFI mechanism.  But I do not mind to change the testcase on
the bz12683 fix itself, the only advantage I see is by using indirect
syscall there is no need to actually change it again.

> 
>> Checked on i686-linux-gnu, x86_64-linux-gnu, x86_64-linux-gnux32,
>> aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu,
>> powerpc-linux-gnu, sparcv9-linux-gnu, and sparc64-linux-gnu.
> 
> When you say checked, do you mean you actually ran the test cases, or
> did you just compile them (perhaps with a cross-compiler)?

It is a full make check on native machines.

> 
>>       * nptl/Makefile [$(run-built-tests) = yes] (tests-special): Remove
>>       tst-cancel-wrappers.sh.
>>       * nptl/tst-cancel-wrappers.sh: Remove file.
> 
> This part is OK.
> 
>>       * nptl/tst-cancel4.c (tf_write): Handle cancelled syscall with
>>       side-effects.
>>       (tf_send): Likewise.
> 
> This part is also OK.  I think the test can still fail spuriously due
> to races, but this one can't deadlock, at least, so we can live with it.
> 
>>       * sysdeps/unix/sysv/linux/powerpc/syscall.S (syscall): Create stack
>>       frame.
> 
> This ChangeLog entry belongs to patch 1.

Ack. I will remove it.

> 
> zw
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]