This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/4] nptl: Handle EPIPE on tst-cancel2


* Adhemerval Zanella:

> On 20/08/2019 12:30, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> For tst-cancel2.c, if I add a sleep (1) between pthread_create and 
>>> pthread_cancel you can see this issue more clearly (dump with strace):
>>>
>>> [pid  2587] set_robust_list(0x7fffabccf290, 24) = 0
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
>>> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000 <unfinished ...>
>>> [pid  2586] <... nanosleep resumed>0x7ffff0c9e7f0) = 0
>>>
>>> ########### Cancellation start to act here, by loading the libgcc to unwinding
>>> [pid  2586] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5
>>> [pid  2586] fstat(5, {st_mode=S_IFREG|0644, st_size=63776, ...}) = 0
>>> [pid  2586] mmap(NULL, 63776, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7fffabf00000
>>> [pid  2586] close(5)                    = 0
>>> [pid  2586] open("/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 5
>>> [pid  2586] read(5, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\340+\0\0\0\0\0\0"..., 832) = 832
>>> [pid  2586] fstat(5, {st_mode=S_IFREG|0755, st_size=133696, ...}) = 0
>>> [pid  2586] mmap(NULL, 197688, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x7fffab480000
>>> [pid  2586] mmap(0x7fffab4a0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x10000) = 0x7fffab4a0000
>>> [pid  2586] close(5)                    = 0
>>> [pid  2586] mprotect(0x7fffab4a0000, 65536, PROT_READ) = 0
>>> [pid  2586] munmap(0x7fffabf00000, 63776) = 0
>>> [pid  2586] tgkill(2586, 2587, SIGRTMIN) = 0
>>> [pid  2586] close(3)                    = 0
>>> [pid  2586] futex(0x7fffabccf280, FUTEX_WAIT, 2587, NULL <unfinished ...>
>>>
>>> ########### Write returns with broken PIPE and __pthread_disable_asynccancel is called
>>> [pid  2587] <... write resumed>)        = -1 EPIPE (Broken pipe)
>>> [pid  2587] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2586, si_uid=61684} ---
>>> [pid  2587] --- SIGRTMIN {si_signo=SIGRTMIN, si_code=SI_TKILL, si_pid=2586, si_uid=61684} ---
>>> [pid  2587] futex(0x7fffab4b0224, FUTEX_WAKE_PRIVATE, 2147483647) = 0
>>>
>>> ########### No side-effects reported back to program
>>> [pid  2587] madvise(0x7fffab4c0000, 8257536, MADV_DONTNEED) = 0
>>> [pid  2587] exit(0)                     = ?
>>>
>>> With BZ#12683 fix the cancellation is not acted upon and the testcase then fails
>>> depending whether the write is interrupted or not by the cancellation signal.
>> 
>> Hmm.  Which cancellation implementation is this?  At which point in the
>> trace do we start unwinding?  I'm surprised that strace reports the
>> EPIPE before the SIGPIPE, but maybe that's just a kernel race.  My
>> expectation is that the current code unwinds after the system call
>> returns with the EPIPE error, never returning it to the application.  I
>> think this is the right behavior for the write system call.
>
> This is current implement, more specifically glibc 2.17, CentOS 7.6 on
> powerpc64le.  From the trace :
>
>>> [pid  2586] tgkill(2586, 2587, SIGRTMIN) = 0
>
> This where pthread_cancel sends the SIGCANCEL signal to thread.
>
>>> [pid  2586] close(3)                    = 0
>
> This is the
>
>     /* This will cause the write in the child to return.  */
>     close (fd[0]);
>
> In tst-cancel2.c.
>
> And finally:
>
>>> [pid  2587] <... write resumed>)        = -1 EPIPE (Broken pipe)
>>> [pid  2587] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2586, si_uid=61684} ---
>
> SIGPIPE is receives, making the write fail with EPIPE and then
>
>>> [pid  2587] --- SIGRTMIN {si_signo=SIGRTMIN, si_code=SI_TKILL, si_pid=2586, si_uid=61684} ---
>
> sigcancel_handler is issued.  And the implementation *does* unwind after the
> syscall is done, the problem is it ignores -1/EPIPE (and it is BZ#12683).

I looked at this some more.  It seems that in theory, we could also see
EPIPE in the old implementation or at least observe execution of a
signal handler, and the asynchronous cancel hides that.

In the new implementation, if there is a signal handler and a signal for
it is delivered before SIGCANCEL (even if the signal was sent *after*
pthread_cancel, from the same thread), I do not think there is any
chance whatsoever that we can hide the behavior difference.  After all,
SIGCANCEL may not trigger an asynchronous cancellation from the signal
handler.

System calls that are cancellation points appear to fall into these
categories:

(A) Those that do not perform any resource allocation (write, read).

(B) Those that perform resource allocation, but the allocation can be
    easily reverted (openat, accept4).

(C) Those that perform resource allocation, but the allocation is
    difficult to undo (recvmsg with descriptor passing).

(D) close.

For (A), I think POSIX allows sufficient wiggle room so that we can
exhibit a partial side effect and still act on the cancellation (without
reporting a partial write first to the application).

For (B), maybe we should undo the resource allocation and then proceed
to act on the cancellation.

For (C), the complexity may not be worth it.

For (A), (B), (C), we can act on the cancellation in the error case,
after we observe the cancellation flag in the signal handler trampoline.
(I think dropping the EINTR restriction from there achieves that.)  If
we do that, we do not need to change the test case.

(D) is very special.  Ideally, we would specify what happens with the
descriptor if the close call is canceled.  POSIX does not even specify
what the state of file descriptor is after an EINTR error, so it doesn't
say what happens with cancellation, either.  Maybe we have to leave that
undefined.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]