[PATCH 2/4] nptl: Handle EPIPE on tst-cancel2
Florian Weimer
fweimer@redhat.com
Tue Aug 20 15:30:00 GMT 2019
* Adhemerval Zanella:
> For tst-cancel2.c, if I add a sleep (1) between pthread_create and
> pthread_cancel you can see this issue more clearly (dump with strace):
>
> [pid 2587] set_robust_list(0x7fffabccf290, 24) = 0
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid 2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000 <unfinished ...>
> [pid 2586] <... nanosleep resumed>0x7ffff0c9e7f0) = 0
>
> ########### Cancellation start to act here, by loading the libgcc to unwinding
> [pid 2586] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5
> [pid 2586] fstat(5, {st_mode=S_IFREG|0644, st_size=63776, ...}) = 0
> [pid 2586] mmap(NULL, 63776, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7fffabf00000
> [pid 2586] close(5) = 0
> [pid 2586] open("/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 5
> [pid 2586] read(5, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\340+\0\0\0\0\0\0"..., 832) = 832
> [pid 2586] fstat(5, {st_mode=S_IFREG|0755, st_size=133696, ...}) = 0
> [pid 2586] mmap(NULL, 197688, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x7fffab480000
> [pid 2586] mmap(0x7fffab4a0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x10000) = 0x7fffab4a0000
> [pid 2586] close(5) = 0
> [pid 2586] mprotect(0x7fffab4a0000, 65536, PROT_READ) = 0
> [pid 2586] munmap(0x7fffabf00000, 63776) = 0
> [pid 2586] tgkill(2586, 2587, SIGRTMIN) = 0
> [pid 2586] close(3) = 0
> [pid 2586] futex(0x7fffabccf280, FUTEX_WAIT, 2587, NULL <unfinished ...>
>
> ########### Write returns with broken PIPE and __pthread_disable_asynccancel is called
> [pid 2587] <... write resumed>) = -1 EPIPE (Broken pipe)
> [pid 2587] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2586, si_uid=61684} ---
> [pid 2587] --- SIGRTMIN {si_signo=SIGRTMIN, si_code=SI_TKILL, si_pid=2586, si_uid=61684} ---
> [pid 2587] futex(0x7fffab4b0224, FUTEX_WAKE_PRIVATE, 2147483647) = 0
>
> ########### No side-effects reported back to program
> [pid 2587] madvise(0x7fffab4c0000, 8257536, MADV_DONTNEED) = 0
> [pid 2587] exit(0) = ?
>
> With BZ#12683 fix the cancellation is not acted upon and the testcase then fails
> depending whether the write is interrupted or not by the cancellation signal.
Hmm. Which cancellation implementation is this? At which point in the
trace do we start unwinding? I'm surprised that strace reports the
EPIPE before the SIGPIPE, but maybe that's just a kernel race. My
expectation is that the current code unwinds after the system call
returns with the EPIPE error, never returning it to the application. I
think this is the right behavior for the write system call.
Thanks,
Florian
More information about the Libc-alpha
mailing list