[PATCH 2/4] nptl: Handle EPIPE on tst-cancel2

Florian Weimer fweimer@redhat.com
Tue Aug 20 15:30:00 GMT 2019


* Adhemerval Zanella:

> For tst-cancel2.c, if I add a sleep (1) between pthread_create and 
> pthread_cancel you can see this issue more clearly (dump with strace):
>
> [pid  2587] set_robust_list(0x7fffabccf290, 24) = 0
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000) = 100000
> [pid  2587] write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 100000 <unfinished ...>
> [pid  2586] <... nanosleep resumed>0x7ffff0c9e7f0) = 0
>
> ########### Cancellation start to act here, by loading the libgcc to unwinding
> [pid  2586] open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5
> [pid  2586] fstat(5, {st_mode=S_IFREG|0644, st_size=63776, ...}) = 0
> [pid  2586] mmap(NULL, 63776, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7fffabf00000
> [pid  2586] close(5)                    = 0
> [pid  2586] open("/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 5
> [pid  2586] read(5, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\25\0\1\0\0\0\340+\0\0\0\0\0\0"..., 832) = 832
> [pid  2586] fstat(5, {st_mode=S_IFREG|0755, st_size=133696, ...}) = 0
> [pid  2586] mmap(NULL, 197688, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x7fffab480000
> [pid  2586] mmap(0x7fffab4a0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x10000) = 0x7fffab4a0000
> [pid  2586] close(5)                    = 0
> [pid  2586] mprotect(0x7fffab4a0000, 65536, PROT_READ) = 0
> [pid  2586] munmap(0x7fffabf00000, 63776) = 0
> [pid  2586] tgkill(2586, 2587, SIGRTMIN) = 0
> [pid  2586] close(3)                    = 0
> [pid  2586] futex(0x7fffabccf280, FUTEX_WAIT, 2587, NULL <unfinished ...>
>
> ########### Write returns with broken PIPE and __pthread_disable_asynccancel is called
> [pid  2587] <... write resumed>)        = -1 EPIPE (Broken pipe)
> [pid  2587] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2586, si_uid=61684} ---
> [pid  2587] --- SIGRTMIN {si_signo=SIGRTMIN, si_code=SI_TKILL, si_pid=2586, si_uid=61684} ---
> [pid  2587] futex(0x7fffab4b0224, FUTEX_WAKE_PRIVATE, 2147483647) = 0
>
> ########### No side-effects reported back to program
> [pid  2587] madvise(0x7fffab4c0000, 8257536, MADV_DONTNEED) = 0
> [pid  2587] exit(0)                     = ?
>
> With BZ#12683 fix the cancellation is not acted upon and the testcase then fails
> depending whether the write is interrupted or not by the cancellation signal.

Hmm.  Which cancellation implementation is this?  At which point in the
trace do we start unwinding?  I'm surprised that strace reports the
EPIPE before the SIGPIPE, but maybe that's just a kernel race.  My
expectation is that the current code unwinds after the system call
returns with the EPIPE error, never returning it to the application.  I
think this is the right behavior for the write system call.

Thanks,
Florian



More information about the Libc-alpha mailing list