This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH 2/4] nptl: Handle EPIPE on tst-cancel2
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Florian Weimer <fweimer at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 3 Sep 2019 15:17:37 -0300
- Subject: Re: [PATCH 2/4] nptl: Handle EPIPE on tst-cancel2
- References: <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org> <email@example.com> <firstname.lastname@example.org>
On 02/09/2019 09:51, Florian Weimer wrote:
> * Adhemerval Zanella:
>> I think it is possible, but I see that enforcing it using the kernel
>> facilities to avoid extra code in userspace is cleaner.
>>> Even if we undo the resource allocation before that? That is, check
>>> that a cancellation has occurred on the system call return path, and if
>>> it has, close the descriptor and start unwinding.
>> What if the application requires extra steps for the case the syscall
>> returns a side-effects? What about some specific syscalls, such as
>> passing file descriptor using recvmsg? It would need the libc to map
>> *all* possible side-effects for each syscall, determine which might
>> leak side-effect and acts uppon then. What if the kernel, at some version,
>> added another feature to share resources that might create side-effects?
>> We will need to constantly map such cases, adding even more arch-specific
>> glue code to handle it.
>> I really think we should not go on this path.
> I agree now, given what we've got from the kernel. See my response to
>>>> That's not the interpretation I see, the cancellation description for
>>>> such cases is afaiu:
>>>> "The side-effects of acting upon a cancellation request while suspended
>>>> during a call of a function are the same as the side-effects that may be
>>>> seen in a single-threaded program when a call to a function is interrupted
>>>> by a signal and the given function returns [EINTR]"
>>>> And on signal concepts:
>>>> Interrupted function call. An asynchronous signal was caught by the process
>>>> during the execution of an interruptible function. If the signal handler
>>>> performs a normal return, the interrupted function call may return this
>>>> condition (see the Base Definitions volume of POSIX.1-2017, <signal.h>)."
>>>> And my understanding is, since glibc implemented cancellation using signals
>>>> and it requires to use SA_RESTART we can actually return to user if the
>>>> syscall has side-effects visible to caller.
>>> The open function cannot allocate a new file descriptor and fail with
>>> EINTR. So I don't see how this case applies here. Sorry.
>> But that is *exactly* the point. The open *does not* fail, but
>> cancellation acts *before* it can return to the callee. We can't
>> really make it fail with EINTR and acts on the cancellation. The idea
>> is "the interrupted function call may return this condition".
> My uneasiness stems from the fact that we're reading POSIX backwards,
> i.e., our interpretation (necessarily, given the kernel interfaces)
> behaves in a way that is obviously to complying with a direct reading of
> POSIX (a to-be-canceled thread synchronizing with the canceling thread,
> when the pthread_cancel call happened before the synchronization in
> program order in the canceling thread), and we then proceed to find
> reasons why this not totally broken.
> So, back to the start of the thread: Can we avoid the EPIPE error in an
> other way?
> Can we remove this line?
> /* This will cause the write in the child to return. */
> close (fd);
> Cancellation should still happen if the write is blocked, right? It's
> not really clear to me what the intent of the test was.
I think Rich has answered it better than me that using signals to implement
thread cancellation is an implementation detail and POSIX allows us for this
case to not act on some cancellation signals if it leads to visible
side-effects that might lead to resource leakage.
Said that, it would be possible to just remove the close call to avoid the
EPIPE error on write. The cancellation should still happen because the
write is called on a loop, so since it might return with a visible side-effect
(partial write), it would be cancelled in the next write call due the thread
being marked as cancelled.
I really don't have a strong opinion which change is the better, my point is
the current code is racy and it will start to fail more often once we fix