This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/4] nptl: Handle EPIPE on tst-cancel2



On 30/08/2019 08:51, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 26/08/2019 07:06, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>>> In the new implementation, if there is a signal handler and a signal for
>>>>> it is delivered before SIGCANCEL (even if the signal was sent *after*
>>>>> pthread_cancel, from the same thread), I do not think there is any
>>>>> chance whatsoever that we can hide the behavior difference.  After all,
>>>>> SIGCANCEL may not trigger an asynchronous cancellation from the signal
>>>>> handler.
>>>>
>>>> Yes, although SIGCANCEL would be set as SA_RESTART, it is explicit disabled
>>>> by the signal handle by removing it from ucontext_t signal mask.
>>>
>>> Does this mean a thread would lose its ability to be canceled in
>>> blocking system calls if it setjmps out of a signal handler?  (Where as
>>> sigsetjmp would keep cancellation working.) 
> 
>> The setjmp does not really matter here, once the async-cancellation fails
>> to act due side-effects SIGCANCEL is removed from thread signal mask. The
>> idea of making it sticky is always allow the thread to act on such cases
>> regardless whether it calls another cancellation entrypoint.
> 
> I see.  But why are we doing this?  Can't we suppress the cancel signal
> inside pthread_cancel.  Then we don't have to worry about the location
> of the signal mask in the signal context.

I guess we can, it would require a sigprocmask to obtain the current
mask, and another one to remove the cancellation signal.  I still think
it cleaner to just sets on the signal handler, since the kernel will
restore the signal handler after the userspace returns.

> 
>> The thread will still be cancelled in the next cancellation entrypoint
>> due it is marked as CANCELED_BIT.  The __syscall_cancel_arch will check
>> for the bit and call __do_cancel before issuing the syscall.
> 
> Sure, but we can exit the SIGCANCEL handler if we detect that
> cancellation is already in progress.

I think it is possible, but I see that enforcing it using the kernel facilities
to avoid extra code in userspace is cleaner.

> 
>>>>> System calls that are cancellation points appear to fall into these
>>>>> categories:
>>>>>
>>>>> (A) Those that do not perform any resource allocation (write, read).
>>>>>
>>>>> (B) Those that perform resource allocation, but the allocation can be
>>>>>     easily reverted (openat, accept4).
>>>>>
>>>>> (C) Those that perform resource allocation, but the allocation is
>>>>>     difficult to undo (recvmsg with descriptor passing).
>>>>>
>>>>> (D) close.
>>>
>>>>> For (B), maybe we should undo the resource allocation and then proceed
>>>>> to act on the cancellation.
>>>>
>>>> Besides this requires a lot of more complexity by mapping what kind of
>>>> resources each syscall would require to free and when to free based on
>>>> returned codes, it would hide it from the application. My view we just
>>>> need to return that the syscall has visible side-effects that may result
>>>> in system resources leaks and let the application deal with.
>>>
>>> My concern is that there might be no further cancellation point, and if
>>> we do not cancel in the case of (B), we might fail to act on the
>>> cancellation even though the canceled thread was blocked at a
>>> cancellation point when pthread_cancel was called.
>>
>> I shared your concern, but my understanding of the whole BZ#12683
>> issue is we can't really cancel the thread for the (B) case.
> 
> Even if we undo the resource allocation before that?  That is, check
> that a cancellation has occurred on the system call return path, and if
> it has, close the descriptor and start unwinding.

What if the application requires extra steps for the case the syscall
returns a side-effects? What about some specific syscalls, such as
passing file descriptor using recvmsg? It would need the libc to map
*all* possible side-effects for each syscall, determine which might
leak side-effect and acts uppon then. What if the kernel, at some version,
added another feature to share resources that might create side-effects?
We will need to constantly map such cases, adding even more arch-specific
glue code to handle it.

I really think we should not go on this path.

> 
>>> I totally agree that the descriptor must not leak.  However,
>>> tst-cancel28 is written in such a way that it does not expect an error
>>> from open:
>>>
>>> static void *
>>> leaker (void *arg)
>>> {
>>>   int fd = open (arg, O_RDONLY);
>>>   pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
>>>   close (fd);
>>>   return NULL;
>>> }
>>>
>>> It would be clearer if it actually used error checking, but we call
>>> close (-1) on error, which is invalid.
>>
>> We can hardness the test to check for open failures (due file descriptor
>> exhaustion or other arcane issue), but the point of the test is fd is
>> always for the *close* call.  If the open syscall is interrupted before the
>> file descriptor is actually opened, the thread will be cancelled before
>> close call.  Otherwise, the kernel will have to return a valid value
>> and indicate on the PC from the ucontext_t that the syscall has side-effect.
> 
> My point is that the test should still be valid without the close call
> because the open is always canceled according to POSIX.
> 
>>> If this test represents how application code is actually written, I
>>> think we really need to check for cancellation before returning the file
>>> descriptor.
>>>
>>>> If glibc starts to add extra semantics for cancellation, we will have
>>>> a standard extensions since now 'open', for instance, would close the
>>>> file descriptor if the cancellation signal is delivered after the OS
>>>> allocates the file descriptor resource.  IMHO this is clearly what the
>>>> program itself should be handling, because it might the case where it
>>>> is prepared to close itself the file descriptor in a different thread.
>>>
>>> open is a cancellation point, so according to POSIX, the thread is
>>> canceled if the open call completes after pthread_cancel returns.  (Due
>>> to integration with the memory model, thread ordering issues are
>>> difficult to reason about in POSIX, though.)  I do not think we should
>>> return a descriptor to the application that can only exist because of
>>> something that happened *after* the pthread_cancel call.
>>
>> That's not the interpretation I see, the cancellation description for
>> such cases is afaiu:
>>
>> "The side-effects of acting upon a cancellation request while suspended
>> during a call of a function are the same as the side-effects that may be
>> seen in a single-threaded program when a call to a function is interrupted
>> by a signal and the given function returns [EINTR]"
>>
>> And on signal concepts:
>>
>> "[EINTR]
>> Interrupted function call. An asynchronous signal was caught by the process 
>> during the execution of an interruptible function. If the signal handler 
>> performs a normal return, the interrupted function call may return this 
>> condition (see the Base Definitions volume of POSIX.1-2017, <signal.h>)."
>>
>> And my understanding is, since glibc implemented cancellation using signals
>> and it requires to use SA_RESTART we can actually return to user if the
>> syscall has side-effects visible to caller.
> 
> The open function cannot allocate a new file descriptor and fail with
> EINTR.  So I don't see how this case applies here.  Sorry.

But that is *exactly* the point. The open *does not* fail, but cancellation acts
*before* it can return to the callee.  We can't really make it fail with EINTR
and acts on the cancellation. The idea is "the interrupted function call may 
return this condition".

> 
>>>>> (D) is very special.  Ideally, we would specify what happens with the
>>>>> descriptor if the close call is canceled.  POSIX does not even specify
>>>>> what the state of file descriptor is after an EINTR error, so it doesn't
>>>>> say what happens with cancellation, either.  Maybe we have to leave that
>>>>> undefined.
>>>>
>>>> Indeed even after the POSIX close language clarification [1], the specification 
>>>> is messy (it really does not help that some system does have interruptible
>>>> close implementation that might return -1).
>>>>
>>>> However my understanding is we can assume POSIX_CLOSE_RESTART 0 for Linux
>>>> since it shouldn't return -1/EINTR in any case (and I don't know if/when
>>>> there are kernels that actually does it) [2].  So assuming the nptl is Linux 
>>>> specific, we can assume that, although close is specified as a cancellable
>>>> entrypoint, it won't happen in practice.
>>>>
>>>> [1] http://austingroupbugs.net/view.php?id=529
>>>> [2] https://lwn.net/Articles/576478/
>>>
>>> I think this is slightly different from the EINTR case.  With
>>> SA_RESTART, perhaps the signal handler can run *before* the descriptor
>>> is closed, so if we never return from the handler, the descriptor leaks?
>>
>> My understanding is the cancellation handler for close won't interrupt the
>> syscall, regardless SA_RESTART.
> 
> What do you mean by “interrupt” here?  That close will not return with
> EINTR?  Or that the signal handler does not run?

My understanding on the discussions is Linux close implementation does not
return EINTR/-1 in any case (and if it did in the past it was a bug).

> 
>> But indeed it might be the case where the
>> signal is acted just before the syscall is issue and after the test for
>> CANCELED_BITMASK is done.  In this case the thread will be cancelled and
>> the descriptor leaked.  The musl handler SYS_close differently by not using
>> the syscall bridge, so cancellation can't be acted upon and so close always
>> succeed.
> 
> That seems reasonable.
> 
> If we check for cancellation after the close system call (successful or
> not), then I think we have a compliant, leak-free close implementation.

In my understanding musl does this to prevent possible problematic cancel
syscall implementation that might return -1/EINTR (since for this case
syscall will be restart due SA_RESTART and cancellation won't acted upon).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]