This is the mail archive of the
mailing list for the glibc project.
Re: [PATCH v3 2/2] posix: Use posix_spawn on system
On 30/11/2018 16:34, Florian Weimer wrote:
> * Adhemerval Zanella:
>> On 30/11/2018 13:21, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>> On 29/11/2018 15:37, Florian Weimer wrote:
>>>>> * Adhemerval Zanella:
>>>>>> +/* We have to and actually can handle cancelable system(). The big
>>>>>> + problem: we have to kill the child process if necessary. To do
>>>>>> + this a cleanup handler has to be registered and it has to be able
>>>>>> + to find the PID of the child. The main problem is to reliable have
>>>>>> + the PID when needed. It is not necessary for the parent thread to
>>>>>> + return. It might still be in the kernel when the cancellation
>>>>>> + request comes. Therefore we have to use the clone() calls ability
>>>>>> + to have the kernel write the PID into the user-level variable. */
>>>>> This comment does not look relevant to me anymore.
>>>> I think it still worth to mention glibc system aims to be thread-safe,
>>>> which requires restore the signal dispositions for SIGINT and SIGQUIT
>>>> correctly and to deal with cancellation by terminating the child process.
>>>> +/* This system implementation aims to be thread-safe, which requires restore
>>>> + the signal dispositions for SIGINT and SIGQUIT correctly and to deal with
>>>> + cancellation by terminating the child process. */
>>> I don't think you restore SIGINT and SIGQUIT correctly for concurrent
>>> system calls. This is what the ADD_REF code in the old version
>>> attempted to do.
>> It is not strictly incorrect, although Linux sigaction is not really
>> thread-safe (due the copy in/out kernel sigaction structure). And it
>> is indeed not optional, and I agree that relying on this benign data race
>> behaviour is not correct. Below it is an updated patch with ref counter
> It's not about the data race, it is about the higher-level race
> condition. The problem is that the first thread to enter system and
> capture the original signal state may not be the last to leave system
> and restore things.
Even with current implementation the program would need to coordinate
sigaction with system, so I am not sure which would be best option
(try to get some sanity on concurrent system or let the program
>> +static void
>> +cancel_handler (void *arg)
>> + struct cancel_handler_args *args = (struct cancel_handler_args *) (arg);
>> + __kill_noerrno (args->pid, SIGKILL);
>> + TEMP_FAILURE_RETRY (__waitpid (args->pid, NULL, 0));
> One last question (I promise): Should this be the nocancel variant?
The cancel_handler will be called from within the sigcancel_handler
(through unwind mechanism) and SIGCANCEL is not installed SA_NODEFER,
so I can't see how another cancellation can act. However for consistency
using __waitpid_nocancel does make sense. I changed it locally.