Problem with zombie processes
Tue Feb 14 11:14:00 GMT 2017
I encountered a strange bug concerning zombie child processes during
my testing. Specifically, it came up in the context of trying to read
the /proc/<pid> entries of processes that should be zombified but not
reaped yet (i.e. they have received and processed a SIGKILL or SIGTERM
but have not been wait()ed on yet).
In some cases it's still possible to read, for example,
/proc/<pid>/stat of the zombie process, while in other cases it fails
with errno 22 (EINVAL). In the latter case, this is coming from the
OpenProcess() call in format_process_stat() (or similarly in other
format_process_* methods) indicating that Windows has already removed
the process object from its process table. Or equivalently, there are
no more open handles to the child process.
What I don't understand is if this is intentional or not. I feel like
Cygwin should try to keep the Windows process object alive as long as
it's a zombie process. But in some cases it does and in some cases it
The attached Python script shows two such cases, and I don't
understand quite where they differ. In one case, stdout from the
child process is being sent to the parent over a pipe. In the second
case stdout from the child is sent to /dev/null. In the first case
the process object is kept alive and I can read its /proc entries. In
the latter case it dies even before wait(). I'm not sure what the
difference is in terms of keeping the process object alive.
If Cygwin can't guarantee that the Windows process object is kept
around while the process is in zombie state, it would be nice, as an
alternative, to change the error handling in the format_process_*
methods to return as much info as it can, with other fields zero'd
out, rather than an error.
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin