This is the mail archive of the
mailing list for the Cygwin project.
Re: Losing track of processes?
- From: Brian Dessent <brian at dessent dot net>
- To: cygwin at cygwin dot com
- Date: Wed, 13 Apr 2005 20:48:01 -0700
- Subject: Re: Losing track of processes?
- Organization: My own little world...
- References: <Pine.CYG.4.58.0504132311170.1604@Crunch.bcgssbd.sciatl.com>
- Reply-to: cygwin at cygwin dot com
"Shaffer, Kenneth" wrote:
> I have a suite of scripts which process logs but get hung after two
> hours. My initial looking into it shows that cygwin ps command thinks the
> processes are present, but windows task manager doesn't see them at all.
> It's as if the parent wasn't informed that it's child died. Perhaps a wait
> system call isn't working or memory corruption of data structures
> containing this information or race conditions on accessing data
> structures, etc.
> I have run into this off and on since cygwin1.dll 1.5.12 always hoping
> that new versions would make the problem go away. I seem to recall similar
> posts by others. I'm now running the 1.5.15 4/12 snapshot.
> Anyway hoping there might be suggestions to help track this down. I'll try
> strace one more time (when run with it before, the problem did not occur).
If you're not using the -17 (test) version of bash, try that. Bash has
a problem where if a PID is reused, it will get confused and continue to
return the exit value of the original process with that PID and not
retrieve the exit status of the second process that was spawned with a
duplicated PID. Or something like that. (The details are in the
archive.) Anyway, the -17 version includes a fix, but is marked 'test'
so you won't get it unless you explicitly select that version.
As to why you are seeing processes in ps that don't exist in task
manager, I have no idea. It could be that Cygwin is still retaining
information about those processes that have terminated because nothing
has yet called wait() on them to retrieve their exit status. I don't
think the notion of zombie processes exists in windows so Cygwin has to
emulate it. But, that's just wild speculation.
What I do know, is that if there is a real bug hiding in here somewhere
it will never get fixed until someone can narrow it down to something
that is reproducable. If you can manage to whittle down your script
into a generic testcase that exhibits the problem, then at least someone
could look into it. But until then, or until someone that can reproduce
the problem and is familiar enough with cygwin can debug what's going
on, I don't think anything is going to happen.
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html