Deadlock of the process tree when running make
Takashi Yano
takashi.yano@nifty.ne.jp
Wed Apr 27 11:22:16 GMT 2022
Hi Alexey,
On Sat, 16 Apr 2022 16:21:34 +0300
Alexey Izbyshev wrote:
> On 2022-04-16 12:39, Takashi Yano wrote:
> > I am not sure yet what is essential, but the current code closes
> > pseudo console only if there is no other process which is attaching
> > to the pseudo console. I wonder why javac.exe is remaining as
> > zombie. The parent bash.exe calls ColosePseudoConsole() when
> > child non-cygwin app is terminated, i.e., after WaitForSingleObject()
> > for child process handle returns.
> > https://www.cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=7ac0767053e278f0ce9811bf6f77278bd2f49c20#l1009
> >
> > What does the "zombie" mean? Is it listed in the process list of
> > ProcessHacker? I still suspect that the zombie javac.exe holds
> > the hWritePipe handle leaked from parent bash.exe.
> >
> By "zombie" I meant the same thing as in the Linux kernel: a data
> structure that remains after a process terminated, but hasn't been
> waited for yet (I don't know how this is implemented in Cygwin). So
> there is no javac.exe process in ProcessHacker, but "ps" and similar
> tools in Cygwin still list "javac".
>
> I'm now trying to create a small reproducer that I can share, and I've
> had a first small success this night: I could get a very similar hang
> with a simple Makefile and a script with Cygwin 3.3.4. Here is the tree:
>
> make(14479)-+-bash(14484)---bash(14611)
> |-bash(14515)---bash(14618)
> |-bash(14491)---bash(14500)---bash(14612)
> |-bash(14501)---bash(14510)---bash(14605)
> |-bash(14505)---bash(14607)
> |-bash(14494)---bash(14617)
> |-bash(14506)---bash(14513)---bash(14610)
> |-bash(14512)---bash(14518)---bash(14615)
> |-bash(14486)---bash(14495)---bash(14606)
> |-bash(14483)---bash(14490)---bash(14609)
> |-bash(14509)---bash(14614)
> |-bash(14489)---bash(14608)
> |-bash(14499)---bash(14613)
> |-bash(14481)---bash(14485)---python(14588)
> |-bash(14496)---bash(14504)---bash(14616)
> `-bash(14482)---bash(14604)
>
>
> "python" is a zombie, just as "javac" is in the original case. There is
> also a single "conhost.exe" again, and all of its 5 threads are doing
> the same things as in the original case (including the signal pipe
> thread trying to EnterCriticalSection()). The only difference is that
> leaf bash.exe are trying to acquire pcon mutex at a different point [1],
> but I guess this difference is not important.
>
> I'll try this reproducer with your patched DLL as well as on another
> machine and share it in case of success.
>
> Thanks,
> Alexey
>
> [1]
> https://www.cygwin.com/git?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=cygwin-3_3_4-release#l697
Is there any progress on this?
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list