Deadlock of the process tree when running make

Takashi Yano takashi.yano@nifty.ne.jp
Wed Apr 27 11:22:16 GMT 2022


Hi Alexey,

On Sat, 16 Apr 2022 16:21:34 +0300
Alexey Izbyshev wrote:
> On 2022-04-16 12:39, Takashi Yano wrote:
> > I am not sure yet what is essential, but the current code closes
> > pseudo console only if there is no other process which is attaching
> > to the pseudo console. I wonder why javac.exe is remaining as
> > zombie. The parent bash.exe calls ColosePseudoConsole() when
> > child non-cygwin app is terminated, i.e., after WaitForSingleObject()
> > for child process handle returns.
> > https://www.cygwin.com/git/?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=7ac0767053e278f0ce9811bf6f77278bd2f49c20#l1009
> > 
> > What does the "zombie" mean? Is it listed in the process list of
> > ProcessHacker? I still suspect that the zombie javac.exe holds
> > the  hWritePipe handle leaked from parent bash.exe.
> > 
> By "zombie" I meant the same thing as in the Linux kernel: a data 
> structure that remains after a process terminated, but hasn't been 
> waited for yet (I don't know how this is implemented in Cygwin). So 
> there is no javac.exe process in ProcessHacker, but "ps" and similar 
> tools in Cygwin still list "javac".
> 
> I'm now trying to create a small reproducer that I can share, and I've 
> had a first small success this night: I could get a very similar hang 
> with a simple Makefile and a script with Cygwin 3.3.4. Here is the tree:
> 
> make(14479)-+-bash(14484)---bash(14611)
>              |-bash(14515)---bash(14618)
>              |-bash(14491)---bash(14500)---bash(14612)
>              |-bash(14501)---bash(14510)---bash(14605)
>              |-bash(14505)---bash(14607)
>              |-bash(14494)---bash(14617)
>              |-bash(14506)---bash(14513)---bash(14610)
>              |-bash(14512)---bash(14518)---bash(14615)
>              |-bash(14486)---bash(14495)---bash(14606)
>              |-bash(14483)---bash(14490)---bash(14609)
>              |-bash(14509)---bash(14614)
>              |-bash(14489)---bash(14608)
>              |-bash(14499)---bash(14613)
>              |-bash(14481)---bash(14485)---python(14588)
>              |-bash(14496)---bash(14504)---bash(14616)
>              `-bash(14482)---bash(14604)
> 
> 
> "python" is a zombie, just as "javac" is in the original case. There is 
> also a single "conhost.exe" again, and all of its 5 threads are doing 
> the same things as in the original case (including the signal pipe 
> thread trying to EnterCriticalSection()). The only difference is that 
> leaf bash.exe are trying to acquire pcon mutex at a different point [1], 
> but I guess this difference is not important.
> 
> I'll try this reproducer with your patched DLL as well as on another 
> machine and share it in case of success.
> 
> Thanks,
> Alexey
> 
> [1] 
> https://www.cygwin.com/git?p=newlib-cygwin.git;a=blob;f=winsup/cygwin/spawn.cc;h=81dba5a941e919ea2514013069aef22c6fad8004;hb=cygwin-3_3_4-release#l697

Is there any progress on this?

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin mailing list