Trying to pin cygwin crash bug; need a bit of help

David Dindorp ddi@dubex.dk
Fri Jan 14 13:38:00 GMT 2005


Hi

I need some information on how to debug Cygwin processes.

We have a partially cygwin-based project, which are having some
problems.
The cygwin part of the project has some scripts running in a bash
process.

The scripts seem to run fine for a random amount of time.
Sometimes it's mere minutes or hours, but most often it's a day or two.
Suddenly, the script just halts.

We had a similar problem once, where one of the cygwin processes ate
100% CPU, this was due to missing Cygwin registry keys.

However none of the cygwin-based processes are gobbling up CPU, so this
looks like it's perhaps a race condition more than it's an endless loop.

The location inside the scripts seems to be less significant, however
the problem seems to occur only when child 'bash' processes are in use.

I'm guessing that the parent processes are just waiting for their
child process, and so it's the inner-most child process that has hung.

I've made 4 core dumps of the inner-most child process on 4 different
occassions, see them at the bottom of this mail. cygwin_split_path(),
getppid() and others seems to be represented in all of them.

How can I find out what the problem is from here?

Regards,
David



Here's gdb information from the core files.
On my local pc, gdb says "Previous frame identical to this frame
(corrupt stack?)".
It doesn't do this when gdb is executed locally where the script ran..
Cygwin dll version 1.5.10. OS is Windows 2000 Service Pack 3
(5.00.2195).

=== 1st core dump ===
(gdb) info threads
  3 process 2632  0x77e88785 in KERNEL32!GetModuleFileNameA ()
  2 process 3776  0x77f839eb in ntdll!ZwReadFile ()
* 1 process 3648  0x77f8376e in ntdll!ZwClose ()
(gdb) bt
#0  0x77f8376e in ntdll!ZwClose ()
#1  0x77e87738 in KERNEL32!CloseHandle ()
#2  0x61073e06 in cygwin_split_path ()
#3  0x6109afe4 in getppid ()
#4  0x00000005 in ?? ()
#5  0x00000001 in ?? ()
#6  0x00000001 in ?? ()
=====================

=== 2nd core dump ===
(gdb) info threads
  3 process 3720  0x77e88785 in KERNEL32!GetModuleFileNameA ()
  2 process 2984  0x77f839eb in ntdll!ZwReadFile ()
* 1 process 3580  0x77f8376e in ntdll!ZwClose ()
(gdb) bt
#0  0x77f8376e in ntdll!ZwClose ()
#1  0x77e87738 in KERNEL32!CloseHandle ()
#2  0x61073e06 in cygwin_split_path ()
#3  0x6109afe4 in getppid ()
#4  0x00000005 in ?? ()
#5  0x00000001 in ?? ()
#6  0x00000001 in ?? ()
=====================

=== 3rd core dump ===
(gdb) info threads
  3 process 3888  0x77e88785 in KERNEL32!GetModuleFileNameA ()
  2 process 3656  0x77f839eb in ntdll!ZwReadFile ()
* 1 process 3596  0x77f8376e in ntdll!ZwClose ()
(gdb) bt
#0  0x77f8376e in ntdll!ZwClose ()
#1  0x77e87738 in KERNEL32!CloseHandle ()
#2  0x61073e06 in cygwin_split_path ()
#3  0x6109afe4 in getppid ()
#4  0x00000005 in ?? ()
#5  0x00000001 in ?? ()
#6  0x00000001 in ?? ()
=====================

=== 4th core dump ===
(gdb) info threads
  3 process 4252  0x77e88785 in KERNEL32!GetModuleFileNameA ()
  2 process 3300  0x77f839eb in ntdll!ZwReadFile ()
* 1 process 4148  0x77f8376e in ntdll!ZwClose ()
(gdb) bt
#0  0x77f8376e in ntdll!ZwClose ()
#1  0x77e87738 in KERNEL32!CloseHandle ()
#2  0x61073e06 in cygwin_split_path ()
#3  0x6109afe4 in getppid ()
#4  0x00000005 in ?? ()
#5  0x00000001 in ?? ()
#6  0x00000001 in ?? ()
=====================


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list