Re: Race condition that leads to random crashes in cygwin-based builds.

On Jul 24 17:25, Andrey Khalyavin wrote:
> Hi, we have build bots that crash randomly on Windows XP and rarely on
> Windows 7.
> These bots use our compiler that runs under cygwin. Although crashes
> are rare, we
> have ~20 bots what makes green builds almost impossible. I tried to
> reproduce these
> crashes on my local Windows XP computer and after several days (on bots crashes
> are much more frequent may be due to them using virtual machines) I
> got a crash dump.
> Investigation of this crash dump showed that wincapc::init in
> winsup\cygwin\
> called api_fatal ("Cygwin requires at least Windows 2000."). This
> function is called at
> cygwin1.dll initialization even before any code in our compiler
> (cc1.exe) have been
> executed. Further investigation showed that wincapc variable is in
> shared section:
> wincapc wincap __attribute__((section (".cygwin_dll_common"), shared));
> but wincapc::init() function doesn't have any synchronization and is called from
> dll_crt0_0 without any synchronization. Using shared variables without
> synchronization
> is sure way to get random failures. Here is one scenario that can lead
> to api_fatal called:
> 1. No cygwin processes exist in a system.
> 2. Two cygwin processes are started simultaneously.
> 3. First process enters wincapc::init, clears version field with
> memset and executes
> version.dwOSVersionInfoSize = sizeof (OSVERSIONINFOEX)
> 4. Task switching happens and second process enters wincapc::init. It
> sees that caps
> field is still not initialized yet and cleaders version field with memset.
> 5. Task switching happens and first process proceeds to execute
> GetVersionEx with
> version cleared by memset and so not having its size set.
> 6. GetVersionEx returns error and first process fails to start.
> If there is no easy way to add synchronization to wincapc::init, I
> suggest to make
> wincap a regular (not shared) variable.

There's another way, afaics.  The idea here was that wincap is only
ever set once, and even *if* the information is written twice, the
content will be identical.

So, afaics, the above problem is a result of using memset at all.  At
startup, wincap is all 0 anyway, so the memset is not required and
apparently it even hurts.  Weird that nobody saw this problem before.

I applied a patch which should fix this problem.  Please give the
next developer snapshot from a try,
or build yourself from CVS.


Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

