Race condition that leads to random crashes in cygwin-based builds.

Andrey Khalyavin halyavin@google.com
Tue Jul 24 13:25:00 GMT 2012

Hi, we have build bots that crash randomly on Windows XP and rarely on
Windows 7.
These bots use our compiler that runs under cygwin. Although crashes
are rare, we
have ~20 bots what makes green builds almost impossible. I tried to
reproduce these
crashes on my local Windows XP computer and after several days (on bots crashes
are much more frequent may be due to them using virtual machines) I
got a crash dump.

Investigation of this crash dump showed that wincapc::init in
called api_fatal ("Cygwin requires at least Windows 2000."). This
function is called at
cygwin1.dll initialization even before any code in our compiler
(cc1.exe) have been
executed. Further investigation showed that wincapc variable is in
shared section:
wincapc wincap __attribute__((section (".cygwin_dll_common"), shared));
but wincapc::init() function doesn't have any synchronization and is called from
dll_crt0_0 without any synchronization. Using shared variables without
is sure way to get random failures. Here is one scenario that can lead
to api_fatal called:

1. No cygwin processes exist in a system.
2. Two cygwin processes are started simultaneously.
3. First process enters wincapc::init, clears version field with
memset and executes
version.dwOSVersionInfoSize = sizeof (OSVERSIONINFOEX)
4. Task switching happens and second process enters wincapc::init. It
sees that caps
field is still not initialized yet and cleaders version field with memset.
5. Task switching happens and first process proceeds to execute
GetVersionEx with
version cleared by memset and so not having its size set.
6. GetVersionEx returns error and first process fails to start.

If there is no easy way to add synchronization to wincapc::init, I
suggest to make
wincap a regular (not shared) variable.

Andrey Khalyavin

