This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 1.5.15: multiple simultaneous processes can fail to start


"Christopher Faylor" <cgf-no-personal-reply-please@cygwin.com> wrote in
message 20050422000114.GE6431@trixie.casa.cgf.cx">news:20050422000114.GE6431@trixie.casa.cgf.cx...
> On Thu, Apr 21, 2005 at 01:50:44PM -0700, Usman Muzaffar wrote:
> >Hello. We seem to be having difficulty when two Cygwin processes are
> >started simultaneously (or in close succession) on multiprocessor
> >Windows 2003 systems. The problem manifests as this error:
> >
> >    c:\cygwin\bin\which.exe (3824): *** system shared memory version
> >mismatch
> >    detected - 0x0/0x75BE007F.
> >    This problem is probably due to using incompatible versions of the
> >    cygwin DLL.  Search for cygwin1.dll using the Windows
> >    Start->Find/Search facility and delete all but the most recent
> >    version.  The most recent version *should* reside in
> >    x:\cygwin\bin, where 'x' is the drive on which you have installed
> >    the cygwin distribution.  Rebooting is also suggested if you are
> >    unable to find another cygwin DLL.
>
> Looking at the code I can't see any way that this error could ever occur
> but, then, the code that produced this error was actually wrong.
>
> I just checked in a minor rewrite around this error message.  I don't know
> if it will fix this problem or not.  I couldn't duplicate your problem
here,
> FWIW.
>
> This will be in the next cygwin snapshot: http://cygwin.com/snapshots/ .
>
> cgf
>

Thanks very much for your prompt response.
I downloaded 4/21/05 snapshot (uname reports 1.5.16s).

My test case still fails, but far less frequently. It now takes
several hours of runs (several 10s of thousands of iterations) and
then fails on 'user shared memory size' mismatch, instead of 'system
shared memory version':

c:\cygwin\bin\which.exe (3712):
  *** user shared memory size mismatch detected 0x0/0xA65C.

Looking at shared.cc:user_shared_initialize(), it looks like there may
be a window of vulnerability, because it doesn't appear to have the
same InterlockedExchange protecting its 'version' member that the
SH_CYGWIN_SHARE location does:

  /* Initialize the Cygwin per-user shared, if necessary */
  if (!user_shared->version)
    {
      user_shared->version = USER_VERSION_MAGIC;
      debug_printf ("initializing user shared");
      user_shared->cb =  sizeof (*user_shared);
      ...

In particular, if a second proces is unlucky enough to try to open
user_shared right after 'version' was set but before 'cb', couldn't it
fail with the error I saw above?

I'm also wondering if some of the other shared memory locations
without locks are potentially open to startup races:

SH_SHARED_CONSOLE -
  In fhandler_console::get_tty_stuff(), without an
  InterlockedExchange, could two processes each race past the check on
  the 'tty_min_state.ntty' member and then step on each other trying
  to setsid()?

SH_MYSELF -
  In pinfo::init(), is it possible that 'process_state' and 'pid'
  could get corrupted without an interlock?

Thanks again for your help.
-Usman




--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]