spinlock.h timeout causing *** fatal error - add_item abort

Corinna Vinschen corinna-cygwin@cygwin.com
Tue May 24 10:23:00 GMT 2016


On Apr 29 16:03, Kevin Nomura wrote:
> Hi,
> 
> We occasionally see an api_fatal abort during process startup like:
> 
> *** fatal error - add_item ("somepath", "/", ...) failed, errno 1
> 
> This happens if mountinfo.init(false) in user_info::initialize is
> called twice.  The error occurs on a second call when trying to add
> the root mount point when it already exists.
> 
> mountinfo.init is guarded by a "spinlock" object that should only
> allow one process to call it.  But the spinlock has a timeout.  After
> 15 seconds, it stops waiting and returns a value of 0.  The fatal
> error can occur if two processes are starting around the same time
> and the first process takes a long time in internal_getpwsid().  We've
> seen this happen in our environment due to LDAP queries taking a long
> time.  (Incidentally we are using msys, but code in spinlock.h and
> shared.cc looks the same in cygwin).
> 
> To solve the aborts it is tempting to make a local fix to remove the
> spinlock timeout.  I assume there was a rationale for it, and would
> like to understand what tradeoff is incurred if we remove it.

Actually I can't tell you.  You might just want to try it and see if
it has some weird side-effect.  The problem this *probably* was
supposed to handle is that some process hangs in the initialization
indefinitely, thus blocking out other processes.  15 secs is just some
arbitrary big value.

The original reason to use a spinlock here wasbased on the idea that
reading fstabs and passwd/group files is quick, thus a spinlock is the
least intrusive locking mechanism.  This is obviously not true anymore,
now that fetching the user info may require network access.

I put an item on my TODO list to revisit this code and introduce
another, more reliable locking here.

For the time being you might want to try a spinlock with a longer
timeout (2 mins, perhaps)


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20160524/0fbde9ff/attachment.sig>


More information about the Cygwin mailing list