This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Multithreaded accept/connect locks


On Jun 27 19:08, Corinna Vinschen wrote:
> Yes, that's the same problem.  From stracing the 
> problem it appears that the underlying problem is
> in WinSock.  connect returns with an error 10036,
> WSAEINPROGRESS, because the other thread is at 
> that time executing a blocking socket call 
> (accept).  The problem is aggravated by the fact
> that the connect call in Cygwin is always using 
> non-blocking mode and WSAEINPROGRESS is also 
> referring to a connect which is still in progress
> and needs further handling.

With the 's' option, my previous program blocked on select() instead of accept(), so connect() must also collide with select().

> In CVS, the Cygwin socket code has been reworked a 
> lot.  One important change is that besides 
> recv/send/connect also accept is now implemented
> in non-blocking mode under the hood.  This has the 
> effect that the connect call should't be able to 
> fail due to a blocking accept in the same process.
> If you try your application against a recent 
> snapshot from http://cygwin.com/snapshots/, your 
> test application will run as expected.

I'd like to keep this in a standard Cygwin distribution if possible.

> However, if you want the application to work with > 1.5.24 as well, you should consider to implement 
> accept in non-blocking mode manually in your 
> application.

I updated my test program to use nonblocking accepts using two methods, ioctl() and fcntl().  My results are attached in TESTS.  I've also attached my new test program, as well as all relevant stack traces.

Summary of the results:
 ccs3   failed to run every time.
 ccs3 i failed to run in alternate trials.  Very strange.

When I ran
 ccs3 i
by itself (not under strace), it failed every time.

Looking at the strace output:
 1) The socket non-blocking set completes successfully every time.
 2) The call to connect() still returns winsock error 10036, but I suppose we expect it to, since accept() hasn't happened yet.
 3) The difference between successful and failed runs seems to be whether some
 fhandler_socket...
 wndproc...
stuff happens before (bad) or after (good) some
 select_stuff::...
 socket_cleanup...
stuff, although there was no wndproc stuff after the fhandler_socket call near where
 ccs2 s
locked up...

Tomorrow I will try polling nonblocking connect()s and select()s (yuck :p), and if that fails, try out some newer snapshots.

Thanks for your help.

Trevor

Attachment: check_cs3.c
Description: Binary data

Attachment: tests
Description: Binary data

Attachment: strace_ccs2_s.out
Description: Binary data

Attachment: strace_ccs3.out
Description: Binary data

Attachment: strace_ccs3_i.out
Description: Binary data

Attachment: strace_ccs3_i2.out
Description: Binary data

Attachment: strace_ccs3_i3.out
Description: Binary data

Attachment: strace_ccs3_i4.out
Description: Binary data

Attachment: strace_ccs3_i5.out
Description: Binary data

Attachment: strace_ccs3_6.out
Description: Binary data

Attachment: strace_ccs3_7.out
Description: Binary data

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]