Problems with non-blocking I/O

Tim Allen tim@proximity.com.au
Thu Mar 27 02:30:00 GMT 2003


I guess cygwin doesn't get a lot of testing with non-blocking I/O. We're 
having lots of problems. Using version 1.3.14, we find it barely usable but 
problematic and unreliable. With versions 1.3.20 and 1.3.21, it's quite 
unusable. The specific problems are, for 1.3.14:

1. selecting for writing on a non-blocking TCP socket will _always_ report 
selection, even when a write to that socket would block

2. sockets get closed for no apparent reason. This seems particularly likely 
after any process has been away from its select loop for a tenth of a second 
or two (either it's busy elsewhere, or the scheduler doesn't give it the 
processor because other processes are busy). Symptoms are "Connection reset 
by peer" errors (when the peer process appears to be perfectly happy to keep 
talking) and sometimes a EBADF error (not preceded by any other error, ie the 
socket has simply ceased to exist without warning).

For 1.3.20 and 1.3.21, we find that non-blocking reads also fail, with the 
select always reporting selection on read, even when it should block, and, 
much worse than the case for 1.3.14, the read does not block but instead 
manufactures random data (presumably copying from some buffer or other).

I'm working on making simple test cases for this. I have one that demonstrates 
the first problem, which I shall attach here. I'll persist with making test 
cases for the other problems (I need to strip out irrelevant stuff from the 
app) and shall post them when I can reproduce the problems easily.

The attached source files are for a pair of programs, a client and a server. 
The server accepts connections on port 8888 and copies any data it receives 
back to the same socket. The client connects to that port on INADDR_LOOPBACK. 
It takes two file names as command line args, reads the first file, sends it 
to the socket, then writes whatever comes back from the socket to the file 
given as the second arg. Doing a diff on the two files after both programs 
complete is a test that everything worked. The bigger the file, the more 
stringent the test; I've been testing with files in the tens to hundreds of 
megabytes range.

Both programs produce copious output on stdout to tell you what they are 
doing. When run on linux, the programs run very quickly, with no observed 
problems at all. On cygwin 1.3.14, on Windows 2000, you can see that the 
server side in particular spins through select, reporting EWOULDBLOCK all the 
time when selected for write. If you pause (eg ctrl-S) the client, you can 
see it even more clearly. The server should (and on linux does) itself pause 
in that situation, waiting to be able to write to the socket. On cygwin it 
instead keeps going, constantly raising select conditions and constantly 
finding that it would block on the write, doing a busy-wait. A 
single-processor box illustrates the problem best, as with two processors the 
busy-wait doesn't look as bad.

I'll endeavour to provide more details and examples; I thought this much was 
worth contributing so far, as it does demonstrate one of the problems quite 
clearly. May I suggest it'd be worth adding a test based on this to the 
regression test suite? Or, forgive my ignorance, making a regression test 
suite if one doesn't exist, and basing one of the tests on this. In either 
case, you are welcome to use the supplied code to do so.

Tim

-- 
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/
  http://www4.tpg.com.au/users/rita_tim/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plainTCPEchoClient.C
Type: text/x-c++src
Size: 4808 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20030327/18334e76/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plainTCPEchoServer.C
Type: text/x-c++src
Size: 3726 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20030327/18334e76/attachment-0001.bin>
-------------- next part --------------
--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


More information about the Cygwin mailing list