Problems with non-blocking I/O
Tim Allen
tim@proximity.com.au
Thu Mar 27 02:30:00 GMT 2003
I guess cygwin doesn't get a lot of testing with non-blocking I/O. We're
having lots of problems. Using version 1.3.14, we find it barely usable but
problematic and unreliable. With versions 1.3.20 and 1.3.21, it's quite
unusable. The specific problems are, for 1.3.14:
1. selecting for writing on a non-blocking TCP socket will _always_ report
selection, even when a write to that socket would block
2. sockets get closed for no apparent reason. This seems particularly likely
after any process has been away from its select loop for a tenth of a second
or two (either it's busy elsewhere, or the scheduler doesn't give it the
processor because other processes are busy). Symptoms are "Connection reset
by peer" errors (when the peer process appears to be perfectly happy to keep
talking) and sometimes a EBADF error (not preceded by any other error, ie the
socket has simply ceased to exist without warning).
For 1.3.20 and 1.3.21, we find that non-blocking reads also fail, with the
select always reporting selection on read, even when it should block, and,
much worse than the case for 1.3.14, the read does not block but instead
manufactures random data (presumably copying from some buffer or other).
I'm working on making simple test cases for this. I have one that demonstrates
the first problem, which I shall attach here. I'll persist with making test
cases for the other problems (I need to strip out irrelevant stuff from the
app) and shall post them when I can reproduce the problems easily.
The attached source files are for a pair of programs, a client and a server.
The server accepts connections on port 8888 and copies any data it receives
back to the same socket. The client connects to that port on INADDR_LOOPBACK.
It takes two file names as command line args, reads the first file, sends it
to the socket, then writes whatever comes back from the socket to the file
given as the second arg. Doing a diff on the two files after both programs
complete is a test that everything worked. The bigger the file, the more
stringent the test; I've been testing with files in the tens to hundreds of
megabytes range.
Both programs produce copious output on stdout to tell you what they are
doing. When run on linux, the programs run very quickly, with no observed
problems at all. On cygwin 1.3.14, on Windows 2000, you can see that the
server side in particular spins through select, reporting EWOULDBLOCK all the
time when selected for write. If you pause (eg ctrl-S) the client, you can
see it even more clearly. The server should (and on linux does) itself pause
in that situation, waiting to be able to write to the socket. On cygwin it
instead keeps going, constantly raising select conditions and constantly
finding that it would block on the write, doing a busy-wait. A
single-processor box illustrates the problem best, as with two processors the
busy-wait doesn't look as bad.
I'll endeavour to provide more details and examples; I thought this much was
worth contributing so far, as it does demonstrate one of the problems quite
clearly. May I suggest it'd be worth adding a test based on this to the
regression test suite? Or, forgive my ignorance, making a regression test
suite if one doesn't exist, and basing one of the tests on this. In either
case, you are welcome to use the supplied code to do so.
Tim
--
-----------------------------------------------
Tim Allen tim@proximity.com.au
Proximity Pty Ltd http://www.proximity.com.au/
http://www4.tpg.com.au/users/rita_tim/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plainTCPEchoClient.C
Type: text/x-c++src
Size: 4808 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20030327/18334e76/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plainTCPEchoServer.C
Type: text/x-c++src
Size: 3726 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20030327/18334e76/attachment-0001.bin>
-------------- next part --------------
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
More information about the Cygwin
mailing list