[ECOS] accept() FreeBSD hangs when out of resources
Mon Jun 11 21:52:00 GMT 2007
accept() won't return and won't timeout (>12hrs) when listen() indicates
a new connection, if out of sockets/file-descriptors and all TCP
connections are in ESTABLISHED state.
This affects the athttpd server in particular and probably other TCP
Non-blocking the socket with FIONBIO doesn't help.
accept() will return if TCP connections in TIME_WAIT state timeout and
free up connections.
Easy to confirm by sending ~16 connection requests (with posts for
example) to an athttpd server.
This is a big deal, because:
1) Accept blocking when out of resources locks up applications which
then can't shutdown resources.
2) The fix is messy. Is there a better way?
a) attempt to count how many sockets are open (remain)
b) never call accept when within a couple sockets of the max...but since
we're not thread-locked, the sockets could be used up between checking
the count and calling accept.
c) disable the listen socket in subsequent select() calls until sockets
free up. But does the listen socket get other msgs that would be missed?
It would just be nice if accept() returned when out of resources. The
current timeout doesn't work and is set for something like 10*2minutes
in certain cases anyhow.
My other FreeBSD bug posts to bugzilla seem to be ignored, so I won't
bother sending this one there.
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss
More information about the Ecos-discuss