[ECOS] accept() FreeBSD hangs when out of resources

Tad ecos_removethispart@ds3switch.com
Mon Jun 11 21:52:00 GMT 2007


accept() won't return and won't timeout (>12hrs) when listen() indicates 
a new connection, if out of sockets/file-descriptors and all TCP 
connections are in ESTABLISHED state.

This affects the athttpd server in particular and probably other TCP 
network apps.

Non-blocking the socket with FIONBIO doesn't help.

accept() will return if TCP connections in TIME_WAIT state timeout and 
free up connections.

Easy to confirm by sending ~16 connection requests (with posts for 
example) to an athttpd server.

This is a big deal, because:
1) Accept blocking when out of resources locks up applications which 
then can't shutdown resources.

2) The fix is messy.  Is there a better way?
a) attempt to count how many sockets are open (remain)
b) never call accept when within a couple sockets of the max...but since 
we're not thread-locked, the sockets could be used up between checking 
the count and calling accept.
c) disable the listen socket in subsequent select() calls until sockets 
free up.  But does the listen socket get other msgs that would be missed?

It would just be nice if accept() returned when out of resources.  The 
current timeout doesn't work and is set for something like 10*2minutes 
in certain cases anyhow.

My other FreeBSD bug posts to bugzilla seem to be ignored, so I won't 
bother sending this one there.

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list