[ECOS] Re: accept() FreeBSD hangs when out of resources

Lars Povlsen lpovlsen@vitesse.com
Tue Jun 12 15:49:00 GMT 2007


This seems a lot like the problem I've seen - and reported on 17/4-07.
I've been able to occasionally reproduce it manually with a browser
(MSIE), but enabling TCP debug logging causes the problem to go away
(not occur).

AFAICS, it is a race condition in the TCP stack causing socket buffers
to be leaked (forever). Calling cyg_kmem_print_stats() displays the
problem (but you need reset to recover :-() :

Network stack mbuf stats:
   mbufs 97, clusters 60, free clusters 1
   Failed to get 0 times
   Waited to get 0 times
   Drained queues to get 0 times
VM zone 'ripcb':
  Total: 64, Free: 64, Allocs: 0, Frees: 0, Fails: 0
VM zone 'tcpcb':
  Total: 64, Free: 61, Allocs: 353, Frees: 350, Fails: 0
VM zone 'udpcb':
  Total: 64, Free: 63, Allocs: 4, Frees: 3, Fails: 0
VM zone 'socket':
  Total: 64, *Free: 0*, Allocs: 365, Frees: 293, Fails: 8
Misc mpool: total   98304, free    4192, max free block 3748
Mbufs pool: total   81792, free   69248, blocksize  128
Clust pool: total  163840, free   38912, blocksize 2048

FWIW, I have not had time to dig into this (as my attempts to produce a
test bench has failed...)

---Lars

-----Original Message-----
From: ecos-discuss-owner@ecos.sourceware.org
[mailto:ecos-discuss-owner@ecos.sourceware.org] On Behalf Of Tad
Sent: 12. juni 2007 14:05
To: eCos Disuss
Subject: Re: [ECOS] Re: accept() FreeBSD hangs when out of resources



Andrew Lunn wrote:
> On Mon, Jun 11, 2007 at 04:05:57PM -0800, Tad wrote:
>   
>> Andrew Lunn wrote:
>>     
>>> On Mon, Jun 11, 2007 at 03:42:07PM -0800, Tad wrote:
>>>  
>>>       
>>>>>> accept() won't return and won't timeout (>12hrs) when listen() 
>>>>>> indicates a new connection, if out of sockets/file-descriptors
and all 
>>>>>> TCP connections are in ESTABLISHED state.
>>>>>>        
>>>>>>             
>>>>> Where exactly is it blocked. Please could you provide a call
stack.
>>>>>           
more info.
seems to be dependent on CYGNUM_FILEIO_NFILE rather than 
CYGPKG_NET_MAXSOCKETS.  reducing NFILE < MAXSOCKETS causes accept to 
hang with fewer established connections than before reduction.


-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss


--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list