[ECOS] FreeBSD Netstack EPIPE error

Bell, Andrew [Allen & Heath UK] Andrew.Bell@allen-heath.com
Wed Oct 11 14:46:00 GMT 2006


Hi All,

Many thanks for all the replies. 

I've already upped CYGNUM_FILEIO_NFILE.

Alas were not using the eCos flash drivers, were using a nocache
write-through:

CYGARC_MEMDESC_NOCACHE( 0xfe000000, 0x01000000 ), // ROM region

with an API which is called from a single thread, so no where do we
disable interrupts. To remove the lengthy flash write from the equation
I've replaced it with a busy wait (while (1)) in my worker thread - It
still falls over. Hmm.

I'm assuming as I'm seeing full RX packet dumps during the xmit
starvation that the delivery thread is still running. 

There seams to be a correlation between my user thread context being
busy shortly after a high volume of eth TX and the netstack then failing
to ack subsequent requests from the front end. If I remove the busy
wait, the netstack survives.

Is there anyway I can confirm the DSR call to wake the delivery thread
gets performed during this period? Or any reason why the TX side of the
netstack would stall whilst the RX is fine ?

I'm currently writing a test harness to attempt a replication of the
issue which is hardware independent. At present it works perfectly.
Grrr.

Anyone with any clues, you'd be saving my sanity.

Cheers

Andrew.



> From what I understand the protocol stack is interrupt driven. Here's
> the final twist. My main worker thread in eCos is very busy during the
> time I observe the netstack xmit starvation (writing to flash) for a
> period of around 16 seconds.
> 
> I've made sure my main thread priority is lower (higher in integer
> terms) than the internal netstack delivery threads priority.
> 
> Is there any way a user thread can cause netstack starvation? BTW I'm
> not locking out interrupts during this time.

Does the device ping afterwards? 

It could be the device driver is locking up because of a bug in
handling full receive buffers. Transferring packets from the device
driver into the stack happens in a thread. If your threads are not
running for 16 seconds, it could be the ethernet device has been
signalling received packets but the stack has not had chance to get
them from the device. The device then overflows its receiver
buffer. Once threads start running again maybe the correct action is
not being taken to recover from running out of receive buffers. Some
ethernet device need telling to start again. 

         Andrew



--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list