[ECOS] FreeBSD Netstack EPIPE error

Gary Thomas gary@mlbassoc.com
Tue Oct 10 15:15:00 GMT 2006


Bell, Andrew [Allen & Heath UK] wrote:
> Hi Gary,
> 
> Thanks for the reply. 
> 
> One end of the connection is eCos the other is Java running on a Linux
> JVM.
> 
> The connection is dropped by the Java side; after repeated packets are
> not acknowledged by the eCos host, followed by the eCos netstack xmiting
> an out of order segment, the Java protocol stack appears to wind back to
> the last packed with a sequence number the two hosts agreed on and
> resends it. After a couple more attempts the Java netstack gives up and
> drops the connection.
> 
> Looking at the packet capture from eCos with cyg_io_eth_net_debug set,
> there is a complete lack on xmit activity; I see the retransmits from
> the Java host, but the eCos netstack fails to ack them. Almost like the
> protocol stack has stalled.
> 
> From what I understand the protocol stack is interrupt driven. Here's
> the final twist. My main worker thread in eCos is very busy during the
> time I observe the netstack xmit starvation (writing to flash) for a
> period of around 16 seconds.
> 
> I've made sure my main thread priority is lower (higher in integer
> terms) than the internal netstack delivery threads priority.
> 
> Is there any way a user thread can cause netstack starvation? BTW I'm
> not locking out interrupts during this time.
> 

Wow!  16 seconds writing to FLASH.  This is quite possibly your problem.
The V1 FLASH drivers will lock interrupts during write & erase operations
(this happens in the drivers, irrelevant of what your code may do).

Is there any way to do shorter FLASH operations?

> Bell, Andrew [Allen & Heath UK] wrote:
>> Hello All,
>>
>> I'm having FreeBSD netstack issues with an eCos port for a Motorola
> 852T
>> board based on an A&M Adder.
>>
>> Our eCos application keeps dropping socket connections with an EPIPE
>> (broken pipe) after a period of high tx activity. The ethereal capture
>> of the stream shows the eCos nestack shortly after the burst of tx
>> activity stops sending acks to the front end, ignores retransmits from
>> the front end, then eventually emits an out of order segment which
>> ethereal calculates a RTT of 1158229289 seconds!
>>
>> I've run the bsd tests, enabled stack checking and enabled assertions.
>> I've turned on MBUF warnings and enabled cyg_io_eth_net_debug and
>> increased CYGPKG_NET_USAGE to (1008 *1024) + (MAXSOCK * 1024), all of
>> which show no clues.
>>
>> If anyone can point me in the right diections I'd be grateful.
> 
> AFAIK, EPIPE is only returned if the receiving end of a TCP connection
> breaks off and the Tx end is still trying to send.
> 
> Are both "ends" of your connections eCos applications?  On the same
> or different machines?
> 
> Is this failure something that can be tested/demonstrated separately?
> In other words, can you send a test case that duplicates the problem?
> 
> Finally, do you have any idea if it's hardware/platform specific?
> 


-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list