[ECOS] FreeBSD Netstack EPIPE error

Bell, Andrew [Allen & Heath UK] Andrew.Bell@allen-heath.com
Tue Oct 10 15:05:00 GMT 2006


Hi Gary,

Thanks for the reply. 

One end of the connection is eCos the other is Java running on a Linux
JVM.

The connection is dropped by the Java side; after repeated packets are
not acknowledged by the eCos host, followed by the eCos netstack xmiting
an out of order segment, the Java protocol stack appears to wind back to
the last packed with a sequence number the two hosts agreed on and
resends it. After a couple more attempts the Java netstack gives up and
drops the connection.

Looking at the packet capture from eCos with cyg_io_eth_net_debug set,
there is a complete lack on xmit activity; I see the retransmits from
the Java host, but the eCos netstack fails to ack them. Almost like the
protocol stack has stalled.

From what I understand the protocol stack is interrupt driven. Here's
the final twist. My main worker thread in eCos is very busy during the
time I observe the netstack xmit starvation (writing to flash) for a
period of around 16 seconds.

I've made sure my main thread priority is lower (higher in integer
terms) than the internal netstack delivery threads priority.

Is there any way a user thread can cause netstack starvation? BTW I'm
not locking out interrupts during this time.

Sorry for the length of the post, and again TIA

Andrew.


Bell, Andrew [Allen & Heath UK] wrote:
> Hello All,
> 
> I'm having FreeBSD netstack issues with an eCos port for a Motorola
852T
> board based on an A&M Adder.
> 
> Our eCos application keeps dropping socket connections with an EPIPE
> (broken pipe) after a period of high tx activity. The ethereal capture
> of the stream shows the eCos nestack shortly after the burst of tx
> activity stops sending acks to the front end, ignores retransmits from
> the front end, then eventually emits an out of order segment which
> ethereal calculates a RTT of 1158229289 seconds!
> 
> I've run the bsd tests, enabled stack checking and enabled assertions.
> I've turned on MBUF warnings and enabled cyg_io_eth_net_debug and
> increased CYGPKG_NET_USAGE to (1008 *1024) + (MAXSOCK * 1024), all of
> which show no clues.
> 
> If anyone can point me in the right diections I'd be grateful.

AFAIK, EPIPE is only returned if the receiving end of a TCP connection
breaks off and the Tx end is still trying to send.

Are both "ends" of your connections eCos applications?  On the same
or different machines?

Is this failure something that can be tested/demonstrated separately?
In other words, can you send a test case that duplicates the problem?

Finally, do you have any idea if it's hardware/platform specific?

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------



--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list