[ECOS] BSD socket stall

Lambrecht Jürgen J.Lambrecht@TELEVIC.com
Thu Sep 13 21:02:00 GMT 2012


Hello Bernd,

nice to read you worked on the same problems as I did a while ago..

On 09/13/2012 03:14 PM, Bernd Edlinger wrote:
> Hello Henry,
>
>> I am working with a Atmel Based ARM board and we BSD stack configured. The
>> board seems to have a issue with sending data on a UDP socket.
>> We are sending a lot of data thru the socket and we have rare instances
>> where the socket seems to “stall”.
>> Attaching the gdb to the target shows the sending thread paused in “sosend”
>> at the “sblock”. I can see why the socket could stall when it a high water
>> mark.
>> But the socket does not seem to recover from this condition.
>> My confusion is that I do not understand or see how the socket handles this
>>   condition.
>> Could someone point in the direction of how this condition is handles  or in
>> the direction of what our implementation missing.
> The BSD stack uses a simple spin lock to prevent multiple threads from
> entering the send path at the same time.That means, if you are using
> more than one thread to send data to the UDP socket, you got interrupted
> while one thread is in the sosend function. This spin lock is really really
> simple. For instance it does not use priority inheritance at all.
> And if your spin lock is occupied for 99.9% of the time, you're out of luck too.
>
> Therefore, I'd suggest you place an eCos Kernel Mutex object with priority
> inheritance around the sendto function call(s).
>
> By the way, this might be another issue, when you use unicast udp sends.
>
> That's as follows: If your ARP entry expires while your application sends
> many udp telegrams in very short time, you will loose some packets while
> the BSD stack is waiting for tha ARP response. The ARP Timeout is 20 minutes
> by default, so You can expect some data losses every 20 minutes.
>
> I worked around this by sending a unicast ARP request if any message gets
> sent while 90-99% of the ARP time out expired.
I used a static ARP entry to work around this, and filed it as a bug in 
our bug-tracking system.
>
> Recently I fixed that and a lot of other issues in the BSD stack:
> You might use like to try this: http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001656
I will try to find time to integrate and use your patch.
>
> And maybe the improved AT91 Ethernet driver too: http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001649
I mailed my improvements long ago 
(http://old.nabble.com/bugs-in-AT91-Ethernet-driver-td17569021.html), 
but never found time for a proper patch :-(.

Kind regards,
Jürgen
>
> I had spurious interrupts with the original driver on the AT91SAM9G45 when I started
> two or more flood pings at the same time, but this must also happen with other AT91 devices.
>
> Which one are you using?
>
> Regards
> Bernd Edlinger 		 	   		
>


-- 
Jürgen Lambrecht
R&D Associate
Tel: +32 (0)51 303045    Fax: +32 (0)51 310670
http://www.televic-rail.com
Televic Rail NV - Leo Bekaertlaan 1 - 8870 Izegem - Belgium
Company number 0825.539.581 - RPR Kortrijk

--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list