[ECOS] bug in ARP FSM with fragmented packets?

Jürgen Lambrecht J.Lambrecht@televic.com
Tue Jan 22 15:42:00 GMT 2008


Andrew Lunn wrote:

>On Mon, Jan 21, 2008 at 12:18:16PM +0100, J?rgen Lambrecht wrote:
>  
>
>>Hello,
>>
>>I'm investigating a problem where UDP packets get lost when there is an  
>>ARP table timeout.
>>- It only happens with UDP, not with TCP; IPv4.
>>- It only happens with fragmented IP packets, so with UDP packets bigger  
>>than the Ethernet MTU of 1518 Bytes.
>>- I use freeBSD stack, version from CVS end 2007
>>This is what I see with Ethereal: The application receives a UDP request  
>>packet, and must send back a UDP reply packet. Instead an ARP  
>>request/reply happens, and the packet gets lost.
>>
>>Using static ARP table entries solves the problem of course, but that is  
>>not acceptable.
>>
>>Has anybody seen such a problem before?
>>Or has anybody a clue where to start looking?
>>    
>>
>
>First off, is this really a bug? UDP is unreliable. It is allowed to
>drop packets. If the application requires reliable packet transfer, it
>must perform retries at the application layer.
>  
>
I agree that UDP is unreliable, but this only caused by an unreliable 
network, not by unreliable SW.
I know what you mean, but I don't agree.
In a certain way you are saying: UDP related SW can contain bugs or have 
strange behavior, because it is not needed to be reliable.
In our case, we "own" the complete network path from sender to receiver. 
And the tests are done point-to-point with our dedicated HW&SW. So then 
UDP should be completely reliable.

>It is a while since i looked at the ARP code. However, i think it will
>hold onto one IP packet when it needs to make an ARP request. Once the
>ARP reply comes back it will send the packet it held. If more transmit
>requests are made before it has the ARP reply it discards
>packets. This fits your description. Your big UDP packet is being
>fragmented, causing two or more packets to be sent, of which all but
>one gets discarded.
>  
>
ok

>If you really must change this, you need to implement a list of
>packets, not a single packet.
>  
>
I agree that fragmentation is not a good idea; we have already mailed 
about this.. But we have to do it.

>However, in my view, your application is broken, not ARP.
>
>         Andrew
>
>  
>
So I will look next week to the ARP code, after having done detailed 
tests to be sure that ARP is the problem.
If indeed ARP only stores 1 packet, I propose to modify the code so that 
ARP stores as many packets as fit in the networking buffer. The size of 
the static network buffers is a configuration option. I know there are 
several networking buffers - I will have to find out what buffers are 
use wherefore..

kind regards,
Jürgen

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list