[ECOS] Re: RedBoot DHCP failure due to race condition.

Tarmo Kuuse tarmo.kuuse@mail.ee
Fri Mar 18 08:40:00 GMT 2011


On 17.03.2011 16:23, Grant Edwards wrote:
> On 2011-03-16, Grant Edwards<grant.b.edwards@gmail.com>  wrote:
>
>> We've been having intermittent problems with RedBoot's DHCP client
>> failing to acquire an address.  I've tracked it down to what looks
>> like a race condition in RedBoot's DHCP code.
>
> Apart from the race condition that's overwriting the saved bp_info
> struct, I'm a little baffled by the "retry" counter in
> __bootp_find_local_ip.  It doesn't appear to count attempts to get an
> IP address.  It appears to count passes through the foreground state
> machine loop.  It takes 3 passes through that loop to obtain an IP
> address via DHCP.
>
> In our build, somebody had set MAX_RETRIES to 4.  With 4 retries, I
> presume they assumed that up to 5 attempts would be made, but only 1
> is made.
>
> I changed the code so that it's only decremented once for each
> attempted DHCP transaction, and then I ended up with cases where it
> looped indefinitely because the retry counter is re-initialized by the
> state machine when an OFFER packet is received.
>
> Can somebody explain how the retry counter is supposed to work?
>

 From what I have understood there are two levels of retires in DHCP client.

1. Higher level retry involves the common timeout that increases twofold 
with each failure - 2, 4, 8, ... seconds. Some pseudo-randomness (+/- 3 
seconds IIRC) is added the timeout.

2. Lower level retry is applied to each high level retry attempt. It 
sends the DHCPDISCOVER three times in quick succession (approx 100-200 
ms pseudo-random timeout period).

Second retry is nonsense, I disabled it. All DHCP servers that I have 
met check if the IP address is available using ARP query. This has a 
timeout of 1 second so there is no point in spamming DHCPDISCOVER 
messages several times per second. All it creates is three pairs of 
DHCPDISCOVER and DHCPOFFER packages which potentially muck up the client 
state machine and create network traffic.

--
Kind regards,
Tarmo Kuuse


-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss



More information about the Ecos-discuss mailing list