This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Strange timeouts and multiple dhcp?



Colin Ford <colin.ford@pipinghotnetworks.com> writes:
> I need a bit of help with my DHCP on my PowerPC board.
> Everything seemed to be going fine until my eCos system
> started sending out multiple DHCP request. It sends out 
> requests until the tx buffers are full and then the next time
> it tries it waits for 4 seconds and then it starts receiving
> the DHCP responces???

It deliberately spits out 5 retries quite fast if it gets no response in
its initial listens of 0.1 - 0.3S, about, before going into a longer
exponential backoff sequence of up to 100 Seconds - after each longer
timeout it does 3 rapid retries if it gets no reponse.  This was found to
be the best way to get it to work reliably without clogging the net.  The
exponential backoffs are mandated by the RFC; I superimposed the small
number of quickies to get my world to work best.

Waiting 4 seconds before receiving: that could be the initial "long"
timeout, after which it sends *again*.  Perhaps the ethernet driver has a
bug where it doen't report a successful receive until it receives a 2nd
packet, or until it is asked to transmit again, or it cannot report a good
rx until all tx buffers have drained?

> It then when it finally does get the DHCP it then trys to
> do the ping test and gets:
> 
> recvfrom: Operation timed out
> recvfrom: Operation timed out
> recvfrom: Operation timed out
> recvfrom: Operation timed out

Don't know why that would fail.  Probably related if there is a problem in
the ether driver.

> When I use ethereal to look at the DHCP requests 
> comming out the time between them looks very fast:
> 
> Request      Time
> 1                4.181951
> 2                4.181971
> 3                4.181997
> 4                4.182022
> 5                4.182047

If those are decimal seconds, I would agree; the delay should be more like
0.1-0.3 S as I said above, not 25uS - the initial rapid hits are governed
by 
    ptv->tv_sec = 0;
    ptv->tv_usec = 65536 * (2 + (timeout_random & 3)); // 0.1 - 0.3S, about
which should not give so short a timeout under any circumstances!

Take a look in the routines reset_timeout() and next_timeout() in
dhcp_prot.c to debug that these are running correctly?

> Its like the timers are not waiting long enougth and
> timing out really quickly? Anyone know where I
> should be looking for a solution to the problem?

Are system timers per se running at wallclock speed?  It looks like timers
could be a factor of 10^4 too fast - I'd expect you to have noticed that!

But if you have mis-defined the constants CYGNUM_HAL_RTC_NUMERATOR and
CYGNUM_HAL_RTC_DENOMINATOR, or mis-initialized your hardware timer so that
it interrupts more rapidly than those symbols describe, then the timeouts
would be screwy.  There are a couple of tests in the wallclock package that
might help and recently a kernel test called clocktruth.cxx which might be
helpful; they'll simply print their opinion of seconds and you can compare
with your wristwatch.

HTH,
	- Huge


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]