RFC: Optimization to write_register_bytes()
Fri Jun 9 20:49:00 GMT 2000
Fernando Nasser wrote:
> Andrew, if you look carefully at the code of write_register_gen() you will
> see that we will be using almost nothing of it except the test for no changes.
> On the contrary, we would e adding overhead. So, consider my patch with the
> following added to write_register_bytes() on the segment above:
> /* If new value == old value, then don't bother doing the
> actual store. */
> if (memcmp (registers + overlapstart,
> myaddr + (overlapstart - myregstart),
> overlapend - overlapstart) == 0)
> Now I am doing what you want but without adding any unnecessary tests.
Per other e-mail - Please NO!
Its more important to tighten up access to the ``registers'' buffer then
it is to try to save a few extra CPU cycles. The crippler for register
transfers is what happens when target_store_registers(regno) is called,
those other tests wouldn't even show up in a proper system level
I'd be interested in knowing how much just this basic change improves
I'd also note that even this change modifies the functions semantics - I
think in this case it is safe. Studying a very old version of the
function it was written as:
o pull in all the registers that the blat affects
o blat registers using memcpy()
o write out all the registers that the blat effects
If two regisers overlapped then *both* were written
The move to write_register_gen() for complete registers ment that the
above behavour was changed - if several registers overlapped it would
only write the first of them - the second would miss as the
write_register_gen() call found no change.
In moving to write_register_gen() for the partial register case, extends
that - at some risk. However, I think, given the basic change has been
there since ~95 it will be safe.
> > The second tweek I think may be safe is to keep track of how many bytes
> > are still to be written. If we just worry about the case where
> > registers are contigious and assending then the required change should
> > be easy.
> > if (we've just written out all the low bytes)
> > bump our lower bound and number of bytes to write;
> > if there are no more bytes to write
> > break;
> > it does assume that the actual target doesn't need separate writes when
> > two registers overlap. I think that is safe as the original purpose of
> > the function was to spray values such as structures across several
> > registers.
> This solution is cool as well, but the implementation in my patch is less
> restringent. I was wrong when I said that I required the registers to
> be contiguous. The only condition I add is that the last part comes last.
> They can be non-contiguous and the initial pieces may even be out of order
> (not that anyone should use that).
> I would stick to the "last piece written" test and fix the comments.
It isn't a question of cool.
The thing we can't risk is introducing some change only to discover
that, oops, we've broken target XYZ deployed in the field. I've been
through that and it ain't pleasant. The key difference between what I
suggested and you've tabled is that my proposal preserves the existing
behavour when the registers turn out to not be in assending order. Your
change doesn't do that.
More information about the Gdb-patches