This is the mail archive of the
mailing list for the SID project.
Re: Profiling: --insn-count=1
- From: Scott Dattalo <scott at dattalo dot com>
- To: sid at sources dot redhat dot com
- Date: Thu, 1 Aug 2002 05:03:55 -0700 (PDT)
- Subject: Re: Profiling: --insn-count=1
On Thu, 1 Aug 2002, Frank Ch. Eigler wrote:
> Hi, Scott -
> On Wed, Jul 31, 2002 at 12:53:37PM -0700, Scott Dattalo wrote:
> > [...]
> > Now the question I have is there a way to count cpu cycles instead of cpu
> > instructions? If there was a one-to-one relationship between the two, then
> > it's not an issue. However, some instructions on the ARM are not
> > single-cycled. I suppose the real question is, "is there a way to
> > concisely measure the amount of 'simulated' time it take for a simulation
> > to run?"
> The current batch of CPU models in sid do not attempt to track the number
> of cycles taken by any given instruction. To do so exactly is a crazy
> amount of work to do just casually. (Think of having to model all the
> pipeline interlock/bypass features, functional units.)
Yeah, I suspected as much... If I had time to look into it, I'd try to add
that feature. The way I'd approach it is I'd partition the time it takes
an instruction to execute into two parts: the fixed amount of time the CPU
requires and the (possibly) variable amount that the memory accesses
require. The fixed portion may be ascertained when the target program is
first loaded. The variable portion may too, depending on the address
accessed. If not, the address can be examined to see which region it
accesses (and memory can be partitioned into regions that describe how
much time it takes to access - e.g. single-cycle SRAM versus 7-wait state
I'm sure this is already obvious to you because, as you say, it would
require a fair amount of work to implement .
> SID can on the other hand model memory latency, so if that's the bulk of
> your interest, we can make the profile data collector sensitive to that.
Well, actually, that is half the problem.
In my particular application, I have about an 8 Meg program of which only
70k is code and the rest are constants. For all intents and purposes, you
can think of the ~7.93 Meg being a file system. The code will be shadowed
in single-cycle SRAM. The application is an extremely low powered one.
Furthermore, it has time constraints (i.e. it has to complete its task in
under a second or so). The code was originally written targetting a
Desktop PC. I.e. little or no attention was given to code efficiency since
it was not an issue. So I've been re-writing the code to optimize it for
an ARM application.
The development scenario has thus been:
1) make optimizations to the code and test on a Linux box
2) debug and go back to step 1 about 100 hundred times or so.
3) Once convinced that an optimization has been correctly made
re-target the makefile for an ARM processor
4) simulate the code (using sid as the simulator engine, of course)
5) analyze the simulation results
So far, I've been satisfied knowing the total number of executed
instructions. Objective results are easily quantified. However, I'm now
rapidly approaching the point where the optimizations have been completed.
While I know the approximate number of instructions, I still do not know
the total number of CPU cycles (and hence the total time).
> > FWIW, I'm using ~6 week old copy of SID.
> This hasn't changed recently.
Good to know - I've seen the automated messages regarding snap-shots and
was unsure off hand if there had been any changes.