This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: gprof
Ken Raeburn wrote:
On Sep 20, 2005, at 06:31, Michael Trimarchi wrote:
>But also, if you care about the accuracy of the results, you may
need to modify the C runtime support code for >profiling, which
typically updates the per-function data in a manner that is not
thread-safe.
may you explain more precisely this point?
Regards
Michael
With basic profiling, the runtime support code keeps track of how
often the CPU program counter is in a given range of values, with
fairly fine granularity. Later this table is dumped out, and (g)prof
interprets it in combination with symbol information from the
executable. For example, addresses XXX through YYY correspond to
function A, and so many ticks at a certain frequency were counted
with the PC in that range, so here's the amount of time the program
spent in that function. But that may be inaccurate in multithreaded
programs if the counter is implemented as read counter value N,
increment value, another thread runs for a bit and changes the
counter, store counter value N+1; you've just lost the change made by
the other thread.
For graph profiling, you also need data recorded on entry to a
function indicating where the function was called from, and how many
times it was called from each call site. Since you could have
arbitrarily many such call sites, this is likely to involve dynamic
memory allocation, walking through some data structures, etc. If
it's not done just right, it might even result in crashes in
multiprocessor, multithreaded situations, if you're really unlucky
with the timing.
I've never gotten around to modifying the support code to try to make
it thread-safe. Using mutex locks would be the obvious approach, but
probably kind of expensive compared to some of the atomic operations
a few processors have available, or "store if another cpu or thread
hasn't stored here, and set condition codes to tell me", etc.
You might just get kind of lucky with the existing support code,
though, your program might not crash, and the numbers might even be
vaguely accurate...
Ken
For example, to compute the time spent inside a function foo()
(starting at address X and ending at address Y) the system sample each
tick looking if the program counter is between X and Y. That may be
inaccurate in multithreaded programs if the accounting of a
counter C is implemented as the following set of instructions:
read counter value C
C=C+1
store C
If another thread runs for a bit interrupting the thread between 2 and 3,
and that thread changes the counter, then the value stored by the
interrupted
thread is N+1, losing an increment. Is it ok?
Regards Michael