9.2 Profiling Data File Format

The old BSD-derived file format used for profile data does not contain a magic cookie that allows one to check whether a data file really is a gprof file. Furthermore, it does not provide a version number, thus rendering changes to the file format almost impossible. GNU gprof uses a new file format that provides these features. For backward compatibility, GNU gprof continues to support the old BSD-derived format, but not all features are supported with it. For example, basic-block execution counts cannot be accommodated by the old file format.

The new file format is defined in header file gmon_out.h. It consists of a header containing the magic cookie and a version number, as well as some spare bytes available for future extensions. All data in a profile data file is in the native format of the target for which the profile was collected. GNU gprof adapts automatically to the byte-order in use.

In the new file format, the header is followed by a sequence of records. Currently, there are three different record types: histogram records, call-graph arc records, and basic-block execution count records. Each file can contain any number of each record type. When reading a file, GNU gprof will ensure records of the same type are compatible with each other and compute the union of all records. For example, for basic-block execution counts, the union is simply the sum of all execution counts for each basic-block.

9.2.1 Histogram Records

Histogram records consist of a header that is followed by an array of bins. The header contains the text-segment range that the histogram spans, the size of the histogram in bytes (unlike in the old BSD format, this does not include the size of the header), the rate of the profiling clock, and the physical dimension that the bin counts represent after being scaled by the profiling clock rate. The physical dimension is specified in two parts: a long name of up to 15 characters and a single character abbreviation. For example, a histogram representing real-time would specify the long name as “seconds” and the abbreviation as “s”. This feature is useful for architectures that support performance monitor hardware (which, fortunately, is becoming increasingly common). For example, under DEC OSF/1, the “uprofile” command can be used to produce a histogram of, say, instruction cache misses. In this case, the dimension in the histogram header could be set to “i-cache misses” and the abbreviation could be set to “1” (because it is simply a count, not a physical dimension). Also, the profiling rate would have to be set to 1 in this case.

Histogram bins are 16-bit numbers and each bin represent an equal amount of text-space. For example, if the text-segment is one thousand bytes long and if there are ten bins in the histogram, each bin represents one hundred bytes.

9.2.2 Call-Graph Records

Call-graph records have a format that is identical to the one used in the BSD-derived file format. It consists of an arc in the call graph and a count indicating the number of times the arc was traversed during program execution. Arcs are specified by a pair of addresses: the first must be within caller’s function and the second must be within the callee’s function. When performing profiling at the function level, these addresses can point anywhere within the respective function. However, when profiling at the line-level, it is better if the addresses are as close to the call-site/entry-point as possible. This will ensure that the line-level call-graph is able to identify exactly which line of source code performed calls to a function.

9.2.3 Basic-Block Execution Count Records

Basic-block execution count records consist of a header followed by a sequence of address/count pairs. The header simply specifies the length of the sequence. In an address/count pair, the address identifies a basic-block and the count specifies the number of times that basic-block was executed. Any address within the basic-address can be used.