[PATCH] gprofng: a new GNU profiler

Vladimir Mezentsev vladimir.mezentsev@oracle.com
Wed Aug 11 21:10:35 GMT 2021


Hi people!

In this submission we are contributing a new profiler to the GNU binary
utilities, called gprofng (for GNU profiler, next generation).

Why a new profiler?
===================

The GNU profiler, gprof, works well enough in many cases. However, it
hasn't aged well and it is not that very well suited for profiling
modern-world applications. Examples of its limitations are lack of support
for profiling multithreaded programs, and shared objects. Both are
ubiquitous nowadays.

Main characteristics of gprofng
===============================

gprofng supports profiling C, C++ and Java programs. Unlike the old
gprof, it doesn't require to build annotated versions of the programs.
Profiling "production" binaries should work just fine.

Another distinguishing feature of gprofng is the support for various filters
that allow the user to easily drill deeper into an area of interest.

The profiler is commanded through a driver program called `gprofng'.
This driver supports the following sub-commands:

gpronfg collect app EXECUTABLE

This runs EXECUTABLE and collects application performance data.

gprofng display text EXPERIMENT

This runs a client command-line interface that provides access to the
collected performance data stored in the experiment directory.

gprofng display html EXPERIMENT

This generates an HTML report from the collected performance data.
stored in the experiment directory.

gprofng display src OBJECT-FILE

This displays source (if available) or disassembly interleaved
with the source code.

gprofng archive EXPERIMENT

Archive the associated application binaries, load objects and source
files in an existing experiment directory to make it self contained.

There is also an extensive graphical user interface (written in Java)
that displays and analyzes gprofng collected data in a very sophisticated
way. We plan to release this GUI as a separate project.

While WIP, we would like to share some screenshots of the current
development version. These show the following:

pic1.png - a flame graph:
https://jemarch.net/gprofng-pics/pic1.png

pic2.png - color coded call stacks as a function of time ("the timeline"):
https://jemarch.net/gprofng-pics/pic2.png

pic3.png - zoom in on the timeline and adapt colors to identify details:
https://jemarch.net/gprofng-pics/pic3.png

pic4.png - compare two mulithreaded profiles:
https://jemarch.net/gprofng-pics/pic4.png

Some notes on the implementation
================================

- The gp-display-html tool is written in Perl. All other components are
written in C/C++.

- gprofng sources are mostly contained in a new top-level directory
gprofng/ that in turn contains:

+ src/ contains the source code of the gp-* programs and libgprofng.

+ libcollector/ contains the sources of libcollector.

+ common/ contains a few source files that are used by both the gp-*
utilities and libcollector.

+ doc/ contains the Texinfo sources for the gprofng manual.

+ testsuite/ contains the gprofng testsuite.

Three installed header files are distributed in the top-level include/
directory. These are libcollector.h, libfcollector.h, and
collectorAPI.h.

- Currently gprofng supports profiling programs in GNU/Linux systems
running on x86_64 and aarch64 hardware. It is possible to add support
for additional architectures.

- The tools come with a set of man pages. They are generated upon
installation and can be found in the installation directory under
share/man/man1.

Platform support
================

The basic profiling features are supported on most processors from
Intel. Regarding AMD we did not yet test on their recent EPYC
processors, but do not expect serious issues. We also support the Arm
processors as used in systems from Ampere.

Hardware event counters, which are optional and used by gprofng in
advanced profiling, are supported for many modern Intel and AMD
processors. If a particular processor is not supported, a warning
message will be issued when trying to run an event counter experiment.

This code has been developed and tested on Oracle Linux 8 with the
latest GNU toolchain from the sourceware git repos.

Structure of the patch series
=============================

The first patch is preparatory and makes the x86 disassembler in opcodes
to be thread-safe. This is so it can be used by gprofng.

The second patch is the implementation of gprofng proper. This includes
source code for the libraries (libcollector, libgprofng) and the
utilities (gp-collect, etc).

In this patch there are also updates to the corresponding build machinery
(e.g. configure.ac, Makefile.def, plus binutils/MAINTAINERS to cover 
gprofng)

The third patch adds a testsuite in gprofng/testsuite.

The fourth patch adds a Texinfo manual for gprofng. The manual is still
WIP but already provides a tutorial-like introduction to the tools.

Where to find the patch series
==============================

Due to the size of the contribution, we thought it would be better to
submit it in the form of a git branch instead of a regular emailed patch
series.

Repository: https://www.github.com/oracle/binutils-gdb
Branch: oracle/gprofng-v1

We hope this will make it easier for the maintainers to review the
tools. We suggest having feedback and discussion in this mail thread.

Limitations
===========

The gp-display-html tool is present, and can be executed, but it is not
functional yet. Full support for this tool is expected to be delivered
in a future patch.

Requirements
===========

In order to successfully build gprofng, the following versions of
external components are required:

- Bison 3.7.5, or higher
- Texinfo 6.7, or higher
- Java include files (--with-jdk=PATH) if java profiling should be enabled

Maintenance
===========

We are of course volunteering to maintain gprofng once it is
incorporated into the main binutils distribution.



More information about the Binutils mailing list