This is the mail archive of the
mailing list for the GDB project.
[RFC] GDB performance testing infrastructure
- From: Yao Qi <yao at codesourcery dot com>
- To: <gdb-patches at sourceware dot org>
- Date: Wed, 14 Aug 2013 21:00:32 +0800
- Subject: [RFC] GDB performance testing infrastructure
Here is a proposal of GDB performance testing infrastructure.
We'd like to know how people think about this, especially on,
1) What performance issues this infrastructure can test or
2) What does this infrastructure look like? What it can do
and what it can't do.
I've written some micro-benchmarks, and run them in this
infrastructure prototype. The results look reasonable and
Table of Contents
1 Motivation and Goals
.. 1.1 Goals
2 Known works
3 Design
.. 3.1 Requirements
.. 3.2 Design
4 Example
.. 4.1 single step
.. 4.2 shared library
1 Motivation and Goals
GDB development process has no standard mechanism to show the
performance of GDB snapshot or release is improved or worsened. We
run regression tests which address only questions of functionality.
Performance regressions do show up periodically.
We really needs performance testing in GDB development, especially for
these following areas, to make sure there is no performance regression
introduced in the development.
* Remote debugging. It is slower to read from the remote target, and
worse, GDB reads the same memory regions in multiple times, or reads
the consecutive memory by multiple packets.
* Symbols. Some of the performance problems in GDB are related to
symbols. When GDB is used to debug large programs in real life,
such as LibreOffice, which has a huge number of symbols, it is a
challenge to GDB to organize them in an efficient way. We can also
find some bugs reported in bugzilla, such as [PR15412], [PR14125],
etc. Issues are documented on [wiki].
* Shared library. When a program needs a large number of shared
libraries, GDB will be slow. Gary improved the performance in this
area, but there is still an open bug on scalability ([PR15590]).
* Tracepoint. Tracepoint is designed to be efficient on collecting
data in the inferior, so we need performance tests to guarantee that
tracepoint is still efficient enough. Note that we a test
`gdb.trace/tspeed.exp', but there are still some rooms to improve.
1.1 Goals
The goals in this project are:
1. Collect performance data of GDB in various areas under different
supported configurations. These areas or aspects include
performing single step, thread-specific breakpoint, stack
backtrace, symbol lookup, shared library load/unload etc.
Configurations includes native debugging and remote debugging with
GDBserver. This framework include some micro-benchmarks and
utilities to record the performance data, such as execution time
and memory usage of micro-benchmarks.
2. Detect performance regressions. We collected the performance data
of each micro-benchmark, and we need to detect or identify the
performance regression by comparing with the previous run. It is
more powerful to associate it with continuous testing.
2 Known works
* [LNT] It was written for LLVM, but is *designed* to be usable for
the performance testing of any software. It is written in python,
well-documented and easy to set up. LNT spawn the compiler first
and then target program, record the time usages of compiler and
target program in json format. No interaction is involved. The
performance data collection in LNT is relatively simple, because it
is targeted to compiler. The performance testing part is done, and
the next step is to show the data and detect performance
regressions. LNT does a lot work here. The performance data in
json format can be imported to a database, and shown through [web].
The performance regression will be highlighted in red.
* [lldb] LLDB has a [] to measure the speed and memory
usage of LLDB. It captures the internal events, feeds some events
and record the time usages. It handles interactions by consuming
debugging events, and take some actions accordingly. It only
collects performance data, doesn't detect performance regressions.
* libstdc++-v3 There is directory performance in
libstdc++-v3/testsuite/ and a header testsuite_performance.h in
testsuite/util/. Test cases are compiled with the header, and run
with some large data set, to calculate the time usage. It is
suitable for performance testing for a library.
3 Design
3.1 Requirements
+ Drive GDB to do some operations and record the performance data.
Especially to drive GDB for these cases:
* Libraries are loaded or unloaded in a program, which has a large
number shared libraries, 4096 libraries, for example,
* Look up a symbol in a program which has a large number of symbols,
1 million, for example,
* Do single step, disassembly or other operations in remote
+ Both native debugging and remote debugging are supported.
+ Display the performance data in some format, plain text or html.
+ Detect the performance regressions. In functional regression
testing, we can simply diff the two `gdb.sum' files and get to know
the regressions or progressions. In performance testing, we need to
analyze the performance data in two runs to find the regression
instead of simply comparing them by diff.
+ Highlight regressions. It makes sense to show the regression or
progression greater than a certain threshold, 5%, for example.
The first three requires are the minimum set, and can be met in a
short term. Our ultimate goal is to keep track of the performance of
GDB, and improve its performance in some areas, instead of developing
a full-functional performance testing framework. In the long term, we
can improve the framework gradually and meet the last two
3.2 Design
+ Use `dejagnu' to invoke compiler to compile test case and start GDB
(and/or GDBserver). It is same as regression functional testing we
do nowadays. We choose `dejagnu' here because `dejagnu' handles GDB
testing, especially when GDBserver is used, very well. We don't
have to re-invent the wheel in python.
+ GDB load a python script, in which some operations are performed and
performance data (time and memory usage) is collected into a file.
The performance test is driven by python, because GDB has a good
python binding now. We can use python too to collect performance
data, process them and draw graph, which is very convenient.
+ Emulate the effect of large program, instead of using real large
program. Performance problem shows up when the program is *large*
enough, in terms of a large number of symbols or shared libraries.
Using real large program can trigger the problem, but other people
are hard to reproduce it. The test like this can be run regularly.
1. When we test the performance of GDB handling shared library, we
can use .exp script to generate a large number of c files,
compile them to shared libraries, and let main executable load
these libraries in order to measure the performance.
2. When we test the performance of GDB reading symbols in and
looking for symbols, we either can fake a lot of debug
information in the executable or fake a lot of `objfile',
`symtab' and `symbol' in GDB. we may extend `jit.c' to add
symbols on the fly. `jit.c' is able to add `objfile' and
`symtab' to GDB from external reader. We can factor this part to
add `objfile', `symtab', and `symbol' to GDB for the performance
testing purpose. However, I may be wrong.
4 Example
4.1 single step
For micro-benchmark `single-step', there are three source files,
`single-step.c', `' and `single-step.exp'.
`single-step.exp' is similar to our regression tests in `gdb.python'
| if ![runto_main] {
| return -1
| }
| set remote_python_file [remote_download host ${srcdir}/${subdir}/${testfile}.py]
| gdb_test_no_output "python exec (open ('${remote_python_file}').read ())"
| send_gdb "call \$perftest()\n"
| set timeout 300
| gdb_expect {
| -re "\"Done\".*${gdb_prompt} $" {
| }
| timeout {}
| }
| remote_file host delete ${remote_python_file}
`' is to drive GDB to do command `stepi' repeatedly and
record the time usage. Note that class `SingleStep' can be abstracted
in a better way, for example, moving common code to class `TestCase',
and extending it in class `SingleStep'.
| import gdb
| import time
| class SingleStep (gdb.Function):
| def __init__(self):
| # Each test has to register a convenience function 'perftest'.
| super (SingleStep, self).__init__ ("perftest")
| def execute_test(self):
| test_log = open ("perftest.log", 'a+');
| # Execute command 'stepi' in a number of times, and record the
| # time usage.
| for i in range(1, 5):
| start_time = time.clock()
| for j in range(0, i * 300):
| gdb.execute ("stepi");
| elapsed_time = time.clock() - start_time
| print >>test_log, 'single step %d in %s' % (i * 300, elapsed_time)
| test_log.close ()
| def invoke(self):
| self.execute_test()
| return "Done"
| SingleStep ()
* Run `single-step' with GDBserver
| $ make check RUNTESTFLAGS='--target_board=native-gdbserver single-step.exp'
and the result `perftest.log' looks like, each row is about the time
usage for doing a certain number of `stepi'
| single step 300 in 0.19
| single step 600 in 0.35
| single step 900 in 0.57
| single step 1200 in 0.75
* Run `single-step' without GDBserver
| $ make check RUNTESTFLAGS='--target_board=unix single-step.exp'
and the result `perftest.log' looks like,
| single step 300 in 0.06
| single step 600 in 0.08
| single step 900 in 0.14
| single step 1200 in 0.18
4.2 shared library
For micro-benchmark `solib', which is testing the performance of GDB
handling shared libraries load and unload, there are three source
files, `solib.c', `' and `solib.exp'.
`solib.exp' is to generate many c files, and compile them into shared
libraries. `solib.c' is main program which load these libraries
dynamically. `' is a python script to call some inferior
functions to load libraries and measure the time usages.
Here is the performance data, and each row is about the time usage of
handling loading and unloading a certain number of shared libraries.
We can use this data to track the performance of GDB on handling
shared libraries.
| solib 128 in 0.53
| solib 256 in 1.94
| solib 512 in 8.31
| solib 1024 in 47.34
| solib 2048 in 384.75
Yao (éå)