This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Benchmarking (was Re: [patch 2/2] Assert leftover cleanups in TRY_CATCH)


On Tue, May 14, 2013 at 5:13 PM, Stan Shebs <stanshebs@earthlink.net> wrote:
> On 5/7/13 7:00 AM, Jan Kratochvil wrote:
>
>>
>> target-side condition evaluation is a good idea:
>>
>> time gdb ./loop -ex 'b 4 if i==360000' -ex r -q -ex 'set confirm no' -ex q
>> real  1m11.586s
>>
>> gdbserver :1234 ./loop
>> time gdb ./loop -ex 'target remote localhost:1234' -ex 'b 4 if i==360000' -ex c -q -ex 'set confirm no' -ex q
>> real  0m21.862s
>>
>> "set breakpoint condition-evaluation target" really helps a lot.
>
> This reminds me of something that has been on my mind recently -
> detecting performance regression with the testsuite.

tis on my todo list.
Got a time machine?

IIRC Redhat had the seeds of something, but it needed more work.

> I added a test for fast tracepoints a while back (tspeed.exp) that also
> went to some trouble to get numbers for fast tracepoint performance,
> although it just reports them, they are not used to pass/fail.
>
> However, if target-side conditionals get worse due to some random
> change, or GDB startup time gets excessive, these are things that we
> know real users care about.  On the other hand, this is hard to test
> automatically, and no wants to hack dejagnu that much.  Maybe an excuse
> to dabble in a more-modern testing framework?  Are there good options?

re: dejagnu hacking: depends on what's needed.

Sometimes regressions aren't really noticed unless one is debugging a
really large app.
A second slowdown might be attributable to many things, but  in a
bigger app it could be minutes and now we're talking real money.

It's trivial enough to write a program to generate apps (of whatever
size and along whatever axis is useful) for the task at hand.
And it's trivial enough to come up with a set of benchmarks (I have an
incomplete set I use for my standard large app benchmark), and a
harness to run them.
IWBN if running the tests didn't take a lot of time but alas some
things only show up at scale.
Plus one needs to run them a sufficient number of times to make the data usable.
Running a benchmark with different sized tests and comparing the
relative times can help.

[One thing one could do is, e.g., run gdb under valgrind and use
instruction counts as a proxy for performance.
[One also needs to measure memory usage.]
It has the property of being deterministic, and with a set of
testcases for each benchmark could reveal problems.
One can't do just this because it doesn't measure, e.g., disk/network
latency which can be critical.
I'm sure one could write a tool to approximate the times.
Going this route is slower of course.]

Ultimately I'd expect this to be separate from "make check" (or at
least something one has to ask for explicitly).
But we *do* need something and the sooner the better.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]