This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
- From: Don Breazeal <donb at codesourcery dot com>
- To: Pedro Alves <palves at redhat dot com>, "gdb-patches at sourceware dot org" <gdb-patches at sourceware dot org>
- Date: Wed, 5 Aug 2015 15:19:44 -0700
- Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
- Authentication-results: sourceware.org; auth=none
- References: <1438362229-27653-1-git-send-email-palves at redhat dot com> <1438362229-27653-3-git-send-email-palves at redhat dot com> <55BBB89B dot 8020101 at codesourcery dot com> <55BBC636 dot 40705 at redhat dot com>
On 7/31/2015 12:02 PM, Pedro Alves wrote:
> On 07/31/2015 07:04 PM, Don Breazeal wrote:
>> On 7/31/2015 10:03 AM, Pedro Alves wrote:
>>> Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html
>>>
>>> This adds a test that has a multithreaded program have several threads
>>> continuously fork, while another thread continuously steps over a
>>> breakpoint.
>>
>> Wow.
>>
>
> If gdb survives these stress tests, it can hold up to anything. :-)
>
>>> - The test runs with both "set detach-on-fork" on and off. When off,
>>> it exercises the case of GDB detaching the fork child explicitly.
>>> When on, it exercises the case of gdb resuming the child
>>> explicitly. In the "off" case, gdb seems to exponentially become
>>> slower as new inferiors are created. This is _very_ noticeable as
>>> with only 100 inferiors gdb is crawling already, which makes the
>>> test take quite a bit to run. For that reason, I've disabled the
>>> "off" variant for now.
>>
>> Bummer. I was going to ask whether this use-case justifies disabling
>> the feature completely,
>
> Note that this being a stress test, may not be representative of a
> real work load. I'm assuming most real use cases won't be
> so demanding.
>
>> but since the whole follow-fork mechanism is of
>> limited usefulness without exec events, the question is likely moot
>> anyway.
>
> Yeah. There are use cases with fork alone, but combined with exec is
> much more useful. I'll take a look at your exec patches soon; I'm very
> much looking forward to have that in.
>
>>
>> Do you have any thoughts about whether this slowdown is caused by the
>> fork event machinery or by some more general gdbserver multiple
>> inferior problem?
>
> Not sure.
>
> The number of forks live at a given time in the test is constant
> -- each thread forks and waits for the child to exit until it forks
> again. But if you run the test, you see that the first
> few inferiors are created quickly, and then as the inferior number
> grows, new inferiors are added at a slower and slower.
> I'd suspect the problem to be on the gdb side. But the test
> fails on native, so it's not easy to get gdbserver out of
> the picture for a quick check.
>
> It feels like some data structures are leaking, but
> still reacheable, and then a bunch of linear walks end up costing
> more and more. I once added the prune_inferiors call at the end
> of normal_stop to handle a slowdown like this. It feels like
> something similar to that.
>
> With detach "on" alone, it takes under 2 seconds against gdbserver
> for me.
>
> If I remove the breakpoint from the test, and reenable both detach on/off,
> it ends in around 10-20 seconds. That's still a lot slower
> than "detach on" along, but gdb has to insert/remove breakpoints in the
> child and load its symbols (well, it could avoid that, given the
> child is a clone of the parent, but we're not there yet), so
> not entirely unexpected.
>
> But pristine, with both detach on/off, it takes almost 2 minutes
> here. ( and each thread only spawns 10 forks, my first attempt
> was shooting for 100 :-) )
>
> I also suspected all the thread stop/restarting gdbserver does
> both to step over breakpoints, and to insert/remove breakpoints.
> But then again with detach on, there are 12 threads, with detach
> off, at most 22. So that'd be odd. Unless the data structure
> leaks are on gdbserver's side. But then I'd think that tests
> like attach-many-short-lived-threads.exp or non-stop-fair-events.exp
> would have already exposed something like that.
>
>>
>> Are you planning to look at the slowdown?
>
> Nope, at least not in the immediate future.
>
>> Can I help out? I have an
>> interest in having detach-on-fork 'off' enabled. :-S
>
> That'd be much appreciated. :-) At least identifying the
> culprit would be very nice. I too would love for our
> multi-process support to be rock solid.
>
Hi Pedro,
I spent some time looking at this, and I found at least one of the
culprits affecting performance. Without going through the details of
how I arrived at this conclusion, if I insert
gdb_test_no_output "set sysroot /"
just before the call to runto_main, it cuts the wall clock time by at
least half. Running with just the 'detach-on-fork=off' case, it went
from 41 secs to 20 secs on one system, and 1:21 to 0:27 and 1:50 to 0:41
on another. Successive runs without set sysroot resulted in
successively decreasing run times, presumably due to filesystem caching.
I ran strace -cw to collect wall clock time (strace 4.9 and above
support '-w' for wall time), and saw this:
Without set sysroot /:
% time seconds usecs/call calls errors syscall^M
------ ----------- ----------- --------- --------- ----------------^M
25.90 14.620339 4 3666141 202 ptrace^M
25.21 14.229421 81 175135 57 select^M
14.42 8.139715 13 641874 7 write^M
10.65 6.012699 4 1397576 670469 read^M
7.52 4.245209 4 1205014 104 wait4^M
4.90 2.765111 3 847985 rt_sigprocmask^M
With set sysroot /:
% time seconds usecs/call calls errors syscall^M
------ ----------- ----------- --------- --------- ----------------^M
32.91 6.885008 148 46665 43 select^M
21.59 4.516311 4 1158530 202 ptrace^M
11.15 2.332491 13 184229 2 write^M
9.07 1.897401 4 422122 203552 read^M
6.77 1.415918 42 34076 53 open^M
6.27 1.312490 3 378702 103 wait4^M
4.00 0.835731 3 262195 rt_sigprocmask^M
The # calls and times for each case varied from run to run, but the
relative proportions stayed reasonably similar. I'm not sure why the
unmodified case has so many more calls to ptrace, but it was not an
anomaly, I saw this in multiple runs.
Note that I used the original version of the test that you posted, not
the update on your branch. Also, I didn't make the set sysroot command
conditional on running with a remote or gdbserver target, since it was
just an experiment.
Do you think there is more to the slowdown than this? As you said
above, detach-on-fork 'off' is going to take longer than 'on'. It may
be a little while before I can get back to this, so I thought I'd share
what I found. Let me know if you think this change will be sufficient.
thanks
--Don