This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] stap/staprun do not terminate properly
- From: David Smith <dsmith at redhat dot com>
- To: Josh Stone <jistone at redhat dot com>, Torsten Polle <Torsten dot Polle at gmx dot de>, systemtap at sourceware dot org, Jonathan Lebon <jlebon at redhat dot com>
- Date: Wed, 12 Mar 2014 09:11:41 -0500
- Subject: Re: [PATCH] stap/staprun do not terminate properly
- Authentication-results: sourceware.org; auth=none
- References: <m2k3c7uiaa dot fsf at gmx dot de> <531A3581 dot 7050009 at redhat dot com> <531E3E77 dot 30701 at redhat dot com>
On 03/10/2014 05:36 PM, Josh Stone wrote:
> On 03/07/2014 01:09 PM, David Smith wrote:
>> So, instead I did this:
>>
>> ====
>> stap_stap_task_finder()
>> {
>> // ...
>>
>> __stp_tf_cancel_task_work();
>>
>> // Note that utrace_exit() calls stp_task_work_exit()
>> utrace_exit();
>> }
>> ====
>>
>> This moves canceling all outstanding task_work items before shutting
>> down utrace (and calling stp_task_work_exit()). I think the end result
>> is the same as your patch, and I think this makes a little more sense.
>> This way we've canceled all the task_work items before shutting down
>> utrace (and freeing all the memory allocated for utrace).
>>
>> If this doesn't work for you or you see a hole in this logic please let
>> me know.
>
> I notice that utrace_exit() calls utrace_shutdown(), but so does
> __stp_task_finder_cleanup() earlier in stap_stop_task_finder(). So in
> fact the shutdown is still happening before canceling this work, and the
> only thing left for utrace_exit() is the kmem_cache_destroy and
> stp_task_work_exit(). Do you think this could be a problem?
>
> In particular, I worry about this comment:
>
> /* After calling tracepoint_synchronize_unregister(), we're
> * sure there are no outstanding tracepoint probes being
> * called. So, now would be a great time to free everything. */
>
> There won't be any outstanding tracepoint handlers, but couldn't there
> still be outstanding task_work scheduled from those handlers?
>
I went over this code yesterday, and just committed a patch (commit
6628352) that tightens things up here. This patch fixes a crash that
Jonathan was seeing. The patch make sure that all the tracepoint probes
and task work items are finished running before the kmem caches are freed.
--
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)