This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: exercising current aarch64 kprobe support with systemtap


Hi Will,

On 10/06/2016:05:28:36 PM, William Cohen wrote:
> On 06/09/2016 12:17 PM, William Cohen wrote:
> > I have been exercising the current kprobes and uprobe patches for
> > arm64 that are in the test_upstream_arm64_devel branch of
> > https://github.com/pratyushanand/linux with systemtap.  There are a
> > two issues that I have seen on this kernel with systemtap.  There are
> > some cases where kprobes fail to register at places that appear to be
> > reasonable places for a kprobe.  The other issue is that kernel starts
> > having soft lockups when the hw_watch_addr.stp tests runs.  To get
> > systemtap with the newer kernels need the attached hack because of
> > changes in the aarch64 macro args.
> ...
> > Soft Lookup for the hw_watch_addr.stp
> > 
> > When running the hw_watch_addr.stp tests the machine gets a number of
> > processes using a lot of sys time and eventually the kernel reports
> > soft lockup:
> > 
> > http://paste.stg.fedoraproject.org/5323/
> > 
> > The systemtap.base/overload.exp tests all pass, but maybe there is
> > much work being done to generate the backtraces for hw_watch_addr.stp
> > and that is triggering the problem.
> 
> I can reliably reproduce the soft lockup running a single test with:
> 
> /root/systemtap_write/install/bin/stap --all-modules \
> /root/systemtap_write/systemtap/testsuite/systemtap.examples/memory/hw_watch_addr.stp \
> 0x`grep "vm_dirty_ratio" /proc/kallsyms | awk '{print $1}'` -T 5 > /dev/null
> 
> paste of output and soft lockup at: http://paste.stg.fedoraproject.org/5324/
> 
> One of the things that Jeremy Linton pointed to was:
> 
> https://lkml.org/lkml/2016/3/21/198

Now we have following in arch_within_kprobe_blacklist(). So above issue should
not bite us.

+           !!search_exception_tables(addr))
+               return true;

> 
> Could the aarch64 hardware watchpoint handler have an issue that is causing this problem with the soft lockup?
> Or spending too much time doing the stack backtrace?

Not sure, could be the locked up CPU waiting for a lock (spinlock), which is not
being released. Just noticed that, backtrace of all active CPUs (`echo l >
/proc/sysrq-trigger`) is not working for arm64. Probably because, we do not have
arch_trigger_all_cpu_backtrace() defined for aarch64. May be we can have one,
like that of arm. Backtrace of CPUs in this state might give us some input.

~Pratyush


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]