Bug 3911 - Compilation of systemtap causes the system to crash on p570 system.
Summary: Compilation of systemtap causes the system to crash on p570 system.
Status: RESOLVED FIXED
Alias: None
Product: systemtap
Classification: Unclassified
Component: runtime (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Martin Hunt
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-01-24 09:49 UTC by Srinivasa DS
Modified: 2007-02-01 04:16 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Srinivasa DS 2007-01-24 09:49:41 UTC
My environment is systemtap(systemtap-20070120.tar.bz2),
elfutils(elfutils-0.124.tar.gz), kernel(2.6.18-1.2961.el5), p570 lpared system.

I was compiling latest systemtap source on  rhel5(2.6.18-1.2961.el5) kernel and
system dropped to xmon.

Screen looks loke this 
==================================
FAIL: probefunc:kernel.statement(0xc00000000005be34) compilation
FAIL: probefunc:kernel.function("scheduler_tick") compilation
FAIL: probefunc:kernel.inline("context_switch") compilation
Running
/home/systemtap/tmp/stap_testing_200701240903/src/testsuite/systemtap.base/simple.exp
...
Running
/home/systemtap/tmp/stap_testing_200701240903/src/testsuite/systemtap.base/timeofday.exp
...
FAIL: timeofday test compilation
Running
/home/systemtap/tmp/stap_testing_200701240903/src/testsuite/systemtap.base/timers.exp
...
Running
/home/systemtap/tmp/stap_testing_200701240903/src/testsuite/systemtap.base/tri.exp
...
Running
/home/systemtap/tmp/stap_testing_200701240903/src/testsuite/systemtap.maps/absentstats.exp
...


/////System crashed/////
=========================================

What xmon shows 
=============================
Unable to handle kernel paging request for data at address 0x420000000007f


5:mon> e
cpu 0x5: Vector: 300 (Data Access) at [c000000028a6b310]
    pc: c000000000349f40: ._spin_lock+0x20/0x88
    lr: c0000000000d6640: .__cache_alloc_node+0x4c/0x174
    sp: c000000028a6b590
   msr: 8000000000001032
   dar: 420000000007f
 dsisr: 40000000
  current = 0xc00000002ad74430
  paca    = 0xc000000000464d00
    pid   = 15804, comm = staprun

5:mon> t
[c000000028a6b610] c0000000000d6640 .__cache_alloc_node+0x4c/0x174
[c000000028a6b6b0] c0000000000d6cc8 .kmem_cache_alloc_node+0x104/0x12c
[c000000028a6b750] d000000000aa6f18 ._stp_map_init+0xa0/0x150
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6b800] d000000000aa70d4 ._stp_pmap_new+0x10c/0x1f0
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6b8c0] d000000000aa7328 ._stp_pmap_new_ix+0x170/0x28c
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6b970] d000000000aa757c .systemtap_module_init+0x138/0x254
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6ba20] d000000000aa76a8 .probe_start+0x10/0x2c
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6baa0] d000000000aa7734 ._stp_handle_start+0x70/0x10c
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6bbb0] d000000000aa79b8 ._stp_proc_write_cmd+0x1e8/0x9b0
[stap_a0def33066a20a930648d1bcbf25f718_896]
[c000000028a6bcf0] c0000000000e026c .vfs_write+0x118/0x200
[c000000028a6bd90] c0000000000e09dc .sys_write+0x4c/0x8c
[c000000028a6be30] c00000000000869c syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 0000008072e9461c
SP (ffff86eee90) is in userspace


5:mon> di c000000000349f40   (PC value)
c000000000349f40  7d20f828      lwarx   r9,r0,r31
c000000000349f44  2c090000      cmpwi   r9,0
c000000000349f48  40820010      bne     c000000000349f58        #
._spin_lock+0x38/0x88
c000000000349f4c  7c00f92d      stwcx.  r0,r0,r31
c000000000349f50  40a2fff0      bne     c000000000349f40        #
._spin_lock+0x20/0x88



5:mon> r
R00 = 0000000080000005   R16 = 0000000010005b18
R01 = c000000028a6b590   R17 = 0000000010005b10
R02 = c000000000578f80   R18 = 00000ffff86ef390
R03 = 000420000000007f   R19 = 0000000010005b38
R04 = 00000000000012d0   R20 = 0000008072d50698
R05 = 0000000000000010   R21 = 0000000010005c28
R06 = 0000000000000030   R22 = 00000000100170c8
R07 = 0000000000000220   R23 = 0000000000000002
R08 = 0000000000000010   R24 = 0000000000000010
R09 = c00000002dfe2980   R25 = 0000000000000800
R10 = c000000000605448   R26 = 0000000000000010
R11 = 0000000000000000   R27 = c00000002dfe2900
R12 = d000000000aaa3f0   R28 = 8000000000009032
R13 = c000000000464d00   R29 = 00000000000012d0
R14 = 00000000100170c0   R30 = c0000000004a59b8
R15 = 0000000010005638   R31 = 000420000000007f
pc  = c000000000349f40 ._spin_lock+0x20/0x88
lr  = c0000000000d6640 .__cache_alloc_node+0x4c/0x174
msr = 8000000000001032   cr  = 24000444
ctr = c0000000000d6cf0   xer = 0000000020000000   trap =  300
dar = 000420000000007f   dsisr = 40000000



======================================

Thanks 
 Srinivasa Ds
Comment 1 Mike Mason 2007-01-24 15:13:25 UTC
I doubt this was caused by the compilation.  The tests run in parallel so it's
difficult to associate what you see on the screen with the test that caused the
problem.  Is this crash repeatable?  

Also, you're using an old elfutils.  Try using elfutils-0.125 and see if you can
repeat the crash.
Comment 2 Frank Ch. Eigler 2007-01-25 20:27:32 UTC
(In reply to comment #0)
> I was compiling latest systemtap source on  rhel5(2.6.18-1.2961.el5) kernel and
> system dropped to xmon.

Not really just compiling: you were running test cases.

> What xmon shows 
> =============================
> Unable to handle kernel paging request for data at address 0x420000000007f
> 5:mon> e
> cpu 0x5: Vector: 300 (Data Access) at [c000000028a6b310]
>     pc: c000000000349f40: ._spin_lock+0x20/0x88

This resembles a memory corruption that may even precede this systemtap module.
It is certainly *before* any probes are even registered, let alone run.  The
runtime is the only part more or less running by this time.
Comment 3 Srinivasa DS 2007-01-29 08:30:53 UTC
Iam able to reproduce this bug on latest upstream kernel 2.6.20-rc6 kernel also.

Comment 4 Srinivasa DS 2007-01-30 07:17:37 UTC
Here is my analysis of this bug
======================================
 1) Looking at the backtrace, _stp_map_init() calls kmalloc_node() with cpu as
argument. kmalloc_node() is same as kmalloc if NUMA is not configured and
kmalloc_node() calls kmem_cache_alloc_node() if NUMA is configured.

_stp_map_init is called by _stp_pmap_new() within for_each_cpu() brace.

 
=============================================
static int _stp_map_init(MAP m, unsigned max_entries, int type, int key_size,
int data_si
ze, int cpu)
{
        int size;

.....................................
.....................................
for (i = 0; i < max_entries; i++) {
                        if (cpu < 0)
                                tmp = kmalloc(size, STP_ALLOC_FLAGS);
                        else
                                tmp = kmalloc_node(size, STP_ALLOC_FLAGS, cpu);

                        if (!tmp)
                                return -1;;

                        dbug ("allocated %lx\n", (long)tmp);
=========================================================================
static PMAP _stp_pmap_new(unsigned max_entries, int type, int key_size, int
data_size)
{
        int i;
        MAP map, m;

        PMAP pmap = (PMAP) kmalloc(sizeof(struct pmap), STP_ALLOC_FLAGS);
        if (pmap == NULL)
                return NULL;
.........................
.........................
 for_each_cpu(i) {
                m = per_cpu_ptr (map, i);
                if (_stp_map_init(m, max_entries, type, key_size, data_size, i)) {
                        goto err1;
                }
        }
============================================

2) Since in my system, NUMA is configured kmalloc_node() calls
kmem_cache_alloc_node with "cpu" as nodeid
====================================================
#ifdef CONFIG_NUMA
extern void *__kmalloc_node(size_t size, gfp_t flags, int node);

static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
{
        if (__builtin_constant_p(size)) {
                int i = 0;
#define CACHE(x) \
.......................
..............................
return kmem_cache_alloc_node((flags & GFP_DMA) ?
                        malloc_sizes[i].cs_dmacachep :
                        malloc_sizes[i].cs_cachep, flags, node);
        }
========================================================
3) This means systemtap code expects the number of nodes in numa should be same
as number of cpu's. 
  kmem_cache_alloc_node() inturn calls ___cache_alloc_node where
cachep->nodelists[nodeid] gives wrong address because in my system number of
nodes are less than number of cpus.
================================
Mount-cache hash table entries: 4096
Processor 1 found.
Processor 2 found.
Processor 3 found.
Processor 4 found.
Processor 5 found.
Processor 6 found.
Processor 7 found.
Brought up 8 CPUs
Node 0 CPUs: 0-7
Node 1 CPUs:
Node 2 CPUs:
Node 3 CPUs:
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=560 bytes
=================================

for example: I have 8 cpu's in my system and  4 numa nodes. nodelists[8] gives
wrong address and that is causing the oops.
============================================================
static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags,
                                int nodeid)
{
        struct list_head *entry;
        struct slab *slabp;
        struct kmem_list3 *l3;
        void *obj;
        int x;

        l3 = cachep->nodelists[nodeid];  <<<========PC is here
        BUG_ON(!l3);
============================================

Hence assuming, number of nodes equal to number of cpus in systemtap modules is
causing this bug.


Martin 
 Any ideas??

Thanks
 Srinivasa DS
Comment 5 Frank Ch. Eigler 2007-01-30 12:15:54 UTC
Very nice analysis.
Comment 6 Martin Hunt 2007-01-30 14:36:46 UTC
Looks like there is a missing cpu to node mapping there.  It's probably time to
just give up on NUMA for older kernels and just use alloc_percpu or percpu_alloc
as it is in the kernel. 
Comment 7 Martin Hunt 2007-01-30 17:15:26 UTC
I've checked in a fix for this to CVS. I tested it on SMP systems running
kernels 2.6.9 and 2.6.19. Please let me know if this does not fix your problems.
Comment 8 Srinivasa DS 2007-02-01 04:16:12 UTC
Martin
 It fixes my problem.

Thanks
 Srinivasa DS