This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug kprobes/2387] system crash on ppc64/2.6.15.4


------- Additional Comments From guanglei at cn dot ibm dot com  2006-02-23 15:23 -------
The following is the disassembly given by objdump:

Disassambly inside __find_get_block:
c0000000000b1934:    mr      r31,r6
c0000000000b1938:    bne-    cr7,c0000000000b1a68 <.__find_get_block+0x238>
c0000000000b193c:    bl      c0000000000b0b94 <.__find_get_block_slow>
c0000000000b1940:    mr.     r31,r3
c0000000000b1944:    beq-    c0000000000b1a68 <.__find_get_block+0x238>
c0000000000b1948:    li      r27,0
c0000000000b194c:    mfmsr   r0


disassambly around __find_get_block_slow:
c0000000000b0b8c <.sys_fdatasync>:
c0000000000b0b8c:    li      r4,1
c0000000000b0b90:    b       c0000000000b0a10 <.do_fsync>

c0000000000b0b94 <.__find_get_block_slow>:
c0000000000b0b94:    mflr    r0
c0000000000b0b98:    std     r24,-64(r1)
c0000000000b0b9c:    std     r25,-56(r1)
c0000000000b0ba0:    std     r28,-32(r1)
c0000000000b0ba4:    std     r29,-24(r1)
c0000000000b0ba8:    mr      r24,r4

But I wonder whether such info given by xmon is useful. I tried several times, 
and it will crash every time and showed a different exception & backtrace. And I 
noticed that all of these errors will have:

Unable to handle kernel paging request for data at address ...


--------------- Testing One ---------------------------------

Unable to handle kernel paging request for data at address 0x00000010
Faulting instruction address: 0xd000000000270ee4
cpu 0x1: Vector: 300 (Data Access) at [c000000040dab3f0]
    pc: d000000000270ee4: ._stp_print_flush+0xb8/0x164 [stap_7259]
    lr: d000000000273cb4: .probe_4+0x374/0x400 [stap_7259]
    sp: c000000040dab670
   msr: 8000000000001032
   dar: 10
 dsisr: 40000000
  current = 0xc00000002a351040
  paca    = 0xc000000000538400
    pid   = 9179, comm = dbench
enter ? for help

1:mon> t
[c000000040dab720] d000000000273cb4 .probe_4+0x374/0x400 [stap_7259]
[c000000040dab7c0] d000000000273e6c .dwarf_kprobe_4_enter+0x12c/0x1c8 
[stap_7259]
[c000000040dab840] c000000000419164 .trampoline_probe_handler+0xb0/0x150
[c000000040dab8e0] c00000000041959c .kprobe_exceptions_notify+0x334/0x5e8
[c000000040dab9a0] c00000000041a134 .notifier_call_chain+0x68/0x98
[c000000040daba30] c000000000418834 .program_check_exception+0x114/0x5d0
[c000000040dabad0] c000000000004348 program_check_common+0xc8/0x100
--- Exception: 700 (Program Check) at c00000000002a3bc kretprobe_trampoline+0x0/
0x8
[c000000040dabe30] c00000000002a3bc kretprobe_trampoline+0x0/0x8
--- Exception: c01 (System Call) at 000000000ff201b8
SP (ff9000b0) is in userspace
1:mon> 

----------- Testing Two -----------------------------------

localhost.localdomain login: Unable to handle kernel paging request for data at 
address 0x00000010
Faulting instruction address: 0xd000000000270ee4
cpu 0x1: Vector: 300 (Data Access) at [c000000066eeb500]
    pc: d000000000270ee4: ._stp_print_flush+0xb8/0x164 [stap_3949]
    lr: d0000000002736dc: .probe_3+0x374/0x400 [stap_3949]
    sp: c000000066eeb780
   msr: 8000000000001032
   dar: 10
 dsisr: 40000000
  current = 0xc000000002423040
  paca    = 0xc000000000538400
    pid   = 17224, comm = env
enter ? for help
1:mon> t
[c000000066eeb830] d0000000002736dc .probe_3+0x374/0x400 [stap_3949]
[c000000066eeb8d0] d0000000002738a4 .dwarf_kprobe_3_enter+0x13c/0x1d8 
[stap_3949]
[c000000066eeb950] c00000000041959c .kprobe_exceptions_notify+0x334/0x5e8
[c000000066eeba10] c00000000041a134 .notifier_call_chain+0x68/0x98
[c000000066eebaa0] c000000000418834 .program_check_exception+0x114/0x5d0
[c000000066eebb40] c000000000004348 program_check_common+0xc8/0x100
--- Exception: 700 (Program Check) at c00000000000ae38 .ppc_newuname+0x14/0x120
[link register   ] c00000000002a3bc kretprobe_trampoline+0x0/0x8
[c000000066eebe30] c000000000004760 .handle_page_fault+0x20/0x54 (unreliable)
--- Exception: c01 (System Call) at 000000000ffe2958
SP (fff6a970) is in userspace
1:mon> 

----------------------------------------------------------


kprobe_exceptions_notify could be triggered by breakpoint or singstep trap. 
kprobe_exceptions_notify will check and if it was triggered by BreadkPoint, it 
will invoke kprobe_handler which will then invoke kprobe->pre_handler, i.e. the 
probe handlers. and the stap -p3 shows:
 dwarf_kprobe_1[i].pre_handler = &dwarf_kprobe_1_enter;

So I think the exception notification stuff *could* result in launching into a 
kprobe. Am I wrong with something?



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=2387

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]