I was testing systemtap snapshot on x86_64 with the 2.6.22 kernel. I usually start with the existing config file for the distro's kernel config on the machine, e.g. /boot/config-2.6.9-55.EL for RHEL4 machine. The 2.6.22-rc2 kernel on x86_64 machine that was built starting with rhel4 config file work. The 2.6.22-rc2 kernel on x86_64 machine starting with the fc7 config oopsed. The oops showed it dying in the kprobes arc_arm_kprobe, the instruction writing the actual break point when attempting place a probe for the following test: Running /home/wcohen/stap_snap_200705221118/src/testsuite/systemtap.base/equal.exp ... The point it is dying is arch_arm_kprobe+0x4/0x8 in the function below: ffffffff804400cf <arch_arm_kprobe>: ffffffff804400cf: 48 8b 47 30 mov 0x30(%rdi),%rax ffffffff804400d3: c6 00 cc movb $0xcc,(%rax) ffffffff804400d6: c3 retq Unfortunately, this laptop machine does not have a serial port, so it isn't easy to get complete back trace. -Will
Created attachment 1853 [details] oops output from config_x86_64_fc7 This is the output from the oops that occurred for the problem configuration.
Created attachment 1854 [details] problem kernel config file This is the configuration used for the kernel that oopsed.
Created attachment 1855 [details] kernel config that works. The kernel built with the config_x86_64_rhel4 works on the same machine.
The following option in the .config looks like a possible cause: CONFIG_DEBUG_RODATA=y Shouldn't CONFIG_KPROBES disable this?
CONFIG_DEBUG_RODATA=y has been there in Fedora kernels for some time. AFAICT, because CONFIG_HOTPLUG_CPU is set it's only getting applied to _etext..__end_rodata, which does not include the code sections. It still seems odd, since I would expect num_cpus_possible to return 1 on some machines anyway. Also, the i386 code is similar and I would expect it to be write-protecting the text for num_possible_cpus()==1 too. So there is something else going on here.
I realized after submitting that comment CONFIG_DEBUG_RODATA is set on the working f7 kernels. Is there anything other configuration that would prevent the actual writing of the breakpoint into the kernel text section?
From your systemtap .ko cache, you should be able to find the C code, which in turn identifies the address at which the probe was attempted to be placed.
I added a diagnostic print to the C code, manually compiled, then run the module. The machine remained up to get get oops from the /var/log/messages. Looking at the disassembled kernel image it appears to be trying to put the probe in the correct place for 'probe kernel.function("schedule")': ffffffff8043ccc0 <schedule>: ffffffff8043ccc0: 55 push %rbp ffffffff8043ccc1: 48 89 e5 mov %rsp,%rbp ffffffff8043ccc4: 41 57 push %r15 The output in /var/log/messages: May 22 15:27:43 dhcp231-176 kernel: orig address = 0000000000235cc0, relocated address = ffffffff8043ccc0 May 22 15:27:43 dhcp231-176 kernel: Unable to handle kernel paging request at ffffffff8043ccc0 RIP: May 22 15:27:43 dhcp231-176 kernel: [<ffffffff804400d3>] arch_arm_kprobe+0x4/0x8 May 22 15:27:43 dhcp231-176 kernel: PGD 203067 PUD 205063 PMD 37ce5163 PTE 43c161 May 22 15:27:43 dhcp231-176 kernel: Oops: 0003 [1] SMP May 22 15:27:43 dhcp231-176 kernel: CPU 0 May 22 15:27:43 dhcp231-176 kernel: Modules linked in: stap_2727 autofs4 hidp rfcomm l2cap bluetooth ipv6 nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables cpufreq_ondemand video sbs button dock battery ac lp loop snd_intel8x0 snd_seq_dummy snd_intel8x0m snd_ac97_codec ac97_bus snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss 8139cp fw_ohci bcm43xx ieee80211softmac ieee80211 fw_core 8139too mii ieee80211_crypt snd_pcm snd_timer parport_pc crc_itu_t parport snd serio_raw soundcore snd_page_alloc sr_mod k8temp cdrom shpchp i2c_nforce2 i2c_core hwmon joydev sg dm_snapshot dm_zero dm_mirror dm_mod sata_nv ata_generic pata_amd libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd May 22 15:27:43 dhcp231-176 kernel: Pid: 2781, comm: staprun Not tainted 2.6.22-rc2 #1 May 22 15:27:43 dhcp231-176 kernel: RIP: 0010:[<ffffffff804400d3>] [<ffffffff804400d3>] arch_arm_kprobe+0x4/0x8 May 22 15:27:43 dhcp231-176 kernel: RSP: 0018:ffff810030b59dd0 EFLAGS: 00010212 May 22 15:27:43 dhcp231-176 kernel: RAX: ffffffff8043ccc0 RBX: 0000000000000000 RCX: 0000000000000282 May 22 15:27:43 dhcp231-176 kernel: RDX: 0000000000000000 RSI: 0000000000000212 RDI: ffffffff88408e60 May 22 15:27:43 dhcp231-176 kernel: RBP: ffffffff88408e60 R08: ffffffff8055fa18 R09: 0000000000000010 May 22 15:27:43 dhcp231-176 kernel: R10: ffff810002031d00 R11: ffffffff80323fd7 R12: 0000000000000000 May 22 15:27:43 dhcp231-176 kernel: R13: ffff810030b59f50 R14: 0000000000000000 R15: 0000000000000000 May 22 15:27:43 dhcp231-176 kernel: FS: 00002b470d28c8f0(0000) GS:ffffffff805a4000(0000) knlGS:0000000000000000 May 22 15:27:43 dhcp231-176 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 22 15:27:43 dhcp231-176 kernel: CR2: ffffffff8043ccc0 CR3: 0000000030b1d000 CR4: 00000000000006e0 May 22 15:27:43 dhcp231-176 kernel: Process staprun (pid: 2781, threadinfo ffff810030b58000, task ffff8100322250c0) May 22 15:27:43 dhcp231-176 kernel: Stack: ffffffff804417f3 ffffffff8043ccc0 0000000000000000 ffffffff8840478c May 22 15:27:43 dhcp231-176 kernel: ffffffff88402f6b 0000000000000292 0000000000000292 ffff810030b59e58 May 22 15:27:43 dhcp231-176 kernel: 0000000000000008 00007fff9d833c74 ffffffff88402ff2 00007fff9d833c74 May 22 15:27:43 dhcp231-176 kernel: Call Trace: May 22 15:27:43 dhcp231-176 kernel: [<ffffffff804417f3>] __register_kprobe+0x28a/0x2ed May 22 15:27:43 dhcp231-176 kernel: [<ffffffff8043ccc0>] __sched_text_start+0x0/0x8d7 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff88402f6b>] :stap_2727:systemtap_module_init+0x260/0x2de May 22 15:27:43 dhcp231-176 kernel: [<ffffffff88402ff2>] :stap_2727:probe_start+0x9/0x13 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff8840301d>] :stap_2727:_stp_handle_start+0x21/0x7c May 22 15:27:43 dhcp231-176 kernel: [<ffffffff884031f0>] :stap_2727:_stp_ctl_write_cmd+0x178/0x6ee May 22 15:27:43 dhcp231-176 kernel: [<ffffffff802e7016>] file_has_perm+0x94/0xa3 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff80231798>] do_fork+0x11f/0x1a8 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff8028e3ad>] vfs_write+0xce/0x177 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff8028e970>] sys_write+0x45/0x6e May 22 15:27:43 dhcp231-176 kernel: [<ffffffff80209b9e>] system_call+0x7e/0x83 May 22 15:27:43 dhcp231-176 kernel: [<ffffffff80209b9e>] system_call+0x7e/0x83 May 22 15:27:43 dhcp231-176 kernel: May 22 15:27:43 dhcp231-176 kernel: May 22 15:27:43 dhcp231-176 kernel: Code: c6 00 cc c3 48 8b 57 30 8a 47 68 88 02 c3 48 8b 97 08 01 00 May 22 15:27:43 dhcp231-176 kernel: RIP [<ffffffff804400d3>] arch_arm_kprobe+0x4/0x8 May 22 15:27:43 dhcp231-176 kernel: RSP <ffff810030b59dd0> May 22 15:27:43 dhcp231-176 kernel: CR2: ffffffff8043ccc0
Put im some diagnostic printk in mark_rodata_ro() for the 2.6.22-rc2. Below is the output: May 22 17:31:36 dhcp231-176 kernel: num_possible_cpus()= 1 May 22 17:31:36 dhcp231-176 kernel: start= ffffffff80207000 end= ffffffff80556000 May 22 17:31:36 dhcp231-176 kernel: Write protecting the kernel read-only data: 3388k Looking up the address from the nm: ffffffff80207000 T _stext ffffffff80556090 R __end_rodata Looks like all the kprobe address falls between those two. ffffffff8043cd00 T schedule This is uniprocessor machine so the process is marking the text as read only too. Thus, the following start point is not used: ffffffff804449a8 A _etext
Looks like some changes at the beginning of May expanded the range of data that CONFIG_DEBUG_RODATA protected: http://www.linux-mips.org/git?p=linux.git;a=commitdiff;h=6fb14755a676282a4e6caa05a08c92db8e45cfff;hp=d01ad8dd56527be72947b4b9997bb2c05783c3ed#patch2 This isn't going to work well with things like kprobes that expect to be able to put software break points in code.
(In reply to comment #10) > This isn't going to work well with things like kprobes that expect to be able to > put software break points in code. Perhaps kprobes can add code that overrides the page protection when adding a probe? It may be able to add write access the page long enough to write the breakpoint, and then turn the write back off...
It is certainly possible to change the page protections on the fly as needed, but it may make probe insertion and removal too costly. kprobes maintainers should experiment and measure that overhead. The other option is to change the CONFIG_DEBUG_RODATA code to start protecting only from _etext and not _stext when #ifdef CONFIG_KPROBES.
Some entertaining links: http://www.archivum.info/linux.kernel/2005-11/msg04473.html ... according to which, RODATA should not be on by default, and is expected to hurt kprobes http://lkml.org/lkml/2007/5/25/155 ... according to which, kprobes should work around RODATA
Fix for i386 and x86_64 architectures has been posted to lkml. http://lkml.org/lkml/2007/6/7/2 http://lkml.org/lkml/2007/6/7/3 Thanks Prasanna
The fix (both patches) has been pulled into the -mm tree, but the bug is in Linus's 2.6.22-rc* kernels. Thanks, Prasanna. Please update this bugzilla when the patches make it into Linus's kernel.