This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [BUG] syscall.unlink no longer works after upgrading kernel to 3.7.3
- From: Zheng Da <zhengda1936 at gmail dot com>
- To: Josh Stone <jistone at redhat dot com>
- Cc: Mark Wielaard <mjw at redhat dot com>, agentzh <agentzh at gmail dot com>, "systemtap at sourceware dot org" <systemtap at sourceware dot org>
- Date: Tue, 28 May 2013 19:52:24 -0400
- Subject: Re: [BUG] syscall.unlink no longer works after upgrading kernel to 3.7.3
- References: <CAB4Tn6PdW3GOa09z_tfjQs=F+7XLOqMr5+c5GourX5e0v8FMeQ at mail dot gmail dot com> <1360054656 dot 3837 dot 13 dot camel at bordewijk dot wildebeest dot org> <51114188 dot 60400 at redhat dot com> <CAFLer83DQhCQg7Y3NKR0EUYePzp+fETDTeYEthUXKarAySM0_g at mail dot gmail dot com> <20130528191449 dot GA31042 at toonder dot wildebeest dot org> <CAFLer81NE1bocCbufPTtLLZ-pZz2kVA5r3rKoCgjmc_6w+fwng at mail dot gmail dot com> <51A511FD dot 8010006 at redhat dot com> <20130528203223 dot GA768 at toonder dot wildebeest dot org> <CAFLer80-zMFK9-qqGFdsiacrYPmzD=-qxdMO_a+TP71XbpGxXA at mail dot gmail dot com> <51A534E8 dot 8000209 at redhat dot com>
Hello,
On Tue, May 28, 2013 at 6:51 PM, Josh Stone <jistone@redhat.com> wrote:
> On 05/28/2013 02:35 PM, Zheng Da wrote:
>> Yes, it's my own script. Here is the code:
>> probe kernel.function("scsi_device_unbusy") {
>> if ($sdev->host->host_no == 9 && $sdev->id == 1) {
>> printf("sdev on node %d, host on node %d\n",
>> addr_to_node($sdev), addr_to_node($sdev->host));
>> exit();
>> }
>> }
>> The script works in Linux 3.2.12.
>
> Ok, this also works on 3.9.2-200.fc18.x86_64. I don't hit that
> particular host_no+id on my machine, but it is hitting the probe.
>
>> systemtap actually can find the right location of scsi_device_unbusy,
>> but it doesn't show its parameters.
>> $ stap -L 'kernel.function("scsi_device_unbusy")'
>> kernel.function("scsi_device_unbusy@drivers/scsi/scsi_lib.c:318")
>>
>> I run eu-readelf -N --debug-dump=info
>> /usr/lib/debug/lib/modules/3.8.12/vmlinux and the info of
>> scsi_device_unbusy is shown below:
>> [43d72e9] subprogram
>> external (flag) Yes
>> name (strp) "scsi_device_unbusy"
>> decl_file (data1) 1
>> decl_line (data2) 318
>> prototyped (flag) Yes
>> low_pc (addr) 0xffffffff81480e80
>> high_pc (addr) 0xffffffff81480f44
>> frame_base (data4) location list [e061d3]
>> sibling (ref4) [43d7492]
>> [43d730b] formal_parameter
>> name (strp) "sdev"
>> decl_file (data1) 1
>> decl_line (data2) 318
>> type (ref4) [43d22ff]
>> location (data4) location list [e06233]
> ...
>> Josh, when you say "DWARF dump", do you mean the output of eu-readelf
>> as I did above?
>
> Yep, that's great. Next, can you try --debug-dump=loc and see the list
> at [e06233] for sdev? This will hopefully reveal why it's not
> available. On my Fedora 18 kernel, I get:
>
>> [3b28227] subprogram
>> external (flag_present) Yes
>> name (strp) "scsi_device_unbusy"
>> decl_file (data1) 1
>> decl_line (data2) 323
>> prototyped (flag_present) Yes
>> low_pc (addr) 0xffffffff81420310
>> high_pc (addr) 0xffffffff814203d4
>> frame_base (exprloc)
>> [ 0] call_frame_cfa
>> GNU_all_call_sites (flag_present) Yes
>> sibling (ref4) [3b28405]
>> [3b28245] formal_parameter
>> name (strp) "sdev"
>> decl_file (data1) 1
>> decl_line (data2) 323
>> type (ref4) [3b22271]
>> location (sec_offset) location list [d6aa75]
> ...
>> [d6aa75] 0xffffffff81420315..0xffffffff8142033f [ 0] reg5
>> 0xffffffff8142033f..0xffffffff814203a6 [ 0] reg3
>> 0xffffffff814203a6..0xffffffff814203b4 [ 0] GNU_entry_value:
>> [ 0] reg5
>> [ 3] stack_value
>> 0xffffffff814203b4..0xffffffff814203d4 [ 0] reg3
>
> You can see that my function starts at 420310, yet sdev is first
> specified at 420315. That's the 5-byte fentry call still padding it
> away from the start, also seen in objdump -d:
>
>> ffffffff81420310 <scsi_device_unbusy>:
>> ffffffff81420310: e8 eb 92 24 00 callq ffffffff81669600 <__fentry__>
>> ffffffff81420315: 55 push %rbp
>> ffffffff81420316: 48 89 e5 mov %rsp,%rbp
>> ffffffff81420319: 48 83 ec 20 sub $0x20,%rsp
I think I can also see the 5-byte difference here.
[e061d3] 0xffffffff81480e80..0xffffffff81480e86 [ 0] breg7 8
0xffffffff81480e86..0xffffffff81480e89 [ 0] breg7 16
0xffffffff81480e89..0xffffffff81480f43 [ 0] breg6 16
0xffffffff81480f43..0xffffffff81480f44 [ 0] breg7 8
[e06233] 0xffffffff81480e85..0xffffffff81480eae [ 0] reg5
0xffffffff81480edf..0xffffffff81480f36 [ 0] reg3
[e06269] 0xffffffff81480ea3..0xffffffff81480eae [ 0] breg5 0
0xffffffff81480eae..0xffffffff81480eb2 [ 0] breg3 0
0xffffffff81480eb2..0xffffffff81480f3e [ 0] reg13
Output of objdump -d:
ffffffff81480e80 <scsi_device_unbusy>:
ffffffff81480e80: e8 7b 6c 22 00 callq
ffffffff816a7b00 <__fentry__>
ffffffff81480e85: 55 push %rbp
ffffffff81480e86: 48 89 e5 mov %rsp,%rbp
ffffffff81480e89: 48 83 ec 20 sub $0x20,%rsp
>
> But in my case, the heuristic of stap 45b02a36 appears to be working. I
> can also set environment PR15123_DISABLE=1, and it will fail the same as
> for you. Perhaps you could step through dwflpp::pr15123_retry_addr, and
> see what's happening?
>
> My best guess at this point is the check for "-mfentry" in
> DW_AT_producer. I found a Yocto commit where they forced gcc to have
> -grecord-gcc-switches, exactly for SystemTap's benefit, but then I'm not
> sure why Fedora is able to manage without that option.
>
> http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-3.8/commit/?h=standard/base&id=d9a45e3325030f7bd6f37947a7a0b12da7f602c3
>
I added lines in the source code of systemtap to print msg when
dwflpp::pr15123_retry_addr returns 0.
The problem is that the producer string returned by
dwarf_formstring(&cudie_producer) is "GNU C 4.6.3". There isn't
"-mfentry". I guess that is what you mean.
Do you want me to use -grecord-gcc-switches to rebuilt the kernel?
Thanks,
Da