Bug 24804 - bpf code generation unable to determine type of target variable used to index global variable
Summary: bpf code generation unable to determine type of target variable used to index...
Status: NEW
Alias: None
Product: systemtap
Classification: Unclassified
Component: bpf (show other bugs)
Version: unspecified
: P2 normal
Target Milestone: ---
Assignee: Unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-11 17:43 UTC by William Cohen
Modified: 2019-07-11 19:12 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description William Cohen 2019-07-11 17:43:25 UTC
When experimenting with the bpf tracepoints on the most currently checked out version of systemtap came across the following problem where the target variable seems to be properly typed, but when used as an index for a global array systemtap reports it can't determine the type:

[wcohen@localhost systemtap]$ rpm -q systemtap
systemtap-4.2-1.201907111323.fc31.x86_64
[wcohen@localhost systemtap]$ stap  --bpf -L 'kernel.trace("sys_enter")' 
kernel.trace("raw_syscalls:sys_enter") $id:long int $args:long unsigned int[]
[wcohen@localhost systemtap]$ stap --bpf -e 'global ids; probe kernel.trace("sys_enter"){ids[$id]++}' -T 10
semantic error: unresolved type : identifier '$id' at <input>:1:49
        source: global ids; probe kernel.trace("sys_enter"){ids[$id]++}
                                                                ^

Pass 2: analysis failed.  [man error::pass2]
Number of similar error messages suppressed: 1.
Rerun with -v to see them.

However, it looks for some cases systemtap can determine the target variable type as the following works:

[wcohen@localhost systemtap]$ stap  -m good -e 'probe kernel.trace("sys_enter"){printf("syscall %d\n", $id); exit()}'
syscall 72


The traditional linux kernel module version works fine:

[wcohen@localhost systemtap]$ stap   -L 'kernel.trace("sys_enter")' 
kernel.trace("raw_syscalls:sys_enter") $regs:struct pt_regs* $id:long int
[wcohen@localhost systemtap]$ stap  -e 'global ids; probe kernel.trace("sys_enter"){ids[$id]++}' -T 10
ids[39]=719
ids[72]=595
ids[0]=577
ids[5]=332
ids[3]=307
ids[257]=208
ids[228]=172
ids[7]=115
ids[213]=106
...
Comment 1 William Cohen 2019-07-11 17:55:55 UTC
Correction.  Here is an example of the bpf backend getting the $id target variable type is properly determined and the script runs:

[wcohen@localhost systemtap]$ sudo stap  --bpf  -e 'probe kernel.trace("sys_enter"){printf("%d ", $id)}' -T 1
13 228 228 228 0 257 228 228 228 228 228 228 232 228 228 228 0 257 228 13 9 10 56 16 321 34 273 9 11 11 10 7 5 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 14 WARNING: lost 624 perf_events on cpu 0
7 1 14 14 228 0 39 228 23 14 14 228 1 228 23 1 1 1 1 1 1 1 1 1 1 1 14 14 228 0 39 WARNING: lost 2253 perf_events on cpu 0
bpfinterp.cxx:387: printf already started
WARNING: /usr/bin/stapbpf exited with signal: 6 (Aborted)
Pass 5: run failed.  [man error::pass5]
Comment 2 William Cohen 2019-07-11 18:10:42 UTC
systemtap seems okay with target a target variable for function:

[wcohen@localhost systemtap]$ stap --bpf -L 'kernel.function("__do_sys_fcntl")'
kernel.function("__do_sys_fcntl@fs/fcntl.c:448") $arg:long unsigned int $cmd:unsigned int $fd:unsigned int
[wcohen@localhost systemtap]$ sudo stap -v --bpf -e 'global ids; probe kernel.function("__do_sys_fcntl"){ids[$arg]++}' -T 10
Pass 1: parsed user script and 70 library scripts using 336176virt/109212res/9160shr/100200data kb, in 190usr/30sys/220real ms.
Pass 2: analyzed script: 3 probes, 4 functions, 0 embeds, 1 global using 397828virt/172324res/10292shr/161852data kb, in 750usr/10sys/766real ms.
Pass 4: compiled BPF into "stap_6498.bo" in 0usr/0sys/3real ms.
Pass 5: starting run.
Pass 5: run completed in 0usr/0sys/10003real ms.
Comment 3 William Cohen 2019-07-11 19:12:29 UTC
Took a look the verbose output of the following commands for the failing bpf and working lkm versions:

$ stap  -p4 -vvvvv --bpf -e 'global ids; probe kernel.function("__do_sys_fcntl"){ids[$fd]++}'  >& bpf_function_typing.log

$ stap  -vvvvv -p4  -e 'global ids; probe kernel.trace("sys_enter"){ids[$id]++}' >& lkm_typing.log


In the failing bpf version in the output see:

focused on module '/tmp/stapvvSB1G/tracequery_kmod_1/tracequery_kmod_1_116.o'
pattern 'stapprobe_sys_enter' matches function 'stapprobe_sys_enter'
replaced $id with __tracepoint_arg_id
Rerunning the code filters.
tracepoint-based probe_20795 tracepoint='sys_enter'

Later in the bpf output see:


symbol resolution for derived-probe kernel.trace("raw_syscalls:sys_enter") /* <- kernel.trace("sys_enter") */
      global ids is defined in chosen-tapset-file <input>
number of probes with global-variable conditions: 0
Eliding side-effect-free singleton block operator '{' at <input>:1:44
replaced {
(__global_ids[__tracepoint_arg_id])++;
} with (__global_ids[__tracepoint_arg_id])++
resolved type long to identifier 'ids' at <input>:1:45
resolved type long to operator '++' at <input>:1:53
resolved type long to identifier 'ids' at <input>:1:45
semantic error: unresolved type : identifier '$id' at <input>:1:49
   thrown from: elaborate.cxx:7391
        source: global ids; probe kernel.trace("sys_enter"){ids[$id]++}
                  

It seems that bpf is losing typing information for __tracepoint_arg_id.  For the lkm version see:

found parameter for tracepoint 'sys_enter': type:'long int' name:'id' decl:'long int __tracepoint_arg_id' ok
replaced $id with __tracepoint_arg_id
Rerunning the code filters.

Then later in the lkm output:

symbol resolution for derived-probe kernel.trace("raw_syscalls:sys_enter") /* <- kernel.trace("sys_enter") */
      local __tracepoint_arg_id is already defined
      global ids is defined in chosen-tapset-file <input>
number of probes with global-variable conditions: 0
Eliding side-effect-free singleton block operator '{' at <input>:1:44
replaced {
(__global_ids[__tracepoint_arg_id])++;
} with (__global_ids[__tracepoint_arg_id])++
resolved type long to identifier 'ids' at <input>:1:45
resolved type long to identifier '$id' at <input>:1:49
resolved type long to identifier '$id' at <input>:1:49
resolved type long to operator '++' at <input>:1:53
resolved type long to identifier 'ids' at <input>:1:45
derive-probes (location #0): end of keyword at <input>:1:1
      global ids is already defined
      local __idx0 is already defined
      local __val is already defined
resolved type long to identifier '__idx0' at <input>:2:19
resolved type long to identifier 'ids' at <input>:2:30
resolved type long to identifier '__val' at <input>:2:10
resolved type long to identifier '__idx0' at <input>:3:25
resolved type long to identifier '__val' at <input>:3:32
resolved type long to identifier 'printf' at <input>:3:1
resolved type long to identifier '__idx0' at <input>:2:19
resolved type long to identifier '__val' at <input>:2:10
deleting module_cache