This is an annoying issue that's related to ulimit. After a long time running different stapbpf programs, previously working scripts begin to fail, being unable to allocate maps.
(On older Fedora kernels I saw an issue where instead the BPF verifier would reject a program -- that seems unrelated and difficult to reproduce.)
The issue can be worked around by removing RLIMIT_MEMLOCK entirely in stapbpf.cxx instantiate_maps():
+ curr_rlimit.rlim_cur = RLIM_INFINITY;
+ curr_rlimit.rlim_max = RLIM_INFINITY;
rc = setrlimit(RLIMIT_MEMLOCK, &curr_rlimit);
This is also what bcc does. However, the fact that rlimit resources do not overflow immediately, but are instead exhausted over multiple runs and across a reboot makes me suspicious about the safety of this fix.
As suspected, applying this quick fix on Fedora can lead to subtler resource exhaustion, so committing it to master is a no-go for now.
I do notice that the testsuite can produce orphaned stapbpf processes which may be holding the pinned memory, or some other resource that runs out more slowly than the finite rlimit I was using.
root 482 0.0 0.0 80144 3848 pts/6 D 16:18 0:00 /opt/systemtap/bin/stapbpf /tmp/stapfEnTr4/stap_481.bo
root 494 0.0 0.0 6412 1760 pts/6 D 16:18 0:00 /opt/systemtap/bin/stapbpf /tmp/stapm3M5Nd/stap_493.bo
root 508 0.0 0.0 6376 1676 pts/6 D 16:19 0:00 /opt/systemtap/bin/stapbpf /tmp/stap7RZrMT/stap_507.bo
root 643 0.0 0.0 6384 1720 pts/6 D 16:21 0:00 /opt/systemtap/bin/stapbpf /tmp/stapMhTLZX/stap_642.bo
root 666 0.0 0.0 6384 1644 pts/6 D 16:21 0:00 /opt/systemtap/bin/stapbpf /tmp/stap1ME9en/stap_663.bo
root 688 0.0 0.0 6384 1648 pts/6 D 16:22 0:00 /opt/systemtap/bin/stapbpf /tmp/stapd3AOZs/stap_687.bo
serhei 746 0.0 0.0 6004 760 pts/6 S+ 16:22 0:00 grep --color=auto stapbpf
When shutting down bpf instrumentation what happens to the map elements created?
Are they automatically remove when the ebpf program that created them goes away?
Or do the maps need to be explicitly cleaned up?
IIRC by default bpf maps, programs, and so forth will be garbage collected when the stapbpf process that owns them is stopped.
However, as documented in my previous comment, getting the stapbpf processes to always stop cleanly is the real problem.