This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: reducing cost of user-space probes


Hi,

8-10% performance hit when handling 0.5M-1M events/s is in line with
what I experience. Some ways to improve the performance

* find (or add) different probe point which is called less frequent
* when running the STAP remove built in checks, for example
--suppress-time-limits
* examine the source code generated by stap (command line switch -k).
there are things which are more expensive. For example, nesting in the
STAP script, strings, associative arrays all come at some cost. I
discovered that using inline C and array makes sense in some cases.
You can access the array with /proc and process the data offline.


On Mon, Apr 24, 2017 at 2:58 PM, O Mahony, Billy
<billy.o.mahony@intel.com> wrote:
> Hi,
>
> I'm new to systemtap and I am using it to add some probes into a user space application.
>
> The probe is pretty simple - it collects one integer argument and presents a histogram every 3 seconds.
>
> The probe is working fine and I'm getting results that are sensible. The application is a packet processing application that is using a user space io library (DPDK) to read batches of network packets directly into user space.  The probe is called about 750K times per second  (I have 10Gb link with 64B packets which generates 14.8M packets per second - but the batch size (that's the stat I'm tracing) - is about 20 so 750K probe hits per sec.
>
> When the probe is in use I see less performance from the packet processing application - it starts loosing packets at about 90% of it's non-probed throughput.
>
> However, when I run stap I see:
>
>> Pass 4: compiled C into "stap_13723.ko" in 9020usr/980sys/10638real ms
>
> Does this mean that each time the probe is hit that a system call is made to this new .ko module? That would surely mean quite a lot of overhead. If this is correct, can this overhead be avoided for user space probes.
>
> Alternatively is there a way to only execute the script every n times the probe is hit?
>
> Maybe there is a compile time macro that does this or some .stap command that does an early return from the script X% of the time. I searched for 'sample/sampling' in the lang ref but I didn't see anything.
>
> Thanks for any help you can give.
>
> Billy
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]