10839 – KRETACTIVE should have a smaller default value

Bug 10839 - KRETACTIVE should have a smaller default value

Summary: KRETACTIVE should have a smaller default value

Status:	RESOLVED FIXED

Alias:	None

Product:	systemtap
Classification:	Unclassified
Component:	translator (show other bugs)
Version:	unspecified

Importance:	P2 normal
Target Milestone:	---
Assignee:	Unassigned

URL:
Keywords:

Depends on:
Blocks:

Reported:	2009-10-24 02:29 UTC by Josh Stone
Modified:	2009-10-30 22:49 UTC (History)
CC List:	0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Josh Stone 2009-10-24 02:29:59 UTC

Consider the following session on 2.6.31.1-56.fc12.x86_64:

$ MOD=$(stap -e 'probe kernel.function("*@fs/*").return { next }' -p4)
$ free
             total       used       free     shared    buffers     cached
Mem:       1992712     308924    1683788          0       7628     133084
-/+ buffers/cache:     168212    1824500
Swap:       522104          0     522104
$ staprun $MOD -D -o /dev/null
1564
# (give it a moment to load all the probes)
$ free
             total       used       free     shared    buffers     cached
Mem:       1992712    1459972     532740          0       7628     133096
-/+ buffers/cache:    1319248     673464
Swap:       522104          0     522104


That's a 1124MB increase in "used" when the module is loaded!  I bumped my VM up
to 2GB for this test, but normally I run with only 512MB, so scripts were
causing me to get OOM.

For comparison, I get 43MB on 2.6.31.1-56.fc12.i686, and I get 72MB on
2.6.30.8-64.fc11.x86_64.  These numbers still seem high, but they're not so
outrageous...

Comment 1 Frank Ch. Eigler 2009-10-25 08:01:17 UTC

Might this be accounted for by a change to NR_CPUs leading
to an excessive default kretprobe .maxactive?  Try setting
a low -DKRETACTIVE=nnn parameter.

Comment 2 Josh Stone 2009-10-26 23:07:14 UTC

(In reply to comment #1)
> Might this be accounted for by a change to NR_CPUs leading
> to an excessive default kretprobe .maxactive?  Try setting
> a low -DKRETACTIVE=nnn parameter.

Ah -- indeed, on x86_64 NR_CPUS = 512, where on F11 it was only 64.  We default
to 6*NR_CPUS, but that still doesn't account for it:
  (2802 probes)*(6*512 maxactive)*(40b kretprobe_instance) = ~330MB

A different measurement:

# stap -e 'probe kernel.function("*@fs/*").return { next }' -DKRETACTIVE=100 -c free
             total       used       free     shared    buffers     cached
Mem:        458900     398628      60272          0       7488     148036
-/+ buffers/cache:     243104     215796
Swap:       522104       1260     520844
# stap -e 'probe kernel.function("*@fs/*").return { next }' -DKRETACTIVE=200 -c free
             total       used       free     shared    buffers     cached
Mem:        458900     435944      22956          0       7488     148012
-/+ buffers/cache:     280444     178456
Swap:       522104       1260     520844

That's 37MB for a change of 100 maxactive, which comes to about 136 bytes each.

Comment 3 Josh Stone 2009-10-26 23:14:05 UTC

(In reply to comment #2)
> That's 37MB for a change of 100 maxactive, which comes to about 136 bytes each.

This waste is because kretprobes allocates each instance with a separate
kmalloc, which usually allocates more than actually requested.

  global waste
  probe kernel.trace("kmalloc") {
    waste <<< ($bytes_alloc - $bytes_req)
  }
  probe timer.s(1) {
    if (@count(waste))
      printdln(" ", @count(waste), @sum(waste), @avg(waste))
    delete waste
  }

During KRETACTIVE=100, I get:
  271 21949 80
  281281 27015907 96
  141 13280 94

During KRETACTIVE=200, I get:
  25 2664 106
  561746 53935824 96
  127 12264 96

Comment 4 Josh Stone 2009-10-29 17:01:51 UTC

I've submitted bug #10869 to track the kretprobe_instance waste.  Within
systemtap itself, we should find a more sane default for KRETACTIVE, probably
based on the number of cpus that are actually online instead of NR_CPUS.

Comment 5 Frank Ch. Eigler 2009-10-30 12:12:02 UTC

commit 1ee6b5f

Comment 6 Josh Stone 2009-10-30 22:49:52 UTC

(In reply to comment #5)
> commit 1ee6b5f

That gives me 96 on F12-x86_64, which is manageable.  Thanks!