-.\" t -*- nroff -*-
+.\" te -*- nroff -*-
.TH STAP 1
.SH NAME
stap \- systemtap script translator/driver
.PP
Variance uses Welford's online algorithm. The calculations are based
-on integer arithmetics, and so may suffer from low precision. To improve
-this, @variance(v[, b]) accepts an optional parameter b, the
+on integer arithmetic, and so may suffer from low precision and overflow.
+To improve this, @variance(v[, b]) accepts an optional parameter b, the
bit-shift, ranging from 0 (default) to 62, for internal scaling. Only one
-value of bit-shift may be used with given global variable. The bit-shift may
-affect the result significantly, here is an example:
+value of bit-shift may be used with given global variable. A larger bitshift
+value increases precision, but increases the likelihood of overflow.
.SAMPLE
$ stap -e \\
> 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x)) }'
12
-$ stap --poison-cache -e \\
+$ stap -e \\
> 'global x probe oneshot { for(i=1;i<=5;i++) x<<<i println(@variance(x,1)) }'
2
$ python3 -c 'import statistics; print(statistics.variance([1, 2, 3, 4, 5]))'
$
.ESAMPLE
-Negative variance signals overflow. In some cases you might need to
-normalize your input data. Following rule applies:
+Overflow (from internal multiplication of large numbers) may occur and
+may cause a negative variance result. Consider normalizing your input
+data. Adding or subtracting a fixed value from all variance inputs
+preserves the original variance. Dividing the variance inputs by a fixed
+value shrinks the original variance by that value squared.
-.SAMPLE
-if
-@variance(v1, v2, ..., vN) = V
-then
-@variance(Xv1, Xv2, ..., XvN) = 2XV
-.ESAMPLE
+\" the following is a more mathy rendering, but gnu nroff can't show them properly :-(
+.ig
+
+If
+.EQ
+variance( v sub 1 , v sub 2 , ... , v sub N ) = V
+.EN
+
+Then
+.EQ
+variance ( {v sub 1} over X , {v sub 2} over X, ... , {v sub N} over X ) = V over {X sup 2}
+.EN
+
+and
+.EQ
+variance ( v sub 1 - Y, v sub 2 - Y, ... , v sub N - Y ) = V
+.EN
+
+..
Histograms are also available, but are more complicated because they
have a vector rather than scalar value.