This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Collecting statistics for time dependent data?


On Thu, 9 Dec 2004, John Lamb wrote:

] Raimondo Giammanco wrote:
] > Hello,
] > 
] >  I was wondering if there is a way to compute "running" statistics with
] > gsl.
] > 
] Yes you can do it, but there's nothing in GSL that does it and its eay 
] enough that you don't need GSL. Something like (untested)
] 
] double update_mean( double* mean, int* n, double x ){
]    if( *n == 1 )
]      *mean = x;
]    else
]      *mean = (1 - (double)1 / *n ) * *mean + x / n;
] }
] 
] will work and you can derive a similar method for updating the variance 
] using the usual textbook formula.
] 
] var[x] = (1/n) sum x^2_i - mean(x)^2
] 
] I don't know if there is a method that avoids the rounding errors. I 
] don't know why so many textbooks repeat this formula without the 
] slightest warning that it can go so badly wrong.
] 
]

Stably updating mean and variance is remarkably nontrivial.  There was
a series of papers in Comm ACM that discussed the issue; the final one
(that I know of) refers back to the earlier ones, and it can be found
in D.H.D. West, Updating mean and variance estimates: an improved
method, Comm ACM 22:9, 532 (1979)  [* I see Luke Stras just sent this
reference! *].  I'll just copy out the pseudocode since the paper is
old enough that it might not be easy to find.  This, by the way, is
generalized for weighted data, so it assumes that you get a weight and
a data value (W_i and X_i) that you use to update the estimates XBAR
and S2:

    SUMW = W_1
    M = X_1
    T = 0
    For i=2,3,...,n 
    {
       Q = X_i - M
       TEMP = SUM + W_i    // typo: He meant SUMW
       R = Q*W_i/TEMP
       M = M + R
       T = T + R*SUMW*Q
       SUMW = TEMP
    }
    XBAR = M
    S2 = T*n/((n-1)*SUMW)



jt 

-- 
James Theiler            Space and Remote Sensing Sciences
MS-B244, ISR-2, LANL     Los Alamos National Laboratory
Los Alamos, NM 87545     http://nis-www.lanl.gov/~jt




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]