This is the mail archive of the
gsl-discuss@sourceware.org
mailing list for the GSL project.
Re: correlation coefficient
- From: James Theiler <jt at lanl dot gov>
- To: Brian Gough <bjg at network-theory dot co dot uk>
- Cc: Patrick Alken <patrick dot alken at colorado dot edu>, <gsl-discuss at sourceware dot org>
- Date: Fri, 16 Mar 2007 10:05:24 -0600 (MDT)
- Subject: Re: correlation coefficient
On Fri, 16 Mar 2007, Brian Gough wrote:
] At Thu, 15 Mar 2007 17:06:43 -0600,
] Patrick Alken wrote:
] > Is there any interest in putting a new function in the
] > statistics area for calculating the Pearson correlation coefficient?
] > I think this can be done safely in gsl by just using
] >
] > r = gsl_stats_covariance(x,y) / (gsl_stats_sd(x) * gsl_stats_sd(y))
] >
] > but it would be more efficient to calculate everything in 1 pass
] > through the data and I believe there is a stable algorithm to do
] > this (similar to how the mean/variance is calculated).
]
] Yes, sounds like a good idea to me. Go ahead and add it in
] covariance_source.c if you have the 1-pass algorithm.
]
]
be sure to include Pearson in the name of the function, since there
are also Spearman's and Kendall's correlation statistics. (on second
thought, contradicting myself, those two are specialized nonparametric
measures, and so maybe it's reasonable to have Pearson's be the
default.)
jt
--
James Theiler
MS-B244, ISR-2, LANL; Los Alamos, NM 87544
Space and Remote Sensing Sciences; Los Alamos National Laboratory
http://public.lanl.gov/jt