Why does Maple uses N as the denominator and not N-1 in calculating a covariance matrix?

It appears that Maple uses N as the denominator and not N-1 in calculating a covariance matrix in the Statistics package. Why?

Harry Garst

probably this reason (although I am not an expert)

I am not a statistician, but I suspect the answer is as follows - taking the more basic issue of variance for simplicity.

The variance is *defined* as Sum( x_j-mu)^2/N where mu is the mean of the distribution you are sampling from.

The formulas with N-1 arise because if you have only samples and do not know the mean mu of the underlying distribution you have to use the sample mean rather than the true underlying mean in the formula. The replacement of N with N-1 gives the best possible correction to allow for this.

edgar's picture

another way to say it

At least for the variance answer (or standard deviation, as you usually see the question)...

Do you want the standard deviation of this particular list of numbers, or do you want to estimate the standard deviation of some population for which this sequence of numbers is a random sample? For the first, use denominator N. For the second, denominator N-1 is preferred, since it provides an unbiased estimator.
---
G A Edgar

Yes

That's what I was trying to say, but after a couple of glasses of Burgundy.....

Axel Vogt's picture

and more detailled

while searching for a Burgundy (in vain) I found, what I filed once (from an economic lecturer):

Suppose I have n observations from a Normal distribution with unknown
mean mu and standard deviation sigma. 

Let SUM be the sum of the observations. Consider one observation, X1:

E[(X1 - SUM/n)^2] = E([X1 - X1/n - (SUM - X1)/n]^2)
= E([(n - 1)*X1/n - (SUM - X1)/n]^2)
= (n - 1)^2*E(X1^2)/n^2 - 2*(n - 1)*E(X1)*E(SUM - X1)/n^2 + E[(SUM - X1)^2]/n^2
= [(n - 1)^2*(s^2 + m^2) - 2*(n - 1)^2*m^2 + (n - 1)*s^2 +(n - 1)^2*m^2]/n^2
= (n - 1)*s^2/n

When I add n of these expectations together, the expectation of the sum is 
(n - 1)*s^2. Divide by n - 1 and you have an unbiased estimate of the variance
(that is an estimate whose expected value equals the variance). The math is a 
bit tedious, but each step should be perfectly straightforward.

If you insist on a word explanation, note that the sample mean is the value that
minimizes the sample variance (another straightforward math problem to prove). 

Therefore the expected standard deviation has to be higher than the sample 
standard deviation computed using the sample mean (of course, if you know the 
distribution mean, this logic doesn't apply and you should divide by n instead 
of n-1). 

Aaron Brown 

Hope that explanation is of interest.

Robert Israel's picture

Consistency would be a virtue

There is an unfortunate ambiguity in terminology here,
and it is not always clear which version of covariance,
variance, or standard deviation is meant. But it would be
a good idea for Covariance, Variance and StandardDeviation
to be consistent here. In particular, Covariance(A,A) should
be the same as Variance(A). But this is not the case in Maple
for data. As far as I can tell, for a data set (say a list), Variance and StandardDeviation use the definition with N-1, while Covariance uses N. For example:

> with(Statistics): A:= [1,2,3]:
  StandardDeviation(A),Variance(A),Covariance(A,A);

1., 1., .6666666667

Covariance

Covariance is one of those terms that doesn't seem to follow the same conventions as variance and standard deviation.

The definition given in Collins Dictionary of Mathematics is:
"A measure of the association between two random variables, X and Y, equal to the expected value of the product of their deviations from the mean. It may be estimated by the sum of products of deviations from the sample mean of the associated values of the two variables, divided by the number of sample points."

This is expressed as :

Cov(x,y) = (1/N)*Sum( (x - mean_of_x)*(y - mean_of_y) )

where N is the number of sample points.

I hope this adds a little clarity to a confusing subject.

J. Tarr

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
}