Student's t distribution is named after William Sealy Gosset's pseudonym, Student. He published using this pseudonym because his employer, the Guinness brewery in Ireland, did not allow its employees to publish scientific papers after an incident where trade secrets were disclosed. In this blog post we will have a look at the Student t distribution with one degree of freedom; this is also a special case of the Cauchy distribution (with location parameter a = 0 and scale parameter b = 1).
This distribution has probability density at value x. You can obtain a StudentT(1)-distributed random variable, for example, as the ratio of two independent standard normally distributed random variables. Clearly, the distribution is symmetric around zero: the probability densities at x and -x are the same. This implies that the median of the distribution is zero.
The probability density function of Student's t distribution with one degree of freedom.
However, the mean of the distribution is undefined. This may seem strange for a distribution that is symmetric: surely, the mean should be 0? The definition is clear enough, though: the mean is equal to
which separates into an infinite area on the positive side and an infinite area on the negative side of the horizontal axis. Since infinity minus infinity is undefined, there is no mean.
This may sound like a rather formal argument, but it truly works like this in practice as well: there are simply too many sample values that are far from zero to have a finite mean. This is nicely illustrated if you generate a large sample and plot the running averages; that is, the averages of the first n sample values for increasing values of n. If there was a well-defined mean, then the sample average should converge to that mean. However, if I perform the procedure described above for the StudentT(1) distribution, this is what I get:
Running average of a StudentT(1)-distributed random sample
As we see, the value truly does not converge; it jumps about very capriciously and even after about 550 points, there is still a substantial jump. These jumps will keep occurring. Compare this to the following graph, which is much smoother:
Running average of a Normal(0, 1)-distributed random sample
I was thinking about these things a while ago and decided to experiment with this using order statistics. The jth order statistic of a random sample is its jth smallest value, and Maple can work with these quantities using the Statistics[OrderStatistic] command. You supply it a random variable or a distribution, the number j, and a sample size n, and it returns a random variable that is distributed as the jth order statistic of a sample of size n. So OrderStatistic(X, 1, 2) is the minimum of two values distributed according to X, and OrderStatistic(X, 2, 3) is the middle of three values. OrderStatistic(X, 1, 1) is distributed as X itself.
Let us examine the mean of a few of these order statistics. First consider Mean(OrderStatistic(StudentT(1), 1, 2)). It evaluates to minus infinity. That is, taking two sample values, then taking their minimum, and then taking the average of a number of results of this type should give you arbitrarily low values. I created a running average plot again, which looked like this:
Running average of the first order statistics of samples of size two
Indeed, this seems to diverge to arbitrarily negative values.
The second thing you might want to try is Mean(OrderStatistic(StudentT(1), 2, 2)). This, we can predict, will be positive infinity, because it is the mirror image of what we tried before, and StudentT(1) is symmetric around 0. The answer is indeed positive infinity, and in fact Mean(OrderStatistic(StudentT(1), j, n)) = -Mean(OrderStatistic(StudentT(1), n + 1 - j, n)) holds in general (where both are defined).
The minimum or maximum of a sample of three numbers seems like it could only be more extreme than the minimum or maximum of a sample of two, and indeed Mean(OrderStatistic(StudentT(1), 1, 3)) and Mean(OrderStatistic(StudentT(1), 3, 3)) are plus and minus infinity, respectively. We find that in general, for n > 1, we have that
Mean(OrderStatistic(StudentT(1), 1, n)) = -infinity,
Mean(OrderStatistic(StudentT(1), n, n)) = infinity.
So the next interesting value is Mean(OrderStatistic(StudentT(1), 2, 3)). This is the middle of a random sample of size three. The running average looks like this:
Running average of the second order statistics of samples of size three
This looks a lot smoother, and indeed Mean(OrderStatistic(StudentT(1), 2, 3)) turns out to be equal to 0. So in order to tame Student's t distribution with one degree of freedom, we just need to take samples of size three and take the middle one each time!
I examined the mean of OrderStatistic(StudentT(1), j, n) for various other values of j and n and empirically found that this evaluates to:
- undefined if j = n = 1 (this is the same situation as the mean of StudentT(1));
- positive infinity if j = n > 1 (this is the maximum of two or more sample values);
- negative infinity if j = 1 < n (this is the minimum of two or more sample values);
- a finite quantity in the remaining cases (so 1 < j < n).
This is quite astonishing: taking the maximum of two random numbers gives you something with an expected value of infinity, but taking the second-largest out of a sample of 1000 has a finite expected value (to be precise, approximately 318).
So now it is time for a proof. We define m := Mean(OrderStatistic(StudentT(1), j, n)) and obtain
I knew from earlier experiments that those hypergeometric functions should really be arctangents, so I set m := convert(m, arctan), and obtained the much more palatable
This suggested that we might be able to do a tangent substitution, so I set m := IntegrationTools[Change](m, _t1 = tan(phi)), which lead to
and we see that for j = n = 1 we have int(tan(phi), phi = -Pi/2 .. Pi/2) as part of the formula, which shows that indeed the result is undefined for that case. Now let us concentrate on the case j = 1 < n. We concentrate on the integral; we define integral := indets(subs(int = Int, m), specfunc(anything, Int)) to get just the inert integral, and then set integral := eval(integral, j = 1) to obtain
Then we split the integral into two parts at phi = 0 and rewrite the negative part in terms of -phi:
negativePart, positivePart := op(IntegrationTools:-Split(integral, 0));
negativePart := IntegrationTools:-Change(negativePart, theta = -phi);
integral := IntegrationTools:-Combine(positivePart + eval(negativePart, theta = phi));
Now the factor of the integrand is nonnegative and strictly increasing, so the integral is at least
which is infinite. With the minus sign that we omitted above, we get minus infinity, as expected.
The computation for j = n > 1 is similar. The integral for the general case, with 1 < j < n, can be rewritten in the same manner as above, to obtain
where and are positive integers. Now the polynomial factor of the integrand has a root of multiplicity at , which is at least 1; and since
is finite, so is the mean.