In this series of blog posts, I have picked on Baseball win-loss records already. Looking for other sources of things that might or might not be random, I decided to look at lottery draws. Since I live in Canada, the obvious lottery to look at is the national Lotto 6/49.

A lotto 6/49 draw consists of drawing 6 numbered balls from a pool of 49 balls. So, each draw can be described as set of 6 unique numbers between 1 and 49. The October 6th draw was: {10, 20, 24, 32, 37, 47} or, if we include the 7th bonus ball (drawn from the same pool) {10, 20, 24, 32, 37, 47, 49}.

The main problem here is that lottery draws are sequences of combinations, and the tools we have looked at are all designed to analyze sequences of bits. Fortunately this is not an insurmountable problem and we will handle it in two steps:

- convert combinations into a dice rolls
- convert dice rolls into bit sequences

To accomplish step 1, we can use a tool called a combinatorial number system which is basically, a 1 to 1 conversion between k-combinations of n items and the numbers between 1 and _{n}C_{k}. So, we will convert the 7 numbers of a 649 pick into a single number between 1 and 85,900,584 (i.e. 1 roll of an 85,900,584 sided die) using the formula

where c is the sorted list of numbers in the pick, or the procedure:

pick2roll := proc(pick)
local c;
c := sort([op(pick)]);
1 + binomial(c[1]-1, 1) + binomial(c[2]-1, 2) + binomial(c[3]-1, 3)
+ binomial(c[4]-1, 4) + binomial(c[5]-1, 5) + binomial(c[6]-1, 6)
+ binomial(c[7]-1, 7);
end proc;

As you might expect: {1,2,3,4,5,6,7} transforms to 1, and {43,44,45,46,47,48,49} transforms to 85,900,584.

(**) pick2roll({1,2,3,4,5,6,7});
1
(**) pick2roll({43,44,45,46,47,48,49});
85900584
(**) pick2roll({10, 20, 24, 32, 37, 47, 49});
83406300

For step 2, we would be in luck if 85,900,584 were a power of 2 since then we could just convert the numbers from the die rolls into base-2 numbers. We are not that lucky, however: 2^{26}< 85,900,584 < 2^{27}. Since it is not a power of 2, the bits from the binary representation of the rolls is not going to be uniformly random.

In order to get a uniform sequence, we can use a simple algorithm (callled Q_{2) }in the nice paper "Turning Loaded Dice into Fair Coins" from Juels et al at RSA Labs. This is the algorithm that turns fair dice rolls into variable numbers of fair coin flips:

roll2bits := proc(size, roll)
local S, R, k, i;
S := Bits:-Split(size);
k := nops(S);
R := Bits:-Split(roll-1, bits=k );
for i from k to 2 by -1 do
if S[i] = 1 and R[i] = 0 then
return op(1..(i-1), R);
end if;
end do;
return NULL;
end proc;

It will return up to 26 bits, will return fewer in many cases:

(**) roll2bits(85900584, 83406300);
1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1
(**) roll2bits(85900584, 1);
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
(**) roll2bits(85900584, 85900584);
1, 1, 1

Now we can apply some randomness tests to these bits. Attached to this post is a list of the 2768 Lotto 6/49 draws up to the end of July 2010. Compose the two procedures above to create a sequence of bits from the data:

(**) numdraws := combinat:-numbcomb(49,7);
85900584
(**) L := [ seq(roll2bits(numdraws, pick2roll( Lotto649[i][2] union {Lotto649[i][3]} ) ), i=1..nops(Lotto649) ) ]:

Then run the various tests:

(**) WaldWolfowitz(L);
WaldWolfowitz: sequence contains 35036 zeros and 35380 ones

WaldWolfowitz: sequence has 35331 runs

WaldWolfowitz: expected number of runs is 35208

WaldWolfowitz: the variance is 17602.91

WaldWolfowitz: the statistic for this sequence is is 0.925867

true, 0.3545152098

(**) seq( print([i,BinaryRank(L, dimension=i)]), i=4..floor( abs( solve(nops(L)/N^2 = 100)[1] ) ) );

[4, true, 0.5324021830]

[5, true, 0.3726492307]

[6, true, 0.5890016206]

[7, true, 0.1364141942]

[8, true, 0.4584857288]

[9, true, 0.3621306040]

[10, true, 0.2110288005]

[11, true, 0.6746907632]

[12, true, 0.7071276847]

[13, true, 0.0632322524]

[14, true, 0.4183305750]

[15, true, 0.4562036804]

[16, true, 0.7009441987]

[17, true, 0.3078522810]

[18, false, 0.0366504341]

[19, true, 0.7711336501]

[20, true, 0.6557501007]

[21, true, 0.7494236194]

[22, true, 0.6888584700]

[23, true, 0.1233406278]

[24, true, 0.2614664627]

[25, true, 0.8872760280]

[26, true, 0.1929215151]

(**) Compressibility(L);

1.000227221

(**) for i from 1 to floor( solve(1/2^N*nops(L)=10,N) ) do
print( nprintf("-------------------- %2d ---------------------", i) );
SequenceFrequency(L, size=i);
end do;
-------------------- 1 ---------------------

entropy = 0.9999827846

true, 0.1948544208

-------------------- 2 ---------------------

entropy = 1.999956514

true, 0.2373671944

-------------------- 3 ---------------------

entropy = 2.999912072

true, 0.2843759056

-------------------- 4 ---------------------

entropy = 3.999864364

true, 0.5845824329

-------------------- 5 ---------------------

entropy = 4.999710009

true, 0.6046551079

-------------------- 6 ---------------------

entropy = 5.999359751

true, 0.4970121096

-------------------- 7 ---------------------

entropy = 6.998736751

true, 0.5825065003

-------------------- 8 ---------------------

entropy = 7.997404833

true, 0.5339163223

-------------------- 9 ---------------------

entropy = 8.994717036

true, 0.4612282161

-------------------- 10 ---------------------

entropy = 9.989533632

true, 0.5826251531

-------------------- 11 ---------------------

entropy = 10.97825459

true, 0.2157023699

-------------------- 12 ---------------------

entropy = 11.95585547

true, 0.0564462352

(**) MonteCarloPi(L);

(**) Correlogram(L);

(**) VisualizeBits(L);

Except for the 18x18 binary rank test, all tests suggest that the lottery draws are random. This shouldn't be particularly surprising, given the money involved a great deal of effort goes into assuring the soundness of the drawing process.

If we hadn't used the roll2bits procedure and instead had just used Bits:-Split to get 26 bits from each Lottery draw, the results actually look pretty similar, except for the results of SequenceFrequency which are clearly show the non-uniformity. As always, here is a worksheet with the computations: LottoRandomnessTest.mw. The code for the various randomness tests are in the start up code.

And here is the data: Data649.txt