More on the Lottery: Numerology and Randomness

October 22 2010 John May 1827


The Canadian Lotto649 draws are randomized the old fashioned way, the draws are held using a Ryo-Catteau Tulipe ball machine made by a well respected French Company. The draws are video recorded in a secure studio, and broadcast live.  There is no reason to suspect that these draws might not be random, but let us look at some ways we might detect it if it were not random.

You could look at the Lottery draws as a generator for a binary sequence as I did in my previous post, but as Robert Israel pointed out in the comments, that encoding can hide some non-random behavior (e.g. if the number 25 appeared in every draw, that encoding would not appear less random).

In this post, we will look at the numbers in the draws directly, and see if we can detect any evidence of non-random behavior.   We will not be able to prove that the numbers are random or non-random, but we will be able to build up a body of evidence one way or another.

Our study of randomness here will be driven by the types of numerological-analyses that appear commonly in "How to Beat the Lottery" literature.

Many people advocate trying to beat the lottery by looking at the least and most common numbers in the draws, and choosing your ticket based on those.  This is, of course, ridiculous, but you can see why this might appeal since numbers do not appear with equal frequency (light blue indicates frequency as bonus numbers).  The solid red line is the expected mean frequency and the dashed lines indicate two standard deviations around the expected mean.  You can see that there are a couple numbers that fall outside of two standard deviations.

Compare that to graphs of the same number of random draws created by Maple.  Try re-generating several times in the attached worksheet, and see that it is not uncommon randomly generate graphs with four numbers outside of the two standard deviation range:

So the number frequency data does not provide good evidence against randomness.

There is a strategy of looking for patterns in the balance of even and odd numbers in the winners.  If you are choosing random numbers from 1 to 49, you expect the number of even draws to be slightly less than half the numbers on average and that is in fact what we see.

The black line indicates the expected value in each case. The graph on the left is from the Lotto649 data, the one on the right from PRNG draws in Maple (generate more in the attached worksheet).  So, clearly this does not give us any evidence against randomness.

On the other hand, a numerologist might use this information to reject lotto tickets with 1 even number or 5 even numbers as being unlikely to win.  But in fact, if random, any given tickets is equally likely to be drawn, there are just far fewer tickets with 1 even number than there are tickets with 3 even numbers.  Try it out in the modified Historical Simulator in the attached worksheet.

Another place numerologists look for patterns is in the distribution of sums of the numbers in the winning draws.  Again, like with even number frequency, the lotto numbers give the distribution of sums we would expect from a random sampling of draws (actual Lotto data on the left, simulated PRNG lotto on the right):

Numerologist use this information to reject lotto tickets with high or low ticket sums.  But assuming randomness, such filtering will not improve your odds of winning.  The winning draw is more likely to have a sum close to 160, but that's because most tickets have sums close to 160.  Any individual ticket with sum 160 is no more likely to be drawn than a given ticket with sum 100.

Looking for patterns in the highest and lowest numbers drawn isn't any more productive than looking at sums or evens and odds.  There are many more possible draws with 49 as the largest number than there are draws with 25 as the largest number. Thus it is no surprise that draws of the former type appear more often than the former:

The curves indicate the expected number of draws in each case.  The graph again suggests the data is random.  To be more thorough, you could make a binary sequence from above/below expected values and perform a Wald-Wolfowitz runs test on it.

Numerologist use the information in this graph to reject lotto tickets with low highest numbers or high lowest numbers.  But again, such filtering will not improve your odds of winning.  The winning draw is more likely to have a high number in the 40's but that's because most tickets have their highest number in the 40's.  Any individual ticket with a highest number in the 40's is no more likely to be drawn than a given ticket with a highest number in the 20's.

Yet another numerological obsession, is looking for patterns in the number of sequential adjacent numbers in the draws.   As you might guess from above.  Most draws will have few sequential numbers, but so will most possible draws.  So, it doesn't really improve your odds to avoid sequentials. To break up the bar graph monotony, here is a pie chart of the frequency of sequentials in winning lotto draws versus randomly generated draws.  As you can see the Lottery draws are consistent with random draws (actual Lotto data on the left, simulated PRNG lotto on the right).

Now, to emphasize, none of this is proof that the Lottery is random, rather is is all evidence for randomness.  And taken collectively and in context, I think it is pretty convincing.

Here is this post in the form of a fancy worksheet with a new version of the Lottery Historical Simulator that lets you filter your random ticket for the numerological properties mentioned here:

Please Wait...