Testing Randomness with Compression

Ask a Question

Create a Post

Testing Randomness with Compression

Posted: John May 3081 Product: Maple

August 03 2010

Continuing on in this series of posts, here is a way to test the randomness of a sequence of bits from a PRNG that is the appropriate to the first morning back after the August long weekend. It is a very fast, and not very formal test done by checking how well a sequence compresses. This is really easy in Maple 14, with the new commands ?StringTools:-Compress and StringTools:-Uncompress which use zlib. Here is a simple procedure to take a list of ones and zeroes and return (approximately) how compressible it is:

Compressibility := proc(L::list, $)
local byteArray, compressedArray, compression;
    byteArray := Array(1..floor(nops(L)/8),
        i->Bits:-Join([seq(L[j],j=8*(i-1)+1..8*(i-1)+8)])-128,
        datatype=integer[1]);
    compressedArray := StringTools:-Compress(byteArray);
    compression := (rtable_num_elems(compressedArray)-9) / rtable_num_elems(byteArray);
    return  evalf(compression);
end proc: #Compressibility

It simply packs the bits into one byte signed integers (Maple type integer[1]) and then compresses. A fudge factor of 9 is subtracted from the length of the compresses array to take into account the zlib header and then a ratio is computed. An answer greater than 1 means the input was uncompressible. An answer less than one likely means the input is likely not random - the smaller the number, the less likely.

(**) Compressibility([0$10000]);
#                         0.007200000000
(**) Compressibility([seq(op([1,0]),i=1..5000)]);
#                         0.008000000000

(**) r1 := rand(0..1):
(**) L1 := [seq(r1(), i=1..10000)]:
(**) Compressibility(L1);
#                          1.001600000

Now for the bad PRNGs from the previous post:

(**) r2a := rand(0..2): r2 := ()-> r2a() mod 2:
(**) L2 := [seq(r2(), i=1..10000)]:
(**) Compressibility(L2);
#                          0.9800000000

(**) LCState := 1: r3 := proc() global LCState;
(**)    LCState := irem(89853*LCState, 50027);
(**)    irem(LCState, 2); end proc:
(**) L3 := [seq(r3(), i=1..10000)]:
(**) Compressibility(L3);
#                          0.3632000000

L2 compresses only a little bit since it is random, just not uniform (0 appears twice as often as 1). L3 compresses quite a lot since even though it has greater entropy than L2 ( 7.485 vs. 7.197 as byte sequences ) it is actually periodic. So, zlib compressibility ends up encoding not just entropy / byte frequency but it can also detect longer periodic behavior. Entropy is often described intuitively as the compressibility of a sequence/string, but since it is computed from a character by character frequency, it really only describes compressibility by Huffman style encoding. The DEFLATE algorithm of zlib includes both Lempel-Ziv and Huffman encoding, so it can acheive better compression than the entropy might suggect if there are large repeated subsequences. If you take the sequence that is just two copies of L2, it will have exactly the same entropy, but it will compress twice as well:

(**) Compressibility([op(L2),op(L2)]);
#                          0.4948000000

Finally, here is a worksheet that computes compressibility as well as entropy and byte frequency counts Compressibility.mw

10098 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag

Add a Comment

Share via:

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

Ask a Question

Create a Post

Testing Randomness with Compression

10098 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag

Preview

Preview

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

Save this setting as your default sorting preference?
Note: You can change your preference any time in your account settings
Don't show this again

From:
To:

Custom Message (optional):

Ask a Question

Create a Post

Testing Randomness with Compression

10098 views More... Contact Author Subscribe Generate PDF Comment Share Branch Create a new Post in reply Ask a related Question Flag

Preview

Preview

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

10098 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag