Overview
Some Random Number Generators
are not very good at producing a stream of numbers which appear to come in
random order.
Even ones that look random to the eye can have subtle
patterns in long sequences. These tests have been designed to ferret
out such failures.
A generator with a short period is a bad generator.
Also, a function could have localized loops into which it could fall,
yielding a small number of outputs it produces over and over again.
This also would be a bad generator.
Frequency
The Frequency test checks that the numbers produced are evenly
distributed, that there are no pileups or troughs at certain numbers.
The test just counts how many times a particular number appears and
compares this to the appearance rate of all the other numbers.
Each word produced by the generator is broken up into bytes,
and each byte is the test unit for the actual test.
Serial
The Serial test checks that two successive numbers in the series do
not have correlations.
The test counts how many times each pair of numbers appears and
compares this to the appearance rate of all the other pairs of numbers.
Each word produced by the generator is broken up into
short integer half-words, which are taken as "pairs of bytes".
Gap
The Gap test counts how many numbers come before one which lies within
a particular range, [A,B).
Each word produced by the generator is broken up into bytes,
and each byte is the test unit for the actual test.
The range chosen for these tests was [100,200).
Poker
The Poker test accumulates the occurrence rate of one-pair, two-pairs,
three-of-a-kind, and four-of-a-kind in a four-unit "hand".
Each half-word is taken as a "hand" in this test.
Each half-word is broken up into four 4-bit "cards",
and these are compared to each other to determine the relations.
The probabilities are a little different than a normal 5-card poker
hand, but this fits better into the structure of computer data.
Coupon
This test counts how many numbers must be seen to complete a whole
set.
This particular implementation breaks up each word into bytes,
and the bottom three bits of each byte form the unit of the
cataloging.
Permutation
This Permutation test divides the input data into a stream of bytes,
and it considers 5 bytes at a time.
It catalogs the ordering of the 5 numbers.
The ordering is
whether one number is bigger or smaller than the previous.
There are 5! possible
arrangements of ordering for these five bytes.
In the end, each ordering should be equally probable.
Runs
The Run test catalogs the numbers of times successive numbers are
increasing, or "running up".
This particular implementation breaks up the words into half-words and
counts the run-ups.
Maximum-t
The Maximum-t test catalogs the maximum value seen in groupings of numbers.
This particular implementation breaks up the words into bytes,
and it catalogs the maximum byte seen in groupings of 8 bytes.
These occurrence rates are known for a purely random distribution,
and the measurements are compared to these "pure" values.
M-Tuple
The M-Tuple test catalogs correlations in neighboring words.
This particular implementation breaks up the words into bytes,
and it uses the bottom 3-bits of each byte as the unit of comparison.
It accumulates statistics on neighboring bytes and also on neighboring
three bytes.
Birthday Spacings
The "Birthday Spacings" test catalogs the distances between neighboring
words. It counts how many times, in a group of numbers, there are any
equal spacings. In a large enough group of numbers, there should be
one or two.
We take specifics from Knuth, page-71.
If we take 512 "random" numbers between 0 and 2^25,
there should be equal spacings with the following probabilities:
0 equal, probability 0.368801577
1 equal, probability 0.379035243
2 equal, probability 0.183471182
>2 equal, probability 0.078691997
|