There are certain applications that require computing a random variable with p=0.5. That is, the simulation of a coin toss. Another common use case is inserting elements into a Skip List data structure. When a new element is inserted into a Skip List, a tower height must be calculated for that element. In particular, the height is equivalent to calculating the number of consecutive coin flips that land heads. In mathematical language, the height follows the negative binomial distribution with parameters 1 and 0.5, that is, H ~ NB(1, 0.5).
Utilizing Google Code Search, I’ve found a few Skip List implementations that calculate the height of a new element as follows,
while nextRandomInteger() % 1 == 1 and elementHeight < maxTowerHeight
The problem with the above implementation is that for each iteration of the while loop, we call nextRandomInteger() which we would expect to call some rand(2) function. Computing a random integer is actually a pretty expensive operation, consuming both CPU power and entropy. However, these random functions usually return an integer that is much larger than a single bit and it is possible to take advantage of these extra bits in calculating the element height. The goal is to eliminate the number of calls to rand(2), thus, speeding up the executing of the Skip List insertion.
Let's assume that rand(2) returns an n-bit integer. An n-bit integer has the form, X1, X2, ..., Xn, where an any X can be either a 0 or a 1 and they are all independent. It follows that since each bit is a Bernoulli Trial, a random n-bit integer follows the binomial(n, 0.5) distribution. A binomial distribution and a negative binomial distribution are closely related. Recall that we are interested in modelling a negative binomial distribution to compute the height of our Skip List element. We can accomplish this task by examining each bit of our n-bit integer and counting the number of consecutive "success" bits. Concretely, define a success bit to be a 1 and rand(2) returns an 8-bit integer, say, 01011011, then reading from the right we see, "011", that is, two successes followed by a failure. This is precisely the negative binomial distribution that we are looking to model and the height for this element would be 2.
It's nice to utilize the entire entropy of the pseudo-random number generator! Also worth investigating that by taking various patterns of successive bit configurations we can actually model other binamiol(n, p) distributions.