I need a function to generate random integers. (assume Java long
type for now, but this will be extended to BigInteger
or BitSet
later
Here's another variant of Michael Anderson's answer
To avoid recursion, we process the bits of P iteratively from right-to-left instead of recursively from left-to-right. This would be tricky in floating-point representation so we extract the exponent/mantissa fields from the binary representation instead.
class BitsWithProbabilityHelper {
public BitsWithProbabilityHelper(float prob, Random rnd) {
if (Float.isNaN(prob)) throw new IllegalArgumentException();
this.rnd = rnd;
if (prob <= 0f) {
zero = true;
return;
}
// Decode IEEE float
int probBits = Float.floatToIntBits(prob);
mantissa = probBits & 0x7FFFFF;
exponent = probBits >>> 23;
// Restore the implicit leading 1 (except for denormals)
if (exponent > 0) mantissa |= 0x800000;
exponent -= 150;
// Force mantissa to be odd
int ntz = Integer.numberOfTrailingZeros(mantissa);
mantissa >>= ntz;
exponent += ntz;
}
/** Determine how many random words we need from the system RNG to
* generate one output word with probability P.
**/
public int iterationCount() {
return - exponent;
}
/** Generate a random number with the desired probability */
public long nextLong() {
if (zero) return 0L;
long acc = -1L;
int shiftReg = mantissa - 1;
for (int bit = exponent; bit < 0; ++ bit) {
if ((shiftReg & 1) == 0) {
acc &= rnd.nextLong();
} else {
acc |= rnd.nextLong();
}
shiftReg >>= 1;
}
return acc;
}
/** Value of <code>prob</code>, represented as m * 2**e where m is always odd. */
private int exponent;
private int mantissa;
/** Random data source */
private final Random rnd;
/** Zero flag (special case) */
private boolean zero;
}
If you're looking to apply some distribution where with probability P you get a 1 and with probability 1-P you get a 0 at any particular bit your best bet is simply to generate each bit independently with probability P of being a 1 (that sounds like a recursive definition, I know).
Here's a solution, I'll walk through it below:
public class MyRandomBitGenerator
{
Random pgen = new Random();
// assumed p is well conditioned (0 < p < 1)
public boolean nextBitIsOne(double p){
return pgen.nextDouble() < p ? true : false;
}
// assumed p is well conditioned (0 < p < 1)
public long nextLong(double p){
long nxt = 0;
for(int i = 0; i < 64; i++){
if(nextBitIsOne(p)){
nxt += 1 << i;
}
}
return nxt;
}
}
Basically, we first determine how to generate a value of 1 with probability P: pgen.nextDouble()
generates a number between 0 and 1 with uniform probability, by asking if it's less than p
we're sampling this distribution such that we expect to see p
1s as we call this function infinitely.
Here's how I solved it in the end.
This was partly inspired by Ondra Žižka's answer.
The benefit is that it reduces the number of calls to Random.nextLong()
to 8 calls per 64 bits of output.
For comparison, rolling for each individual bit would require 64 calls. Bitwise AND/OR uses between 2 and 32 calls depending on the value of P
Of course calculating binomial probabilities is just as expensive, so those go in another lookup table.
It's a lot of code, but it's paying off in terms of performance.
Update - merged this with the bitwise AND/OR solution. It now uses that method if it guesses it will be more efficient (in terms of calls to Random.next()
.)
Suppose the size of bit array is L. If L=1, the chance that the 1st bit is 1 will be P, and that being 0 will be 1-P. For L=2, the probability of getting a 00 is (1-P)2, a 01 or 10 is P(1-P) each and 11 is P2. Extending this logic, we can first determine the first bit by comparing a random number with P, then scale the random number such that we can again get anything between 0 to 1. A sample javascript code:
function getRandomBitArray(maxBits,probabilityOf1) {
var randomSeed = Math.random();
bitArray = new Array();
for(var currentBit=0;currentBit<maxBits;currentBit++){
if(randomSeed<probabilityOf1){
//fill 0 at current bit
bitArray.push(0);
//scale the sample space of the random no from [0,1)
//to [0.probabilityOf1)
randomSeed=randomSeed/probabilityOf1;
}
else{
//fill 1 at current bit
bitArray.push(1);
//scale the sample space to [probabilityOf1,1)
randomSeed = (randomSeed-probabilityOf1)/(1-probabilityOf1);
}
}
}
EDIT: This code does generate completely random bits. I will try to explain the algorithm better.
Each bit string has a certain probability of occurring. Suppose a string has a probability of occurrence p; we want to choose that string if our random number falls is some interval of length p. The starting point of the interval must be fixed, but its value will not make much difference. Suppose we have chosen upto k bits correctly. Then, for the next bit, we divide the interval corresponding to this k-length bit-string into two parts of sizes in the ratio P:1-P (here P is the probability of getting a 1). We say that the next bit will be 1 if the random number is in the first part, 0 if it is in the second part. This ensure that the probabilities of strings of length k+1 also remain correct.
Java code:
public ArrayList<Boolean> getRandomBitArray(int maxBits, double probabilityOf1) {
double randomSeed = Math.random();
ArrayList<Boolean> bitArray = new ArrayList<Boolean>();
for(int currentBit=0;currentBit<maxBits;currentBit++){
if(randomSeed<probabilityOf1){
//fill 0 at current bit
bitArray.add(false);
//scale the sample space of the random no from [0,1)
//to [0.probabilityOf1)
randomSeed=randomSeed/probabilityOf1;
}
else{
//fill 1 at current bit
bitArray.add(true);
//scale the sample space to [probabilityOf1,1)
randomSeed = (randomSeed-probabilityOf1)/(1-probabilityOf1);
}
}
return bitArray;
}
First a little ugly math that you're already using in your code.
Define x and y are bits with probability of being 1 of X = p(x=1), Y = p(y=1) respectively. Then we have that
p( x & y = 1) = X Y
p( x | y = 1) = 1 - (1-X) (1-Y)
p( x ^ y = 1) = X (1 - Y) + Y (1 - X)
Now if we let Y = 1/2 we get
P( x & y ) = X/2
P( x | y ) = (X+1)/2
Now set the RHS to the probability we want and we have two cases that we can solve for X
X = 2 p // if we use &
X = 2 p - 1 // if we use |
Next we assume we can use this again to obtain X in terms of another variable Z... And then we keep iterating until we've done "enough".
Thats a bit unclear but consider p = 0.375
0.375 * 2 = 0.75 < 1.0 so our first operation is &
0.75 * 2 = 1.5 > 1.0 so our second operation is |
0.5 is something we know so we stop.
Thus we can get a variable with p=0.375 by X1 & (X2 | X3 )
The problem is that for most variables this will not terminate. e.g.
0.333 *2 = 0.666 < 1.0 so our first operation is &
0.666 *2 = 1.333 > 1.0 so our second operation is |
0.333 *2 = 0.666 < 1.0 so our third operation is &
etc...
so p=0.333 can be generated by
X1 & ( X2 | (X3 & (X4 | ( ... ) ) ) )
Now I suspect that taking enough terms in the series will give you enough accuracy, and this can be written as a recursive function. However there might be a better way that that too... I think the order of the operations is related to the binary representation of p, I'm just not sure exactly how... and dont have time to think about it deeper.
Anyway heres some untested C++ code that does this. You should be able to javaify it easily.
uint bitsWithProbability( float p )
{
return bitsWithProbabilityHelper( p, 0.001, 0, 10 );
}
uint bitsWithProbabilityHelper( float p, float tol, int cur_depth, int max_depth )
{
uint X = randbits();
if( cur_depth >= max_depth) return X;
if( p<0.5-tol)
{
return X & bitsWithProbabilityHelper( 2*p, 0.001, cur_depth+1, max_depth );
}
if(p>0.5+tol)
{
return X | bitsWithProbabilityHelper( 2*p-1, 0.001, cur_depth+1, max_depth );
}
return X;
}
Distribute proportional number of bits throughuot the number. Pseudocode:
long generateNumber( double probability ){
int bitCount = 64 * probability;
byte[] data = new byte[64]; // 0-filled
long indexes = getRandomLong();
for 0 to bitCount-1 {
do {
// distribute this bit to some postition with 0.
int index = indexes & 64;
indexes >> 6;
if( indexes == 0 ) indexes = getRandomLong();
} while ( data[index] == 0 );
data[index] = 1;
}
return bytesToLong( data );
}
I hope you get what I mean. Perhaps the byte[]
could be replaced with a long
and bit operations to make it faster.