I am wondering what would be the best way (e.g. in Java) to generate random numbers within a particular range where each number has a certain probability to occur or not?
Also responded here: find random country but probability of picking higher population country should be higher. Using TreeMap:
TreeMap<Integer, Integer> map = new TreeMap<>();
map.put(percent1, 1);
map.put(percent1 + percent2, 2);
// ...
int random = (new Random()).nextInt(100);
int result = map.ceilingEntry(random).getValue();
This may be useful to someone, a simple one I did in python. you just have to change the way p and r are written. This one, for instance, projects random values between 0 and 0.1 to 1e-20 to 1e-12.
import random
def generate_distributed_random():
p = [1e-20, 1e-12, 1e-10, 1e-08, 1e-04, 1e-02, 1]
r = [0, 0.1, 0.3, 0.5, 0.7, 0.9, 1]
val = random.random()
for i in range(1, len(r)):
if val <= r[i] and val >= r[i - 1]:
slope = (p[i] - p[i - 1])/(r[i] - r[i - 1])
return p[i - 1] + (val - r[i - 1])*slope
print(generate_distributed_random())
If you have performance issue instead of searching all the n values O(n)
you could perform binary search which costs O(log n)
Random r=new Random();
double[] weights=new double[]{0.1,0.1+0.2,0.1+0.2+0.5};
// end of init
double random=r.nextDouble();
// next perform the binary search in weights array
you only need to access log2(weights.length) in average if you have a lot of weights elements.
You already wrote the implementation in your question. ;)
final int ran = myRandom.nextInt(100);
if (ran > 50) { return 3; }
else if (ran > 20) { return 2; }
else { return 1; }
You can speed this up for more complex implementations by per-calculating the result on a switch table like this:
t[0] = 1; t[1] = 1; // ... one for each possible result
return t[ran];
But this should only be used if this is a performance bottleneck and called several hundred times per second.
Try this: In this example i use an array of chars, but you can substitute it with your integer array.
Weight list contains for each char the associated probability. It represent the probability distribution of my charset.
In weightsum list for each char i stored his actual probability plus the sum of any antecedent probability.
For example in weightsum the third element corresponding to 'C', is 65:
P('A') + P('B) + P('C') = P(X=>c)
10 + 20 + 25 = 65
So weightsum represent the cumulative distribution of my charset. weightsum contains the following values:
It's easy to see that the 8th element correspondig to H, have a larger gap (80 of course like his probability) then is more like to happen!
List<Character> charset = Arrays.asList('A','B','C','D','E','F','G','H','I','J');
List<Integer> weight = Arrays.asList(10,30,25,60,20,70,10,80,20,30);
List<Integer> weightsum = new ArrayList<>();
int i=0,j=0,k=0;
Random Rnd = new Random();
weightsum.add(weight.get(0));
for (i = 1; i < 10; i++)
weightsum.add(weightsum.get(i-1) + weight.get(i));
Then i use a cycle to get 30 random char extractions from charset,each one drawned accordingly to the cumulative probability.
In k i stored a random number from 0 to the max value allocated in weightsum. Then i look up in weightsum for a number grather than k, the position of the number in weightsum correspond to the same position of the char in charset.
for (j = 0; j < 30; j++)
{
Random r = new Random();
k = r.nextInt(weightsum.get(weightsum.size()-1));
for (i = 0; k > weightsum.get(i); i++) ;
System.out.print(charset.get(i));
}
The code give out that sequence of char:
HHFAIIDFBDDDHFICJHACCDFJBGBHHB
Let's do the math!
A = 2
B = 4
C = 3
D = 5
E = 0
F = 4
G = 1
H = 6
I = 3
J = 2
Total.:30
As we wish D and H are have more occurances (70% and 80% prob.)
Otherwinse E didn't come out at all. (10% prob.)
Some time ago I wrote a helper class to solve this issue. The source code should show the concept clear enough:
public class DistributedRandomNumberGenerator {
private Map<Integer, Double> distribution;
private double distSum;
public DistributedRandomNumberGenerator() {
distribution = new HashMap<>();
}
public void addNumber(int value, double distribution) {
if (this.distribution.get(value) != null) {
distSum -= this.distribution.get(value);
}
this.distribution.put(value, distribution);
distSum += distribution;
}
public int getDistributedRandomNumber() {
double rand = Math.random();
double ratio = 1.0f / distSum;
double tempDist = 0;
for (Integer i : distribution.keySet()) {
tempDist += distribution.get(i);
if (rand / ratio <= tempDist) {
return i;
}
}
return 0;
}
}
The usage of the class is as follows:
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.3d); // Adds the numerical value 1 with a probability of 0.3 (30%)
// [...] Add more values
int random = drng.getDistributedRandomNumber(); // Generate a random number
Test driver to verify functionality:
public static void main(String[] args) {
DistributedRandomNumberGenerator drng = new DistributedRandomNumberGenerator();
drng.addNumber(1, 0.2d);
drng.addNumber(2, 0.3d);
drng.addNumber(3, 0.5d);
int testCount = 1000000;
HashMap<Integer, Double> test = new HashMap<>();
for (int i = 0; i < testCount; i++) {
int random = drng.getDistributedRandomNumber();
test.put(random, (test.get(random) == null) ? (1d / testCount) : test.get(random) + 1d / testCount);
}
System.out.println(test.toString());
}
Sample output for this test driver:
{1=0.20019100000017953, 2=0.2999349999988933, 3=0.4998739999935438}