How can I create a random integer n
in Java, between 1
and k
with a "linear descending distribution", i.e. 1
is
Let me try another answer too, inspired by rlibby. This particular distribution is also the distribution of the smaller of two values chosen uniformly and random from the same range.
There is no need to simulate this with arrays and such, if your distribution is such that you can compute its cumulative distribution function (cdf). Above you have a probability distribution function (pdf). h is actually determined, since the area under the curve must be 1. For simplicity of math, let me also assume you're picking a number in [0,k).
The pdf here is f(x) = (2/k) * (1 - x/k), if I read you right. The cdf is just integral of the pdf. Here, that's F(x) = (2/k) * (x - x^2 / 2k). (You can repeat this logic for any pdf function if it's integrable.)
Then you need to compute the inverse of the cdf function, F^-1(x) and if I weren't lazy, I'd do it for you.
But the good news is this: once you have F^-1(x), all you do is apply it to a random value distribution uniformly in [0,1] and apply the function to it. java.util.Random can provide that with some care. That's your randomly sampled value from your distribution.
The first solution that comes to mind is to use a blocked-array. Each index would specify a range of values depending on how "probable" you want it to be. In this case, you would use a wider range for 1, less wider for 2, and so on until you reach a small value (lets say 1) for k.
int [] indexBound = new int[k];
int prevBound =0;
for(int i=0;i<k;i++){
indexBound[i] = prevBound+prob(i);
prevBound=indexBound[i];
}
int r = new Random().nextInt(prevBound);
for(int i=0;i<k;i++){
if(r > indexBound[i];
return i;
}
Now the problem is just finding a random number, and then mapping that number to its bucket. you can do this for any distribution provided you can discretize the width of each interval. Let me know if i am missing something either in explaining the algorithm or its correctness. Needless to say, this needs to be optimized.
There are many ways to generate a random integer with a custom distribution (also known as a discrete distribution). The choice depends on many things, including the number of integers to choose from, the shape of the distribution, and whether the distribution will change over time.
One of the simplest ways to choose an integer with a custom weight function f(x)
is the rejection sampling method. The following assumes that the highest possible value of f
is max
. The time complexity for rejection sampling is constant on average, but depends greatly on the shape of the distribution and has a worst case of running forever. To choose an integer in [1, k
] using rejection sampling:
i
in [1, k
].f(i)/max
, return i
. Otherwise, go to step 1.Other algorithms have an average sampling time that doesn't depend so greatly on the distribution (usually either constant or logarithmic), but often require you to precalculate the weights in a setup step and store them in a data structure. Some of them are also economical in terms of the number of random bits they use on average. These algorithms include the alias method, the Fast Loaded Dice Roller, the Knuth–Yao algorithm, the MVN data structure, and more. See my section "A Note on Weighted Choice Algorithms" for a survey.
The Cumulative Distribution Function is x^2
for a triangular distribution [0,1]
with mode (highest weighted probability) of 1, as shown here.
Therefore, all we need to do to transform a uniform distribution (such as Java's Random::nextDouble
) into a convenient triangular distribution weighted towards 1 is: simply take the square root Math.sqrt(rand.nextDouble())
, which can then multiplied by any desired range.
For your example:
int a = 1; // lower bound, inclusive
int b = k; // upper bound, exclusive
double weightedRand = Math.sqrt(rand.nextDouble()); // use triangular distribution
weightedRand = 1.0 - weightedRand; // invert the distribution (greater density at bottom)
int result = (int) Math.floor((b-a) * weightedRand);
result += a; // offset by lower bound
if(result >= b) result = a; // handle the edge case