Java: random integer with non-uniform distribution

前端 未结 11 1445
终归单人心
终归单人心 2020-12-13 12:55

How can I create a random integer n in Java, between 1 and k with a "linear descending distribution", i.e. 1 is

相关标签:
11条回答
  • 2020-12-13 13:12

    The simplest thing to do it to generate a list or array of all the possible values in their weights.

    int k = /* possible values */
    int[] results = new int[k*(k+1)/2];
    for(int i=1,r=0;i<=k;i++)
       for(int j=0;j<=k-i;j++)
           results[r++] = i;
    // k=4 => { 1,1,1,1,2,2,2,3,3,4 }
    
    // to get a value with a given distribution.
    int n = results[random.nextInt(results.length)];
    

    This best works for relatively small k values.ie. k < 1000. ;)

    For larger numbers you can use a bucket approach

    int k = 
    int[] buckets = new int[k+1];
    for(int i=1;i<k;i++)
       buckets[i] = buckets[i-1] + k - i + 1;
    
    int r = random.nextInt(buckets[buckets.length-1]);
    int n = Arrays.binarySearch(buckets, r);
    n = n < 0 ? -n : n + 1;
    

    The cost of the binary search is fairly small but not as efficient as a direct look up (for a small array)


    For an arbitary distrubution you can use a double[] for the cumlative distrubution and use a binary search to find the value.

    0 讨论(0)
  • 2020-12-13 13:15

    This should give you what you need:

    public static int getLinnearRandomNumber(int maxSize){
        //Get a linearly multiplied random number
        int randomMultiplier = maxSize * (maxSize + 1) / 2;
        Random r=new Random();
        int randomInt = r.nextInt(randomMultiplier);
    
        //Linearly iterate through the possible values to find the correct one
        int linearRandomNumber = 0;
        for(int i=maxSize; randomInt >= 0; i--){
            randomInt -= i;
            linearRandomNumber++;
        }
    
        return linearRandomNumber;
    }
    

    Also, here is a general solution for POSITIVE functions (negative functions don't really make sense) along the range from start index to stopIndex:

    public static int getYourPositiveFunctionRandomNumber(int startIndex, int stopIndex) {
        //Generate a random number whose value ranges from 0.0 to the sum of the values of yourFunction for all the possible integer return values from startIndex to stopIndex.
        double randomMultiplier = 0;
        for (int i = startIndex; i <= stopIndex; i++) {
            randomMultiplier += yourFunction(i);//yourFunction(startIndex) + yourFunction(startIndex + 1) + .. yourFunction(stopIndex -1) + yourFunction(stopIndex)
        }
        Random r = new Random();
        double randomDouble = r.nextDouble() * randomMultiplier;
    
        //For each possible integer return value, subtract yourFunction value for that possible return value till you get below 0.  Once you get below 0, return the current value.  
        int yourFunctionRandomNumber = startIndex;
        randomDouble = randomDouble - yourFunction(yourFunctionRandomNumber);
        while (randomDouble >= 0) {
            yourFunctionRandomNumber++;
            randomDouble = randomDouble - yourFunction(yourFunctionRandomNumber);
        }
    
        return yourFunctionRandomNumber;
    }
    

    Note: For functions that may return negative values, one method could be to take the absolute value of that function and apply it to the above solution for each yourFunction call.

    0 讨论(0)
  • 2020-12-13 13:16

    There are lots of ways to do this, but probably the easiest is just to generate two random integers, one between 0 and k, call it x, one between 0 and h, call it y. If y > mx + b (m and b chosen appropriately...) then k-x, else x.

    Edit: responding to comments up here so I can have a little more space.

    Basically my solution exploits symmetry in your original distribution, where p(x) is a linear function of x. I responded before your edit about generalization, and this solution doesn't work in the general case (because there is no such symmetry in the general case).

    I imagined the problem like this:

    1. You have two right triangles, each k x h, with a common hypotenuse. The composite shape is a k x h rectangle.
    2. Generate a random point that falls on each point within the rectangle with equal probability.
    3. Half the time it will fall in one triangle, half the time in the other.
    4. Suppose the point falls in the lower triangle.
      • The triangle basically describes the P.M.F., and the "height" of the triangle over each x-value describes the probability that the point will have such an x-value. (Remember that we're only dealing with points in the lower triangle.) So by yield the x-value.
    5. Suppose the point falls in the upper triangle.
      • Invert the coordinates and handle it as above with the lower triangle.

    You'll have to take care of the edge cases also (I didn't bother). E.g. I see now that your distribution starts at 1, not 0, so there's an off-by-one in there, but it's easily fixed.

    0 讨论(0)
  • 2020-12-13 13:20

    This is called a triangular distribution, although yours is a degenerate case with the mode equal to the minimum value. Wikipedia has equations for how to create one given a uniformly distributed (0,1) variable.

    0 讨论(0)
  • 2020-12-13 13:24

    So we need the following distribution, from least likely to most likely:

    *
    **
    ***
    ****
    *****
    

    etc.

    Lets try mapping a uniformly distributed integer random variable to that distribution:

    1
    2  3
    4  5  6
    7  8  9  10
    11 12 13 14 15
    

    etc.

    This way, if we generate a uniformly distributed random integer from 1 to, say, 15 in this case for K = 5, we just need to figure out which bucket it fits it. The tricky part is how to do this.

    Note that the numbers on the right are the triangular numbers! This means that for randomly-generated X from 1 to T_n, we just need to find N such that T_(n-1) < X <= T_n. Fortunately there is a well-defined formula to find the 'triangular root' of a given number, which we can use as the core of our mapping from uniform distribution to bucket:

    // Assume k is given, via parameter or otherwise
    int k;
    
    // Assume also that r has already been initialized as a valid Random instance
    Random r = new Random();
    
    // First, generate a number from 1 to T_k
    int triangularK = k * (k + 1) / 2;
    
    int x = r.nextInt(triangularK) + 1;
    
    // Next, figure out which bucket x fits into, bounded by
    // triangular numbers by taking the triangular root    
    // We're dealing strictly with positive integers, so we can
    // safely ignore the - part of the +/- in the triangular root equation
    double triangularRoot = (Math.sqrt(8 * x + 1) - 1) / 2;
    
    int bucket = (int) Math.ceil(triangularRoot);
    
    // Buckets start at 1 as the least likely; we want k to be the least likely
    int n = k - bucket + 1;
    

    n should now have the specified distribution.

    0 讨论(0)
  • 2020-12-13 13:27

    Something like this....

    class DiscreteDistribution
    {
        // cumulative distribution
        final private double[] cdf;
        final private int k;
    
        public DiscreteDistribution(Function<Integer, Double> pdf, int k)
        {
            this.k = k;
            this.cdf = new double[k];
            double S = 0;
            for (int i = 0; i < k; ++i)
            {
                double p = pdf.apply(i+1);         
                S += p;
                this.cdf[i] = S;
            }
            for (int i = 0; i < k; ++i)
            {
                this.cdf[i] /= S;
            }
        }
        /**
         * transform a cumulative distribution between 0 (inclusive) and 1 (exclusive)
         * to an integer between 1 and k.
         */
        public int transform(double q)
        {
            // exercise for the reader:
            // binary search on cdf for the lowest index i where q < cdf[i]
            // return this number + 1 (to get into a 1-based index.
            // If q >= 1, return k.
        }
    }
    
    0 讨论(0)
提交回复
热议问题