问题

I'm trying to create a function with the following signature:

function weightedRandom (target, probability) {
  // calculate weighted random number that is more likely to be near target value
  return value; // number from 0 to 1
}

It should do the following:

Generate random number from 0 to 1, but not including 1
The probability of any given number being picked in that range is NOT evenly distributed
There is a greater chance that the number picked is near the target value (target is also a value from 0 to 1)
The probability curve would look like a bell curve, where the target value has the highest probability and the values around it taper down, but all values in the 0 to 1 range would still have a chance of getting picked.
The weight of this chance can be adjusted with the probability value, where a value of 0 means there is no weighting applied to the randomness and 1 means that almost all numbers picked will be clustered right around the target value.

For example, weightedRandom(0.8, 0.2) would result in a random value that is 20% likely to be clustered around the value 0.8, but could be any number from 0 up to 1. If the probability was 0.5, then even more of the resulting random values returned would be near 0.8. I think maybe there needs to be another parameter to define the width of the cluster (standard deviation?).

I am not a mathematician, but I have been told to look at Beta Distributions as a possible tool to help:

enter image description here

I've found some NPM modules that have beta functions, but I'm not sure how to use them to solve this problem:

https://github.com/AndreasMadsen/mathfn
https://github.com/jstat/jstat

回答1:

TLDR: randomly choose between two easier distributions, also Inverse transform sampling

Combine two distributions

If you had a flat distribution, you could choose any value in your range equally. If you had a Gaussian distribution, you could choose a value near your Gaussian's mean. So consider randomly choosing to do one or other of these.

If you want the random value to be near the target t 80% of the time, and elsewhere the other 20%. Suppose that 'near' means within 2 standard deviations, and we'll take the variance to be v. So the range (t-2*v) to (t+2*v) needs to cover P(0.8).

Suppose we will randomly use either a flat distribution or a Gaussian distribution; the probability of a random value falling in a given range is then the sum of the two distributions, weighted by the bias of the distribution choice. If we choose the Gaussian, we will end up with a value within 2 std.dev. 95.45% of the time. If we take the Gaussian X% of the time, then the near probability Pn = P(t-2v to t+2v) = 0.9545*X + (1-X)(4v/r), where r is the full range, and (4v/r) is then the proportion of the flat distribution inside the range.

To get this Pn to 80%:

0.8 = 0.9545*X + (1-X)(4v/r).

We have 2 unknowns, so if we also require a very near probability that the value is within 1 std.dev of the target 60% of the time, then

0.6 = 0.6827*X + (1-X)(2v/r).

Rearranging for (2v/r):

(0.8 - 0.9545*X)/(1-X)*2 = (2v/r)
(0.6 - 0.6826*x)/(1-X) = (2v/r)

Equating and simplifying

X = 0.81546

Thus:

var range = [0, 10];
var target = 7.0;
var stddev = 1.0;
var takeGauss = (Math.random() < 0.81546);
if(takeGauss) {
  // perform gaussian sampling (normRand has mean 0), resample if outside range
  while(1) {
    var sample = ((normRand()*stddev) + target);
    if(sample >= range[0] && sample <= range[1]) {
      return sample;
    }
  }
} else {
  // perform flat sampling
  return range[0]+(Math.random()*(range[1]-range[0]));
}

I think this gives you the required shape, lets you choose two probabilities for near and very near probabilities, but avoids too much complexity.

As I'm being asked to provide more implementation, I've found a normal variate generator (thanks Prof Ian Neath):

function normRand() {
    var x1, x2, rad;

    do {
        x1 = 2 * Math.random() - 1;
        x2 = 2 * Math.random() - 1;
        rad = x1 * x1 + x2 * x2;
    } while(rad >= 1 || rad == 0);

    var c = Math.sqrt(-2 * Math.log(rad) / rad);

    return x1 * c;
};

Inverse transform sampling

The first way I considered was to use Inverse transform sampling, which I'll attempt to explain here.

Suppose we have a distribution where values from 0 to 4 are equally likely, but only half as likely as values from 4 to 10. The total probability is 4a + 6(2*a) = 1, so a=1/16:

step probability function doubling at 4 from a sixteenth to an eighth

Suppose you had a function which, when given a value between 0 and 1 in, produced a value between 0 and 10; it is still monotonic (no minima/maxima), but if you fed it every 0.01 increment from 0 to 1 you'd get a ratio of 4:6*2 = 1:3, so 3x as many values over 4 as under it. That function would look like this:

enter image description here

We have linear segment from z=0 to z=1/3, where x(1/3) = 4, and then a linear segment from z=1/3 to z=1 continuing on to x(1)=10. If we choose a random number z from the flat probability distribution between 0 and 1, then x(z) will be distributed with the first 1/3 of the range giving values up to 4, as required, and the remainder above.

z(x), then, is the inverse transform that takes a flat distribution and makes yields values from the desired distribution. If you want to plot it, it is x<(1/3) ? 9*x : 12*x -1.

The game is then to construct a distribution that you are happy with, and invert it to get the inverse transform, either by using pieces as above or by analytically inverting it, or some approximation (Gaussian inverse cannot be written down analytically). With this you can transform any flat-distributed sample into the desired distribution.

Sampling from the above step distribution would be done like this:

// transform 0-1 flat to 0-10 stepped
function stepInvTransform(z) {
    return (3*z < 1 ? 9*z : (12*z - 1));
}

// sample via inv transform
var sample = stepInvTransform(Math.random());

回答2:

Well. the probability is a quite easy thing to do:

//get the probability factor:
var prob = +(probability.charAt(2));//0.[2] <-- coerces 2 to number

This is the max number of tries you're going to give the RNG. But then, to see how close you are getting to the target must be taken into account, which we might do with a while loop.

var targetDigit = (''+ target).charAt(2),
    prob = +((''+probability).charAt(2)),
    rn,
    closest = 0;//keep track of closest hit
if (probability == 0)
    return Math.random();
do
{
    rn = Math.random();
    //reduce chances for too great of distances, regenerate, but don't count!
    if (Math.abs(rn.toFixed(1).charAt(2)-targetDigit) >= target/2)
        rn = Math.random();
    if (rn.toFixed(1).charAt(2) === targetDigit)
        return rn;//on the money
    if (Math.abs(target - rn) < Math.abs(target - closest))
        closest = rn;
} while (prob--);//while probability left
return closest;//return the closest generated value

When I tried this code, with the parameters .8 as target, and .4 as probability, I got the following return values:

0.8257349738919855 // hit!
0.7696360136541552 // -0.1, close
0.7542727420001457 // -0.1, close
0.9837384401271013 // +0.1, close
0.3078854698178465 // -0.5, far
0.8178311114230021 // hit!
0.6079441227457084 // -0.2, fairly close

Which, I take it, is close to what you'd expect.

回答3:

I would make an expansion of the range (Math.round()*2) and then use some other function f: [0,2] -> [0,1] that is not 1-1 mapping back to the 0-1 interval but maps more values to the area of 0.8. E.g., f(x) = c*x^2 - d*x^3 and tuning c and d such that the distribution suits you.

See this. You can just play with the values or derive the function and see how c and d interact. Of course you can use other functions for more precision like polynomials with more degrees or mix polynomials with exponential functions to get all sorts of behavior, but this is a question of what you really want, and how you would like it to behave.

来源：https://stackoverflow.com/questions/22656126/javascript-random-number-with-weighted-probability

标签

javascript