Say I have 100 records, and I want to mock out the created_at
date so it fits on some curve. Is there a library to do that, or what formula could I use? I think this is along the same track:
Generate Random Numbers with Probabilistic Distribution
I don't know much about how they are classified in mathematics, but I'm looking at things like:
- bell curve
- logarithmic (typical biology/evolution) curve? ...
Just looking for some formulas in code so I can say this:
- Given 100 records, a timespan of
1.week
, and an interval of12.hours
- set
created_at
for each record such that it fits, roughly, tocurve
Thanks so much!
Update
I found this forum post about ruby algorithms, which led me to rsruby, an R/Ruby bridge, but that seems like too much.
Update 2
I wrote this little snippet trying out the gsl
library, getting there...
Generate test data in Rails where created_at falls along a Statistical Distribution
You can generate UNIX timestamps which are really just integers. First figure out when you want to start, for example now:
start = DateTime::now().to_time.to_i
Find out when the end of your interval should be (say 1 week later):
finish = (DateTime::now()+1.week).to_time.to_i
Ruby uses this algorithm to generate random numbers. It is almost uniform. Then generate random numbers between the two:
r = Random.new.rand(start..finish)
Then convert that back to a date:
d = Time.at(r)
This looks promising as well: http://rb-gsl.rubyforge.org/files/rdoc/randist_rdoc.html
And this too: http://rb-gsl.rubyforge.org/files/rdoc/rng_rdoc.html
I recently came across croupier, a ruby gem that aims to generate numbers according to a variety of statistical distributions.
I have yet to try it but it sounds quite promising.
From wiki:
There are a couple of methods to generate a random number based on a probability density function. These methods involve transforming a uniform random number in some way. Because of this, these methods work equally well in generating both pseudo-random and true random numbers.
One method, called the inversion method, involves integrating up to an area greater than or equal to the random number (which should be generated between 0 and 1 for proper distributions).
A second method, called the acceptance-rejection method, involves choosing an x and y value and testing whether the function of x is greater than the y value. If it is, the x value is accepted. Otherwise, the x value is rejected and the algorithm tries again.
The first method is the one used in the accepted answer in your SO linked question: Generate Random Numbers with Probabilistic Distribution
Another option is the Distribution gem under SciRuby. You can generate normal numbers by:
require 'distribution'
rng = Distribution::Normal.rng
random_numbers = Array.new(100).map { rng.call }
There are RNGs for various other distributions as well.
来源:https://stackoverflow.com/questions/4550758/generate-array-of-numbers-that-fit-to-a-probability-distribution-in-ruby