I have a problem involving a collection of continuous probability distribution functions, most of which are determined empirically (e.g. departure times, transit times). What I
Autonomous mobile robotics deals with similar issue in localization and navigation, in particular the Markov localization and Kalman filter (sensor fusion). See An experimental comparison of localization methods continued for example.
Another approach you could borrow from mobile robots is path planning using potential fields.
A couple of responses:
1) If you have empirically determined PDFs they either you have histograms or you have an approximation to a parametric PDF. A PDF is a continuous function and you don't have infinite data...
2) Let's assume that the variables are independent. Then if you make the PDF discrete then P(f(x,y)) = f(x,y)p(x,y) = f(x,y)p(x)p(y) summed over all the combinations of x and y such that f(x,y) meets your target.
If you are going to fit the empirical PDFs to standard PDFs, e.g. the normal distribution, then you can use already-determined functions to figure out the sum, etc.
If the variables are not independent, then you have more trouble on your hands and I think you have to use copulas.
I think that defining your own mini-language, etc., is overkill. you can do this with arrays...
Probability distributions form a monad; see eg the work of Claire Jones and also the LICS 1989 paper, but the ideas go back to a 1982 paper by Giry (DOI 10.1007/BFb0092872) and to a 1962 note by Lawvere that I cannot track down (http://permalink.gmane.org/gmane.science.mathematics.categories/6541).
But I don't see the comonad: there's no way to get an "a" out of an "(a->Double)->Double". Perhaps if you make it polymorphic - (a->r)->r for all r? (That's the continuation monad.)
I worked on similar problems for my dissertation.
One way to compute approximate convolutions is to take the Fourier transform of the density functions (histograms in this case), multiply them, then take the inverse Fourier transform to get the convolution.
Look at Appendix C of my dissertation for formulas for various special cases of operations on probability distributions. You can find the dissertation at: http://riso.sourceforge.net
I wrote Java code to carry out those operations. You can find the code at: https://sourceforge.net/projects/riso
I think the histograms or the list of 1/N area regions is a good idea. For the sake of argument, I'll assume that you'll have a fixed N for all distributions.
Use the paper you linked edit 4 to generate the new distribution. Then, approximate it with a new N-element distribution.
If you don't want N to be fixed, it's even easier. Take each convex polygon (trapezoid or triangle) in the new generated distribution and approximate it with a uniform distribution.
No need for histograms or symbolic computation: everything can be done at the language level in closed form, if the right point of view is taken.
[I shall use the term "measure" and "distribution" interchangeably. Also, my Haskell is rusty and I ask you to forgive me for being imprecise in this area.]
Probability distributions are really codata.
Let mu be a probability measure. The only thing you can do with a measure is integrate it against a test function (this is one possible mathematical definition of "measure"). Note that this is what you will eventually do: for instance integrating against identity is taking the mean:
mean :: Measure -> Double
mean mu = mu id
another example:
variance :: Measure -> Double
variance mu = (mu $ \x -> x ^ 2) - (mean mu) ^ 2
another example, which computes P(mu < x):
cdf :: Measure -> Double -> Double
cdf mu x = mu $ \z -> if z < x then 1 else 0
This suggests an approach by duality.
The type Measure
shall therefore denote the type (Double -> Double) -> Double
. This allows you to model results of MC simulation, numerical/symbolic quadrature against a PDF, etc. For instance, the function
empirical :: [Double] -> Measure
empirical h:t f = (f h) + empirical t f
returns the integral of f against an empirical measure obtained by eg. MC sampling. Also
from_pdf :: (Double -> Double) -> Measure
from_pdf rho f = my_favorite_quadrature_method rho f
construct measures from (regular) densities.
Now, the good news. If mu and nu are two measures, the convolution mu ** nu
is given by:
(mu ** nu) f = nu $ \y -> (mu $ \x -> f $ x + y)
So, given two measures, you can integrate against their convolution.
Also, given a random variable X of law mu
, the law of a * X is given by:
rescale :: Double -> Measure -> Measure
rescale a mu f = mu $ \x -> f(a * x)
Also, the distribution of phi(X) is given by the image measure phi_* X, in our framework:
apply :: (Double -> Double) -> Measure -> Measure
apply phi mu f = mu $ f . phi
So now you can easily work out an embedded language for measures. There are much more things to do here, particularly with respect to sample spaces other than the real line, dependencies between random variables, conditionning, but I hope you get the point.
In particular, the pushforward is functorial:
newtype Measure a = (a -> Double) -> Double
instance Functor Measure a where
fmap f mu = apply f mu
It is a monad too (exercise -- hint: this very much looks like the continuation monad. What is return
? What is the analog of call/cc
?).
Also, combined with a differential geometry framework, this can probably be turned into something which compute Bayesian posterior distributions automatically.
At the end of the day, you can write stuff like
m = mean $ apply cos ((from_pdf gauss) ** (empirical data))
to compute the mean of cos(X + Y) where X has pdf gauss
and Y has been sampled by a MC method whose results are in data
.