For a few days I\'ve been working on this problem and I\'m stuck ...
I have performed a number of Monte Carlo simulations in R which gives an output y for each input
Without an idea of the underlying process you may as well just fit a polynomial with as many components as you like. You don't seem to be testing a hypothesis (eg, gravitational strength is inverse-square related with distance) so you can fish all you like for functional forms, the data is unlikely to tell you which one is 'right'.
So if I read your data into a data frame with x and y components I can do:
data$lx=log(data$x)
plot(data$lx,data$y) # needs at least a cubic polynomial
m1 = lm(y~poly(lx,3),data=data) # fit a cubic
points(data$lx,fitted(m1),pch=19)
and the fitted points are pretty close. Change the polynomial degree from 3 to 7 and the points are identical. Does that mean that your Y values are really coming from a 7-degree polynomial of your X values? No. But you've got a curve that goes through the points.
At this scale, you may as well just join adjacent points up with a straight line, your plot is so smooth. But without underlying theory of why Y depends on X (like an inverse square law, or exponential growth, or something) all you are doing is joining the dots, and there are infinite ways of doing that.
Regressing x/y vs. x Plotting y
vs. x
for the low data and playing around a bit it seems that x/y
is approximately linear in x
so try regressing x/y
against x
which gives us a relationship based on only two parameters:
y = x / (a + b * x)
where a and b are the regression coefficients.
> lm(x / y ~ x, lo.data)
Call:
lm(formula = x/y ~ x, data = lo.data)
Coefficients:
(Intercept) x
-0.1877 -0.3216
MM.2 The above can be transformed into the MM.2 model in the drc R package. As seen below this model has a high R2. Also, we calculate the AIC which we can use to compare to other models (lower is better):
> library(drc)
> fm.mm2 <- drm(y ~ x, data = lo.data, fct = MM.2())
> cor(fitted(fm.mm2), lo.data$y)^2
[1] 0.9986303
> AIC(fm.mm2)
[1] -535.7969
CRS.6 This suggests we try a few other drc models and of the ones we tried CRS.6 has a particularly low AIC and seems to fit well visually:
> fm.crs6 <- drm(y ~ x, data = lo.data, fct = CRS.6())
> AIC(fm.crs6)
[1] -942.7866
> plot(fm.crs6) # see output below
This gives us a range of models we can use from the 2 parameter MM.2
model which is not as good as a fit (according to AIC) as the CRS.6 but still fits quite well and has the advantage of only two parameters or the 6 parameter CRS.6
model with its superior AIC. Note that AIC already penalizes models for having more parameters so having a better AIC is not a consequence of having more parameters.
Other If its believed that both low and high should have the same model form then finding a single model form fitting both low and high well might be used as another criterion for picking a model form. In addition to the drc models, there are also some yield-density models in (2.1), (2.2), (2.3) and (2.4) of Akbar et al, IRJFE, 2010 which look similar to the MM.2 model which could be tried.
UPDATED: reworked this around the drc package.