问题
I don't understand why I can't have a nls function for these data. I have tried with a lot of different start values and I have always the same error.
Here is what I have been doing:
expFct2 = function (x, a, b,c)
{
a*(1-exp(-x/b)) + c
}
vec_x <- c(77.87,87.76,68.6,66.29)
vec_y <- c(1,1,0.8,0.6)
dt <- data.frame(vec_x=vec_x,vec_y=vec_y)
ggplot(data = dt,aes(x = vec_x, y = vec_y)) + geom_point() +
geom_smooth(data=dt, method="nls", formula=y~expFct2(x, a, b, c),
se=F, start=list(a=1, b=75, c=-5)
I have always this error:
Error in method(formula, data = data, weights = weight, ...) :
singular gradient
回答1:
This can be written with two linear parameters (.lin1
and .lin2
) and one nonlinear parameter (b
) like this:
a*(1-exp(-x/b)) + c
= (a+c) - a * exp(-x/b)
= .lin1 + .lin2 * exp(-x/b)
where .lin1 = a+c
and .lin2 = -a
(so a = - .lin2
and c = .lin1 + .lin2
) This lets us use "plinear"
which only requires specification of a starting value for the single nonlinear parameter (eliminating the problem of how to set the starting values for the other parameters) and which converges despite the starting value of b=75
being far from that of the solution:
nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")
Here is the result of a run from which we can see from the size of .lin2
that the problem is badly scaled:
> x <- c(77.87,87.76,68.6,66.29)
> y <- c(1,1,0.8,0.6)
> nls(y ~ cbind(1, exp(-x/b)), start = list(b = 75), alg = "plinear")
Nonlinear regression model
model: y ~ cbind(1, exp(-x/b))
data: parent.frame()
b .lin1 .lin2
3.351e+00 1.006e+00 -1.589e+08
residual sum-of-squares: 7.909e-05
Number of iterations to convergence: 9
Achieved convergence tolerance: 9.887e-07
> R.version.string
[1] "R version 2.14.2 Patched (2012-02-29 r58660)"
> win.version()
[1] "Windows Vista (build 6002) Service Pack 2"
EDIT: added sample run and comment on scaling.
回答2:
Fitting a three-parameter nonlinear model to four data points is going to be moderately challenging in any case, although in this case the data are well behaved. Point #1 is that your starting value for your c
parameter (-5) was way off. Drawing a picture of the curve corresponding to your starting parameters (see below) would help you understand this (so would recognizing that the curve you get will range from c
at its minimum to c+a
at its maximum, and the range of your data is from 0.6 to 1 ...)
However, even with a better starting guess I found myself fussing with control parameters (i.e. control=nls.control(maxiter=200)
), followed by more warnings -- nls
is not known for its robustness. So I tried the SSasympOff
model, which implements a self-starting version of the curve you want to fit.
start1 <- list(a=1, b=75, c=-5)
start2 <- list(a=0.5, b=75, c=0.5) ## a better guess
pfun <- function(params) {
data.frame(vec_x=60:90,
vec_y=do.call(expFct2,c(list(x=60:90),params)))
}
library(ggplot2)
ggplot(data = dt,aes(x = vec_x, y = vec_y)) + geom_point() +
geom_line(data=pfun(start1))+
geom_line(data=pfun(start2),colour="red")+
geom_smooth(data=dt, method="nls", formula=y~SSasympOff(x, a, b, c),
se=FALSE)
My advice in general is that it's easier to figure out what's going on and fix problems if you fit nls
outside of geom_smooth
and construct the curve you want to add using predict.nls
...
More generally, the way to get good starting parameters is to understand the geometry of the function you are fitting, and which parameters control which aspects of the curve. As I mentioned above, c
is the minimum value of the shifted saturating-exponential curve, a
is the range, and b
is a scale parameter (you can see that when x=b
, the curve is 1-exp(-1)
or approximately 2/3 of the way from the minimum to the maximum). Either a bit of algebra and calculus (i.e. taking limits), or playing around with the curve()
function, are good ways to gather this information.
回答3:
I struggle to find an interpretation to your parameters: a is a slope, b the speed of convergence, and a+c the limit, but c by itself does not seem to mean much. After reparametrizing your function, the problem disappears.
f <- function (x, a,b,c) a + c * exp(-x/abs(b))
nls(y~f(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE)
However, the value of c
looks very, very high:
that is probably why the model initially failed to converge.
Nonlinear regression model
model: y ~ f(x, a, b, c)
data: dt
a b c
1.006e+00 3.351e+00 -1.589e+08
residual sum-of-squares: 7.909e-05
Number of iterations to convergence: 9
Achieved convergence tolerance: 2.232e-06
Here is another, more reasonable parametrization of the same function.
g <- function (x, a,b,c) a * (1-exp(-(x-c)/abs(b)))
nls(y~g(x, a, b, c), data=dt, start=list(a=1, b=75, c=-5), trace=TRUE)
Nonlinear regression model
model: y ~ g(x, a, b, c)
data: dt
a b c
1.006 3.351 63.257
residual sum-of-squares: 7.909e-05
Number of iterations to convergence: 10
Achieved convergence tolerance: 1.782e-06
来源:https://stackoverflow.com/questions/9692677/how-to-find-good-start-values-for-nls-function