Question on nls fit in R - why is this such a strange fit?

谁都会走 提交于 2021-01-05 08:57:18

问题


I'm trying to perform a non linear fit to some simple data (corn yield by year). It's straight forward enough to do it with lm in R, but some of the data would fit better if there was a curve allowed, something on the order of year^1.5 or so.

x <- c(1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 
1988L, 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 
1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2015L, 
2016L, 2017L, 2018L, 2019L)

y <- c(47.3, 25.4, 39, 56.4, 41.4, 56.1, 60.3, 58, 64, 35, 56, 54, 
37, 80, 59, 88, 55, 87, 90, 99, 93, 90.4, 80.7, 35, 80.2, 104.9, 
59.9, 43.5, 97.9, 106, 132, 121.7, 120.1, 63.9, 142.5, 129.9, 
114.8, 122.1, 164.3, 133.9)

yield_model <- nls(y ~ x^a,start=list(a = 1))

plot(x,y)
lines(x,predict(yield_model),lty=2,col="red",lwd=3)

> yield_model2
Nonlinear regression model
 model: y ~ x^a
 data: parent.frame()
 a 
0.5778 
residual sum-of-squares: 46984

Number of iterations to convergence: 8 
Achieved convergence tolerance: 7.566e-09

Why does the nls fit so poorly (visible if you plot it)? Did I do something wrong? You can imagine that a slight curve in a fit to the data would be better, along with a trend. It's like nls removed the trend or something. Any help would be great.


回答1:


Two options. As mentioned by @RuiBarradas the issue is the specification of the model. You can set your starting values using a lm() like this:

#Define initial values
mod <- lm(y~x)
#nls model
yield_model <- nls(y ~ a+x^b,
                   start=list(a = mod$coefficients[1],b=mod$coefficients[2]))
#Plot
plot(x,y)
lines(x,predict(yield_model),lty=2,col="red",lwd=3)

Output:

Or trying another approach like loess:

library(ggplot2)
#Data
df <- data.frame(x=x,y=y)
#Plot
ggplot(df,aes(x=x,y=y))+
  geom_point()+
  stat_smooth(se=F)

Output:




回答2:


The fit is forgetting the constant term, the y intercept. Unlike other modeling functions, nls needs an explicit intercept.
Below I also fit a linear model with lm, for comparison.

df1 <- data.frame(x, y)

yield_model <- nls(y ~ k + x^a, data = df1, start=list(k = 0, a = 1))
yield_model2 <- lm(y ~ x, df1)
summary(yield_model)
summary(yield_model2)

plot(x, y)
lines(x, predict(yield_model), lty = "dashed", col = "red", lwd = 3)
lines(x, predict(yield_model2), lty = "dotted", col = "blue", lwd = 3)

As you can see, the fits are very close to one another. But they are not equal, to see it run:

predict(yield_model) - predict(yield_model2)


来源:https://stackoverflow.com/questions/64977367/question-on-nls-fit-in-r-why-is-this-such-a-strange-fit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!