I am trying to fit smooth curve to my dataset; is there is any better smoothing curve than I produced using the following codes:
x <- seq(1, 10, 0.5)
y <-
You've got 19 points, so a polynomial up to X^18 will bullseye each of your points:
> xl=seq(0,10,len=100)
> p=lm(y~poly(x,18))
> plot(x,y)
> lines(xl,predict(p,newdata=data.frame(x=xl)))
BUT that's ignoring what statistics is all about. Its about acknowledging that curves won't fit through points. Its about finding a model with a small number of parameters that explains as much as it can about the data, and leaves only noise. Its not about spearing your points with a curve - a curve so drawn has very little meaning between the data points.
As posed, the question is almost meaningless. There is no such thing as a "best" line of fit, since "best" depends on the objectives of your study. It is fairly trivial to generate a smoothed line to fit through every single point of data (e.g. a 18th order polynomial will fit your data perfectly, but will most likely be quite meaningless).
That said, you can specify the amount of smoothness of a loess
model by changing the span
argument. The larger the value of span, the smoother the curve, the smaller the value of span, the more it will fit each point:
Here is a plot with the value span=0.25
:
x <- seq(1, 10, 0.5)
y <- c(1, 1.5, 1.6, 1.7, 2.1,
2.2, 2.2, 2.4, 3.1, 3.3,
3.7, 3.4, 3.2, 3.1, 2.4,
1.8, 1.7, 1.6, 1.4)
xl <- seq(1, 10, 0.125)
plot(x, y)
lines(xl, predict(loess(y~x, span=0.25), newdata=xl))
An alternative approach is to fit splines through your data. A spline is constrained to pass through each point (whereas a smoother such as lowess
may not.)
spl <- smooth.spline(x, y)
plot(x, y)
lines(predict(spl, xl))
I think perhaps you're looking for an interpolated smooth line, which in the case of R is probably most easily accomplished by fitting an interpolation spline? As the other answers discuss, that's not what statistical fitting is about, but there are many contexts where you want a smooth interpolated curve -- I think your terminology may have thrown people off.
Splines are more numerically stable than polynomials.
x <- seq(1, 10, 0.5)
y <- c(1, 1.5, 1.6, 1.7, 2.1,
2.2, 2.2, 2.4, 3.1, 3.3,
3.7, 3.4, 3.2, 3.1, 2.4,
1.8, 1.7, 1.6, 1.4)
library(splines)
isp <- interpSpline(x,y)
xvec <- seq(min(x),max(x),length=200) ## x values for prediction
png("isp.png")
plot(x,y)
## predict() produces a list with x and y components
lines(predict(isp,xvec),col="red")
dev.off()