how do I select the smoothing parameter for smooth.spline()?

后端 未结 3 631
陌清茗
陌清茗 2021-02-08 12:28

I know that the smoothing parameter(lambda) is quite important for fitting a smoothing spline, but I did not see any post here regarding how to select a reasonable lambda (spar=

相关标签:
3条回答
  • 2021-02-08 12:55

    From the help of smooth.spline you have the following:

    The computational λ used (as a function of \code{spar}) is λ = r * 256^(3*spar - 1)

    spar can be greater than 1 (but I guess no too much). I think you can vary this parameters and choose it graphically by plotting the fitted values for different spars. For example:

    spars <- seq(0.2,2,length.out=10)          ## I will choose between 10 values 
    dat <- data.frame(
      spar= as.factor(rep(spars,each=18)),    ## spar to group data(to get different colors)
      x = seq(1:18),                          ## recycling here to repeat x and y 
      y = c(1:3,5,4,7:3,2*(2:5),rep(10,4)))
    xyplot(y~x|spar,data =dat, type=c('p'), pch=19,groups=spar,
           panel =function(x,y,groups,...)
           {
              s2  <- smooth.spline(y,spar=spars[panel.number()])
              panel.lines(s2)
              panel.xyplot(x,y,groups,...)
           })
    

    Here for example , I get best results for spars = 0.4

    enter image description here

    0 讨论(0)
  • 2021-02-08 13:04

    agstudy provides a visual way to choose spar. I remember what I learned from linear model class (but not exact) is to use cross validation to pick "best" spar. Here's a toy example borrowed from agstudy:

    x = seq(1:18)
    y = c(1:3,5,4,7:3,2*(2:5),rep(10,4))
    splineres <- function(spar){
      res <- rep(0, length(x))
      for (i in 1:length(x)){
        mod <- smooth.spline(x[-i], y[-i], spar = spar)
        res[i] <- predict(mod, x[i])$y - y[i]
      }
      return(sum(res^2))
    }
    
    spars <- seq(0, 1.5, by = 0.001)
    ss <- rep(0, length(spars))
    for (i in 1:length(spars)){
      ss[i] <- splineres(spars[i])
    }
    plot(spars, ss, 'l', xlab = 'spar', ylab = 'Cross Validation Residual Sum of Squares' , main = 'CV RSS vs Spar')
    spars[which.min(ss)]
    R > spars[which.min(ss)]
    [1] 0.381
    

    enter image description here

    Code is not neatest, but easy for you to understand. Also, if you specify cv=T in smooth.spline:

    R > xyspline <- smooth.spline(x, y, cv=T)
    R > xyspline$spar
    [1] 0.3881
    
    0 讨论(0)
  • 2021-02-08 13:11

    If you don't have duplicated points at the same x value, then try setting GCV=TRUE - the Generalized Cross Validation (GCV) procedure is a clever way of selecting a pretty good stab at picking a good value for lambda (span). One neat detail about the GCV is that it doesn't actually have to go to the trouble of doing the calculations for every single set of one-left-out points - as highlighted in Simon Wood's book. For lots of detail on this have a look at the notes on Simon Wood's web page on MGCV.

    Adrian Bowman's (sm) r-package has a function h.select() which is intended specifically for going the grunt work for choosing a value of lambda (though I'm not 100% sure that it is compatible with the smooth.spline() function in the base package.

    0 讨论(0)
提交回复
热议问题