interpretation of the output of R function bs() (B-spline basis matrix)

后端 未结 2 1822
旧巷少年郎
旧巷少年郎 2021-02-03 10:52

I often use B-splines for regression. Up to now I\'ve never needed to understand the output of bs in detail: I would just choose the model I was interested in, and

2条回答
  •  温柔的废话
    2021-02-03 11:18

    The matrix b

    #              1         2         3
    # [1,] 0.0000000 0.0000000 0.0000000    
    # [2,] 0.8270677 0.0000000 0.0000000    
    # [3,] 0.8198433 0.1801567 0.0000000    
    # [4,] 0.0000000 0.7286085 0.2713915    
    # [5,] 0.0000000 0.0000000 1.0000000  
    

    is actually just the matrix of the values of the three basis functions in each point of x, which should have been obvious to me since it's exactly the same interpretation as for a polynomial linear model. As a matter of fact, since the boundary knots are

    bknots <- attr(b,"Boundary.knots")
    # [1]  0.0 77.4
    

    and the internal knots are

    iknots <- attr(b,"knots")
    # 33.33333% 66.66667% 
    #  13.30000  38.83333 
    

    then the three basis functions, as shown here, are:

    knots <- c(bknots[1],iknots,bknots[2])
    y1 <- c(0,1,0,0)
    y2 <- c(0,0,1,0)
    y3 <- c(0,0,0,1)
    par(mfrow = c(1, 3))
    plot(knots, y1, type = "l", main = "basis 1: b1")
    plot(knots, y2, type = "l", main = "basis 2: b2")
    plot(knots, b3, type = "l", main = "basis 3: b3")
    

    Now, consider b[,1]

    #              1
    # [1,] 0.0000000
    # [2,] 0.8270677
    # [3,] 0.8198433
    # [4,] 0.0000000
    # [5,] 0.0000000
    

    These must be the values of b1 in x <- c(0.0, 11.0, 17.9, 49.3, 77.4). As a matter of fact, b1 is 0 in knots[1] = 0 and 1 in knots[2] = 13.3000, meaning that in x[2] (11.0) the value must be 11/13.3 = 0.8270677, as expected. Similarly, since b1 is 0 for knots[3] = 38.83333, the value in x[3] (17.9) must be (38.83333-13.3)/17.9 = 0.8198433. Since x[4], x[5] > knots[3] = 38.83333, b1 is 0 there. A similar interpretation can be given for the other two columns.

提交回复
热议问题