I often use B-splines for regression. Up to now I\'ve never needed to understand the output of bs
in detail: I would just choose the model I was interested in, and
The matrix b
# 1 2 3
# [1,] 0.0000000 0.0000000 0.0000000
# [2,] 0.8270677 0.0000000 0.0000000
# [3,] 0.8198433 0.1801567 0.0000000
# [4,] 0.0000000 0.7286085 0.2713915
# [5,] 0.0000000 0.0000000 1.0000000
is actually just the matrix of the values of the three basis functions in each point of x
, which should have been obvious to me since it's exactly the same interpretation as for a polynomial linear model. As a matter of fact, since the boundary knots are
bknots <- attr(b,"Boundary.knots")
# [1] 0.0 77.4
and the internal knots are
iknots <- attr(b,"knots")
# 33.33333% 66.66667%
# 13.30000 38.83333
then the three basis functions, as shown here, are:
knots <- c(bknots[1],iknots,bknots[2])
y1 <- c(0,1,0,0)
y2 <- c(0,0,1,0)
y3 <- c(0,0,0,1)
par(mfrow = c(1, 3))
plot(knots, y1, type = "l", main = "basis 1: b1")
plot(knots, y2, type = "l", main = "basis 2: b2")
plot(knots, b3, type = "l", main = "basis 3: b3")
Now, consider b[,1]
# 1
# [1,] 0.0000000
# [2,] 0.8270677
# [3,] 0.8198433
# [4,] 0.0000000
# [5,] 0.0000000
These must be the values of b1
in x <- c(0.0, 11.0, 17.9, 49.3, 77.4)
. As a matter of fact, b1
is 0 in knots[1] = 0
and 1 in knots[2] = 13.3000
, meaning that in x[2]
(11.0) the value must be 11/13.3 = 0.8270677
, as expected. Similarly, since b1
is 0 for knots[3] = 38.83333
, the value in x[3]
(17.9) must be (38.83333-13.3)/17.9 = 0.8198433
. Since x[4], x[5] > knots[3] = 38.83333
, b1
is 0 there. A similar interpretation can be given for the other two columns.
Just a small correction to the excellent answer by @DeltaIV above (it looks like I can not comment.)
So in b1
, when he calculated b1(x[3])
, it should be (38.83333-17.9)/(38.83333-13.3)=0.8198433
by linear interpolation. Everything else is perfect.
Note b1
should look like this
\frac{t}{13.3}I(0<=t<13.3)+\frac{38.83333-t}{38.83333-13.3}I(13.3<=t<38.83333)