问题
I want to plot the estimated hazard ratio as a function of time in the case of a coxph
model with a time-dependent coefficient that is based on a spline term. I created the time-dependent coefficient using function tt
, analogous to this example that comes straight from ?coxph
:
# Fit a time transform model using current age
cox = coxph(Surv(time, status) ~ ph.ecog + tt(age), data=lung,
tt=function(x,t,...) pspline(x + t/365.25))
Calling survfit(cox)
results in an error that survfit
does not understand models with a tt
term (as described in 2011 by Terry Therneau).
You can extract the linear predictor using cox$linear.predictors
, but I would need to somehow extract ages and less trivially, times to go with each. Because tt
splits the dataset on event times, I can't just match up the columns of the input dataframe with the coxph
output. Additionally, I really would like to plot the estimated function itself, not just the predictions for the observed data points.
There is a related question involving splines here, but it does not involve tt
.
Edit (7/7)
I'm still stuck on this. I've been looking in depth at this object:
spline.obj = pspline(lung$age)
str(spline.obj)
# something that looks very useful, but I am not sure what it is
# cbase appears to be the cardinal knots
attr(spline.obj, "printfun")
function (coef, var, var2, df, history, cbase = c(43.3, 47.6,
51.9, 56.2, 60.5, 64.8, 69.1, 73.4, 77.7, 82, 86.3, 90.6))
{
test1 <- coxph.wtest(var, coef)$test
xmat <- cbind(1, cbase)
xsig <- coxph.wtest(var, xmat)$solve
cmat <- coxph.wtest(t(xmat) %*% xsig, t(xsig))$solve[2, ]
linear <- sum(cmat * coef)
lvar1 <- c(cmat %*% var %*% cmat)
lvar2 <- c(cmat %*% var2 %*% cmat)
test2 <- linear^2/lvar1
cmat <- rbind(c(linear, sqrt(lvar1), sqrt(lvar2), test2,
1, 1 - pchisq(test2, 1)), c(NA, NA, NA, test1 - test2,
df - 1, 1 - pchisq(test1 - test2, max(0.5, df - 1))))
dimnames(cmat) <- list(c("linear", "nonlin"), NULL)
nn <- nrow(history$thetas)
if (length(nn))
theta <- history$thetas[nn, 1]
else theta <- history$theta
list(coef = cmat, history = paste("Theta=", format(theta)))
}
So, I have the knots, but I am still not sure how to combine the coxph
coefficients with the knots in order to actually plot the function. Any leads much appreciated.
回答1:
I think what you need can be generated by generating an input matrix using pspline
and matrix-multiplying this by the relevant coefficients from the coxph
output. To get the HR, you then need to take the exponent.
i.e.
output <- data.frame(Age = seq(min(lung$age) + min(lung$time) / 365.25,
max(lung$age + lung$time / 365.25),
0.01))
output$HR <- exp(pspline(output$Age) %*% cox$coefficients[-1] -
sum(cox$means[-1] * cox$coefficients[-1]))
library("ggplot2")
ggplot(output, aes(x = Age, y = HR)) + geom_line()
Note the age here is the age at the time of interest (i.e. the sum of the baseline age and the elapsed time since study entry). It has to use the range specified to match with the parameters in the original model. It could also be calculated using the x
output from using x = TRUE
as shown:
cox <- coxph(Surv(time, status) ~ ph.ecog + tt(age), data=lung,
tt=function(x,t,...) pspline(x + t/365.25), x = TRUE)
index <- as.numeric(unlist(lapply(strsplit(rownames(cox$x), "\\."), "[", 1)))
ages <- lung$age[index]
output2 <- data.frame(Age = ages + cox$y[, 1] / 365.25,
HR = exp(cox$x[, -1] %*% cox$coefficients[-1] -
sum(cox$means[-1] * cox$coefficients[-1])))
来源:https://stackoverflow.com/questions/31105216/plotting-estimated-hr-from-coxph-object-with-time-dependent-coefficient-and-spli