How can I calculate survival function in gbm package analysis?

问题

I would like to analysis my data based on the gradient boosted model.

On the other hand, as my data is a kind of cohort, I have a trouble understanding the result of this model.

Here's my code. Analysis was performed based on the example data.

install.packages("randomForestSRC")
install.packages("gbm")
install.packages("survival")

library(randomForestSRC)
library(gbm)
library(survival)

data(pbc, package="randomForestSRC")
data <- na.omit(pbc)

set.seed(9512)
train <- sample(1:nrow(data), round(nrow(data)*0.7))
data.train <- data[train, ]
data.test <- data[-train, ]

set.seed(9741)
gbm <- gbm(Surv(days, status)~.,
           data.train,
           interaction.depth=2,
           shrinkage=0.01,
           n.trees=500,
           distribution="coxph")

summary(gbm)


set.seed(9741)
gbm.pred <- predict.gbm(gbm, 
                        n.trees=500,
                        newdata=data.test, 
                        type="response")

As I read the package documnet, "gbm.pred" is the result of cox's partial likelihood.

set.seed(9741)
lambda0 = basehaz.gbm(t=data.test$days, 
                      delta=data.test$status,  
                      t.eval=sort(data.test$days), 
                      cumulative = FALSE, 
                      f.x=gbm.pred, 
                      smooth=T)

hazard=lambda0*exp(gbm.pred)

In this code, lambda0 is a baseline hazard fuction.

So, according to formula: h(t/x)=lambda0(t)*exp(f(x))

"hazard" is hazard function.

However, what I've wanted to calculte was the "survival function".

Because, I would like to compare the outcome of original data (data$status) to the prediction result (survival function).

Please let me know how to calculate survival function.

Thank you

回答1:

Actually, the returns is cumulative baseline hazard function(integral part: \int^t\lambda(z)dz), and survival function can be computed as below:

s(t|X)=exp{-e^f(X)\int^t\lambda(z)dz}

f(X) is prediction of gbm, which is equal to log-hazard proportion.

I think this tutorial about gbm-based survival analysis would help to u!

https://github.com/liupei101/Tutorial-Machine-Learning-Based-Survival-Analysis/blob/master/Tutorial_Survival_GBM.ipynb

来源：https://stackoverflow.com/questions/52222714/how-can-i-calculate-survival-function-in-gbm-package-analysis

标签

boosting