I am using ctree function within party R package. I would like to idenfiy all predictors that are used within the tree in order to reduce the data.frame dimension used for furth
You can try using this:
getUsefulPredictors<-function(x){
flatTree<-unlist(x@tree)
pred<-unique(flatTree[grepl("*variableName",names(flatTree))])
return(pred)
}
It flattens the trees and looks for the elements having variableName
in their name
Run on your model it returns:
getUsefulPredictors(myModel)
#[1] "Temp" "Wind"
Just for completeness: The answer by NicE pertains to the ctree()
implementation in the party
package. If someone wants to do the same thing based on the new (and recommended) implementation in the partykit
package, then a different function is necessary because the internal representation completely changed.
getUsefulPredictors <- function(x) {
varid <- nodeapply(x, ids = nodeids(x),
FUN = function(n) split_node(n)$varid)
varid <- unique(unlist(varid))
names(data_party(x))[varid]
}
This first obtains the variable ID varid
from each split in each node of the tree. Then the names of the model frame are obtained and those pertaining to the unique variable IDs returned. In your example:
library("partykit")
myModel <- ctree(Ozone ~ ., data = na.omit(airquality))
getUsefulPredictors(myModel)
## [1] "Temp" "Wind"