chaid regression tree to table conversion in r

前端 未结 1 1785
粉色の甜心
粉色の甜心 2021-01-16 12:58

I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision

相关标签:
1条回答
  • 2021-01-16 13:07

    CHAID package uses partykit (recursive partitioning) tree structures. You can walk the tree by using party nodes - a node can be terminal or have a list of nodes with information about decision rule (split) and fitted data.

    The code below walks the tree and creates the decision table. It is written for demonstration purposes and tested only on one sample tree.

    tree2table <- function(party_tree) {
    
      df_list <- list()
      var_names <-  attr( party_tree$terms, "term.labels")
      var_levels <- lapply( party_tree$data, levels)
    
      walk_the_tree <- function(node, rule_branch = NULL) {
        # depth-first walk on partynode structure (recursive function)
        # decision rules are extracted for every branch
        if(missing(rule_branch)) {
          rule_branch <- setNames(data.frame(t(replicate(length(var_names), NA))), var_names)
          rule_branch <- cbind(rule_branch, nodeId = NA)
          rule_branch <- cbind(rule_branch, predict = NA)
        }
        if(is.terminal(node)) {
          rule_branch[["nodeId"]] <- node$id
          rule_branch[["predict"]] <- predict_party(party_tree, node$id) 
          df_list[[as.character(node$id)]] <<- rule_branch
        } else {
          for(i in 1:length(node)) {
            rule_branch1 <- rule_branch
            val1 <- decision_rule(node,i)
            rule_branch1[[names(val1)[1]]] <- val1
            walk_the_tree(node[i], rule_branch1)
          }
        }
      }
    
      decision_rule <- function(node, i) {
        # returns split decision rule in data.frame with variable name an values
        var_name <- var_names[node$split$varid[[1]]]
        values_vec <- var_levels[[var_name]][ node$split$index == i]
        values_txt <- paste(values_vec, collapse = ", ")
        return( setNames(values_txt, var_name))
      }
      # compile data frame list
      walk_the_tree(party_tree$node)
      # merge all dataframes
      res_table <- Reduce(rbind, df_list)
      return(res_table)
    }
    

    call function with the CHAID tree object:

    table1 <- tree2table(chaidUS)
    

    the result should be something like this:

    gender   ager                       empstat   educr              marstat                          nodeId   predict  
    -------- -------------------------- --------- ------------------ -------------------------------- -------- ---------
    NA       NA                         NA        <HS, HS, >HS       married                          3        Gore     
    NA       NA                         NA        College, Post Coll married                          4        Bush     
    male     NA                         NA        NA                 widowed, divorced, never married 6        Gore     
    female   18-24, 25-34, 35-44, 45-54 NA        NA                 widowed, divorced, never married 8        Gore     
    female   55-64, 65+                 NA        NA                 widowed, divorced, never married 9        Gore
    
    0 讨论(0)
提交回复
热议问题