Transform a dataframe into a tree structure list of lists

别说谁变了你拦得住时间么 提交于 2019-11-30 04:44:02

问题


I have a data.frame with two columns representing a hierarchical tree, with parents and nodes.

I want to transform its structure in a way that I can use as an input for the function d3tree, from d3Network package.

Here's my data frame:

df <- data.frame(c("Canada","Canada","Quebec","Quebec","Ontario","Ontario"),c("Quebec","Ontario","Montreal","Quebec City","Toronto","Ottawa"))
names(df) <- c("parent","child")

And I want to transform it to this structure

Canada_tree <- list(name = "Canada", children = list(
                                                list(name = "Quebec", 
                children = list(list(name = "Montreal"),list(name = "Quebec City"))),
                                                 list(name = "Ontario", 
                children = list(list(name = "Toronto"),list(name = "Ottawa")))))

I have succesfully transformed this particular case using this code below:

fill_list <- function(df,node) node <- as.character(node)if (is.leaf(df,node)==TRUE){
    return (list(name = node))
  }
  else {
    new_node = df[df[,1] == node,2]

    return (list(name = node, children =  list(fill_list(df,new_node[1]),fill_list(df,new_node[2]))))
  }

The problem is, it only works with trees which every parent node has exactly two children. You can see I hard coded the two children (new_node[1] and new_node[2]) as inputs for my recursive function.

I'm trying to figure out a way that I could call the recursive function as many time as the parent's node children. Example:

fill_list(df,new_node[1]),...,fill_list(df,new_node[length(new_node)])

I tried these 3 possibilities but none of it worked:

First: Creating a string with all the functions and parameters and then evaluating. It return this error could not find function fill_functional(df,new_node[1]). That's because my function wasn´t created by the time I called it after all.

fill_functional <- function(df,node) {
  node <- as.character(node)
  if (is.leaf(df,node)==TRUE){
    return (list(name = node))
  }
  else {
    new_node = df[df[,1] == node,2]
    level <- length(new_node)
    xxx <- paste0("(df,new_node[",seq(level),"])")
    lapply(xxx,function(x) eval(call(paste("fill_functional",x,sep=""))))

  }
}

Second: Using a for loop. But I only got the children of my root node.

L <- list()
fill_list <- function(df,node) {
  node <- as.character(node)
  if (is.leaf(df,node)==TRUE){
    return (list(name = node))
  }
  else {
    new_node = df[df[,1] == node,2]

    for (i in 1:length(new_node)){
      L[i] <- (fill_list(df,new_node[i]))
    }

    return (list(name = node, children = L))
  }
}

Third: Creating a function that populates a list with elements that are functions, and just changing the arguments. But I wasn't able to accomplish anything interesting, and I'm afraid I'll have the same problem as I did on my first try described above.


回答1:


Here is a recursive definition:

maketreelist <- function(df, root = df[1, 1]) {
  if(is.factor(root)) root <- as.character(root)
  r <- list(name = root)
  children = df[df[, 1] == root, 2]
  if(is.factor(children)) children <- as.character(children)
  if(length(children) > 0) {
    r$children <- lapply(children, maketreelist, df = df)
    }
  r
  }

canadalist <- maketreelist(df)

That produces what you desire. This function assumes that the first column of the data.frame (or matrix) you pass in contains the parent column and the second column has the child. it also takes a root parameter which allows you to specify a starting points. It will default to the first parent in the list.

But if you really are interested in playing round with trees. The igraph package might be of interest

library(igraph)
g <- graph.data.frame(df)
plot(g)



来源:https://stackoverflow.com/questions/23839142/transform-a-dataframe-into-a-tree-structure-list-of-lists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!