Color branches of dendrogram using an existing column

前端 未结 2 2058
春和景丽
春和景丽 2021-01-14 04:46

I have a data frame which I am trying to cluster. I am using hclust right now. In my data frame, there is a FLAG column which I would like to color

相关标签:
2条回答
  • 2021-01-14 05:31

    I think Arhopala's answer is good. I took the liberty to take a step further, and added the function assign_values_to_leaves_edgePar to the dendextend package (starting from version 0.17.2, which is now on github). This version of the function is a bit more robust and flexible from Arhopala's answer since:

    1. It is a general function which can work in different problems/settings
    2. The function can deal with other edgePar parameters (col, lwd, lty)
    3. The function offers recycling of partial vectors, and various warnings massages when needed.

    To install the dendextend package you can use install.packages('dendextend'), but for the latest version, use the following code:

    require2 <- function (package, ...) {
        if (!require(package)) install.packages(package); library(package)
    }
    
    ## require2('installr')
    ## install.Rtools() # run this if you are using Windows and don't have Rtools installed (you must have it for devtools)
    
    # Load devtools:
    require2("devtools")
    devtools::install_github('talgalili/dendextend')
    

    Now that we have dendextend installed, here is a second take on Arhopala's answer:

    x<-1:100
    dim(x)<-c(10,10)
    set.seed(1)
    groups<-sample(c("red","blue"), 10, replace=TRUE)
    x.clust<-as.dendrogram(hclust(dist(x)))
    
    x.clust.dend <- x.clust
    x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = groups, edgePar = "col") # add the colors.
    x.clust.dend <- assign_values_to_leaves_edgePar(x.clust.dend, value = 3, edgePar = "lwd") # make the lines thick
    plot(x.clust.dend)
    

    Here is the result:

    enter image description here

    p.s.: I personally prefer using pipes for this type of coding (which will give the same result as above, but is easier to read):

    x.clust <- x %>% dist  %>% hclust %>% as.dendrogram
    x.clust.dend <- x.clust %>% 
       assign_values_to_leaves_edgePar(value = groups, edgePar = "col") %>% # add the colors.
       assign_values_to_leaves_edgePar(value = 3, edgePar = "lwd") # make the lines thick
    plot(x.clust.dend)
    
    0 讨论(0)
  • 2021-01-14 05:43

    If you want to color the branches of a dendrogram based on a certain variable then the following code (largely taken from the help for the dendrapply function) should give the desired result:

    x<-1:100
    dim(x)<-c(10,10)
    groups<-sample(c("red","blue"), 10, replace=TRUE)
    
    x.clust<-as.dendrogram(hclust(dist(x)))
    
    local({
      colLab <<- function(n) {
        if(is.leaf(n)) {
          a <- attributes(n)
          i <<- i+1
          attr(n, "edgePar") <-
            c(a$nodePar, list(col = mycols[i], lab.font= i%%3))
        }
        n
      }
      mycols <- groups
      i <- 0
    })
    
    x.clust.dend <- dendrapply(x.clust, colLab)
    plot(x.clust.dend)
    
    0 讨论(0)
提交回复
热议问题