Understanding heatmap dendogram clustering in R

后端 未结 3 846
你的背包
你的背包 2021-01-07 03:45

I would appreciate any info material on the dendograms (Colv, Rowv) of R\'s heatmap function. Such as how the clustering works (is it euclidean distance?). You don\'t have t

3条回答
  •  清酒与你
    2021-01-07 04:02

    Rowv and Colv control whether the rows and columns of your data set should be reordered and if so how.

    The possible values for them are TRUE, NULL, FALSE, a vector of integers, or a dendrogram object.

    • In the default mode TRUE, heatmap.2 performs clustering using the hclustfun and distfun parameters. This defaults to complete linkage clustering, using a euclidean distance measure. The dendrogram is then reordered using the row/column means. You can control this by specifying different functions to hclustfun or distfun. For example to use the Manhattan distance rather than the euclidiean distance you would do:

      heatmap.2(x,...,distfun=function (y) dist(y,method = "manhattan") )
      

      check out ?dist and ?hclust. If you want to learn more about clustering you could start with "distance measures" and "agglomeration methods".

    • If Rowv/Colv is NULL or FALSE then no reordering or clustering is done and the matrix is plotted as-is.

    • If Rowv/Colv is a numeric vector, then the clustering is computed as for TRUE and the reordering of the dendrogram is done using the vector supplied to Rowv/Colv.

    • If Rowv/Colv is a dendrogram object, then this dendrogram will be used to reorder the matrix. Dendrogram objects can be generated, for example, by:

      rowDistance = dist(x, method = "manhattan")
      rowCluster = hclust(rowDistance, method = "complete")
      rowDend = as.dendrogram(rowCluster)
      rowDend = reorder(rowDend, rowMeans(x))
      

      which generates a complete clustering on a manhattan distance, ordered by row means. You can now pass rowDend to Rowv.

      heatmap.2(x,...,Rowv = rowDend)
      

      This can be useful, if for example you want to cluster the rows and columns in different ways, or use a clustering that someone else has given you, or you want to do something funky that cannot be accommodated by just specifying the hclustfun and the distfun. This is what is meant by" the dendrogram is honoured": it is used instead of what is specified by hclustfun and distfun.

提交回复
热议问题