ggpairs plot with heatmap of correlation values

前端 未结 2 1361
花落未央
花落未央 2020-12-06 11:33

My question is twofold;

I have a ggpairs plot with the default upper = list(continuous = cor) and I would like to colour the tiles by correlation values

相关标签:
2条回答
  • 2020-12-06 12:11

    A possible solution is to get the list of colors from the ggcorr correlation matrix plot and to set these colors as background in the upper tiles of the ggpairs matrix of plots.

    library(GGally)   
    library(mvtnorm)
    # Generate data
    set.seed(1)
    n <- 100
    p <- 7
    A <- matrix(runif(p^2)*2-1, ncol=p) 
    Sigma <- cov2cor(t(A) %*% A)
    sample_df <- data.frame(rmvnorm(n, mean=rep(0,p), sigma=Sigma))
    colnames(sample_df) <- c("KUM", "MHP", "WEB", "OSH", "JAC", "WSW", "gaugings")
    
    # Matrix of plots
    p1 <- ggpairs(sample_df, lower = list(continuous = "smooth"))  
    # Correlation matrix plot
    p2 <- ggcorr(sample_df, label = TRUE, label_round = 2)
    

    The correlation matrix plot is:

    # Get list of colors from the correlation matrix plot
    library(ggplot2)
    g2 <- ggplotGrob(p2)
    colors <- g2$grobs[[6]]$children[[3]]$gp$fill
    
    # Change background color to tiles in the upper triangular matrix of plots 
    idx <- 1
    for (k1 in 1:(p-1)) {
      for (k2 in (k1+1):p) {
        plt <- getPlot(p1,k1,k2) +
         theme(panel.background = element_rect(fill = colors[idx], color="white"),
               panel.grid.major = element_line(color=colors[idx]))
        p1 <- putPlot(p1,plt,k1,k2)
        idx <- idx+1
    }
    }
    print(p1)
    

    0 讨论(0)
  • 2020-12-06 12:16

    You can map a background colour to the cell by writing a quick custom function that can be passed directly to ggpairs. This involves calculating the correlation between the pairs of variables, and then matching to some user specified colour range.

    my_fn <- function(data, mapping, method="p", use="pairwise", ...){
    
                  # grab data
                  x <- eval_data_col(data, mapping$x)
                  y <- eval_data_col(data, mapping$y)
    
                  # calculate correlation
                  corr <- cor(x, y, method=method, use=use)
    
                  # calculate colour based on correlation value
                  # Here I have set a correlation of minus one to blue, 
                  # zero to white, and one to red 
                  # Change this to suit: possibly extend to add as an argument of `my_fn`
                  colFn <- colorRampPalette(c("blue", "white", "red"), interpolate ='spline')
                  fill <- colFn(100)[findInterval(corr, seq(-1, 1, length=100))]
    
                  ggally_cor(data = data, mapping = mapping, ...) + 
                    theme_void() +
                    theme(panel.background = element_rect(fill=fill))
                }
    

    Using the data in Marco's answer:

    library(GGally)    # version: ‘1.4.0’
    
    p1 <- ggpairs(sample_df, 
                       upper = list(continuous = my_fn),
                       lower = list(continuous = "smooth"))  
    

    Which gives:


    A followup question Change axis labels of a modified ggpairs plot (heatmap of correlation) noted that post plot updating of the theme resulted in the panel.background colours being removed. This can be fixed by removing the theme_void and removing the grid lines within the theme. i.e. change the relevant bit to (NOTE that this fix is not required for ggplot2 v3.3.0)

    ggally_cor(data = data, mapping = mapping, ...) + 
               theme(panel.background = element_rect(fill=fill, colour=NA),
                     panel.grid.major = element_blank()) 
    
    0 讨论(0)
提交回复
热议问题