Plot one numeric variable against n numeric variables in n plots

后端 未结 4 1012
难免孤独
难免孤独 2020-11-28 13:44

I have a huge data frame and I would like to make some plots to get an idea of the associations among different variables. I cannot use

pairs(data)
<         


        
相关标签:
4条回答
  • 2020-11-28 14:20

    If your goal is only to get an idea of the associations among different variables, you can also use:

    plot(y~., data = foo)
    

    It is not as nice as using ggplot and it doesn't automatically put all the graphs in one window (although you can change that using par(mfrow = c(a, b)), but it is a quick way to get what you want.

    0 讨论(0)
  • 2020-11-28 14:24

    Could do reshape2/ggplot2/gridExtra packages combination. This way you don't need to specify the number of plots. This code will work on any number of explaining variables without any modifications

    foo <- data.frame(x1=1:10,x2=seq(0.1,1,0.1),x3=-7:2,x4=runif(10,0,1))
    library(reshape2)
    foo2 <- melt(foo, "x3")
    library(ggplot2)
    p1 <- ggplot(foo2, aes(value, x3)) +  geom_point() + facet_grid(.~variable)
    p2 <- ggplot(foo, aes(x = x3)) + geom_histogram()
    library(gridExtra)
    grid.arrange(p1, p2, ncol=2)
    

    enter image description here

    0 讨论(0)
  • 2020-11-28 14:35

    The package tidyr helps doing this efficiently. please refer here for more options

    data %>%
      gather(-y_value, key = "some_var_name", value = "some_value_name") %>%
      ggplot(aes(x = some_value_name, y = y_value)) +
        geom_point() +
        facet_wrap(~ some_var_name, scales = "free")
    

    you would get something like this

    0 讨论(0)
  • 2020-11-28 14:35

    I faced the same problem, and I don't have any experience of ggplot2, so I created a function using plot which takes the data frame, and the variables to be plotted as arguments and generate graphs.

    dfplot <- function(data.frame, xvar, yvars=NULL)
    {
        df <- data.frame
        if (is.null(yvars)) {
            yvars = names(data.frame[which(names(data.frame)!=xvar)])       
        }   
    
        if (length(yvars) > 25) {
                print("Warning: number of variables to be plotted exceeds 25, only first 25 will be plotted")
                yvars = yvars[1:25]
        }
    
        #choose a format to display charts
        ncharts <- length(yvars) 
        nrows = ceiling(sqrt(ncharts))
        ncols = ceiling(ncharts/nrows)  
        par(mfrow = c(nrows,ncols))
    
        for(i in 1:ncharts){    
            plot(df[,xvar],df[,yvars[i]],main=yvars[i], xlab = xvar, ylab = "")
        }
    }
    

    Notes:

    1. You can provide the list of variables to be plotted as yvars, otherwise it will plot all (or first 25, whichever is less) the variables in the data frame against xvar.
    2. Margins were going out of bounds if the number of plots exceeds 25, so I kept a limit to plot 25 charts only. Any suggestions to nicely handle this are welcome.
    3. Also the y axis labels are removed as titles of the graphs take care of it. x axis label is set to xvar.
    0 讨论(0)
提交回复
热议问题