Merging more than 2 dataframes in R by rownames

后端 未结 4 1995
醉梦人生
醉梦人生 2020-12-04 19:28

I gather data from 4 df\'s and would like to merge them by rownames. I am looking for an efficient way to do this. This is a simplified version of the data I have.

相关标签:
4条回答
  • 2020-12-04 20:01

    Three lines of code will give you the exact same result:

    dat2 <- cbind(df1, df2, df3, df4)
    colnames(dat2)[-(1:7)] <- paste(paste('V', rep(1:100, 2),sep = ''),
                                rep(c('x', 'y'), each = 100), sep = c('.'))
    all.equal(dat,dat2)    
    

    Ah I see, now I understand why you are getting into so much pain. Using the old for loop surely does the trick. Maybe there are even more clever solutions

    rn <- rownames(df1)
    l <- list(df1, df2, df3, df4)
    dat <- l[[1]]
    for(i in 2:length(l)) {
      dat <- merge(dat, l[[i]],  by= "row.names", all.x= F, all.y= F) [,-1]
      rownames(dat) <- rn
    }
    
    0 讨论(0)
  • 2020-12-04 20:06

    Editing your function, I have came up with the function which allows you to merge more data frames by a specific column key (name of the column). The resulted data frame includes all the variable of the merged data frames (if you wanna keep just the common variables (excluding NA, use: all.x= FALSE, all.y= FALSE)

    MyMerge <- function(x, y){
      df <- merge(x, y, by= "name of the common column", all.x= TRUE, all.y= TRUE)
      return(df)
    }
    new.df <- Reduce(MyMerge, list(df1, df2, df3, df4))
    
    0 讨论(0)
  • 2020-12-04 20:07

    I have been looking for the same function. After trying a couple of the options here and others elsewhere. The easiest for me was:

    cbind.data.frame( df1,df2,df3,df4....)
    
    0 讨论(0)
  • 2020-12-04 20:13

    join_all from plyr will probably do what you want. But they all must be data frames and the rownames are added as a column

    require(plyr)
    
    df3 <- data.frame(df3)
    df4 <- data.frame(df4)
    
    df1$rn <- rownames(df1)
    df2$rn <- rownames(df2)
    df3$rn <- rownames(df3)
    df4$rn <- rownames(df4)
    
    df <- join_all(list(df1,df2,df3,df4), by = 'rn', type = 'full')
    

    type argument should help even if the rownames vary and do not match If you do not want the rownames:

    df$rn <- NULL
    
    0 讨论(0)
提交回复
热议问题