I gather data from 4 df\'s and would like to merge them by rownames. I am looking for an efficient way to do this. This is a simplified version of the data I have.
Three lines of code will give you the exact same result:
dat2 <- cbind(df1, df2, df3, df4)
colnames(dat2)[-(1:7)] <- paste(paste('V', rep(1:100, 2),sep = ''),
rep(c('x', 'y'), each = 100), sep = c('.'))
all.equal(dat,dat2)
Ah I see, now I understand why you are getting into so much pain. Using the old for
loop surely does the trick. Maybe there are even more clever solutions
rn <- rownames(df1)
l <- list(df1, df2, df3, df4)
dat <- l[[1]]
for(i in 2:length(l)) {
dat <- merge(dat, l[[i]], by= "row.names", all.x= F, all.y= F) [,-1]
rownames(dat) <- rn
}
Editing your function, I have came up with the function which allows you to merge more data frames by a specific column key (name of the column). The resulted data frame includes all the variable of the merged data frames (if you wanna keep just the common variables (excluding NA, use: all.x= FALSE, all.y= FALSE
)
MyMerge <- function(x, y){
df <- merge(x, y, by= "name of the common column", all.x= TRUE, all.y= TRUE)
return(df)
}
new.df <- Reduce(MyMerge, list(df1, df2, df3, df4))
I have been looking for the same function. After trying a couple of the options here and others elsewhere. The easiest for me was:
cbind.data.frame( df1,df2,df3,df4....)
join_all
from plyr
will probably do what you want. But they all must be data frames and the rownames are added as a column
require(plyr)
df3 <- data.frame(df3)
df4 <- data.frame(df4)
df1$rn <- rownames(df1)
df2$rn <- rownames(df2)
df3$rn <- rownames(df3)
df4$rn <- rownames(df4)
df <- join_all(list(df1,df2,df3,df4), by = 'rn', type = 'full')
type
argument should help even if the rownames vary and do not match
If you do not want the rownames:
df$rn <- NULL