Merge Multiple Data Frames by Row Names

只谈情不闲聊 提交于 2019-12-18 12:08:43

问题


I'm trying to merge multiple data frames by row names.

I know how to do it with two:

x = data.frame(a = c(1,2,3), row.names = letters[1:3])
y = data.frame(b = c(1,2,3), row.names = letters[1:3])
merge(x,y, by = "row.names")

But when I try using the reshape package's merge_all() I'm getting an error.

z = data.frame(c = c(1,2,3), row.names = letters[1:3])
l = list(x,y,z)
merge_all(l, by = "row.names")

Error in -ncol(df) : invalid argument to unary operator

What's the best way to do this?


回答1:


Merging by row.names does weird things - it creates a column called Row.names, which makes subsequent merges hard.

To avoid that issue you can instead create a column with the row names (which is generally a better idea anyway - row names are very limited and hard to manipulate). One way of doing that with the data as given in OP (not the most optimal way, for more optimal and easier ways of dealing with rectangular data I recommend getting to know data.table instead):

Reduce(merge, lapply(l, function(x) data.frame(x, rn = row.names(x))))



回答2:


maybe there exists a faster version using do.call or *apply, but this works in your case:

x = data.frame(X = c(1,2,3), row.names = letters[1:3])
y = data.frame(Y = c(1,2,3), row.names = letters[1:3])
z = data.frame(Z = c(1,2,3), row.names = letters[1:3])

merge.all <- function(x, ..., by = "row.names") {
  L <- list(...)
  for (i in seq_along(L)) {
    x <- merge(x, L[[i]], by = by)
    rownames(x) <- x$Row.names
    x$Row.names <- NULL
  }
  return(x)
}

merge.all(x,y,z)

important may be to define all the parameters (like by) in the function merge.all you want to forward to merge since the whole ... arguments are used in the list of objects to merge.




回答3:


As an alternative to Reduce and merge:

If you put all the data frames into a list, you can then use grep and cbind to get the data frames with the desired row names.

## set up the data
> x <- data.frame(x1 = c(2,4,6), row.names = letters[1:3])
> y <- data.frame(x2 = c(3,6,9), row.names = letters[1:3])
> z <- data.frame(x3 = c(1,2,3), row.names = letters[1:3])
> a <- data.frame(x4 = c(4,6,8), row.names = letters[4:6])
> lst <- list(a, x, y, z)

## combine all the data frames with row names = letters[1:3]
> gg <- grep(paste(letters[1:3], collapse = ""), 
             sapply(lapply(lst, rownames), paste, collapse = ""))
> do.call(cbind, lst[gg])
##   x1 x2 x3
## a  2  3  1
## b  4  6  2
## c  6  9  3


来源:https://stackoverflow.com/questions/22617593/merge-multiple-data-frames-by-row-names

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!