Let\'s say I want to merge two data.frames but some of the columns are redundant (the same). How would I merge those data.frames but drop the redundant columns?
You could include the column same in your by argument. The default is by=intersect(names(x), names(y))
. Try merge(X1, X2)
(it is the same as merge(X1, X2, by=c("id", "same"))
):
merge(X1, X2)
# id same different1 different2
#1 a 1 4 9
#2 b 2 5 7
#3 c 3 6 8
Just subset via indexing in the merge statement. There are many ways to subset i.e. name, position. There is even a subset function but the [] notation works well for almost all cases
merge(X1[,c("id","same","different1")], X2[,c("id","different2")], by="id", all = TRUE, sort = FALSE)
As shown in other examples you could put it into the by statement but this will become an issue after you exit the realm of one-to-one merges and enter one-to-many or many-to-many merges.