suffixes
in merge
works only on common column names. Is there anyway to extend this to the rest of the columns as well without manually updating co
Try the following:
colnames(
mergeWithSuffix(df1,df2, by = 'a', suffixes = c("1","2"))
)
[1] "a" "b.1" "d.1" "d.2"
Notice that the original data.frames
are unharmed.
colnames(df1)
[1] "a" "b" "d"
colnames(df2)
[1] "a" "d"
The functions are as follows
require(data.table)
mergeWithSuffix <- function(x, y, by, suffixes=NULL, ...) {
# Add Suffixes
mkSuffix(x, suffixes[[1]], merge.col=by)
mkSuffix(y, suffixes[[2]], merge.col=by)
# Merge
ret <- merge(x, y, by = by, suffixes = NULL, ...)
# Remove Suffixes
undoSuffix(x, suffixes[[1]], merge.col=by)
undoSuffix(y, suffixes[[2]], merge.col=by)
return(ret)
}
mkSuffix <- function(x, sfx, sep=".", merge.col=NULL) {
nms <- setdiff(names(x), merge.col)
setnames(x, nms, paste(nms, sfx, sep=".") )
}
undoSuffix <- function(x, sfx, sep=".", merge.col=NULL) {
nms <- setdiff(names(x), merge.col)
setnames(x, nms, sub(paste0(get("sep"), sfx, "$"), "", nms))
}
Notice that setnames
works by reference, so the overhead is almost negligible. Also, as discussed elsewhere, this works equally well on data.frames and data.table