Extending Suffixes in Merge to All Non-by Columns

前端 未结 3 1836
后悔当初
后悔当初 2021-01-04 12:06

suffixes in merge works only on common column names. Is there anyway to extend this to the rest of the columns as well without manually updating co

3条回答
  •  一生所求
    2021-01-04 12:32

    Try the following:

    colnames(
      mergeWithSuffix(df1,df2, by = 'a', suffixes = c("1","2"))
    )
    [1] "a"   "b.1" "d.1" "d.2"
    

    Notice that the original data.frames are unharmed.

    colnames(df1)
    [1] "a" "b" "d"
    
    colnames(df2)
    [1] "a" "d"
    

    The functions are as follows

    require(data.table)
    
    mergeWithSuffix <- function(x, y, by, suffixes=NULL, ...) {
    
      # Add Suffixes
      mkSuffix(x, suffixes[[1]], merge.col=by)
      mkSuffix(y, suffixes[[2]], merge.col=by)
    
      # Merge
      ret <- merge(x, y, by = by, suffixes = NULL, ...)
    
      # Remove Suffixes
      undoSuffix(x, suffixes[[1]], merge.col=by)
      undoSuffix(y, suffixes[[2]], merge.col=by)
      return(ret)
    }
    
    mkSuffix <- function(x, sfx, sep=".", merge.col=NULL)  {
      nms <- setdiff(names(x), merge.col)
      setnames(x, nms, paste(nms, sfx, sep=".") ) 
    }
    
    undoSuffix <- function(x, sfx, sep=".", merge.col=NULL) {
      nms <- setdiff(names(x), merge.col)
      setnames(x, nms, sub(paste0(get("sep"), sfx, "$"), "", nms))
    }
    

    Notice that setnames works by reference, so the overhead is almost negligible. Also, as discussed elsewhere, this works equally well on data.frames and data.table

提交回复
热议问题