Checking duplicates, sum them and delete one row after summing

前端 未结 1 704
生来不讨喜
生来不讨喜 2021-02-10 03:44

I have a dataframe which contains some duplicates. I want to sum rows of two columns where there is a duplicate and then delete the unwanted row.

Here is an example of t

1条回答
  •  心在旅途
    2021-02-10 04:04

    Solution using data.table:

    require(data.table)
    df <- structure(list(year = c(2015, 2015), ID = c(200, 200), Lats = c(30.5417, 
                30.5417), Longs = c(-20.5254, -20.5254), N = c(150, 90), n = c(30, 
                50), c_id = c(4142, 4142)), .Names = c("year", "ID", "Lats", 
                "Longs", "N", "n", "c_id"), row.names = c(NA, -2L), 
                class = "data.frame")
    dt <- data.table(df)
    dt[, lapply(.SD, sum), by="c_id,year,ID,Lats,Longs"]
    
       c_id year  ID    Lats    Longs   N  n
    1: 4142 2015 200 30.5417 -20.5254  240 80
    

    Solution using plyr:

    require(plyr)
    ddply(df, .(c_id, year, ID, Lats, Longs), function(x) c(N=sum(x$N), n=sum(x$n)))
    
      c_id year  ID    Lats    Longs   N  n
    1 4142 2015 200 30.5417 -20.5254 240 80
    

    0 讨论(0)
提交回复
热议问题