Duplicated rows: select rows based on criteria and store duplicated values

前端 未结 2 1385
有刺的猬
有刺的猬 2021-01-23 19:17

I am working on a raw dataset that looks something like this:

df <- data.frame(\"ID\" = c(\"Alpha\", \"Alpha\", \"Alpha\", \"Alpha\", 
                                


        
2条回答
  •  梦毁少年i
    2021-01-23 19:50

    Using data.table, a dcast based on rowid(ID, Year) after ordering by Val2 descending gets you there with the exception of column names. The "_1" columns are the "keep" columns, and the "_2" columns are the "del" columns.

    library(data.table)
    setDT(df)
    
    setorder(df, ID, Year, -Val2)
    
    out <- 
      dcast(df, ID + Year ~ rowid(ID, Year), value.var = c('treatment', 'Val', 'Val2'))
    out
    #       ID Year treatment_1 treatment_2 Val_1 Val_2 Val2_1 Val2_2
    # 1: Alpha 1970           B           A     0     0   2.34   0.00
    # 2: Alpha 1980           C             0    NA   1.30     NA
    # 3: Alpha 1990           D             1    NA   0.00     NA
    # 4:  Beta 1970           E             0    NA   0.00     NA
    # 5:  Beta 1980           G           F     0     1   3.20   2.34
    # 6:  Beta 1990           H             1    NA   1.30     NA
    

    We can change the names to match yours, only difference is the del columns have a number at the end. Would be useful if there is possiblity of > 2 rows per group.

    setnames(out, function(x) gsub('(.*)_1', '\\1', x))
    setnames(out, function(x) gsub('(.*_\\d+)', 'del_\\1', x))
    out
    #       ID Year treatment del_treatment_2 Val del_Val_2 Val2 del_Val2_2
    # 1: Alpha 1970         B               A   0         0 2.34       0.00
    # 2: Alpha 1980         C               0        NA 1.30         NA
    # 3: Alpha 1990         D               1        NA 0.00         NA
    # 4:  Beta 1970         E               0        NA 0.00         NA
    # 5:  Beta 1980         G               F   0         1 3.20       2.34
    # 6:  Beta 1990         H               1        NA 1.30         NA
    

提交回复
热议问题