Aggregate data frame while keeping original order, in a simple manner

前端 未结 4 1027
囚心锁ツ
囚心锁ツ 2021-02-15 12:38

I\'m having some trouble aggregating a data frame while keeping the groups in their original order (order based on first appearance in data frame). I\'ve managed to get it right

4条回答
  •  广开言路
    2021-02-15 12:52

    Looking for solutions to the same problem, I found a new one using aggregate(), but first converting the select variables as factors with the order you want.

    all.add <- names(orig.df)[!(names(orig.df)) %in% c("sel.1", "sel.2")]
    
    # Selection variables as factor with leves in the order you want
    orig.df$sel.1 <- factor(orig.df$sel.1, levels = unique(orig.df$sel.1))
    orig.df$sel.2 <- factor(orig.df$sel.2, levels = unique(orig.df$sel.2))
    
    # This is ordered first by sel.1, then by sel.2
    aggr.df.ordered <- aggregate(orig.df[,all.add], 
                                 by=list(sel.1 = orig.df$sel.1, sel.2 = orig.df$sel.2), sum)
    

    The output is:

       newvar add.1 add.2
    1     1 1   100    91
    2     1 4   170   183
    3     1 5   384   366
    4     2 2   175   176
    5     2 3    90    96
    6     2 4    82    87
    7     2 5    95    89
    8     3 2   189   178
    9     3 3    81    82
    10    4 1   174   192
    11    5 3    91    98
    12    5 4    96    84
    13    5 5    83    88
    

    To have it ordered for the first appearance of each combination of both variables, you need a new variable:

    # ordered by first appearance of the two variables (needs a new variable)
    orig.df$newvar <- paste(orig.df$sel.1, orig.df$sel.2)
    orig.df$newvar <- factor(orig.df$newvar, levels = unique(orig.df$newvar))
    
    aggr.df.ordered2 <- aggregate(orig.df[,all.add], 
                                  by=list(newvar = orig.df$newvar,
                                          sel.1 = orig.df$sel.1, 
                                          sel.2 = orig.df$sel.2), sum)
    

    which gives the output:

       newvar sel.2 sel.1 add.1 add.2
    1     5 4     4     5    96    84
    2     5 5     5     5    83    88
    3     5 3     3     5    91    98
    4     2 4     4     2    82    87
    5     2 2     2     2   175   176
    6     2 5     5     2    95    89
    7     2 3     3     2    90    96
    8     1 4     4     1   170   183
    9     1 5     5     1   384   366
    10    1 1     1     1   100    91
    11    4 1     1     4   174   192
    12    3 2     2     3   189   178
    13    3 3     3     3    81    82
    

    With this solution you do not need to install any new package.

提交回复
热议问题