Aggregate data frame while keeping original order, in a simple manner

前端未结

关注

 4  1063

囚心锁ツ 2021-02-15 12:38

I\'m having some trouble aggregating a data frame while keeping the groups in their original order (order based on first appearance in data frame). I\'ve managed to get it right

4条回答

广开言路 (楼主)

2021-02-15 12:52

Looking for solutions to the same problem, I found a new one using aggregate(), but first converting the select variables as factors with the order you want.

all.add <- names(orig.df)[!(names(orig.df)) %in% c("sel.1", "sel.2")]

# Selection variables as factor with leves in the order you want
orig.df$sel.1 <- factor(orig.df$sel.1, levels = unique(orig.df$sel.1))
orig.df$sel.2 <- factor(orig.df$sel.2, levels = unique(orig.df$sel.2))

# This is ordered first by sel.1, then by sel.2
aggr.df.ordered <- aggregate(orig.df[,all.add], 
                             by=list(sel.1 = orig.df$sel.1, sel.2 = orig.df$sel.2), sum)

The output is:

   newvar add.1 add.2
1     1 1   100    91
2     1 4   170   183
3     1 5   384   366
4     2 2   175   176
5     2 3    90    96
6     2 4    82    87
7     2 5    95    89
8     3 2   189   178
9     3 3    81    82
10    4 1   174   192
11    5 3    91    98
12    5 4    96    84
13    5 5    83    88

To have it ordered for the first appearance of each combination of both variables, you need a new variable:

# ordered by first appearance of the two variables (needs a new variable)
orig.df$newvar <- paste(orig.df$sel.1, orig.df$sel.2)
orig.df$newvar <- factor(orig.df$newvar, levels = unique(orig.df$newvar))

aggr.df.ordered2 <- aggregate(orig.df[,all.add], 
                              by=list(newvar = orig.df$newvar,
                                      sel.1 = orig.df$sel.1, 
                                      sel.2 = orig.df$sel.2), sum)

which gives the output:

   newvar sel.2 sel.1 add.1 add.2
1     5 4     4     5    96    84
2     5 5     5     5    83    88
3     5 3     3     5    91    98
4     2 4     4     2    82    87
5     2 2     2     2   175   176
6     2 5     5     2    95    89
7     2 3     3     2    90    96
8     1 4     4     1   170   183
9     1 5     5     1   384   366
10    1 1     1     1   100    91
11    4 1     1     4   174   192
12    3 2     2     3   189   178
13    3 3     3     3    81    82

With this solution you do not need to install any new package.

0 讨论(0)

查看其它4个回答