I\'m having some trouble aggregating a data frame while keeping the groups in their original order (order based on first appearance in data frame). I\'ve managed to get it right
Looking for solutions to the same problem, I found a new one using aggregate(), but first converting the select variables as factors with the order you want.
all.add <- names(orig.df)[!(names(orig.df)) %in% c("sel.1", "sel.2")]
# Selection variables as factor with leves in the order you want
orig.df$sel.1 <- factor(orig.df$sel.1, levels = unique(orig.df$sel.1))
orig.df$sel.2 <- factor(orig.df$sel.2, levels = unique(orig.df$sel.2))
# This is ordered first by sel.1, then by sel.2
aggr.df.ordered <- aggregate(orig.df[,all.add],
by=list(sel.1 = orig.df$sel.1, sel.2 = orig.df$sel.2), sum)
The output is:
newvar add.1 add.2
1 1 1 100 91
2 1 4 170 183
3 1 5 384 366
4 2 2 175 176
5 2 3 90 96
6 2 4 82 87
7 2 5 95 89
8 3 2 189 178
9 3 3 81 82
10 4 1 174 192
11 5 3 91 98
12 5 4 96 84
13 5 5 83 88
To have it ordered for the first appearance of each combination of both variables, you need a new variable:
# ordered by first appearance of the two variables (needs a new variable)
orig.df$newvar <- paste(orig.df$sel.1, orig.df$sel.2)
orig.df$newvar <- factor(orig.df$newvar, levels = unique(orig.df$newvar))
aggr.df.ordered2 <- aggregate(orig.df[,all.add],
by=list(newvar = orig.df$newvar,
sel.1 = orig.df$sel.1,
sel.2 = orig.df$sel.2), sum)
which gives the output:
newvar sel.2 sel.1 add.1 add.2
1 5 4 4 5 96 84
2 5 5 5 5 83 88
3 5 3 3 5 91 98
4 2 4 4 2 82 87
5 2 2 2 2 175 176
6 2 5 5 2 95 89
7 2 3 3 2 90 96
8 1 4 4 1 170 183
9 1 5 5 1 384 366
10 1 1 1 1 100 91
11 4 1 1 4 174 192
12 3 2 2 3 189 178
13 3 3 3 3 81 82
With this solution you do not need to install any new package.