R group by aggregate

后端未结

关注

 3  1091

南笙 2021-01-22 10:37

In R (which I am relatively new to) I have a data frame consists of many column and a numeric column I need to aggregate according to groups determined by another column.

3条回答

长情又很酷 (楼主)

2021-01-22 11:15
Here's my solution using aggregate.

First, load the data:
```
df <- read.table(text = 
"SessionID   Price
'1'       '624.99'
'1'       '697.99'
'1'       '649.00'
'7'       '779.00'
'7'       '710.00'
'7'       '2679.50'", header = TRUE) 
```
Then aggregate and match it back to the original data.frame:
```
tmp <- aggregate(Price ~ SessionID, df, function(x) c(Min = min(x), Max = max(x)))
df <- cbind(df, tmp[match(df$SessionID, tmp$SessionID), 2])
print(df)
#  SessionID   Price    Min     Max
#1         1  624.99 624.99  697.99
#2         1  697.99 624.99  697.99
#3         1  649.00 624.99  697.99
#4         7  779.00 710.00 2679.50
#5         7  710.00 710.00 2679.50
#6         7 2679.50 710.00 2679.50
```
EDIT: As per the comment below, you might wonder why this works. It indeed is somewhat weird. But remember that a data.frame just is a fancy list. Try to call str(tmp), and you'll see that the Price column itself is 2 by 2 numeric matrix. It gets confusing as the print.data.frame knows how to handle this and so print(tmp) looks like there are 3 columns. Anyway, tmp[2] simply access the second column/entry of the data.frame/list and returns that 1 column data.frame while tmp[,2] access the second column and return the data type stored.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...