Accessing grouped data in dplyr

时光总嘲笑我的痴心妄想 提交于 2020-01-10 19:06:08

问题


How can I access the grouped data after applying group_by function from dplyr and using %.% operator

For example, If I want to have the first row of each grouped data then I can do this using plyr package as

ddply(iris,.(Species),function(df){
  df[1,]
})

#output
#  Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#1          5.1         3.5          1.4         0.2     setosa
#2          7.0         3.2          4.7         1.4 versicolor
#3          6.3         3.3          6.0         2.5  virginica  

回答1:


For your specific case, you can use row_number():

library(dplyr)

iris %.% 
  group_by(Species) %.%
  filter(row_number(Species) == 1)
## Source: local data frame [3 x 5]
## Groups: Species
## 
##   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 1          5.1         3.5          1.4         0.2     setosa
## 2          7.0         3.2          4.7         1.4 versicolor
## 3          6.3         3.3          6.0         2.5  virginica

This will be a little more natural in version 0.2 since you can omit the variable name:

# devtools::install_github("hadley/dplyr")

iris %.% 
  group_by(Species) %.%
  filter(row_number() == 1)
## Source: local data frame [3 x 5]
## Groups: Species
## 
##   Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
## 1          5.1         3.5          1.4         0.2     setosa
## 2          7.0         3.2          4.7         1.4 versicolor
## 3          6.3         3.3          6.0         2.5  virginica

For arbitrary operations, do() is much more useful in 0.2. You give it arbitrary expressions, using . as a placeholder for each group:

iris %.% 
  group_by(Species) %.%
  do(.[1, ])
## Source: local data frame [3 x 6]
## Groups: Species
## 
##      Species Sepal.Length Sepal.Width Petal.Length Petal.Width  Species.1
## 1     setosa          5.1         3.5          1.4         0.2     setosa
## 2 versicolor          7.0         3.2          4.7         1.4 versicolor
## 3  virginica          6.3         3.3          6.0         2.5  virginica



回答2:


The only way I found that may help is using the do function.

library(dplyr)

g.iris <- group_by(x=iris, Species)

do(g.iris, function(x){ head(x, n=1)})


来源:https://stackoverflow.com/questions/22709206/accessing-grouped-data-in-dplyr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!