问题
I am trying to move away from rowwise() for list columns as I have heard that the tidyverse team is in the process of axing it. However, I am not used to using the purrr functions so I feel like there must be a better way of doing the following:
I create a list-column containing a tibble for each species. I then want to go into the tibble and take the mean of certain variables. The first case is using map and second is the rowwise solution that I personally feel is cleaner.
Does anyone know a better way to use map in this situation?
library(tidyverse)
iris %>%
group_by(Species) %>%
nest() %>%
mutate(mean_slength = map_dbl(data, ~mean(.$Sepal.Length, na.rm = TRUE)),
mean_swidth = map_dbl(data, ~mean(.$Sepal.Width, na.rm = TRUE))
)
#> # A tibble: 3 x 4
#> Species data mean_slength mean_swidth
#> <fct> <list> <dbl> <dbl>
#> 1 setosa <tibble [50 x 4]> 5.01 3.43
#> 2 versicolor <tibble [50 x 4]> 5.94 2.77
#> 3 virginica <tibble [50 x 4]> 6.59 2.97
iris %>%
group_by(Species) %>%
nest() %>%
rowwise() %>%
mutate(mean_slength = mean(data$Sepal.Length, na.rm = TRUE),
mean_swidth = mean(data$Sepal.Width, na.rm = TRUE))
#> Source: local data frame [3 x 4]
#> Groups: <by row>
#>
#> # A tibble: 3 x 4
#> Species data mean_slength mean_swidth
#> <fct> <list> <dbl> <dbl>
#> 1 setosa <tibble [50 x 4]> 5.01 3.43
#> 2 versicolor <tibble [50 x 4]> 5.94 2.77
#> 3 virginica <tibble [50 x 4]> 6.59 2.97
Created on 2018-12-26 by the reprex package (v0.2.1)
回答1:
Instead of having two map
, use a single one, with summarise_at
library(tidyverse)
iris %>%
group_by(Species) %>%
nest() %>%
mutate(out = map(data, ~
.x %>%
summarise_at(vars(matches('Sepal')),
funs(mean_s = mean(., na.rm = TRUE))))) %>%
unnest(out)
来源:https://stackoverflow.com/questions/53938745/r-cleaner-way-to-use-map-with-list-columns