问题
This is probably a simple question, but I'm having trouble getting the mean function to work using dplyr.
Using the mtcars dataset as an example, if I type:
data(mtcars)
mtcars %>%
select (mpg) %>%
mean()
I get the "Warning message: In mean.default(.) : argument is not numeric or logical: returning NA" error message.
For some reason though if I repeat the same code but just ask for a "summary", or "range" or several other statistical calculations, they work fine:
data(mtcars)
mtcars %>%
select (mpg) %>%
summary()
Similarly, if I run the mean function in base R notation, that works fine too:
mean(mtcars$mpg)
Can anyone point out what I've done wrong?
回答1:
In dplyr
, you can use summarise()
whenever you're not changing your original dataframe (reordering it, filtering it, adding to it, etc), but instead are creating a new dataframe that has summary statistics for the first dataframe.
mtcars %>%
summarise(mean_mpg = mean(mpg))
gives the output:
mean_mpg
1 20.09062
PS. If you're learning dplyr
, learning these five verbs will take you a long way: select()
, filter()
, group_by()
, summarise()
, arrange()
.
回答2:
Use pull
to pull out the vector.
mtcars %>%
pull(mpg) %>%
mean()
# [1] 20.09062
Or use pluck
from the purrr
package.
mtcars %>%
purrr::pluck("mpg") %>%
mean()
# [1] 20.09062
Or summarize first and then pull out the mean.
mtcars %>%
summarize(mean = mean(mpg)) %>%
pull(mean)
# [1] 20.09062
来源:https://stackoverflow.com/questions/52718538/how-do-i-get-mean-functions-to-work-when-i-use-piping