Calculate mean of column data based on conditions in another column [duplicate]

问题

I have a dataframe with a transnational data structure that looks something similar to the below:

ID   RANK   GRADE
123  E1     0
123  E1     42
123  E1     NA
123  E2     41
123  E2     42
456  E2     41
456  E2     41
456  E3     NA

I want to calculate the mean of the Grade column for each Rank based on the ID, ignoring the values that are 0 because they are data entry errors, and ignoring the NA's.

For example: For ID 123, I want the mean of Grade when their rank was E1, then for E2, etc.

回答1:

You can use group_by and summarize from the dplyr package:

library(dplyr)

df %>%
  filter(!is.na(GRADE),
         GRADE != 0) %>%
  group_by(ID, RANK) %>%
  summarize(mean_grade = mean(GRADE))

The filter function is to remove any rows where GRADE is NA or 0

来源：https://stackoverflow.com/questions/48753162/calculate-mean-of-column-data-based-on-conditions-in-another-column

标签

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!