问题
I have a dataframe with a transnational data structure that looks something similar to the below:
ID RANK GRADE
123 E1 0
123 E1 42
123 E1 NA
123 E2 41
123 E2 42
456 E2 41
456 E2 41
456 E3 NA
I want to calculate the mean of the Grade column for each Rank based on the ID, ignoring the values that are 0 because they are data entry errors, and ignoring the NA's.
For example: For ID 123, I want the mean of Grade when their rank was E1, then for E2, etc.
回答1:
You can use group_by
and summarize
from the dplyr
package:
library(dplyr)
df %>%
filter(!is.na(GRADE),
GRADE != 0) %>%
group_by(ID, RANK) %>%
summarize(mean_grade = mean(GRADE))
The filter function is to remove any rows where GRADE
is NA
or 0
来源:https://stackoverflow.com/questions/48753162/calculate-mean-of-column-data-based-on-conditions-in-another-column