问题
I want to exclude participants from an analysis that are too old (age >90). Usually I would do it like that:
df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df[df$age > 90, ] <- NA
I can't figure out how to do this with dplyr. If we want to replace one variable we can use
library(dplyr)
df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df %>%
mutate(age= replace(age, age> 90, NA))
So I thought I could use
df %>%
mutate_all(function(i) replace(i, age> 90, NA))
I also tried mutate_if
and mutate_at
but it did not work out. After reading questions on SO I think the "problem" is that in my situation I need to change the values rowwise with dplyr.
回答1:
You need to arrange the columns in a way such that the test column (age
) is the last.
library(dplyr)
df %>%
select(x, y, age) %>%
mutate_all(~replace(.x, age> 90, NA))
# x y age
#1 1 1 1
#2 2 2 10
#3 NA NA NA
回答2:
library(dplyr)
df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df[df$age > 90, ] <- NA
df %>%
mutate_all(function(i) replace(i, .$age> 90, NA))
age x y
1 1 1 1
2 10 2 2
3 NA NA NA
Just to be sure. You're saying, you want to exclude them. I would guess that you actually maybe want this:
df %>%
filter(age <= 90)
age x y
1 1 1 1
2 10 2 2
?
来源:https://stackoverflow.com/questions/59928746/dplyr-replace-values-rowwise-based-on-value-in-one-variable