dplyr: Replace values rowwise based on value in one variable

我是研究僧i 提交于 2021-02-17 06:40:14

问题


I want to exclude participants from an analysis that are too old (age >90). Usually I would do it like that:

df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df[df$age > 90, ] <- NA

I can't figure out how to do this with dplyr. If we want to replace one variable we can use

library(dplyr)
df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df %>%
  mutate(age= replace(age, age> 90, NA))

So I thought I could use

df %>%
  mutate_all(function(i) replace(i, age> 90, NA))

I also tried mutate_if and mutate_at but it did not work out. After reading questions on SO I think the "problem" is that in my situation I need to change the values rowwise with dplyr.


回答1:


You need to arrange the columns in a way such that the test column (age) is the last.

library(dplyr)
df %>%
  select(x, y, age) %>%
  mutate_all(~replace(.x, age> 90, NA))

#   x  y age
#1  1  1   1
#2  2  2  10
#3 NA NA  NA



回答2:


library(dplyr)

df <- data.frame(age=c(1,10, 100), x= 1:3, y= 1:3)
df[df$age > 90, ] <- NA


df %>%
    mutate_all(function(i) replace(i, .$age> 90, NA))

  age  x  y
1   1  1  1
2  10  2  2
3  NA NA NA

Just to be sure. You're saying, you want to exclude them. I would guess that you actually maybe want this:

df %>%
    filter(age <= 90)

  age x y
1   1 1 1
2  10 2 2

?



来源:https://stackoverflow.com/questions/59928746/dplyr-replace-values-rowwise-based-on-value-in-one-variable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!