问题
I'm trying to replace values across values with NA across multiple columns if a condition is met.
Here's a sample dataset:
library(tidyverse)
sample <- tibble(id = 1:6,
team_score = 5:10,
cent_dept_test_agg = c(1, 2, 3, 4, 5, 6),
cent_dept_blue_agg = c(15:20),
num_in_dept = c(1, 1, 2, 5, 100, 6))
I want the columns that contain cent_dept_.*_agg to be NA when num_in_dept is 1, so it looks like this:
library(tidyverse)
solution <- tibble(id = 1:6,
team_score = 5:10,
cent_dept_test_agg = c(NA, NA, 3, 4, 5, 6),
cent_dept_blue_agg = c(NA, NA, 17:20),
num_in_dept = c(1, 1, 2, 5, 100, 6))
I've tried using replace_with_na_at (from the nanier package) and na_if (from the dplyr package), but I can't figure it out. I know my selection criteria is correct (dplyr::matches("cent_dept_.*_agg"), but I can't figure out the solution.
In my actual dataset, I have many columns that start with cent_dept and end with agg, so it's very important that the selection users that matches component.
Thank you for your help!
回答1:
We can use mutate_at
to select the columns that matches
'cent_dept' and replace
the values where 'num_in_dept' is 1
library(dplyr)
sample %>%
mutate_at(vars(matches('^cent_dept_.*_agg$')), ~
replace(., num_in_dept == 1, NA))
# A tibble: 6 x 5
# id team_score cent_dept_test_agg cent_dept_blue_agg num_in_dept
# <int> <int> <dbl> <int> <dbl>
#1 1 5 NA NA 1
#2 2 6 NA NA 1
#3 3 7 3 17 2
#4 4 8 4 18 5
#5 5 9 5 19 100
#6 6 10 6 20 6
In base R
, we can also do
nm1 <- grep('^cent_dept_.*_agg$', names(sample))
sample[nm1] <- lapply(sample[nm1], function(x)
replace(x, sample$num_in_dept == 1, NA))
Or it can be done with
sample[nm1] <- sample[nm1] * NA^(sample$num_in_dept == 1)
来源:https://stackoverflow.com/questions/59699567/replace-values-with-na-across-multiple-columns-if-a-condition-is-met-in-r