Drawing on the discussion on conditional dplyr evaluation I would like conditionally execute a step in pipeline depending on whether the reference column exists in the passed da
On a busy day, one might do like the following:
library(dplyr)
df <- data.frame(A = 1:3, B = letters[1:3], stringsAsFactors = F)
> df %>% mutate( C = ifelse("D" %in% colnames(.), D, B))
# Notice the values on "C" colum. No error thrown, but the logic and result is wrong
A B C
1 1 a a
2 2 b a
3 3 c a
Why? Because "D" %in% colnames(.)
returns only one value of TRUE
or FALSE
, and therefore ifelse
operates only once. Then the value is broadcasted to the whole column!
> df %>% mutate( C = if("D" %in% colnames(.)) D else B)
A B C
1 1 a a
2 2 b b
3 3 c c