dplyr if_else() vs base R ifelse()

前端 未结 4 1126
野趣味
野趣味 2020-11-30 10:22

I am fairly proficient within the Tidyverse, but have always used ifelse() instead of dplyr if_else(). I want to switch this behavior and default t

相关标签:
4条回答
  • 2020-11-30 10:32

    I'd also add that if_else() can attribute a value in case of NA, which is a handy way of adding an extra condition.

    df <- data_frame(val = c(80, 90, NA, 110))
    df %>% mutate(category = if_else(val < 100, 1, 2, missing = 9))
    
    #     val category
    #   <dbl>    <dbl>
    # 1    80        1
    # 2    90        1
    # 3    NA        9
    # 4   110        2
    
    0 讨论(0)
  • 2020-11-30 10:44

    Another reason to choose if_else over ifelse is that ifelse turns Date into numeric objects

    Dates <- as.Date(c('2018-10-01', '2018-10-02', '2018-10-03'))
    new_Dates <- ifelse(Dates == '2018-10-02', Dates + 1, Dates)
    str(new_Dates)
    
    #>  num [1:3] 17805 17807 17807
    

    if_else is also faster than ifelse.

    Note that when testing multiple conditions, the code would be more readable and less error-prone if we use case_when.

    library(dplyr)
    
    case_when(
      Dates == '2018-10-01' ~ Dates - 1,
      Dates == '2018-10-02' ~ Dates + 1,
      Dates == '2018-10-03' ~ Dates + 2,
      TRUE ~ Dates
    )
    
    #> [1] "2018-09-30" "2018-10-03" "2018-10-05"
    

    Created on 2018-06-01 by the reprex package (v0.2.0).

    0 讨论(0)
  • 2020-11-30 10:45

    Another important reason for preferring if_else() to ifelse() is checking for consistency in lengths. See this dangerous gotcha:

    > tibble(x = 1:3, y = ifelse(TRUE, x, 4:6))
    # A tibble: 3 x 2
          x     y
      <int> <int>
    1     1     1
    2     2     1
    3     3     1
    

    Compare with

    > tibble(x = 1:3, y = if_else(TRUE, x, 4:6))
        Error: `true` must be length 1 (length of `condition`), not 3.
    

    The intention in both cases is clearly for column y to equal x or to equal 4:6 acording to the value of a single (scalar) logical variable; ifelse() silently truncates its output to length 1, which is then silently recycled. if_else() catches what is almost certainly an error at source.

    0 讨论(0)
  • 2020-11-30 10:56

    if_else is more strict. It checks that both alternatives are of the same type and otherwise throws an error, while ifelse will promote types as necessary. This may be a benefit in some circumstances, but may otherwise break scripts if you don't check for errors or explicitly force type conversion. For example:

    ifelse(c(TRUE,TRUE,FALSE),"a",3)
    [1] "a" "a" "3"
    if_else(c(TRUE,TRUE,FALSE),"a",3)
    Error: `false` must be type character, not double
    
    0 讨论(0)
提交回复
热议问题