Is there an alternative to “revalue” function from plyr when using dplyr?

后端 未结 4 1112
野性不改
野性不改 2021-02-18 16:05

I\'m a fan of the revalue function is plyr for substituting strings. It\'s simple and easy to remember.

However, I\'ve migrated new code to

相关标签:
4条回答
  • 2021-02-18 16:45

    One alternative that I find handy is the mapvalues function for the data.tables e.g

    df[, variable := mapvalues(variable, old = old_names_string_vector, new = new_names_string_vector)]
    
    0 讨论(0)
  • 2021-02-18 16:47

    There is a recode function available starting with dplyr version dplyr_0.5.0 which looks very similar to revalue from plyr.

    Example built from the recode documentation Examples section:

    set.seed(16)
    x = sample(c("a", "b", "c"), 10, replace = TRUE)
    x
     [1] "a" "b" "a" "b" "b" "a" "c" "c" "c" "a"
    
    recode(x, a = "Apple", b = "Bear", c = "Car")
    
       [1] "Car"   "Apple" "Bear"  "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"
    

    If you only define some of the values that you want to recode, by default the rest are filled with NA.

    recode(x, a = "Apple", c = "Car")
     [1] "Car"   "Apple" NA      "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"
    

    This behavior can be changed using the .default argument.

    recode(x, a = "Apple", c = "Car", .default = x)
     [1] "Car"   "Apple" "b"     "Apple" "Car"   "Apple" "Apple" "Car"   "Car"   "Apple"
    

    There is also a .missing argument if you want to replace missing values with something else.

    0 讨论(0)
  • 2021-02-18 16:53

    I wanted to comment on the answer by @aosmith, but lack reputation. It seems that nowadays the default of dplyr's recode function is to leave unspecified levels unaffected.

    x = sample(c("a", "b", "c"), 10, replace = TRUE)
    x
    [1] "c" "c" "b" "b" "a" "b" "c" "c" "c" "b"
    
    recode(x , a = "apple", b = "banana" )
    
    [1] "c"      "c"      "banana" "banana" "apple"  "banana" "c"      "c"      "c"      "banana"
    

    To change all nonspecified levels to NA, the argument .default = NA_character_ should be included.

    recode(x, a = "apple", b = "banana", .default = NA_character_)
    
    [1] "apple"  "banana" "apple"  "banana" "banana" "apple"  NA       NA       NA       "apple" 
    
    0 讨论(0)
  • 2021-02-18 16:53

    We can do this with chartr from base R

    chartr("ac", "AC", x)
    

    data

    x <- c("a", "b", "c")
    
    0 讨论(0)
提交回复
热议问题