str_replace_all replacing named vector elements iteratively not all at once

后端未结

关注

 4  2007

误落风尘 2021-01-18 09:15

Let\'s say I have a long character string: pneumonoultramicroscopicsilicovolcanoconiosis. I\'d like to use stringr::str_replace_all to replace certain letters w

4条回答

情歌与酒 (楼主)

2021-01-18 09:41

The iterative behavior is intended. That said, we can use write our own workaround. I am going to use character subsetting for the replacement.

In a named vector, we can look up things by name and get a replacement value for each name. This is like doing all the replacement simultaneously.

rules <- c(a = "X", b = "Y", X = "a")
chars <- c("a", "a", "b", "X", "X")
rules[chars]
#>   a   a   b   X   X 
#> "X" "X" "Y" "a" "a"

So here, looking up "a" in the rules vector gets us "X", effectively replacing "a" with "X". The same goes for the other characters.

One problem is that names without a match yield NA.

rules <- c(a = "X", b = "Y", X = "a")
chars <- c("a", "Y", "Z")
rules[chars]
#>    a   
#>  "X"   NA   NA

To prevent the NAs from appearing, we can expand the rules to include any new characters so that a character is replaced by itself.

rules <- c(a = "X", b = "Y", X = "a")
chars <- c("a", "Y", "Z")
no_rule <- chars[! chars %in% names(rules)]
rules2 <- c(rules, setNames(no_rule, no_rule))
rules2[chars]
#>   a   Y   Z 
#> "X" "Y" "Z"

And that's the logic behind the following function.

Break strings to characters
Create a full list of replacement rules
Look up replacement values
Glue strings back together

library(stringr)

str_replace_chars <- function(string, rules) {
  # Expand rules to replace characters with themselves 
  # if those characters do not have a replacement rule
  chars <- unique(unlist(strsplit(string, "")))
  complete_rules <- setNames(chars, chars)
  complete_rules[names(rules)] <- rules

  # Split each string into characters, replace and unsplit
  for (string_i in seq_along(string)) {
    chars_i <- unlist(strsplit(string[string_i], ""))
    string[string_i] <- paste0(complete_rules[chars_i], collapse = "")
  }
  string
}

rules <- c(a = "X", p = "e", e = "p")
string <- c("application", "developer")
str_replace_chars(string, rules)
#> [1] "XeelicXtion" "dpvploepr"

0 讨论(0)

查看其它4个回答