I have a large data.frame of character data that I want to convert based on what is commonly called a dictionary in other languages.
Currently I am going about it li
One of the most readable way to replace value in a string or a vector of string with a dictionary is stringr::str_replace_all
, from the stringr
package. The pattern needed by str_replace_all
can be a dictionnary, e.g.,
# 1. Made your dictionnary
dictio_replace= c("AA"= "0101",
"AC"= "0102",
"AG"= "0103") # short example of dictionnary.
# 2. Replace all pattern, according to the dictionary-values (only a single vector of string, or a single string)
foo$snp1 <- stringr::str_replace_all(string = foo$snp1,
pattern= dictio_replace) # we only use the 'pattern' option here: 'replacement' is useless since we provide a dictionnary.
Repeat step 2 with foo$snp2 & foo$snp3. If you have more vectors to transform it's a good idea to use another func', in order to replace values in each of the columns/vector in the dataframe without repeating yourself.