How can I compare two strings to find the number of characters that match in R, using substitution distance?

感情迁移 提交于 2021-02-18 11:37:30

问题


In R, I have two character vectors, a and b.

a <- c("abcdefg", "hijklmnop", "qrstuvwxyz")
b <- c("abXdeXg", "hiXklXnoX", "Xrstuvwxyz")

I want a function that counts the character mismatches between each element of a and the corresponding element of b. Using the example above, such a function should return c(2,3,1). There is no need to align the strings. I need to compare each pair of strings character-by-character and count matches and/or mismatches in each pair. Does any such function exist in R?

Or, to ask the question in another way, is there a function to give me the edit distance between two strings, where the only allowed operation is substitution (ignore insertions or deletions)?


回答1:


Using some mapply fun:

mapply(function(x,y) sum(x!=y),strsplit(a,""),strsplit(b,""))
#[1] 2 3 1



回答2:


Another option is to use adist which Compute the approximate string distance between character vectors:

mapply(adist,a,b)
abcdefg  hijklmnop qrstuvwxyz 
     2          3          1 


来源:https://stackoverflow.com/questions/17286019/how-can-i-compare-two-strings-to-find-the-number-of-characters-that-match-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!