I have generated an string using the following alphabet.
{A,C,G,T}
. And my string contains more than 10000 characters. I'm searching the following patterns in it.
- ATGGA
- TGGAC
- CCGT
I have asked to use a string matching algorithm which has O(m+n)
running time.
m = pattern length
n = text length
Both KMP and Rabin-Karp algorithms
have this running time. What is the most suitable algorithm (between Rabin-Carp and KMP) in this situation?
When you want to search for multiple patterns, typically the correct choice is to use Aho-Corasick, which is somewhat a generalization of KMP. Now in your case you are only searching for 3 patterns so it may be the case that KMP is not that much slower(at most three times), but this is the general approach.
Rabin-Karp is easier to implement if we assume that a collision will never happen, but if the problem you have is a typical string searching KMP will be more stable no matter what input you have. However, Rabin-Karp has many other applications, where KMP is not an option.
来源:https://stackoverflow.com/questions/23336807/when-to-use-rabin-karp-or-kmp-algorithms