Shortest Common Superstring algorithm?

匿名 (未验证) 提交于 2019-12-03 03:03:02

问题:

I'm trying to code this problem here:

but I'd like to find an algorithm that breaks down the steps for solving the problem. I can't seem to find anything too useful online so I've come here to ask if anyone knows of a resource which I can use to refer to an algorithm that solves this problem.

回答1:

This is called the shortest common supersequence problem. The idea is that in order for the supersequence to be the shortest, we want to find as many shared bits of a and b as possible. We can solve the problem in two steps:

  1. Find the longest common subsequence of a and b.

  2. Insert the remaining bits of a and b while preserving the order of these bits.

We can solve the longest common subsequence problem using dynamic programming.



回答2:

I agree with Terry Li: it is only NP-complete to find the SCS of multiple sequences. For 2 sequences (say s is of length n and t is of length m), my solution (doesn't use LCS but uses something similar) is done in O(nm) time:

1) Run a global alignment, in which you disallow mismatches, don't penalize indels, and give a positive score to matches (I did +1 for matches, -10 for mismatches, and 0 for indels, but these can be adjusted). (This is O(nm))

2) Iterate over the global alignment for both output strings v and w. If v[i] isn't a gap, output it. Otherwise, output w[i]. (This is O(n+m)).



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!