I'm trying to code this problem here:
but I'd like to find an algorithm that breaks down the steps for solving the problem. I can't seem to find anything too useful online so I've come here to ask if anyone knows of a resource which I can use to refer to an algorithm that solves this problem.
This is called the shortest common supersequence problem. The idea is that in order for the supersequence to be the shortest, we want to find as many shared bits of a and b as possible. We can solve the problem in two steps:
Find the longest common subsequence of a and b.
Insert the remaining bits of a and b while preserving the order of these bits.
We can solve the longest common subsequence problem using dynamic programming.
I agree with Terry Li: it is only NP-complete to find the SCS of multiple sequences. For 2 sequences (say s is of length n and t is of length m), my solution (doesn't use LCS but uses something similar) is done in O(nm) time:
1) Run a global alignment, in which you disallow mismatches, don't penalize indels, and give a positive score to matches (I did +1 for matches, -10 for mismatches, and 0 for indels, but these can be adjusted). (This is O(nm))
2) Iterate over the global alignment for both output strings v and w. If v[i] isn't a gap, output it. Otherwise, output w[i]. (This is O(n+m)).