Finding the Longest Palindrome Subsequence with less memory

ε祈祈猫儿з 提交于 2019-12-02 19:35:49

Here is a very memory efficient version. But I haven't demonstrated that it is always O(n) memory. (With a preprocessing step it can better than O(n2) CPU, though O(n2) is the worst case.)

Start from the left-most position. For each position, keep track of a table of the farthest out points at which you can generate reflected subsequences of length 1, 2, 3, etc. (Meaning that a subsequence to the left of our point is reflected to the right.) For each reflected subsequence we store a pointer to the next part of the subsequence.

As we work our way right, we search from the RHS of the string to the position for any occurrences of the current element, and try to use those matches to improve the bounds we previously had. When we finish, we look at the longest mirrored subsequence and we can easily construct the best palindrome.

Let's consider this for character.

  1. We start with our best palindrome being the letter 'c', and our mirrored subsequence being reached with the pair (0, 11) which are off the ends of the string.
  2. Next consider the 'c' at position 1. Our best mirrored subsequences in the form (length, end, start) are now [(0, 11, 0), (1, 6, 1)]. (I'll leave out the linked list you need to generate to actually find the palindrome.
  3. Next consider the h at position 2. We do not improve the bounds [(0, 11, 0), (1, 6, 1)].
  4. Next consider the a at position 3. We improve the bounds to [(0, 11, 0), (1, 6, 1), (2, 5, 3)].
  5. Next consider the r at position 4. We improve the bounds to [(0, 11, 0), (1, 10, 4), (2, 5, 3)]. (This is where the linked list would be useful.

Working through the rest of the list we do not improve that set of bounds.

So we wind up with the longest mirrored list is of length 2. And we'd follow the linked list (that I didn't record in this description to find it is ac. Since the ends of that list are at positions (5, 3) we can flip the list, insert character 4, then append the list to get carac.

In general the maximum memory that it will require is to store all of the lengths of the maximal mirrored subsequences plus the memory to store the linked lists of said subsequences. Typically this will be a very small amount of memory.

At a classic memory/CPU tradeoff you can preprocess the list once in time O(n) to generate a O(n) sized hash of arrays of where specific sequence elements appear. This can let you scan for "improve mirrored subsequence with this pairing" without having to consider the whole string, which should generally be a major saving on CPU for longer strings.

First solution in @Luiz Rodrigo's question is wrong: Longest Common Subsesquence (LCS) of a string and its reverse is not necessarily a palindrome.

Example: for string CBACB, CAB is LCS of the string and its reverse and it's obviously not a palindrome. There is a way, however, to make it work. After LCS of a string and its reverse is built, take left half of it (including mid-character for odd-length strings) and complement it on the right with reversed left half (not including mid-character if length of the string is odd). It will obviously be a palindrome and it can be trivially proven that it will be a subsequence of the string.

For above LCS, the palindrome built this way will be CAC.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!