Total number of palindromic subsequences in a string

后端 未结 3 1788
你的背包
你的背包 2021-01-31 12:47

The question is like this--

For every string given as input, you need to tell the number of subsequences of it that are palindromes (need not necessarily be distinct). N

3条回答
  •  日久生厌
    2021-01-31 13:26

    [EDIT 19/10/2015: An anonymous reviewer pointed out a problem with the formula, which prompted me to notice another, even bigger mistake... Now fixed.]

    I now see how to drop the solution time down to O(n^2). I'll leave my other answer up in case it's interesting as a stepping-stone to this one. Note: This is (also) only a solution to the first part of the problem; I see no way to efficiently count only distinct palindromic subsequences (PS).

    Instead of counting the number of PS that begin and end at exactly the positions i and j, let's count how many begin at or after i and end at or before j. Call this g(i, j).

    We can try to write g(i, j) = g(i, j-1) + g(i+1, j) + (x[i] == x[j])*g(i+1, j-1) for the case when j > i. But this doesn't quite work, because the first two terms will double-count any PS that begin after i and end before j.

    The key insight is to notice that we can easily calculate the number of PS that begin or end at some exact position by subtracting off other values of g(), and perhaps adding yet more values of g() back on to compensate for double-counting. For example, the number of PS that begin at exactly i and end at exactly j is g(i, j) - g(i+1, j) - g(i, j-1) + g(i+1, j-1): the last term corrects for the fact that both the second and third terms count all g(i+1, j-1) PS that begin after i and end before j.

    Every PS that begins at or after i and ends at or before j is in exactly 1 of 4 categories:

    1. It begins after i, and ends before j.
    2. It begins at i, and ends before j.
    3. It begins after i, and ends at j.
    4. It begins at i, and ends at j.

    g(i+1, j) counts all PS in category 1 or 3, and g(i, j-1) counts all PS in category 1 or 2, so their sum g(i+1, j) + g(i, j-1) counts all PS in category 2 or 3 once each, and all PS in category 1 twice. Since g(i+1, j-1) counts all PS in category 1 only, subtracting this off to get g(i+1, j) + g(i, j-1) - g(i+1, j-1) gives the total number of PS in category 1, 2 and 3. The remaining PS are those in category 4. If x[i] != x[j] then there are no PS in this category; otherwise, there are exactly as many as there are PS that begin at or after i+1 and end at or before j-1, namely g(i+1, j-1), plus one extra for the 2-character sequence x[i]x[j]. [EDIT: Thanks to commenter Tuxdude for 2 fixes here!]

    With this in hand, we can express g() in a way that changes the quadratic case from f() to constant time:

    g(i, i) = 1 (i.e. when j = i)
    g(i, i+1) = 2 + (x[i] == x[i+1]) (i.e. 3 iff adjacent chars are identical, otherwise 2)
    g(i, j) = 0 when j < i (this new boundary case is needed)
    g(i, j) = g(i+1, j) + g(i, j-1) - g(i+1, j-1) + (x[i] == x[j])*(g(i+1, j-1)+1) when j >= i+2
    

    The final answer is now simply g(1, n).

提交回复
热议问题