Convert string to palindrome with fewest number of insertions

落花浮王杯 提交于 2019-12-12 18:20:19

问题


This is a question from https://www.dailycodingproblem.com/:

Given a string, find the palindrome that can be made by inserting the fewest number of characters as possible anywhere in the word. If there is more than one palindrome of minimum length that can be made, return the lexicographically earliest one (the first one alphabetically).

For example, given the string "race", you should return "ecarace", since we can add three letters to it (which is the smallest amount to make a palindrome). There are seven other palindromes that can be made from "race" by adding three letters, but "ecarace" comes first alphabetically.

As another example, given the string "google", you should return "elgoogle".

It is similar to this SO question, or this GeeksforGeeks post. Similar, but not the same; none of them provide any explanation for the recurrence, as if they plucked the solution out of thin air, and they don't reconstruct the solution, let alone the lexicographically earliest one.

After some thinking, my understanding is as follows:

Observe that for any string s[i..j], if s[i] == s[j], then the number of insertions required to make it a palindrome is the same as the number of insertions required to make s[i+1..j-1] a palindrome.

If, however, s[i] != s[j], then we may convert s[i..j-1] to a palindrome and then insert s[j] at the beginning, or convert s[i+1..j] to a palindrome and insert s[i] at the end. Since we are looking for the fewest number of insertions, we will choose the minimum of the two options. The number of insertions is one more than the number of insertions required for the chosen subproblem (for adding a character at the beginning or at the end).

How do I reconstruct the lexicographically earliest solution?


回答1:


First, lets answer "how do I reconstruct the solution", then focus on ordering. Assuming you store the number of insertions in a 2D matrix insertions[start][stop], you just need to retrace your steps, "collecting" the characters inserted as you go. We'll need a new array to store out output string, of length equal to our starting string plus the minimal number of insertions. We'll also store two indices, pointing to the next available spots from the front and back into the array.

Start by comparing the first and last letters of the current substring, and if equal assign the output string both of those, in the next available positions from the front and back respectively. For example, if we have FYRF as our current substring, we'll assign our output string F..F, where . are undetermined characters. Our substring then becomes s[i+1..j-1] or YR.

If the two characters do not match, we'll compare our records in insertions[i+1][j] and insertions[i][j-1], to see which is smaller (at least one of them will be exactly one less than insertions[i][j]). If they're equal, just pick one (we'll return to this later). Assign the character in our output string which corresponds to the letter of the substring we duplicated / inserted, at the next available front and back indices into the output string. That is, in the case JLL, if we decide to add a J for JLLJ, we'd take the substring s[i+1..j], so we'd store J and J in our output string J..J. If our output string already contained AR....RA, we'd have stored ARJ..JRA instead. We repeat this entire process until all characters are assigned.

Now, to make it ordered lexicographically. Regarding the case in the previous paragraph where insertions[i+1][j] and insertions[i][j-1] are equal, we shouldn't pick one of them at random. Instead, we should compare s[i] and s[i+1] lexicographically, and if s[i] comes first, insert s[i] into the output string / proceed on insertions[i+1][j]. Otherwise, use s[i+1] / insertions[i][j-1]. This will give us the lexicographically soonest string from all available options.




回答2:


OP here: @dillon-davis' answer is correct (upvoted), although I had figured it out myself by then. I've already provided the explanation of the basic algorithm in the question, @dillon-davis has provided an explanation of the reconstruction, here's the working code in Scala for completeness sake.

def makePalindromeByFewestEdits(word: String): String = {
    val n = word.length
    val dp = Array.ofDim[Int](n, n)

    for (window <- 1 until n)
      (0 until n)
        .map(start => (start, start + window))
        .takeWhile(_._2 < n)
        .foreach {
          case (start, end) if word(start) == word(end) =>
            dp(start)(end) = dp(start + 1)(end - 1)
          case (start, end) =>
            dp(start)(end) = math.min(dp(start + 1)(end), dp(start)(end - 1)) + 1
        }

    val minInsertions = dp(0)(n - 1)
    val palindrome = Array.ofDim[Char](n + minInsertions)

    @tailrec
    def reconstruct(start: Int, end: Int, count: Int, offset: Int): String = {
      if (count == 0) {
        // we have written 'start' characters from the beginning, the current insertion index is 'offset', and
        // the number of characters left to be written are the substring word[start..end]
        Array.copy(word.toCharArray, start, palindrome, offset, end - start + 1)
        palindrome.mkString
      } else {
        val (s, e, c, ch) = if (word(start) == word(end))
          (start + 1, end - 1, count, word(start))
        else if (dp(start + 1)(end) < dp(start)(end - 1) ||
          (dp(start + 1)(end) == dp(start)(end - 1) && word(start) < word(end))
        )
          (start + 1, end, count - 1, word(start))
        else
          (start, end - 1, count - 1, word(end))

        palindrome(offset) = ch
        palindrome(palindrome.length - 1 - offset) = ch
        reconstruct(s, e, c, offset + 1)
      }
    }

    reconstruct(0, n - 1, minInsertions, 0)
}


来源:https://stackoverflow.com/questions/55178669/convert-string-to-palindrome-with-fewest-number-of-insertions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!