How does Duval's algorithm handle odd-length strings?

后端 未结 2 979
南笙
南笙 2021-01-28 10:38

Finding the Lexicographically minimal string rotation is a well known problem, for which a linear time algorithm was proposed by Jean Pierre Duval in 1983. This blog post is pro

2条回答
  •  逝去的感伤
    2021-01-28 10:56

    OP here: I accepted ruakh's answer as it pertains to my question, but I wanted to provide my own explanation for others that might stumble across this post trying to understand Duval's algorithm.

    Problem:

    Lexicographically least circular substring is the problem of finding the rotation of a string possessing the lowest lexicographical order of all such rotations. For example, the lexicographically minimal rotation of "bbaaccaadd" would be "aaccaaddbb".

    Solution:

    A O(n) time algorithm was proposed by Jean Pierre Duval (1983).

    Given two indices i and j, Duval's algorithm compares string segments of length j - i starting at i and j (called a "duel"). If index + j - i is greater than the length of the string, the segment is formed by wrapping around.

    For example, consider s = "baabbaba", i = 5 and j = 7. Since j - i = 2, the first segment starting at i = 5 is "ab". The second segment starting at j = 7 is constructed by wrapping around, and is also "ab". If the strings are lexicographically equal, like in the above example, we choose the one starting at i as the winner, which is i = 5.

    The above process repeated until we have a single winner. If the input string is of odd length, the last character wins without a comparison in the first iteration.

    Time complexity:

    The first iteration compares n strings each of length 1 (n/2 comparisons), the second iteration may compare n/2 strings of length 2 (n/2 comparisons), and so on, until the i-th iteration compares 2 strings of length n/2 (n/2 comparisons). Since the number of winners is halved each time, the height of the recursion tree is log(n), thus giving us a O(n log(n)) algorithm. For small n, this is approximately O(n).

    Space complexity is O(n) too, since in the first iteration, we have to store n/2 winners, second iteration n/4 winners, and so on. (Wikipedia claims this algorithm uses constant space, I don't understand how).

    Here's a Scala implementation; feel free to convert to your favorite programming language.

    def lexicographicallyMinRotation(s: String): String = {
     @tailrec
     def duel(winners: Seq[Int]): String = {
       if (winners.size == 1) s"${s.slice(winners.head, s.length)}${s.take(winners.head)}"
       else {
         val newWinners: Seq[Int] = winners
           .sliding(2, 2)
           .map {
             case Seq(x, y) =>
               val range = y - x
               Seq(x, y)
                 .map { i =>
                   val segment = if (s.isDefinedAt(i + range - 1)) s.slice(i, i + range)
                   else s"${s.slice(i, s.length)}${s.take(s.length - i)}"
                   (i, segment)
                 }
                 .reduce((a, b) => if (a._2 <= b._2) a else b)
                 ._1
             case xs => xs.head
           }
           .toSeq
         duel(newWinners)
       }
     }
    
     duel(s.indices)
    }
    

提交回复
热议问题