How to find the minimum number of operation(s) to make the string balanced?

后端 未结 5 1618
夕颜
夕颜 2020-12-29 13:27

From Codechef:

A string is considered balanced if and only if all the characters occur in it equal number of times.

You are given a strin

相关标签:
5条回答
  • 2020-12-29 13:37
    if __name__ == "__main__":
      for j in range(int(input())):
        S = str(input())
        N = len(S)
        A = [0]*27
        for c in S:
          A[ord(c) - 65] = A[ord(c) - 65] + 1
        A = sorted(A,reverse=True)
        minSwap = N + 1
        for i in range(1,27):
          if N%i == 0:
            temp = N//i
            tempSwap = 0
            for f in range(i):
              if temp > A[f]:
                tempSwap = tempSwap + temp - A[f]
            if tempSwap <= minSwap:
               minSwap = tempSwap
        if minSwap == N+1:
            minSwap = 0
        print(minSwap)
    
    0 讨论(0)
  • 2020-12-29 13:40

    Following code implement the solution, in Java, with unit test.

    The algorithm is almost identical as in @ruakh's answer, if not identical.


    Code

    BalanceString.java

    import java.util.Arrays;
    import java.util.Collections;
    import java.util.HashMap;
    import java.util.Map;
    
    /**
     * Assume string only contains A-Z, the 26 uppercase letter,
     * <p>given a string, you can replace a char with another char from the 26 letter,
     * <p>figure out the minimum replacement required to make the string balance,
     * <p>which means each char in the string occurs the same time,
     *
     * @author eric
     * @date 2/2/19 8:54 PM
     */
    public class BalanceString {
        private final char minChar;
        private final char maxChar;
        private final int distinctChars; // total distinct char count,
    
        public static final BalanceString EN_UPPER_INSTANCE = new BalanceString('A', 'Z');
    
        public BalanceString(char minChar, char maxChar) {
            this.minChar = minChar;
            this.maxChar = maxChar;
            this.distinctChars = maxChar - minChar + 1;
    
            if (distinctChars <= 0)
                throw new IllegalArgumentException("invalid range of chars: [" + minChar + ", " + maxChar + "]");
        }
    
        /**
         * Check minimal moves needed to make string balanced.
         *
         * @param str
         * @return count of moves
         */
        public int balanceCount(String str) {
            // System.out.printf("string to balance:\t%s\n", str);
            int len = str.length(); // get length,
            if (len <= 2) return 0; // corner cases,
    
            Map<Character, Integer> coMap = figureOccurs(str); // figure occurrences,
            Integer[] occurs = sortOccursReversely(coMap); // reversely order occurrences,
    
            int m = coMap.size(); // distinct char count,
            int maxN = (len < distinctChars ? len : distinctChars); // get max possible distinct char count, included,
    
            int smallestMoves = Integer.MAX_VALUE; // smallest moves, among all possible n,
    
            // check each possible n, and get its moves,
            for (int n = 1; n <= maxN; n++) {
                if (len % n == 0) {
                    int moves = figureMoves(len, coMap, occurs, m, n);
                    if (moves < smallestMoves) smallestMoves = moves;
                }
            }
    
            return smallestMoves;
        }
    
        /**
         * Figure occurs for each char.
         *
         * @param str
         * @return
         */
        protected Map<Character, Integer> figureOccurs(String str) {
            Map<Character, Integer> coMap = new HashMap<>();
            for (char c : str.toCharArray()) {
                if (c < minChar || c > maxChar)
                    throw new IllegalArgumentException(c + " is not within range 'A-Z'");
    
                if (!coMap.containsKey(c)) coMap.put(c, 1);
                else coMap.put(c, coMap.get(c) + 1);
            }
    
            return coMap;
        }
    
        /**
         * Get reverse sorted occurrences.
         *
         * @param coMap
         * @return
         */
        protected Integer[] sortOccursReversely(Map<Character, Integer> coMap) {
            Integer[] occurs = new Integer[coMap.size()];
    
            coMap.values().toArray(occurs);
            Arrays.sort(occurs, Collections.reverseOrder());
    
            return occurs;
        }
    
        /**
         * Figure moves needed to balance.
         *
         * @param len   length of string,
         * @param coMap
         * @param m     original distinct elements count,
         * @param n     new distinct elements count,
         * @return count of moves,
         */
        protected int figureMoves(int len, Map<Character, Integer> coMap, Integer[] occurs, int m, int n) {
            int avgOccur = len / n; // average occurrence,
            int moves = 0;
    
            if (n == m) { // distinct chars don't change,
                for (Integer occur : occurs) {
                    if (occur <= avgOccur) break;
                    moves += (occur - avgOccur);
                }
            } else if (n < m) { // distinct chars decrease,
                for (int i = 0; i < n; i++) moves += Math.abs(occurs[i] - avgOccur); // for elements kept,
                for (int i = n; i < m; i++) moves += occurs[i]; // for elements to replace,
                moves >>= 1;
            } else { // distinct chars increase,
                for (int i = 0; i < occurs.length; i++) moves += Math.abs(occurs[i] - avgOccur); // for existing elements,
                moves += ((n - m) * avgOccur); // for new elements,
                moves >>= 1;
            }
    
            return moves;
        }
    
        public char getMinChar() {
            return minChar;
        }
    
        public char getMaxChar() {
            return maxChar;
        }
    
        public int getDistinctChars() {
            return distinctChars;
        }
    }
    

    BalanceStringTest.java
    (Unit test, via TestNG)

    import org.testng.Assert;
    import org.testng.annotations.Test;
    
    /**
     * BalanceString test.
     *
     * @author eric
     * @date 2/2/19 9:36 PM
     */
    public class BalanceStringTest {
        private BalanceString bs = BalanceString.EN_UPPER_INSTANCE;
    
        @Test
        public void test() {
            // n < m case,
            Assert.assertEquals(bs.balanceCount("AAAABBBC"), 1); // e.g 1A -> B,
            Assert.assertEquals(bs.balanceCount("AAAAABBC"), 2); // e.g 1A -> B, 1C -> B,
            Assert.assertEquals(bs.balanceCount("AAAAAABC"), 2); // e.g 1B -> A, 1C -> A,
            Assert.assertEquals(bs.balanceCount("AAAAAAAB"), 1); // e.g 1B -> A,
    
            // n > m case,
            Assert.assertEquals(bs.balanceCount("AAAABBBBCCCCDDDDEEEEAAAA"), 4); // add 1 new char, e.g change 4 A to 4 F,
            Assert.assertEquals(bs.balanceCount(genIncSeq(10)), 15); // A-J, 10 distinct chars, 55 in length; solved by add 1 new char, need 15 steps,
    
            // n == m case,
            Assert.assertEquals(bs.balanceCount(genIncSeq(3)), 1); // A-C, 3 distinct chars, 6 in length; symmetric, solved with same distinct chars, need 1 steps,
            Assert.assertEquals(bs.balanceCount(genIncSeq(11)), 15); // A-K, 11 distinct chars, 66 in length; symmetric, solved with same distinct chars, need 15 steps,
    
            // n < m, or n > m case,
            Assert.assertEquals(bs.balanceCount("ABAC"), 1); // e.g 1A -> B, or 1A -> D,
        }
    
        // corner case,
        @Test
        public void testCorner() {
            // m <= 2,
            Assert.assertEquals(bs.balanceCount(""), 0);
            Assert.assertEquals(bs.balanceCount("A"), 0);
            Assert.assertEquals(bs.balanceCount("AB"), 0);
            Assert.assertEquals(bs.balanceCount("AA"), 0);
    
            /*------ m == n == distinctChars ------*/
            String mndBalanced = genMndBalancedSeq(); // each possible char occurs exactly once, already balanced,
            Assert.assertEquals(mndBalanced.length(), bs.getDistinctChars());
            Assert.assertEquals(bs.balanceCount(mndBalanced), 0); // no need change,
    
            char lastChar = mndBalanced.charAt(mndBalanced.length() - 1);
            String mndOneDup = mndBalanced.replace(lastChar, (char) (lastChar - 1)); // (distinctChars -2) chars occur exactly once, one occurs twice, one is missing, thus it's one step away to balance,
            Assert.assertEquals(mndOneDup.length(), bs.getDistinctChars());
            Assert.assertEquals(bs.balanceCount(mndOneDup), 1); // just replace the duplicate char with missing char,
        }
    
        // invalid input,
        @Test(expectedExceptions = IllegalArgumentException.class)
        public void testInvalidInput() {
            Assert.assertEquals(bs.balanceCount("ABAc"), 1);
        }
    
        // invalid char range, for constructor,
        @Test(expectedExceptions = IllegalArgumentException.class)
        public void testInvalidRange() {
            new BalanceString('z', 'a');
        }
    
        /**
         * Generate a string, with first char occur once, second twice, third three times, and so on.
         * <p>e.g A, ABB, ABBCCC, ABBCCCDDDD,
         *
         * @param m distinct char count,
         * @return
         */
        private String genIncSeq(int m) {
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < m; i++) {
                for (int j = 0; j <= i; j++) sb.append((char) (bs.getMinChar() + i));
            }
            return sb.toString();
        }
    
        /**
         * Generate a string that contains each possible char exactly once.
         * <p>For [A-Z], it could be: "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
         *
         * @return
         */
        private String genMndBalancedSeq() {
            StringBuilder sb = new StringBuilder();
            char minChar = bs.getMinChar();
            int distinctChars = bs.getDistinctChars();
    
            for (int i = 0; i < distinctChars; i++) {
                sb.append((char) (minChar + i));
            }
            return sb.toString();
        }
    }
    

    All test cases would pass.


    Complexity

    • Time: O(len) + O(m * lg(m)) + O(m * factorCount)
      • Each sequential scan takes O(len), there are several sequential loop.
      • Sorting of array takes O(m*lg(m)), which is at most O(distinctChars * lg(distinctChars)), thus constant, and only sort once.
      • To figure out moves for each n, takes O(m).
      • The count of n that needs to figure moves, depends on count of divisible numbers for len, in the range [minChar, maxChar].
        This count if kind small & constant too.
    • Space: O(len)
      • Input string need O(len).
      • Counter hashmap need O(m).
      • Sorted occurrence array need O(m).

    Where:

    • len is string length.
    • m is distinct char count in original string
    • distinctChars is distinct char count, e.g 26.
    • maxN max possible distinct char count, included,
    • factorCount divisible number count in range [1, n], by len,
    • minChar min char, e.g A
    • maxChar max char, e.g Z

    And:

    • len >= m
    • m <= distinctChars
    0 讨论(0)
  • 2020-12-29 13:42
    #include <iostream>
    #include <string>
    #include <vector>
    
    int countOps(std::vector<int> &map, int requiredFreq) {
        int countOps = 0, greaterFreq = 0, lesserFreq = 0;
        for (auto a : map) {
            if (a > 0 && a < requiredFreq) {
                lesserFreq =  lesserFreq + abs(requiredFreq - a);
            }
            else if (a > 0 && a > requiredFreq) {
                greaterFreq =  greaterFreq + abs(requiredFreq - a);
            }
        }
        
        countOps = greaterFreq > lesserFreq ? (lesserFreq + (greaterFreq - lesserFreq)) : lesserFreq;
        
        return countOps;
    }
    
    int balanceString(std::string &s, long n){
        
        std::vector<int> map(26, 0);
        int distinctChar = 0;
        int requiredFreq = -1;
        int count = INT_MAX;
        
        // Creating map with frequency and counting distinct chars
        for (char x : s) {
            if (map[x - 'a'] == 0) distinctChar++;
            map[x - 'a']++;
        }
        
        std::sort(std::begin(map), std::end(map), std::greater<int>{});
    
        // If size is multiple of distinctChar
        if (n % distinctChar == 0) {
            requiredFreq = int(n / distinctChar);
            count = countOps(map, requiredFreq);
        }
        else {
            for (int i = 1; i < distinctChar;  i++) {
                if (n % i == 0){
                    requiredFreq = int(n / i);
                    
                    std::vector<int> temp(map.begin(), map.begin() + i);
                    int x = countOps(temp, requiredFreq);
                    count = std::min(count, x);
                }
            }
        }
        
        return count;
    }
    
    int main() {
        std::string s = "aaaaccddefghiii";
        long n = s.size();
        
        if(n <= 1) return 0;
        
        int count = balanceString(s, n);
    
        std::cout << count << std::endl;
        
        return 0;
    }
    
    0 讨论(0)
  • 2020-12-29 13:46

    I don't think you really need dynamic programming here.

    One approach in O(length(S)) time:

    • Iterate over S, build a frequency-map (a mapping from distinct letters A–Z to counts). For your ABCB example, that would be A->1 B->2 C->1 D->0 E->0 ... Z->0, which we can represent as the array [1, 2, 1, 0, 0, ..., 0].
      • We can do this because we don't actually care about the positions of letters at all; there's no real difference between ABCB and ABBC, in that each can be balanced by replacing its C with an A.
    • Sort the array. For your example, that gives [0, 0, ..., 0, 1, 1, 2].
      • We can do this because we don't actually care about which letter was which; there's no real difference between ABCB and ABDB, in that each can be balanced by replacing one of its singleton letters with its other one.
    • Assuming the string is nonempty (since if it's empty then the answer is just 0), the final balanced string will contain at least 1 and at most 26 distinct letters. For each integer i between 1 and 26, determine how many operations you'd need to perform in order to produce a balanced string with i distinct letters:
      • First, check that length(S) is a multiple of i; if not, this isn't possible, so skip to the next integer.
      • Next, compute length(S)/i, the count of each distinct letter in the final balanced string.
      • To count how many operations need to be performed, we're going to examine all the counts that need to be increased. (We could, equivalently, examine the counts that need to be decreased: they'll have to match.)
      • We're only interested in the last i elements of the sorted array. Any elements before that point represent letters that won't occur in the balanced string: either the counts are already zero and will remain so, or they are nonzero but will be decreased to zero. Either way, since we're only tracking increases, we can ignore them.
      • For each of the last i counts that are less than length(S)/i, add the difference. This sum is the number of operations. (Note that, since the counts are sorted, you can exit this inner loop as soon as you get to a count that's greater than or equal to the target count.)
      • You can exit this loop after the first i that's greater than or equal to the number of distinct letters in the original S (aside from values of i that we had to skip because they don't evenly divide length(S)). For example, if length(S) = 100, and the original S has 19 distinct letters, then we only need to consider i as high as 20. (Hat-tip to Eric Wang for suggesting something along these lines.)
    • Return the least of these up-to-26 sums. (Note that you don't actually need to store all the sums; you can just keep track of the minimum.)
    0 讨论(0)
  • 2020-12-29 14:01

    Towards solving this problem ,I think it is also useful to find out all over extra presence( sum of number of elements which present more than one times) of different elements in string

    for ex: in aabbc ,number of elements we have to remove to make presence of each element one is equal to 2 (this is called good string)

    `x=input()
    char=26
    total=0
    lis=[0]*char
    #print(lis)
    
    for i in range(len(x)):
        lis[ord(x[i])-ord('a')]+=1
    #print(lis) 
    
    for i in range(26):
        if(lis[i]>1):
            total=total+(lis[i]-1)
    print(total)        
    

    `

    0 讨论(0)
提交回复
热议问题