Checking if two strings are permutations of each other in Python

前端 未结 22 2296
旧时难觅i
旧时难觅i 2020-12-08 16:04

I\'m checking if two strings a and b are permutations of each other, and I\'m wondering what the ideal way to do this is in Python. From the Zen of

相关标签:
22条回答
  • 2020-12-08 16:25

    How about something like this. Pretty straight-forward and readable. This is for strings since the as per the OP.

    Given that the complexity of sorted() is O(n log n).

    def checkPermutation(a,b):
        # input: strings a and b
        # return: boolean true if a is Permutation of b
    
        if len(a) != len(b):
            return False
        else:
            s_a = ''.join(sorted(a))
            s_b = ''.join(sorted(b))
            if s_a == s_b:
                return True
            else:
                return False
    
    # test inputs
    a = 'sRF7w0qbGp4fdgEyNlscUFyouETaPHAiQ2WIxzohiafEGJLw03N8ALvqMw6reLN1kHRjDeDausQBEuIWkIBfqUtsaZcPGoqAIkLlugTxjxLhkRvq5d6i55l4oBH1QoaMXHIZC5nA0K5KPBD9uIwa789sP0ZKV4X6'
    b = 'Vq3EeiLGfsAOH2PW6skMN8mEmUAtUKRDIY1kow9t1vIEhe81318wSMICGwf7Rv2qrLrpbeh8bh4hlRLZXDSMyZJYWfejLND4u9EhnNI51DXcQKrceKl9arWqOl7sWIw3EBkeu7Fw4TmyfYwPqCf6oUR0UIdsAVNwbyyiajtQHKh2EKLM1KlY6NdvQTTA7JKn6bLInhFvwZ4yKKbzkgGhF3Oogtnmzl29fW6Q2p0GPuFoueZ74aqlveGTYc0zcXUJkMzltzohoRdMUKP4r5XhbsGBED8ReDbL3ouPhsFchERvvNuaIWLUCY4gl8OW06SMuvceZrCg7EkSFxxprYurHz7VQ2muxzQHj7RG2k3khxbz2ZAhWIlBBtPtg4oXIQ7cbcwgmBXaTXSBgBe3Y8ywYBjinjEjRJjVAiZkWoPrt8JtZv249XiN0MTVYj0ZW6zmcvjZtRn32U3KLMOdjLnRFUP2I3HJtp99uVlM9ghIpae0EfC0v2g78LkZE1YAKsuqCiiy7DVOhyAZUbOrRwXOEDHxUyXwCmo1zfVkPVhwysx8HhH7Iy0yHAMr0Tb97BqcpmmyBsrSgsV1aT3sjY0ctDLibrxbRXBAOexncqB4BBKWJoWkQwUZkFfPXemZkWYmE72w5CFlI6kuwBQp27dCDZ39kRG7Txs1MbsUnRnNHBy1hSOZvTQRYZPX0VmU8SVGUqzwm1ECBHZakQK4RUquk3txKCqbDfbrNmnsEcjFaiMFWkY3Esg6p3Mm41KWysTpzN6287iXjgGSBw6CBv0hH635WiZ0u47IpUD5mY9rkraDDl5sDgd3f586EWJdKAaou3jR7eYU7YuJT3RQVRI0MuS0ec0xYID3WTUI0ckImz2ck7lrtfnkewzRMZSE2ANBkEmg2XAmwrCv0gy4ExW5DayGRXoqUv06ZLGCcBEiaF0fRMlinhElZTVrGPqqhT03WSq4P97JbXA90zUxiHCnsPjuRTthYl7ZaiVZwNt3RtYT4Ff1VQ5KXRwRzdzkRMsubBX7YEhhtl0ZGVlYiP4N4t00Jr7fB4687eabUqK6jcUVpXEpTvKDbj0JLcLYsneM9fsievUz193f6aMQ5o5fm4Ilx3TUZiX4AUsoyd8CD2SK3NkiLuR255BDIA0Zbgnj2XLyQPiJ1T4fjStpjxKOTzsQsZxpThY9Fvjvoxcs3HAiXjLtZ0TSOX6n4ZLjV3TdJMc4PonwqIb3lAndlTMnuzEPof2dXnpexoVm5c37XQ7fBkoMBJ4ydnW25XKYJbkrueRDSwtJGHjY37dob4jPg0axM5uWbqGocXQ4DyiVm5GhvuYX32RQaOtXXXw8cWK6JcSUnlP1gGLMNZEGeDXOuGWiy4AJ7SH93ZQ4iPgoxdfCuW0qbsLKT2HopcY9dtBIRzr91wnES9lDL49tpuW77LSt5dGA0YLSeWAaZt9bDrduE0gDZQ2yX4SDvAOn4PMcbFRfTqzdZXONmO7ruBHHb1tVFlBFNc4xkoetDO2s7mpiVG6YR4EYMFIG1hBPh7Evhttb34AQzqImSQm1gyL3O7n3p98Kqb9qqIPbN1kuhtW5mIbIioWW2n7MHY7E5mt0'
    
    print(checkPermutation(a, b)) #optional
    
    0 讨论(0)
  • 2020-12-08 16:26

    This is a PHP function I wrote about a week ago which checks if two words are anagrams. How would this compare (if implemented the same in python) to the other methods suggested? Comments?

    public function is_anagram($word1, $word2) {
        $letters1 = str_split($word1);
        $letters2 = str_split($word2);
        if (count($letters1) == count($letters2)) {
            foreach ($letters1 as $letter) {
                $index = array_search($letter, $letters2);
                if ($index !== false) {
                    unset($letters2[$index]);
                }
                else { return false; }
            }
            return true;
        }
        return false;        
    }
    

    Here's a literal translation to Python of the PHP version (by JFS):

    def is_anagram(word1, word2):
        letters2 = list(word2)
        if len(word1) == len(word2):
           for letter in word1:
               try:
                   del letters2[letters2.index(letter)]
               except ValueError:
                   return False               
           return True
        return False
    

    Comments:

        1. The algorithm is O(N**2). Compare it to @namin's version (it is O(N)).
        2. The multiple returns in the function look horrible.
    
    0 讨论(0)
  • 2020-12-08 16:31

    Your second example won't actually work:

    all(a.count(char) == b.count(char) for char in a)
    

    will only work if b does not contain extra characters not in a. It also does duplicate work if the characters in string a repeat.

    If you want to know whether two strings are permutations of the same unique characters, just do:

    set(a) == set(b)
    

    To correct your second example:

    all(str1.count(char) == str2.count(char) for char in set(a) | set(b))
    

    set() objects overload the bitwise OR operator so that it will evaluate to the union of both sets. This will make sure that you will loop over all the characters of both strings once for each character only.

    That said, the sorted() method is much simpler and more intuitive, and would be what I would use.

    0 讨论(0)
  • 2020-12-08 16:32

    This version is faster than any examples presented so far except it is 20% slower than sorted(x) == sorted(y) for short strings. It depends on use cases but generally 20% performance gain is insufficient to justify a complication of the code by using different version for short and long strings (as in @patros's answer).

    It doesn't use len so it accepts any iterable therefore it works even for data that do not fit in memory e.g., given two big text files with many repeated lines it answers whether the files have the same lines (lines can be in any order).

    def isanagram(iterable1, iterable2):
        d = {}
        get = d.get
        for c in iterable1:
            d[c] = get(c, 0) + 1
        try:
            for c in iterable2:
                d[c] -= 1
            return not any(d.itervalues())
        except KeyError:
            return False
    

    It is unclear why this version is faster then defaultdict (@namin's) one for large iterable1 (tested on 25MB thesaurus).

    If we replace get in the loop by try: ... except KeyError then it performs 2 times slower for short strings i.e. when there are few duplicates.

    0 讨论(0)
提交回复
热议问题