Creating anagram detector

问题

I'm having trouble getting this anagram function to work. The aim is for the function to take 2 strings abc and cba, convert them into a list; sort them in to alphabetical order, compare the elements of the list and print whether they are anagrams or not.

My code is as follows...

def anagram(str1, str2):
    x = str1
    y = str2

    x1 = x.sort()
    y1 = y.sort()

    if (x1) == (y1):
        print("Anagram is True")
    else:
        print("Anagram is False")

str1 = str('abc')
str2 = str('cba')

print(anagram(str1, str2))

回答1:

Your problem is that you can't call String.sort(). Try changing:

x1 = x.sort()
y1 = y.sort()

to:

x1 = sorted(x)
y1 = sorted(y)

回答2:

you cannot call .sort() on a string, nor should you be cause that is actually a method that sorts a list in place and will not return anything. instead, use sorted(x)

>>> def anagram(str1, str2):
    x1 = sorted(str1)
    y1 = sorted(str2)

    if (x1) == (y1):
        print("Anagram is True")
    else:
        print("Anagram is False")


>>> anagram('abc','bca')
Anagram is True

回答3:

The specific issue

x.sort() works in-place if x is a list. This means the sort method changes the objects internal representation. It also returns None which is the reason why it doesn't work as intended.

If x is a string, there is no .sort() method as strings are immutable.

I recommend to use the sorted() function instead, which returns the sorted string.

The more general issues

There are two more general issues:

Runtime: This is an O(log(n) * n) solution
Unicode modifiers and compound glyphs
Print: You print the value, but instead you should return the result. How would you test your code?

Unicode modifiers

Lets say you wrote the function more compact:

def is_anagram(a: str, b: str) -> bool:
    return sorted(a) == sorted(b)

This works fine for normal characters, but fails for compound glyphs. For example, the thumbsup / thumbsdown emoji can be modified to have different colors. The change in color is actually a second unicode "character" which gives the skin tone. The modifier and the previous character belong together, but sorted just looks at the code points. Which results in this:

>>> is_anagram("👍👎🏿", "👎👍🏿")
True  # <-- This should be False!

Sublime Text shows the actual code points:

You can easily fix this by using the grapheme package:

from grapheme import graphemes
def is_anagram(a: str, b: str) -> bool:
    return sorted(graphemes(a)) == sorted(graphemes(b))

Runtime

You can get O(n) runtime if you don't sort, but instead count characters:

from collections import Counter
from grapheme import grahemes

def is_anagram(a: str, b: str) -> bool:
    return not (Counter(grapheme(a)) - Counter(grapheme(b)))

来源：https://stackoverflow.com/questions/33571749/creating-anagram-detector

标签

python

anagram