Trying to come up with python anagram function

别来无恙 提交于 2019-12-31 04:31:06

问题


What I'm trying to do is if I have a list like:

["lime", "mile", "liem", "tag", "gat", "goat", "math"]

I want to write a function that returns the words in the list that have an anagram, which would look like:

["lime", "mile", "liem", "tag", "gat",]

So far I have this code:

def anagramprinter(x):

    output = []     
    for i in x:
        for n in i:
            if n in x[i]:

I can't past this part and would like some help and would also appreciate a thorough explanation as well.

Can anyone please show me a way that doesn't involve importing? Thanks

Thanks.


回答1:


an approach identifying the words by the frozenset of the characters:

from collections import defaultdict

wordlist = ["lime", "mile", "liem", "tag", "gat", "goat", "math"]

worddict = defaultdict(list) 
for word in wordlist:
    worddict[frozenset(word)].append(word)

anagrams = [words for words in worddict.values() if len(words) > 1]
print(anagrams)

# [['lime', 'mile', 'liem'], ['tag', 'gat']]

the output is not yet quite what you wanted, but flattening that list is easy if you prefer that.


update after comments:

the solution above will not handle words with repeating characters well. but this will (this time the key of the dictionary is just the string consisting of the sorted letters):

for word in wordlist:
    worddict[''.join(sorted(word))].append(word)



回答2:


The easy way to analyses anagram words is to put them in alphabetic order.So you create a second list with alphabetic ordered words.

['lime', 'mile', 'liem', 'tag', 'gat']

index = 0
b = []
for i in a:
    b.insert(index, ''.join(sorted(i)))
    index = index + 1

['eilm', 'eilm', 'eilm', 'agt', 'agt']

I think you can have more pythonesque code than the one i give you, but i think the important thing for you is to order letters in the word.

Now you can do something to analyse your anagrams




回答3:


That's a decent start (though it would be clearer if you named the variables something like 'wordlist', 'word' (or even 'w'), and 'char' or 'c'...). But a couple issues:

1: for each word ('i'), you need to compare the other words, hoping to find at least one that is an anagram of i.

2: you need to see if any character fails to be found.

You could start like this:

output = []     
for w1 in wordlist:
    for w2 in wordList:
        if w1==w2: continue  # don't compare to self
        match = True  # hope for the best
        for c in w1:
            if c not in w2: 
                match = False
                break
        if (match):
           output.append(w1)
           break

That's close, but isn't actually enough, because to be a true anagram you have to have the same number of occurrences of each letter, not just the same set of distinct letters (consider 'mail' vs 'milla' or 'mailmailmail').

One way to do that would be to make a copy of w2, and then as you go through the characters of w1, remove the letter in that copy that matches against each letter of w1. That way it can't match twice. And, you'd need to make sure the copy has become empty when you 're done with the 'c' loop.

There are many other ways; some clever ones involve "collection" types such as set and multiset. And as Captain Wise suggested, sorting the characters in each word alphabetically lets you just compare them, instead of looping through characters one at a time.

Hope that helps.

-s




回答4:


You can use itertools to create all permutations of the words, remove the word you just found the permutations of, and then check your list one word at a time to see if it is in permutations like so

from itertools import permutations

l = ["lime", "mile", "liem", "tag", "gat", "goat", "math"]
final = []
perms = []
for i in l:
    perms += [''.join(p) for p in permutations(i)]
    perms.remove(i)

for i in l:
    if i in perms:
        final.append(i)
print final

This isn't the fastest solution in the world, especially if you have long words like 'resistance', 'ancestries'




回答5:


An algorithm to check if two words are anagrams in python.

1) Take two words: e.g.

("mile", "lime") ("tiles", "miles")

2) Make a string array/list:

(['m', 'i', 'l', 'e'], ['l', 'i', 'm', 'e']) (['t', 'i', 'l','e', 's'], ['m', 'i', 'l', 'e', 's'])

3) Sort arrays

(['e', 'i', 'l', 'm'], ['e', 'i', 'l', 'm']) (['e', 'i', 'l','s', 't'], ['e', 'i', 'l', 'm', 's'])

4)Check if first_array[i] == second_array[i] for 0<=i<=len(first_array)||second_array

5) Conclusion. If 4) is held, return true, else false.

from itertools import combinations

def anagram(w1,w2):
    list1 = list(w1)
    list2 = list(w2)

    list1.sort()
    list2.sort()

    idx = 0
    is_anagram = True

    while idx < len(w1) and is_anagram:
        if list1[idx]== list2[idx]:
            idx += 1
        else:
            is_anagram = False
    return is_anagram


lst_words = ["lime", "mile", "liem", "tag", "gat", "goat", "math"]
lst_anagrams = set()
for i in combinations(lst_words, 2):
    if anagram(*i):
        lst_anagrams |= set(i) 

print list(lst_anagrams)



回答6:


Checks two given strings if they're anagrams or not. Those strings may include spaces, numbers or special characters

#First of all define a function that counts the number of alphabets in a string. It'll be used as a final condition to check for anagrams
def count_alpha(text):
    text = text.lower()
    count = 0
    i = 97    #ASCII code range from 'a' to 'z' is 97 to 122
    while i < 123:
        if chr(i) in text:
            count += text.count(chr(i))
        i += 1
    return count
text1 = input('Enter your First Word: ')
text2 = input('Enter your Second Word: ')
#replace all the spaces with empty strings and make the string lower case 
text1 = text1.replace(' ','').lower()
text2 = text2.replace(' ','').lower()
i = 97
while i < 123:
    #check if an alphabet count in both strings is the same.
    if text1.count(chr(i)) == text2.count(chr(i)):
        #replace all the alphabets with spaces
        text1 = text1.replace(chr(i),' ')
        text2 = text2.replace(chr(i),' ')
    i += 1  
#since all the alphabets have been replaced by spaces. There's no alphabet left(if they had the same number of particular alphabets)
if count_alpha(text1) == 0 and count_alpha(text2) == 0:
    print('They are anagrams')
else: print('They are not anagrams')

So here's your code. Enjoy!




回答7:


def does_string_contain(big_word, small_word) :
    list_string = list(big_word)
    for char in small_word:
        if char in list_string:
            list_string.pop(list_string.index(char))
        else:
            return False
    for char in small_word:
        if char in list_string:
            return False
    return True


来源:https://stackoverflow.com/questions/33023682/trying-to-come-up-with-python-anagram-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!