Python Anagram Finder from given File

半腔热情 提交于 2019-12-23 05:37:16

问题


I have tried absolutely everything under the sun to figure this out and have gotten nothing. I'm not even sure how to approach the problem. The instructions are as follows...

Your program will ask the user for the name of the file containing the list of words. The word list is formatted to have one word on each line. • For each word, find all anagrams (some have more than one) of that word. • Output: Report how many words have 0, 1, 2, etc., anagrams. Output the list of words that form the most anagrams (if there are multiple sets with the same maximum length, output all of them). • You are expected to use appropriate functional decomposition.

Please keep in mind that I've been programming for just under a month so dumb everything down as much as possible. Thanks in advance.


回答1:


I take it this is homework. You know that anagrams are just permutations of a word. Take things slowly: learn how to calculate anagram for one word, before you learn how to do it for many words. The following interactive session shows how to calculate anagram of a word. You can go on from there.

>>> # Learn how to calculate anagrams of a word
>>> 
>>> import itertools
>>> 
>>> word = 'fun'
>>> 
>>> # First attempt: anagrams are just permutations of all the characters in a word
>>> for permutation in itertools.permutations(word):
...     print permutation
... 
('f', 'u', 'n')
('f', 'n', 'u')
('u', 'f', 'n')
('u', 'n', 'f')
('n', 'f', 'u')
('n', 'u', 'f')
>>> 
>>> # Now, refine the above block to print actual words, instead of tuple
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
fun
fnu
ufn
unf
nfu
nuf
>>> # Note that some words with repeated characters such as 'all'
>>> # has less anagrams count:
>>> word = 'all'
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
all
all
lal
lla
lal
lla
>>> # Note the word 'all' and 'lla' each repeated twice. We need to
>>> # eliminate redundancy. One way is to use set:
>>> word = 'all'
>>> anagrams = set()
>>> for permutation in itertools.permutations(word):
...     anagrams.add(''.join(permutation))
... 
>>> anagrams
set(['lal', 'all', 'lla'])
>>> for anagram in anagrams:
...     print anagram
... 
lal
all
lla
>>> # How many anagrams does the word 'all' have?
>>> # Just count using the len() function:
>>> len(anagrams)
3
>>> 

I pasted the above session here for your convenience.

Update

Now with Aaron's clarification. The problem at lowest level is: how do you determine if two words are anagrams? The answer is: "When they have the same number of letters." The easiest way (for me) is to sort all the letters and compare them.

def normalize(word):
    word = word.strip().lower() # sanitize it
    word = ''.join(sorted(word))
    return word

# sort_letter('top') ==> 'opt'
# Are 'top' and 'pot' anagrams? They are if their sorted letters are the same:
if normalize('top') == normalize('pot'):
    print 'they are the same'
    # Do something

Now that you know how to compare two words, let's work on a list of words:

>>> import collections
>>> anagrams = collections.defaultdict(list)
>>> words = ['top', 'fun', 'dog', 'opt', 'god', 'pot']
>>> for word in words:
...     anagrams[normalize(word)].append(word)
... 
>>> anagrams
defaultdict(<type 'list'>, {'opt': ['top', 'opt', 'pot'], 'fnu': ['fun'], 'dgo': ['dog', 'god']})
>>> for k, v in anagrams.iteritems():
...     print k, '-', v
... 
opt - ['top', 'opt', 'pot']
fnu - ['fun']
dgo - ['dog', 'god']

In the session above, we use anagrams (a defaultdict, which is the same as dict with default values) to store list of words. The keys are the sorted letters. That means, anagrams['opt'] ==> ['top', 'opt', 'pot']. From there, you can tell which has the most anagrams. The rest should be easy enough.



来源:https://stackoverflow.com/questions/15424577/python-anagram-finder-from-given-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!