I am currently working on a small anagram program that takes all possible permutations of a word and compare them with a dictionary. However, I am unable to get the results to
Note that if your word has W
letters, and your dictionary has D
words, your search is doing W! * D
comparisons.
You can reduce this to D
comparisons by converting both words to canonical form (ie letters in alphabetical order).
If you are going to search for N
words you could reduce it further to D / N
comparisons per word (amortized) by storing your dictionary as {canonical_form: [list,of,matching,words]}
:
from collections import defaultdict
DICT_FILE = "dictionary.txt"
def canonize(word):
# "hello\n" => "ehllo"
return "".join(sorted(word.strip()))
def load_dict(fname=DICT_FILE):
lookup = defaultdict(list)
with open(fname) as inf:
for line in inf:
word = line.strip()
canon = canonize(word)
lookup[canon].append(word)
# lookup["ehllo"] = ["hello"]
return lookup
def main():
anagrams = load_dict()
while True:
word = input("Enter word to search for (or hit Enter to quit): ").strip()
if not word:
break
else:
canon = canonize(word)
if canon in anagrams:
print("Found: " + ", ".join(anagrams[canon]))
else:
print("No anagrams found.")
if __name__ == "__main__":
main()
which then runs like
Enter word to search for (or hit Enter to quit): tester
Found: retest, setter, street, tester
Enter word to search for (or hit Enter to quit): binary
Found: binary, brainy
Enter word to search for (or hit Enter to quit): ttt
No anagrams found.
Enter word to search for (or hit Enter to quit):
Replace the newlines from the variable:
compare = compare.replace('\n', '')
compare = txt.readlines()
readlines()
doesn't strip the line endings from each line, so each line will have a \n
at the end. That causes all your comparisons against compare[j]
to fail.
You could remove the \n
's with something like.
compare = [line.strip() for line in txt]