Find anagrams of a given word in a file

删除回忆录丶 提交于 2019-12-11 05:04:05

问题


Alright so for class we have this problem where we need to be able to input a word and from a given text file (wordlist.txt) a list will be made using any anagrams of that word found in the file.

My code so far looks like this:

def find_anagrams1(string):
"""Takes a string and returns a list of anagrams for that string from the wordlist.txt file.

string -> list"""
anagrams = []

file = open("wordlist.txt")
next = file.readline()
while next != "":
    isit = is_anagram(string, next)
    if isit is True:
        anagrams.append(next)
    next = file.readline()
file.close()

return anagrams

Every time I try to run the program it just returns an empty list, despite the fact that I know there are anagrams present. Any ideas on what's wrong?

P.S. The is_anagram function looks like this:

def is_anagram(string1, string2):
"""Takes two strings and returns True if the strings are anagrams of each other.

list,list -> string"""
a = sorted(string1)
b = sorted(string2)
if a == b:
    return True
else:
    return False

I am using Python 3.4


回答1:


The problem is that you are using the readline function. From the documentation:

file.readline = readline(...)
readline([size]) -> next line from the file, as a string.

Retain newline.  A non-negative size argument limits the maximum
number of bytes to return (an incomplete line may be returned then).
Return an empty string at EOF.

The key information here is "Retain newline". That means that if you have a file containing a list of words, one per line, each word is going to be returned with a terminal newline. So when you call:

next = file.readline()

You're not getting example, you're getting example\n, so this will never match your input string.

A simple solution is to call the strip() method on the lines read from the file:

next = file.readline().strip()
while next != "":
    isit = is_anagram(string, next)
    if isit is True:
        anagrams.append(next)
    next = file.readline().strip()
file.close()

However, there are several problems with this code. To start with, file is a terrible name for a variable, because this will mask the python file module.

Rather than repeatedly calling readline(), you're better off taking advantage of the fact that an open file is an iterator which yields the lines of the file:

words = open('wordlist.txt')
for word in words:
    word = word.strip()
    isit = is_anagram(string, word)
    if isit:
      anagrams.append(word)
words.close()

Note also here that since is_anagram returns True or False, you don't need to compare the result to True or False (e.g., if isit is True). You can simply use the return value on its own.




回答2:


Yikes, don't use for loops:

import collections

def find_anagrams(x):
    anagrams = [''.join(sorted(list(i))) for i in x]
    anagrams_counts = [item for item, count in collections.Counter(anagrams).items() if count > 1]
    return [i for i in x if ''.join(sorted(list(i))) in anagrams_counts]


来源:https://stackoverflow.com/questions/28868716/find-anagrams-of-a-given-word-in-a-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!