How do I return all the unique words from a text file using Python? For example:
I am not a robot
I am a human
Should return:
for word in word_list:
if word not in word_list:
every word
is in word_list
, by definition from the first line.
Instead of that logic, use a set:
unique_words = set(word_list)
for word in unique_words:
file.write(str(word) + "\n")
set
s only hold unique members, which is exactly what you're trying to achieve.
Note that order won't be preserved, but you didn't specify if that's a requirement.
def unique_file(input_filename, output_filename):
input_file = open(input_filename, 'r')
file_contents = input_file.read()
input_file.close()
duplicates = []
word_list = file_contents.split()
file = open(output_filename, 'w')
for word in word_list:
if word not in duplicates:
duplicates.append(word)
file.write(str(word) + "\n")
file.close()
This code loops over every word, and if it is not in a list duplicates
, it appends the word and writes it to a file.
The problem with your code is word_list already has all possible words of the input file. When iterating over the loop you are basically checking if a word in word_list is not present in itself. So it'll always be false. This should work.. (Note that this wll also preserve the order).
def unique_file(input_filename, output_filename):
z = []
with open(input_filename,'r') as fileIn, open(output_filename,'w') as fileOut:
for line in fileIn:
for word in line.split():
if word not in z:
z.append(word)
fileOut.write(word+'\n')