How to return unique words from the text file using Python

后端 未结 9 2124
遇见更好的自我
遇见更好的自我 2021-01-04 23:45

How do I return all the unique words from a text file using Python? For example:

I am not a robot

I am a human

Should return:

相关标签:
9条回答
  • 2021-01-05 00:17
    for word in word_list:
        if word not in word_list:
    

    every word is in word_list, by definition from the first line.

    Instead of that logic, use a set:

    unique_words = set(word_list)
    for word in unique_words:
        file.write(str(word) + "\n")
    

    sets only hold unique members, which is exactly what you're trying to achieve.

    Note that order won't be preserved, but you didn't specify if that's a requirement.

    0 讨论(0)
  • 2021-01-05 00:17
    def unique_file(input_filename, output_filename):
        input_file = open(input_filename, 'r')
        file_contents = input_file.read()
        input_file.close()
        duplicates = []
        word_list = file_contents.split()
        file = open(output_filename, 'w')
        for word in word_list:
            if word not in duplicates:
                duplicates.append(word)
                file.write(str(word) + "\n")
        file.close()
    

    This code loops over every word, and if it is not in a list duplicates, it appends the word and writes it to a file.

    0 讨论(0)
  • 2021-01-05 00:19

    The problem with your code is word_list already has all possible words of the input file. When iterating over the loop you are basically checking if a word in word_list is not present in itself. So it'll always be false. This should work.. (Note that this wll also preserve the order).

    def unique_file(input_filename, output_filename):
      z = []
      with open(input_filename,'r') as fileIn, open(output_filename,'w') as fileOut:
          for line in fileIn:
              for word in line.split():
                  if word not in z:
                     z.append(word)
                     fileOut.write(word+'\n')
    
    0 讨论(0)
提交回复
热议问题