问题
im writing a report and I need to count unique words of text files.
My texts are in D:\shakeall and they're totally 42 files...
I know some about Python, but I don't know what to do now.
This is what I know how it works.
read files in directory
make up a list of words from texts
count total/unique words
all I know is this. and some about for, while, lists and indexes, variables, lists...
What I want to do is make my own function library and use it to get result.
I really appreciate any advice about my questions.
------p.s.
I really know almost nothing about Python. What I can only do is a simple math or printing words in a list..given topic is too hard for me. Sorry.
回答1:
textfile=open('somefile.txt','r')
text_list=[line.split(' ') for line in textfile]
unique_words=[word for word in text_list if word not in unique_words]
print(len(unique_words))
That's the general gist of it
回答2:
import os
uniquewords = set([])
for root, dirs, files in os.walk("D:\\shakeall"):
for name in files:
[uniquewords.add(x) for x in open(os.path.join(root,name)).read().split()]
print list(uniquewords)
print len(uniquewords)
来源:https://stackoverflow.com/questions/11842548/how-do-i-count-unique-words-of-text-files-in-specific-directory-with-python