how do I count unique words of text files in specific directory with Python? [closed]

故事扮演 提交于 2019-12-13 09:28:39

问题


im writing a report and I need to count unique words of text files.

My texts are in D:\shakeall and they're totally 42 files...

I know some about Python, but I don't know what to do now.

This is what I know how it works.

  1. read files in directory

  2. make up a list of words from texts

  3. count total/unique words

all I know is this. and some about for, while, lists and indexes, variables, lists...

What I want to do is make my own function library and use it to get result.

I really appreciate any advice about my questions.

------p.s.

I really know almost nothing about Python. What I can only do is a simple math or printing words in a list..given topic is too hard for me. Sorry.


回答1:


textfile=open('somefile.txt','r')
text_list=[line.split(' ') for line in textfile]
unique_words=[word for word in text_list if word not in unique_words]
print(len(unique_words))

That's the general gist of it




回答2:


import os
uniquewords = set([])

for root, dirs, files in os.walk("D:\\shakeall"):
    for name in files:
        [uniquewords.add(x) for x in open(os.path.join(root,name)).read().split()]

print list(uniquewords)
print len(uniquewords)


来源:https://stackoverflow.com/questions/11842548/how-do-i-count-unique-words-of-text-files-in-specific-directory-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!