How to read the large number of text files from a Directory using Python

假如想象 提交于 2019-12-23 06:07:22

问题


I'm working on a project using Python(3.6) and Django(2) in which I need to read all the text files from a directory one by one, I have written the code but it is reading only 28 files from a folder which has 30 text files at the moment for testing purpose and return an error.

From views.py:

def get_txt_files(base_dir):
    for entry in os.scandir(base_dir):
        if entry.is_file() and entry.name.endswith(".txt"):
            # print(entry.path)
            yield entry.path, entry.name
        elif entry.is_dir():
            yield from get_txt_files(entry.path)
        else:
            print(f"Neither a file, nor a dir: {entry.path}")

for path, name in get_txt_files(obj.textPath):
    print(path)
    sa_response = nlp_text_manager(path, name)

here's the function to read the files:

def nlp_text_manager(text_path, name):
    text = text_path
    txt = Path(text_path).read_text(encoding='cp1252')
    # then use the files below that.....

And it returns this error after reading 28 files:

v = self._sslobj.read(len, buffer)

socket.timeout: The read operation timed out

[11/Dec/2018 07:16:20] "POST / HTTP/1.1" 500 16416

The number of files in the provided folder can be very large may the folder has the size in GBs, So, what is the efficient pythonic way to read a large number of files from a directory?

Thanks in advance!

来源:https://stackoverflow.com/questions/53719293/how-to-read-the-large-number-of-text-files-from-a-directory-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!