问题
I'm working on a project using Python(3.6) and Django(2) in which I need to read all the text files from a directory one by one, I have written the code but it is reading only 28 files from a folder which has 30 text files at the moment for testing purpose and return an error.
From views.py:
def get_txt_files(base_dir):
for entry in os.scandir(base_dir):
if entry.is_file() and entry.name.endswith(".txt"):
# print(entry.path)
yield entry.path, entry.name
elif entry.is_dir():
yield from get_txt_files(entry.path)
else:
print(f"Neither a file, nor a dir: {entry.path}")
for path, name in get_txt_files(obj.textPath):
print(path)
sa_response = nlp_text_manager(path, name)
here's the function to read the files:
def nlp_text_manager(text_path, name):
text = text_path
txt = Path(text_path).read_text(encoding='cp1252')
# then use the files below that.....
And it returns this error after reading 28 files:
v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
[11/Dec/2018 07:16:20] "POST / HTTP/1.1" 500 16416
The number of files in the provided folder can be very large may the folder has the size in GBs, So, what is the efficient pythonic way to read a large number of files from a directory?
Thanks in advance!
来源:https://stackoverflow.com/questions/53719293/how-to-read-the-large-number-of-text-files-from-a-directory-using-python