I\'m trying to open a txt file with 4605227 rows (305 MB)
The way I have done this before is:
data = np.loadtxt(\'file.txt\', delimiter=\'\\t\', dtype=st
You read it directly in as a Pandas DataFrame. eg
import pandas as pd
pd.read_csv(path)
If you want to read faster, you can use modin:
import modin.pandas as pd
pd.read_csv(path)
https://github.com/modin-project/modin
Rather than reading it in with numpy you could just read it directly in as a Pandas DataFrame. E.g., using the pandas.read_csv function, with something like:
df = pd.read_csv('file.txt', delimiter='\t', usecols=["a", "b", "c", "d", "e", "f", "g", "h", "i"])
Method 1 :
You can read the file by chunks , Moreover there is a buffer size which ou can mention in readline and you can read.
inputFile = open('inputTextFile','r')
buffer_line = inputFile.readlines(BUFFERSIZE)
while buffer_line:
#logic goes here
Method 2:
You can also use nmap Module , Here below is the link whic will explain the usage.
import mmap
with open("hello.txt", "r+b") as f:
# memory-map the file, size 0 means whole file
mm = mmap.mmap(f.fileno(), 0)
# read content via standard file methods
print(mm.readline()) # prints b"Hello Python!\n"
# read content via slice notation
print(mm[:5]) # prints b"Hello"
# update content using slice notation;
# note that new content must have same size
mm[6:] = b" world!\n"
# ... and read again using standard file methods
mm.seek(0)
print(mm.readline()) # prints b"Hello world!\n"
# close the map
mm.close()
https://docs.python.org/3/library/mmap.html