How can I partially read a huge CSV file?

后端未结

关注

 2  1096

I have a very big csv file so that I can not read them all into the memory. I only want to read and process a few lines in it. So I am seeking a function in Pandas which cou

相关标签:

2条回答

说谎

2020-12-01 04:44
In addition to EdChums answer I find the nrows argument useful which simply defines the number of rows you want to import. Thereby you don't get an iterator but rather can just import a part of the whole file of size nrows. It works with skiprows too.
```
df = pd.read_csv('matrix.txt',sep=',', header = None, skiprows= 1000, nrows=1000)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
天涯浪人

2020-12-01 04:59
Use chunksize:
```
for df in pd.read_csv('matrix.txt',sep=',', header = None, chunksize=1):
    #do something
```
To answer your second part do this:
```
df = pd.read_csv('matrix.txt',sep=',', header = None, skiprows=1000, chunksize=1000)
```
This will skip the first 1000 rows and then only read the next 1000 rows giving you rows 1000-2000, unclear if you require the end points to be included or not but you can fiddle the numbers to get what you want.
0 讨论(0)
发布评论:

提交评论
- 加载中...