How do I read CSV data into a record array in NumPy?

后端未结

关注

 11  1302

I wonder if there is a direct way to import the contents of a CSV file into a record array, much in the way that R\'s read.table(), read.delim(), a

相关标签:

11条回答

长情又很酷

2020-11-22 03:58
This is the easiest way:
```
import csv
with open('testfile.csv', newline='') as csvfile:
    data = list(csv.reader(csvfile))
```
Now each entry in data is a record, represented as an array. So you have a 2D array. It saved me so much time.
0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2020-11-22 03:58
I would suggest using tables (pip3 install tables). You can save your .csv file to .h5 using pandas (pip3 install pandas),
```
import pandas as pd
data = pd.read_csv("dataset.csv")
store = pd.HDFStore('dataset.h5')
store['mydata'] = data
store.close()
```
You can then easily, and with less time even for huge amount of data, load your data in a NumPy array.
```
import pandas as pd
store = pd.HDFStore('dataset.h5')
data = store['mydata']
store.close()

# Data in NumPy format
data = data.values
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

挽巷

2020-11-22 03:58

This work as a charm...

import csv
with open("data.csv", 'r') as f:
    data = list(csv.reader(f, delimiter=";"))

import numpy as np
data = np.array(data, dtype=np.float)

0 讨论(0)

眼角桃花

2020-11-22 04:01
I timed the
```
from numpy import genfromtxt
genfromtxt(fname = dest_file, dtype = (<whatever options>))
```
versus
```
import csv
import numpy as np
with open(dest_file,'r') as dest_f:
    data_iter = csv.reader(dest_f,
                           delimiter = delimiter,
                           quotechar = '"')
    data = [data for data in data_iter]
data_array = np.asarray(data, dtype = <whatever options>)
```
on 4.6 million rows with about 70 columns and found that the NumPy path took 2 min 16 secs and the csv-list comprehension method took 13 seconds.

I would recommend the csv-list comprehension method as it is most likely relies on pre-compiled libraries and not the interpreter as much as NumPy. I suspect the pandas method would have similar interpreter overhead.
0 讨论(0)
发布评论:

提交评论
- 加载中...
借酒劲吻你

2020-11-22 04:01
You can use this code to send CSV file data into an array:
```
import numpy as np
csv = np.genfromtxt('test.csv', delimiter=",")
print(csv)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2