How to import csv data file into scikit-learn?

后端 未结 4 1845
轮回少年
轮回少年 2021-01-30 03:16

From my understanding, the scikit-learn accepts data in (n-sample, n-feature) format which is a 2D array. Assuming I have data in the form ...

Stock prices    in         


        
4条回答
  •  遇见更好的自我
    2021-01-30 03:52

    This is not a CSV file; this is just a space separated file. Assuming there are no missing values, you can easily load this into a Numpy array called data with

    import numpy as np
    
    f = open("filename.txt")
    f.readline()  # skip the header
    data = np.loadtxt(f)
    

    If the stock price is what you want to predict (your y value, in scikit-learn terms), then you should split data using

    X = data[:, 1:]  # select columns 1 through end
    y = data[:, 0]   # select column 0, the stock price
    

    Alternatively, you might be able to massage the standard Python csv module into handling this type of file.

提交回复
热议问题