Given is a simple CSV file:
A,B,C
Hello,Hi,0
Hola,Bueno,1
Obviously the real dataset is far more complex than this, but this one reproduces
LabelEncoding worked for me (basically you've to encode your data feature-wise) (mydata is a 2d array of string datatype):
myData=np.genfromtxt(filecsv, delimiter=",", dtype ="|a20" ,skip_header=1);
from sklearn import preprocessing
le = preprocessing.LabelEncoder()
for i in range(*NUMBER OF FEATURES*):
myData[:,i] = le.fit_transform(myData[:,i])
You can't pass str
to your model fit()
method. as it mentioned here
The training input samples. Internally, it will be converted to dtype=np.float32 and if a sparse matrix is provided to a sparse csc_matrix.
Try transforming your data to float and give a try to LabelEncoder.