问题
I am pretty new for python. I am using python to read the arff file now:
import arff
for row in arff.load('cpu.arff'):
x = row
print(x)
The part of sample output is like this format:
<Row(125.0,256.0,6000.0,256.0,16.0,128.0,198.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,269.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,220.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,172.0)>
<Row(29.0,8000.0,16000.0,32.0,8.0,16.0,132.0)>
<Row(26.0,8000.0,32000.0,64.0,8.0,32.0,318.0)>
<Row(23.0,16000.0,32000.0,64.0,16.0,32.0,367.0)>
Actually, only the last column of data is the label, and the rest of data are the attributes. I am wondering how I can save them by using array? Because I want to assign the data of last column as y, and the first six column data as my x, and then I will do the cross-validation for the data from arff file.
Or is there any approaches to separate data by attributes and label from arff file automatically?
回答1:
Row objects from arff
module support typical python array slicing, thus you can separate data from labels easily
import arff
X = []
y = []
for row in arff.load('cpu.arff'):
X.append(row[:-1])
y.append(row[-1])
来源:https://stackoverflow.com/questions/33600053/how-to-deal-with-data-from-arff-file-with-python