python pandas : split a data frame based on a column value

后端 未结 3 504
一生所求
一生所求 2021-01-06 17:03

I have a csv file, when I read into pandas data frame, it looks like:

data = pd.read_csv(\'test1.csv\')
print(data)

output looks like:

相关标签:
3条回答
  • 2021-01-06 17:08

    You can try create dictionary of DataFrames by groupby, if column result has many different values:

    print data
       v1  v2  v3  result
    0  12  31  31       0
    1  34  52   4       1
    2  32   4   5       1
    3   7  89   2       0
    
    datas = {}
    for i, g in data.groupby('result'):
        #print 'data_' + str(i)
        #print g
        datas.update({'data_' + str(i) : g.reset_index(drop=True)})
    
    print datas['data_0']
       v1  v2  v3  result
    0  12  31  31       0
    1   7  89   2       0
    
    print datas['data_1']
       v1  v2  v3  result
    0  34  52   4       1
    1  32   4   5       1
    
    0 讨论(0)
  • 2021-01-06 17:30
    df1 = data[data.result==0]
    df2 = data[data.result==1]
    

    Have a look at this.

    0 讨论(0)
  • 2021-01-06 17:32

    Pandas allow you to slice and manipulate the data in a very straightforward way. You may also do the same as Yakym accessing with the key instead of attribute name.

    data_0 = data[data['result'] == 0]
    data_1 = data[data['result'] == 1]
    

    You can even add results columns by manipulating row data directly eg:

    data['v_sum'] = data[v1] + data[v2] + data[v3]
    
    0 讨论(0)
提交回复
热议问题