How to generate a train-test-split based on a group id?

前端 未结 1 982
死守一世寂寞
死守一世寂寞 2021-01-31 22:16

I have the following data:

pd.DataFrame({\'Group_ID\':[1,1,1,2,2,2,3,4,5,5],
          \'Item_id\':[1,2,3,4,5,6,7,8,9,10],
          \'Target\': [0,0,1,0,1,1,0,0         


        
相关标签:
1条回答
  • 2021-01-31 22:51

    I figured out the answer. This seems to work:

    train_inds, test_inds = next(GroupShuffleSplit(test_size=.20, n_splits=2, random_state = 7).split(df, groups=df['Group_Id']))
    
    train = df.iloc[train_inds]
    test = df.iloc[test_inds]
    
    0 讨论(0)
提交回复
热议问题