Pandas and scikit-learn: KeyError: […] not in index

前端 未结 1 1090
轻奢々
轻奢々 2021-02-14 10:23

I do not understand why do I get the error KeyError: \'[ 1351 1352 1353 ... 13500 13501 13502] not in index\' when I run this code:

cv = KFold(n_s         


        
相关标签:
1条回答
  • 2021-02-14 10:33

    The problem is the way you are trying to index the X using X[train_index]. You need to use .loc or .iloc since you have pandas dataframe.


    Use this

    cv = KFold(n_splits=10)
    
    for train_index, test_index in cv.split(X):
        f_train_X, f_valid_X = X.iloc[train_index], X.iloc[test_index]
        f_train_y, f_valid_y = y.iloc[train_index], y.iloc[test_index]
    

    1st way: Example using iloc

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    
    df[[1,2]]
    #KeyError: '[1 2] not in index'
    
    df.iloc[[1,2]]
    #    A   B   C   D
    #1  25  97  78  74
    #2   6  84  16  21
    

    2nd way: Example by converting pandas to numpy in advance

    df = df.values
    
    #now this should work fine
    df[[1,2]]
    #array([[25, 97, 78, 74],
    #      [ 6, 84, 16, 21]])
    
    0 讨论(0)
提交回复
热议问题