Turning a Pandas Dataframe to an array and evaluate Multiple Linear Regression Model

后端 未结 2 580
一向
一向 2020-12-18 07:36

I am trying to evaluate a multiple linear regression model. I have a data set like this :

\"enter

相关标签:
2条回答
  • 2020-12-18 08:25

    You can turn the dataframe into a matrix using the method as_matrix directly on the dataframe object. You might need to specify the columns which you are interested in X=df[['x1','x2','X3']].as_matrix() where the different x's are the column names.

    For the y variables you can use y = df['ground_truth'].values to get an array.

    Here is an example with some randomly generated data:

    import numpy as np
    #create a 5X5 dataframe
    df = pd.DataFrame(np.random.random_integers(0, 100, (5, 5)), columns = ['X1','X2','X3','X4','y'])
    

    calling as_matrix() on df returns a numpy.ndarray object

    X = df[['X1','X2','X3','X4']].as_matrix()
    

    Calling values returns a numpy.ndarray from a pandas series

    y =df['y'].values
    

    Notice: You might get a warning saying:FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.

    To fix it use values instead of as_matrix as shown below

    X = df[['X1','X2','X3','X4']].values
    
    0 讨论(0)
  • 2020-12-18 08:28
    y = broken_df.ground_truth.values
    X = broken_df.drop('ground_truth', axis=1).values
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    linreg = LinearRegression()
    linreg.fit(X_train, y_train)
    y_pred = linreg.predict(X_test)
    print(linreg.score(X_test, y_test)
    print(classification_report(y_test, y_pred))
    
    0 讨论(0)
提交回复
热议问题