Ensemble of different kinds of regressors using scikit-learn (or any other python framework)

后端 未结 4 1739
猫巷女王i
猫巷女王i 2021-01-30 05:43

I am trying to solve the regression task. I found out that 3 models are working nicely for different subsets of data: LassoLARS, SVR and Gradient Tree Boosting. I noticed that w

4条回答
  •  余生分开走
    2021-01-30 06:09

    Ok, after spending some time on googling 'stacking' (as mentioned by @andreas earlier) I found out how I could do the weighting in python even with scikit-learn. Consider the below:

    I train a set of my regression models (as mentioned SVR, LassoLars and GradientBoostingRegressor). Then I run all of them on training data (same data which was used for training of each of these 3 regressors). I get predictions for examples with each of my algorithms and save these 3 results into pandas dataframe with columns 'predictedSVR', 'predictedLASSO' and 'predictedGBR'. And I add the final column into this datafrane which I call 'predicted' which is a real prediction value.

    Then I just train a linear regression on this new dataframe:

    #df - dataframe with results of 3 regressors and true output
    from sklearn linear_model
    stacker= linear_model.LinearRegression()
    stacker.fit(df[['predictedSVR', 'predictedLASSO', 'predictedGBR']], df['predicted'])
    

    So when I want to make a prediction for new example I just run each of my 3 regressors separately and then I do:

    stacker.predict() 
    

    on outputs of my 3 regressors. And get a result.

    The problem here is that I am finding optimal weights for regressors 'on average, the weights will be same for each example on which I will try to make prediction.

提交回复
热议问题