Windows Error using XGBoost with python

大兔子大兔子 提交于 2020-12-12 05:12:18

问题


So I'm tackling this machine-learning problem (from a previous Kaggle competition for practice: https://www.kaggle.com/c/nyc-taxi-trip-duration) and I'm trying to use XGBoost but getting an error which I have no clue how to tackle. I searched on google and stack overflow but couldn't find anyone with a similar problem.

I'm using python 2.7 with the Spyder IDE through Anaconda and I'm on Windows 10. I did have some trouble installing the xgboost package so I won't completely erase the idea that it could be an installation error. However I'm also doing a Udemy course on ML and I was able to use xgboost just fine with a small dataset and I'm using the same functions.

Code

The code is pretty simple:

... import libraries

# import dataset 
dataset = pd.read_csv('data/merged.csv')
y = dataset['trip_duration'].values
del dataset['trip_duration'], dataset["id"], dataset['distance']
X = dataset.values

# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)

# fit XGBoost to training set
classifier = XGBClassifier()
classifier.fit(X_train, y_train)   

Output

However it spits out the following error:

In [1]: classifier.fit(X_train, y_train)
Traceback (most recent call last):

  File "<ipython-input-44-f44724590846>", line 1, in <module>
    classifier.fit(X_train, y_train)

  File "C:\Users\MortZ\Anaconda3\lib\site-packages\xgboost\sklearn.py", line 464, in fit
    verbose_eval=verbose)

  File "C:\Users\MortZ\Anaconda3\lib\site-packages\xgboost\training.py", line 204, in train
    xgb_model=xgb_model, callbacks=callbacks)

  File "C:\Users\MortZ\Anaconda3\lib\site-packages\xgboost\training.py", line 74, in _train_internal
    bst.update(dtrain, i, obj)

  File "C:\Users\MortZ\Anaconda3\lib\site-packages\xgboost\core.py", line 819, in update
    _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, iteration, dtrain.handle))

WindowsError: [Error -529697949] Windows Error 0xE06D7363

I don't really know how to interpret this so any help would be very appreciated. Thanks in advance

MortZ


回答1:


Well after struggling for a few days I managed to find a solution.

A friend of mine told xgboost is known to have problems with python 2.7 so I upgraded it to 3.6 This didn't entirely solve my problem but gave me a knew error:

OSError: [WinError 541541187] Windows Error 0x20474343

After some digging I found a solution to this. The fit function I was trying to use was the source of the problem (although it did work on a different dataset so I'm not entirely sure why..).

Solution

change

classifier = XGBClassifier()
classifier.fit(X_train, y_train) 

to

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
watchlist = [(dtrain, 'train'), (dtest, 'test')]
xgb_pars = {'min_child_weight': 1, 'eta': 0.5, 'colsample_bytree': 0.9, 
        'max_depth': 6, 'subsample': 0.9, 'lambda': 1., 'nthread': -1, 'booster' : 'gbtree', 'silent': 1, 'eval_metric': 'rmse', 'objective': 'reg:linear'}
model = xgb.train(xgb_pars, dtrain, 10, watchlist, early_stopping_rounds=2, maximize=False, verbose_eval=1)
print('Modeling RMSLE %.5f' % model.best_score)



回答2:


I guess the error is because you are using XGBClassfier instead of XGBRegressor for a regression problem.



来源:https://stackoverflow.com/questions/49804208/windows-error-using-xgboost-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!