I am encounteringt the following problems using GridSearchCV: it gives me a parallel error while using n_jobs > 1
. At the same time n_jobs > 1
wo
Maybe this could be still relevant for some!
I tried this only using Anaconda on a Windows 10 machine:
I had the same problem within my environment, with the following code section:
parameters = [{'C': [1, 10, 100, 1000], 'kernel': ['linear']}, {'C': [1, 10, 100, 1000], 'kernel': ['rbf'], 'gamma': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]}]
grid_search = GridSearchCV(estimator = classifier, param_grid = parameters, scoring = 'accuracy', cv = 10, n_jobs = -1)
grid_search = grid_search.fit(X_train, y_train)
best_accuracy = grid_search.best_score_
best_parameters = grid_search.best_params_
I did not find a lot on the internet, so I thought maybe I should update the joblib class. And surprise - joblib was not installed in my specific environment. After I installed and updated it - it worked perfectly. With n_jobs = -1
AND n_jobs = 2
.
I think you are using windows. You need to wrap the grid search in a function and then call inside __name__ == '__main__'
. Joblib parallel n_jobs=-1
determines the number of jobs to use which in parallel doesn't work on windows all the time.
Try wrapping grid search in a function:
def somefunction():
clf = ensemble.RandomForestClassifier()
param_grid = {'n_estimators': [10,20]}
grid_s= model_selection.GridSearchCV(clf, param_grid=param_grid_gb,n_jobs=-1,verbose=1)
grid_s.fit(train, targ)
return grid_s
if __name__ == '__main__':
somefunction()
Or:
if __name__ == '__main__':
clf = ensemble.RandomForestClassifier()
param_grid = {'n_estimators': [10,20]}
grid_s= model_selection.GridSearchCV(clf, param_grid=param_grid_gb,n_jobs=-1,verbose=1)
grid_s.fit(train, targ)