grid-search

XGBoost with GridSearchCV, Scaling, PCA, and Early-Stopping in sklearn Pipeline

夙愿已清 提交于 2020-06-09 11:31:45
问题 I want to combine a XGBoost model with input scaling and feature space reduction by PCA. In addition, the hyperparameters of the model as well as the number of components used in the PCA should be tuned using cross-validation. And to prevent the model from overfitting, early stopping should be added. For combining the various steps, I decided to use sklearn's Pipeline functionalities. At the beginning, I had some problems making sure, that the PCA is also applied to the validation set. But I

Retrieving specific classifiers and data from GridSearchCV

故事扮演 提交于 2020-06-07 07:22:26
问题 I am running a Python 3 classification script on a server using the following code: # define knn classifier for transformed data knn_classifier = neighbors.KNeighborsClassifier() # define KNN parameters knn_parameters = [{ 'n_neighbors': [1,3,5,7, 9, 11], 'leaf_size': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60], 'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'], 'n_jobs': [-1], 'weights': ['uniform', 'distance']}] # Stratified k-fold (default for classifier) # n = 5 folds is default

D* Lite search algorithm for robot path planning gets stuck in infinite loop. Why does my fix work and is it any slower?

谁说我不能喝 提交于 2020-05-08 17:22:32
问题 For a robotics project I've been working with I want to use the D* Lite (optimized version) from paper Koenig, 2002 for dynamic path planning for a changing occupancy grid / cost map. The idea of the D* Lite search algorithm, as described in the paper, is that it works by basically running A* search in reverse starting from the goal and attempting to work back to the start. The solver then gives out the current solution and waits for some kind of change in the weights or obstacles that it is

Scikit-learn using GridSearchCV on DecisionTreeClassifier

我是研究僧i 提交于 2020-03-21 20:01:10
问题 I tried to use GridSearchCV on DecisionTreeClassifier, but get the following error: TypeError: unbound method get_params() must be called with DecisionTreeClassifier instance as first argument (got nothing instead) here's my code: from sklearn.tree import DecisionTreeClassifier, export_graphviz from sklearn.grid_search import GridSearchCV from sklearn.cross_validation import cross_val_score X, Y = createDataSet(filename) tree_para = {'criterion':['gini','entropy'],'max_depth':[4,5,6,7,8,9,10

How to graph grid scores from GridSearchCV?

假装没事ソ 提交于 2020-03-17 03:51:58
问题 I am looking for a way to graph grid_scores_ from GridSearchCV in sklearn. In this example I am trying to grid search for best gamma and C parameters for an SVR algorithm. My code looks as follows: C_range = 10.0 ** np.arange(-4, 4) gamma_range = 10.0 ** np.arange(-4, 4) param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist()) grid = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5) grid.fit(X_train,y_train) print(grid.grid_scores_) After I run the code and print the grid

GridSearchCV: print some expression each time a function completes a loop

拥有回忆 提交于 2020-03-02 06:56:25
问题 Assume you have some function function in Python that works by looping: for example it could be a function that evaluates a certain mathematical expression, e.g. x**2 , for all elements from an array, e.g. ([1, 2, ..., 100]) (obviously this is a toy example). Would it be possible to write a code such that, each time function goes through a loop and obtains a result, some code is executed, e.g. print("Loop %s has been executed" % i) ? So, in our example, when x**1 has been computed, the

How to save GridSearchCV object?

半腔热情 提交于 2020-02-05 13:53:08
问题 Lately, I have been working on applying grid search cross validation (sklearn GridSearchCV) for hyper-parameter tuning in Keras with Tensorflow backend. An soon as my model is tuned I am trying to save the GridSearchCV object for later use without success. The hyper-parameter tuning is done as follows: x_train, x_val, y_train, y_val = train_test_split(NN_input, NN_target, train_size = 0.85, random_state = 4) history = History() kfold = 10 regressor = KerasRegressor(build_fn = create_keras

GridSearchCV.best_score_ meaning when scoring set to 'accuracy' and CV

自古美人都是妖i 提交于 2020-02-01 08:30:08
问题 I'm trying to find the best model Neural Network model applied for the classification of breast cancer samples on the well-known Wisconsin Cancer dataset (569 samples, 31 features + target). I'm using sklearn 0.18.1. I'm not using Normalization so far. I'll add it when I solve this question. # some init code omitted X_train, X_test, y_train, y_test = train_test_split(X, y) Define params NN params for the GridSearchCV tuned_params = [{'solver': ['sgd'], 'learning_rate': ['constant'], "learning

sklearn GridSearchCV (Scoring Function error)

旧城冷巷雨未停 提交于 2020-01-24 21:47:26
问题 I was wondering if you can help me out with an error I am receiving in running grid search. I think it might due to misunderstanding on how grid search actually works. I am now running an application where I need grid search to evaluate best parameters using a different scoring function. I am using RandomForestClassifier to fit a large X dataset to a characterization vector Y which is a list of 0s and 1s. (completely binary). My scoring function (MCC) requires the prediction input and actual

RandomForestRegressor and feature_importances_ error

泪湿孤枕 提交于 2020-01-24 04:01:08
问题 I am struggling to pull out the feature importances from my RandomForestRegressor, I get an: AttributeError: 'GridSearchCV' object has no attribute 'feature_importances_'. Anyone know why there is no attribute? According to documentation there should exist this attribute? The full code: from sklearn.ensemble import RandomForestRegressor from sklearn.model_selection import GridSearchCV #Running a RandomForestRegressor GridSearchCV to tune the model. parameter_candidates = { 'n_estimators' :