hyperparameters | 易学教程

thread.lock during custom parameter search class using Dask distributed

阅读更多关于 thread.lock during custom parameter search class using Dask distributed

问题 I wrote my own parameter search implementation mostly due to the fact that I don't need cross-validation of GridSearch and RandomizedSearch of scikit-learn . I use dask to deliver optimal distributed performance. Here is what I have: from scipy.stats import uniform class Params(object): def __init__(self,fixed,loc=0.0,scale=1.0): self.fixed=fixed self.sched=uniform(loc=loc,scale=scale) def _getsched(self,i,size): return self.sched.rvs(size=size,random_state=i) def param(self,i,size=None): tmp

Hyperparameter tuning locally — Tensorflow Google Cloud ML Engine

阅读更多关于 Hyperparameter tuning locally — Tensorflow Google Cloud ML Engine

问题 Is it possible to tune hyperparameters using ML Engine to train the model locally? The documentation only mentions training with hyperparameter tuning in the cloud (submitting a job), and has no mention to doing so locally. Otherwise, is there another commonly used hyperparameter tuning that passes in command arguments to task.py as in the census estimator tutorial? https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census 回答1: You cannot perform HPTuning (Bayesian

Parameter search using dask

阅读更多关于 Parameter search using dask

问题 How optimally search parameter space using Dask? (no cross validation) Here is the code (no DASK here): def build(ntries,param,niter,func,score,train,test): res=[] for i in range(ntries): cparam=param.rvs(size=niter,random_state=i) res.append( func(cparam, train, test, score) ) return res def score(test,correct): return np.linalg.norm(test-correct) def compute_optimal(res): from operator import itemgetter _sorted=sorted(res,None,itemgetter(1)) return _sorted def func(c,train,test,score): dt=1

Putting together sklearn pipeline+nested cross-validation for KNN regression

阅读更多关于 Putting together sklearn pipeline+nested cross-validation for KNN regression

问题 I'm trying to figure out how to built a workflow for sklearn.neighbors.KNeighborsRegressor that includes: normalize features feature selection (best subset of 20 numeric features, no specific total) cross-validates hyperparameter K in range 1 to 20 cross-validates model uses RMSE as error metric There's so many different options in scikit-learn that I'm a bit overwhelmed trying to decide which classes I need. Besides sklearn.neighbors.KNeighborsRegressor , I think I need: sklearn.pipeline

`warm_start` Parameter And Its Impact On Computational Time

阅读更多关于 `warm_start` Parameter And Its Impact On Computational Time

问题 I have a logistic regression model with a defined set of parameters ( warm_start=True ). As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes. Suppose I alter some parameters, say, C=100 and call .fit method again using the same training data . Theoretically, for the second time, I think .fit should take less computational time as compared to the model with warm_start=False . However, empirically is not actually true. Please, help me

specify scoring metric in GridSearch function with hypopt package in python

阅读更多关于 specify scoring metric in GridSearch function with hypopt package in python

I'm using Gridsearch function from hypopt package to do my hyperparameter searching using specified validation set. The default metric for classification seems to be accuracy (not very sure). Here I want to use F1 score as the metric. I do not know where I should specify the metric. I looked at the documentation but kind of confused. Does anyone who are familiar with hypopt package know how I can do this? Thanks a lot in advance. from hypopt import GridSearch log_reg_params = {"penalty": ['l1'], 'C': [0.001, 0.01]} opt = GridSearch(model=LogisticRegression()) opt.fit(X_train, y_train, log_reg

Putting together sklearn pipeline+nested cross-validation for KNN regression

阅读更多关于 Putting together sklearn pipeline+nested cross-validation for KNN regression

I'm trying to figure out how to built a workflow for sklearn.neighbors.KNeighborsRegressor that includes: normalize features feature selection (best subset of 20 numeric features, no specific total) cross-validates hyperparameter K in range 1 to 20 cross-validates model uses RMSE as error metric There's so many different options in scikit-learn that I'm a bit overwhelmed trying to decide which classes I need. Besides sklearn.neighbors.KNeighborsRegressor , I think I need: sklearn.pipeline.Pipeline sklearn.preprocessing.Normalizer sklearn.model_selection.GridSearchCV sklearn.model_selection

Grid Search the number of hidden layers with keras

阅读更多关于 Grid Search the number of hidden layers with keras

问题 I am trying to optimize the hyperparameters of my NN using Keras and sklearn. I am wrapping up with KerasClassifier (it´s a classification problem). I am trying to optimize the number of hidden layers. I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers) Could anyone please help me? My code (just the important part): ## Import `Sequential` from `keras.models` from keras.models import

Optimize the Kernel parameters of RBF kernel for GPR in scikit-learn using internally supported optimizers

阅读更多关于 Optimize the Kernel parameters of RBF kernel for GPR in scikit-learn using internally supported optimizers

问题 The basic equation of square exponential or RBF kernel is as follows: Here l is the length scale and sigma is the variance parameter. The length scale controls how two points appear to be similar as it simply magnifies the distance between x and x'. The variance parameter controls how smooth the function is. I want to optimize/train these parameters (l and sigma) with my training data sets. My training data sets are in the following form: X : 2-D Cartesian coordinate as input data y : radio

keras/scikit-learn: using fit_generator() with cross validation

阅读更多关于 keras/scikit-learn: using fit_generator() with cross validation

问题 Is it possible to use Keras's scikit-learn API together with fit_generator() method? Or use another way to yield batches for training? I'm using SciPy's sparse matrices which must be converted to NumPy arrays before input to Keras, but I can't convert them at the same time because of high memory consumption. Here is my function to yield batches: def batch_generator(X, y, batch_size): n_splits = len(X) // (batch_size - 1) X = np.array_split(X, n_splits) y = np.array_split(y, n_splits) while