hyperparameters

thread.lock during custom parameter search class using Dask distributed

落花浮王杯 提交于 2019-12-11 08:06:03
问题 I wrote my own parameter search implementation mostly due to the fact that I don't need cross-validation of GridSearch and RandomizedSearch of scikit-learn . I use dask to deliver optimal distributed performance. Here is what I have: from scipy.stats import uniform class Params(object): def __init__(self,fixed,loc=0.0,scale=1.0): self.fixed=fixed self.sched=uniform(loc=loc,scale=scale) def _getsched(self,i,size): return self.sched.rvs(size=size,random_state=i) def param(self,i,size=None): tmp

Hyperparameter tuning locally — Tensorflow Google Cloud ML Engine

吃可爱长大的小学妹 提交于 2019-12-11 06:08:50
问题 Is it possible to tune hyperparameters using ML Engine to train the model locally? The documentation only mentions training with hyperparameter tuning in the cloud (submitting a job), and has no mention to doing so locally. Otherwise, is there another commonly used hyperparameter tuning that passes in command arguments to task.py as in the census estimator tutorial? https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census 回答1: You cannot perform HPTuning (Bayesian

Parameter search using dask

折月煮酒 提交于 2019-12-11 05:29:46
问题 How optimally search parameter space using Dask? (no cross validation) Here is the code (no DASK here): def build(ntries,param,niter,func,score,train,test): res=[] for i in range(ntries): cparam=param.rvs(size=niter,random_state=i) res.append( func(cparam, train, test, score) ) return res def score(test,correct): return np.linalg.norm(test-correct) def compute_optimal(res): from operator import itemgetter _sorted=sorted(res,None,itemgetter(1)) return _sorted def func(c,train,test,score): dt=1

Putting together sklearn pipeline+nested cross-validation for KNN regression

隐身守侯 提交于 2019-12-07 15:50:21
问题 I'm trying to figure out how to built a workflow for sklearn.neighbors.KNeighborsRegressor that includes: normalize features feature selection (best subset of 20 numeric features, no specific total) cross-validates hyperparameter K in range 1 to 20 cross-validates model uses RMSE as error metric There's so many different options in scikit-learn that I'm a bit overwhelmed trying to decide which classes I need. Besides sklearn.neighbors.KNeighborsRegressor , I think I need: sklearn.pipeline

`warm_start` Parameter And Its Impact On Computational Time

梦想的初衷 提交于 2019-12-07 08:23:48
问题 I have a logistic regression model with a defined set of parameters ( warm_start=True ). As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes. Suppose I alter some parameters, say, C=100 and call .fit method again using the same training data . Theoretically, for the second time, I think .fit should take less computational time as compared to the model with warm_start=False . However, empirically is not actually true. Please, help me

specify scoring metric in GridSearch function with hypopt package in python

孤人 提交于 2019-12-06 08:06:16
I'm using Gridsearch function from hypopt package to do my hyperparameter searching using specified validation set. The default metric for classification seems to be accuracy (not very sure). Here I want to use F1 score as the metric. I do not know where I should specify the metric. I looked at the documentation but kind of confused. Does anyone who are familiar with hypopt package know how I can do this? Thanks a lot in advance. from hypopt import GridSearch log_reg_params = {"penalty": ['l1'], 'C': [0.001, 0.01]} opt = GridSearch(model=LogisticRegression()) opt.fit(X_train, y_train, log_reg

Putting together sklearn pipeline+nested cross-validation for KNN regression

与世无争的帅哥 提交于 2019-12-06 01:38:47
I'm trying to figure out how to built a workflow for sklearn.neighbors.KNeighborsRegressor that includes: normalize features feature selection (best subset of 20 numeric features, no specific total) cross-validates hyperparameter K in range 1 to 20 cross-validates model uses RMSE as error metric There's so many different options in scikit-learn that I'm a bit overwhelmed trying to decide which classes I need. Besides sklearn.neighbors.KNeighborsRegressor , I think I need: sklearn.pipeline.Pipeline sklearn.preprocessing.Normalizer sklearn.model_selection.GridSearchCV sklearn.model_selection

Grid Search the number of hidden layers with keras

微笑、不失礼 提交于 2019-12-05 05:27:33
问题 I am trying to optimize the hyperparameters of my NN using Keras and sklearn. I am wrapping up with KerasClassifier (it´s a classification problem). I am trying to optimize the number of hidden layers. I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers) Could anyone please help me? My code (just the important part): ## Import `Sequential` from `keras.models` from keras.models import

Optimize the Kernel parameters of RBF kernel for GPR in scikit-learn using internally supported optimizers

时间秒杀一切 提交于 2019-12-05 05:09:30
问题 The basic equation of square exponential or RBF kernel is as follows: Here l is the length scale and sigma is the variance parameter. The length scale controls how two points appear to be similar as it simply magnifies the distance between x and x'. The variance parameter controls how smooth the function is. I want to optimize/train these parameters (l and sigma) with my training data sets. My training data sets are in the following form: X : 2-D Cartesian coordinate as input data y : radio

keras/scikit-learn: using fit_generator() with cross validation

我怕爱的太早我们不能终老 提交于 2019-12-05 02:23:28
问题 Is it possible to use Keras's scikit-learn API together with fit_generator() method? Or use another way to yield batches for training? I'm using SciPy's sparse matrices which must be converted to NumPy arrays before input to Keras, but I can't convert them at the same time because of high memory consumption. Here is my function to yield batches: def batch_generator(X, y, batch_size): n_splits = len(X) // (batch_size - 1) X = np.array_split(X, n_splits) y = np.array_split(y, n_splits) while