gradient-descent

Multi-layer neural network back-propagation formula (using stochastic gradient descent)

一曲冷凌霜 提交于 2019-12-08 05:13:20
问题 Using the notations from Backpropagation calculus | Deep learning, chapter 4, I have this back-propagation code for a 4-layer (i.e. 2 hidden layers) neural network: def sigmoid_prime(z): return z * (1-z) # because σ'(x) = σ(x) (1 - σ(x)) def train(self, input_vector, target_vector): a = np.array(input_vector, ndmin=2).T y = np.array(target_vector, ndmin=2).T # forward A = [a] for k in range(3): a = sigmoid(np.dot(self.weights[k], a)) # zero bias here just for simplicity A.append(a) # Now A

setting an array element with a sequence error in scikit learn GradientBoostingClassifier

泄露秘密 提交于 2019-12-08 04:51:00
问题 Here is my code, anyone have any ideas what is wrong? The error happens when I call fit , import pandas as pd import numpy as np from sklearn.ensemble import (RandomTreesEmbedding, RandomForestClassifier, GradientBoostingClassifier) from sklearn.model_selection import train_test_split from sklearn.feature_extraction.text import CountVectorizer n_estimators = 10 d = {'f1': [1, 2], 'f2': ['foo goo', 'goo zoo'], 'target':[0, 1]} df = pd.DataFrame(data=d) X_train, X_test, y_train, y_test = train

Estimating linear regression with Gradient Descent (Steepest Descent)

人走茶凉 提交于 2019-12-08 04:02:19
问题 Example data X<-matrix(c(rep(1,97),runif(97)) , nrow=97, ncol=2) y<-matrix(runif(97), nrow= 97 , ncol =1) I have succeed in creating the cost function COST<-function(theta,X,y){ ### Calculate half MSE sum((X %*% theta - y)^2)/(2*length(y)) } How ever when I run this function , it seem to fail to converge over 100 iterations. theta <- matrix (0, nrow=2,ncol=1) num.iters <- 1500 delta = 0 GD<-function(X,y,theta,alpha,num.iters){ for (i in num.iters){ while (max(abs(delta)) < tolerance){ error <

How do I switch tf.train.Optimizers during training?

倖福魔咒の 提交于 2019-12-07 17:40:28
问题 I want to switch from Adam to SGD after a certain number of epochs. How do I do this smoothly so that the weights/gradients are passed over to the new optimizer? 回答1: Just define two optimizers and switch between them: sgd_optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) adap_optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost) ... for epoch in range(100): for (x, y) in zip(train_X, train_Y): optimizer = sgd_optimizer if epoch > 50 else adap_optimizer

Tensorflow gradient with respect to matrix

♀尐吖头ヾ 提交于 2019-12-07 13:48:05
问题 Just for context, I'm trying to implement a gradient descent algorithm with Tensorflow. I have a matrix X [ x1 x2 x3 x4 ] [ x5 x6 x7 x8 ] which I multiply by some feature vector Y to get Z [ y1 ] Z = X [ y2 ] = [ z1 ] [ y3 ] [ z2 ] [ y4 ] I then put Z through a softmax function, and take the log. I'll refer to the output matrix as W. All this is implemented as follows (little bit of boilerplate added so it's runnable) sess = tf.Session() num_features = 4 num_actions = 2 policy_matrix = tf.get

Simple gradient descent using mxnet

我只是一个虾纸丫 提交于 2019-12-07 10:24:57
问题 I'm trying to use MXNet's gradient descent optimizers to minimize a function. The equivalent example in Tensorflow would be: import tensorflow as tf x = tf.Variable(2, name='x', dtype=tf.float32) log_x = tf.log(x) log_x_squared = tf.square(log_x) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(log_x_squared) init = tf.initialize_all_variables() def optimize(): with tf.Session() as session: session.run(init) print("starting at", "x:", session.run(x), "log(x)^2:",

`warm_start` Parameter And Its Impact On Computational Time

梦想的初衷 提交于 2019-12-07 08:23:48
问题 I have a logistic regression model with a defined set of parameters ( warm_start=True ). As always, I call LogisticRegression.fit(X_train, y_train) and use the model after to predict new outcomes. Suppose I alter some parameters, say, C=100 and call .fit method again using the same training data . Theoretically, for the second time, I think .fit should take less computational time as compared to the model with warm_start=False . However, empirically is not actually true. Please, help me

TensorFlow's ReluGrad claims input is not finite

梦想与她 提交于 2019-12-07 04:46:49
问题 I'm trying out TensorFlow and I'm running into a strange error. I edited the deep MNIST example to use another set of images, and the algorithm converges nicely again, until around iteration 8000 (accuracy 91% at that point) when it crashes with the following error. tensorflow.python.framework.errors.InvalidArgumentError: ReluGrad input is not finite At first I thought maybe some coefficients were reaching the limit for a float, but adding l2 regularization on all weights & biases didn't

What's different about momentum gradient update in Tensorflow and Theano like this?

非 Y 不嫁゛ 提交于 2019-12-07 03:55:01
问题 I'm trying to use TensorFlow with my deep learning project. Here I need implement my gradient update in this formula : I have also implement this part in Theano, and it came out the expected answer. But when I try to use TensorFlow's MomentumOptimizer , the result is really bad. I don't know what is different between them. Theano: def gradient_updates_momentum_L2(cost, params, learning_rate, momentum, weight_cost_strength): # Make sure momentum is a sane value assert momentum < 1 and momentum

Multi-layer neural network back-propagation formula (using stochastic gradient descent)

北城余情 提交于 2019-12-06 21:01:28
Using the notations from Backpropagation calculus | Deep learning, chapter 4 , I have this back-propagation code for a 4-layer (i.e. 2 hidden layers) neural network: def sigmoid_prime(z): return z * (1-z) # because σ'(x) = σ(x) (1 - σ(x)) def train(self, input_vector, target_vector): a = np.array(input_vector, ndmin=2).T y = np.array(target_vector, ndmin=2).T # forward A = [a] for k in range(3): a = sigmoid(np.dot(self.weights[k], a)) # zero bias here just for simplicity A.append(a) # Now A has 4 elements: the input vector + the 3 outputs vectors # back-propagation delta = a - y for k in [2, 1