gradient-descent

How can I have multiple losses in a network in Caffe?

前提是你 提交于 2019-12-06 09:40:41
If I define multiple loss layers in a network, will there be multiple back propagation happening from those ends to the beginning of the network? I mean, do they even work that way? Suppose I have something like this: Layer1{ } Layer2{ } ... Layer_n{ } Layer_cls1{ bottom:layer_n top:cls1 } Layer_cls_loss1{ type:some_loss bottom:cls1 top:loss1 } Layer_n1{ bottom:layer_n .. } Layer_n2{ } ... layer_n3{ } Layer_cls2{ bottom:layer_n3 top:cls2 } Layer_cls_loss2{ type:some_loss bottom:cls2 top:loss2 } layer_n4{ bottom:layer_n3 .. } ... layer_cls3End{ top:cls_end bottom:... } loss{ bottom:cls_end top

AdamOptimizer and GradientDescentOptimizer from tensorflow not able to fit simple data

為{幸葍}努か 提交于 2019-12-06 05:26:26
问题 Similar question: Here I am trying out TensorFlow. I generated simple data which is linearly separable and tried to fit a linear equation to it. Here is the code. np.random.seed(2010) n = 300 x_data = np.random.random([n, 2]).tolist() y_data = [[1., 0.] if v[0]> 0.5 else [0., 1.] for v in x_data] x = tf.placeholder(tf.float32, [None, 2]) W = tf.Variable(tf.zeros([2, 2])) b = tf.Variable(tf.zeros([2])) y = tf.sigmoid(tf.matmul(x , W) + b) y_ = tf.placeholder(tf.float32, [None, 2]) cross

Multi variable gradient descent in matlab

孤街醉人 提交于 2019-12-06 03:57:54
问题 I'm doing gradient descent in matlab for mutiple variables, and the code is not getting the expected thetas I got with the normal eq. that are: theta = 1.0e+05 * 3.4041 1.1063 -0.0665 With the Normal eq. I have implemented. And with the GDM the results I get are: theta = 1.0e+05 * 2.6618 -2.6718 -0.5954 And I don't understand why is this, maybe some one can help me and tell me where is the mistake in the code. Code: function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num

Implementing gradient descent in TensorFlow instead of using the one provided with it

≯℡__Kan透↙ 提交于 2019-12-06 03:57:11
问题 I want to use gradient descent with momentum (keep track of previous gradients) while building a classifier in TensorFlow. So I don't want to use tensorflow.train.GradientDescentOptimizer but I want to use tensorflow.gradients to calculate gradients and keep track of previous gradients and update the weights based on all of them. How do I do this in TensorFlow? 回答1: TensorFlow has an implementation of gradient descent with momentum. To answer your general question about implementing your own

Tensorflow gradient with respect to matrix

房东的猫 提交于 2019-12-06 03:07:10
Just for context, I'm trying to implement a gradient descent algorithm with Tensorflow. I have a matrix X [ x1 x2 x3 x4 ] [ x5 x6 x7 x8 ] which I multiply by some feature vector Y to get Z [ y1 ] Z = X [ y2 ] = [ z1 ] [ y3 ] [ z2 ] [ y4 ] I then put Z through a softmax function, and take the log. I'll refer to the output matrix as W. All this is implemented as follows (little bit of boilerplate added so it's runnable) sess = tf.Session() num_features = 4 num_actions = 2 policy_matrix = tf.get_variable("params", (num_actions, num_features)) state_ph = tf.placeholder("float", (num_features, 1))

How do I switch tf.train.Optimizers during training?

喜你入骨 提交于 2019-12-05 19:10:24
I want to switch from Adam to SGD after a certain number of epochs. How do I do this smoothly so that the weights/gradients are passed over to the new optimizer? Just define two optimizers and switch between them: sgd_optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) adap_optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost) ... for epoch in range(100): for (x, y) in zip(train_X, train_Y): optimizer = sgd_optimizer if epoch > 50 else adap_optimizer sess.run(optimizer, feed_dict={X: x, Y: y}) An optimizer only encapsulates the way to apply the gradients to

Simple gradient descent using mxnet

流过昼夜 提交于 2019-12-05 15:59:44
I'm trying to use MXNet's gradient descent optimizers to minimize a function. The equivalent example in Tensorflow would be: import tensorflow as tf x = tf.Variable(2, name='x', dtype=tf.float32) log_x = tf.log(x) log_x_squared = tf.square(log_x) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(log_x_squared) init = tf.initialize_all_variables() def optimize(): with tf.Session() as session: session.run(init) print("starting at", "x:", session.run(x), "log(x)^2:", session.run(log_x_squared)) for step in range(10): session.run(train) print("step", step, "x:", session

How to accumulate gradients in tensorflow?

好久不见. 提交于 2019-12-05 15:53:19
问题 I have a question similar to this one. Because I have limited resources and I work with a deep model (VGG-16) - used to train a triplet network - I want to accumulate gradients for 128 batches of size one training example, and then propagate the error and update the weights. It's not clear to me how do I do this. I work with tensorflow but any implementation/pseudocode is welcome. 回答1: Let's walk through the code proposed in one of the answers you liked to: ## Optimizer definition - nothing

What's different about momentum gradient update in Tensorflow and Theano like this?

三世轮回 提交于 2019-12-05 08:10:46
I'm trying to use TensorFlow with my deep learning project. Here I need implement my gradient update in this formula : I have also implement this part in Theano, and it came out the expected answer. But when I try to use TensorFlow's MomentumOptimizer , the result is really bad. I don't know what is different between them. Theano: def gradient_updates_momentum_L2(cost, params, learning_rate, momentum, weight_cost_strength): # Make sure momentum is a sane value assert momentum < 1 and momentum >= 0 # List of update steps for each parameter updates = [] # Just gradient descent on cost for param

TensorFlow's ReluGrad claims input is not finite

筅森魡賤 提交于 2019-12-05 07:42:26
I'm trying out TensorFlow and I'm running into a strange error. I edited the deep MNIST example to use another set of images, and the algorithm converges nicely again, until around iteration 8000 (accuracy 91% at that point) when it crashes with the following error. tensorflow.python.framework.errors.InvalidArgumentError: ReluGrad input is not finite At first I thought maybe some coefficients were reaching the limit for a float, but adding l2 regularization on all weights & biases didn't resolve the issue. It's always the first relu application that comes out of the stacktrace: h_conv1 = tf.nn