gradient-descent

backward function in PyTorch

浪子不回头ぞ 提交于 2019-12-01 08:10:04
i have some question about pytorch's backward function i don't think i'm getting the right output import numpy as np import torch from torch.autograd import Variable a = Variable(torch.FloatTensor([[1,2,3],[4,5,6]]), requires_grad=True) out = a * a out.backward(a) print(a.grad) the output is tensor([[ 2., 8., 18.], [32., 50., 72.]]) maybe it's 2*a*a but i think the output suppose to be tensor([[ 2., 4., 6.], [8., 10., 12.]]) 2*a. cause d(x^2)/dx=2x Please read carefully the documentation on backward() to better understand it. By default, pytorch expects backward() to be called for the last

backward function in PyTorch

十年热恋 提交于 2019-12-01 06:09:41
问题 i have some question about pytorch's backward function i don't think i'm getting the right output import numpy as np import torch from torch.autograd import Variable a = Variable(torch.FloatTensor([[1,2,3],[4,5,6]]), requires_grad=True) out = a * a out.backward(a) print(a.grad) the output is tensor([[ 2., 8., 18.], [32., 50., 72.]]) maybe it's 2*a*a but i think the output suppose to be tensor([[ 2., 4., 6.], [8., 10., 12.]]) 2*a. cause d(x^2)/dx=2x 回答1: Please read carefully the documentation

Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

时光总嘲笑我的痴心妄想 提交于 2019-11-30 14:57:28
I'm trying to write a custom gradient function for 'my_op' which for the sake of the example contains just a call to tf.identity() (ideally, it could be any graph). import tensorflow as tf from tensorflow.python.framework import function def my_op_grad(x): return [tf.sigmoid(x)] @function.Defun(a=tf.float32, python_grad_func=my_op_grad) def my_op(a): return tf.identity(a) a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32)) sess = tf.Session() sess.run(tf.initialize_all_variables()) grad = tf.gradients(my_op(a), [a])[0] result = sess.run(grad) print(result) sess.close()

pytorch - connection between loss.backward() and optimizer.step()

梦想的初衷 提交于 2019-11-30 12:41:52
Where is an explicit connection between the optimizer and the loss ? How does the optimizer know where to get the gradients of the loss without a call liks this optimizer.step(loss) ? -More context- When I minimize the loss, I didn't have to pass the gradients to the optimizer. loss.backward() # Back Propagation optimizer.step() # Gardient Descent Without delving too deep into the internals of pytorch, I can offer a simplistic answer: Recall that when initializing optimizer you explicitly tell it what parameters (tensors) of the model it should be updating. The gradients are "stored" by the

Vectorization of a gradient descent code

北城以北 提交于 2019-11-30 11:33:39
问题 I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta . theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector. During the update step, I need to set each theta(i) to theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i)) This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term). Any

scipy.optimize.fmin_l_bfgs_b returns 'ABNORMAL_TERMINATION_IN_LNSRCH'

删除回忆录丶 提交于 2019-11-30 05:08:17
I am using scipy.optimize.fmin_l_bfgs_b to solve a gaussian mixture problem. The means of mixture distributions are modeled by regressions whose weights have to be optimized using EM algorithm. sigma_sp_new, func_val, info_dict = fmin_l_bfgs_b(func_to_minimize, self.sigma_vector[si][pj], args=(self.w_vectors[si][pj], Y, X, E_step_results[si][pj]), approx_grad=True, bounds=[(1e-8, 0.5)], factr=1e02, pgtol=1e-05, epsilon=1e-08) But sometimes I got a warning 'ABNORMAL_TERMINATION_IN_LNSRCH' in the information dictionary: func_to_minimize value = 1.14462324063e-07 information dictionary: {'task':

Is my implementation of stochastic gradient descent correct?

笑着哭i 提交于 2019-11-30 04:13:19
I am trying to develop stochastic gradient descent, but I don't know if it is 100% correct. The cost generated by my stochastic gradient descent algorithm is sometimes very far from the one generated by FMINUC or Batch gradient descent. while batch gradient descent cost converge when I set a learning rate alpha of 0.2, I am forced to set a learning rate alpha of 0.0001 for my stochastic implementation for it not to diverge. Is this normal? Here are some results I obtained with a training set of 10,000 elements and num_iter = 100 or 500 FMINUC : Iteration #100 | Cost: 5.147056e-001 BACTH

Vectorization of a gradient descent code

我的梦境 提交于 2019-11-30 00:32:14
I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta . theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector. During the update step, I need to set each theta(i) to theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i)) This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term). Any suggestion? Looks like you are trying to do a simple matrix multiplication, the thing MATLAB is

Implementing back propagation using numpy and python for cleveland dataset

霸气de小男生 提交于 2019-11-29 22:44:13
问题 I wanted to predict heart disease using backpropagation algorithm for neural networks. For this I used UCI heart disease data set linked here: processed cleveland. To do this, I used the cde found on the following blog: Build a flexible Neural Network with Backpropagation in Python and changed it little bit according to my own dataset. My code is as follows: import numpy as np import csv reader = csv.reader(open("cleveland_data.csv"), delimiter=",") x = list(reader) result = np.array(x)

Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

点点圈 提交于 2019-11-29 21:12:23
问题 I'm trying to write a custom gradient function for 'my_op' which for the sake of the example contains just a call to tf.identity() (ideally, it could be any graph). import tensorflow as tf from tensorflow.python.framework import function def my_op_grad(x): return [tf.sigmoid(x)] @function.Defun(a=tf.float32, python_grad_func=my_op_grad) def my_op(a): return tf.identity(a) a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32)) sess = tf.Session() sess.run(tf.initialize_all