gradient-descent | 易学教程

backward function in PyTorch

阅读更多关于 backward function in PyTorch

i have some question about pytorch's backward function i don't think i'm getting the right output import numpy as np import torch from torch.autograd import Variable a = Variable(torch.FloatTensor([[1,2,3],[4,5,6]]), requires_grad=True) out = a * a out.backward(a) print(a.grad) the output is tensor([[ 2., 8., 18.], [32., 50., 72.]]) maybe it's 2*a*a but i think the output suppose to be tensor([[ 2., 4., 6.], [8., 10., 12.]]) 2*a. cause d(x^2)/dx=2x Please read carefully the documentation on backward() to better understand it. By default, pytorch expects backward() to be called for the last

backward function in PyTorch

阅读更多关于 backward function in PyTorch

问题 i have some question about pytorch's backward function i don't think i'm getting the right output import numpy as np import torch from torch.autograd import Variable a = Variable(torch.FloatTensor([[1,2,3],[4,5,6]]), requires_grad=True) out = a * a out.backward(a) print(a.grad) the output is tensor([[ 2., 8., 18.], [32., 50., 72.]]) maybe it's 2*a*a but i think the output suppose to be tensor([[ 2., 4., 6.], [8., 10., 12.]]) 2*a. cause d(x^2)/dx=2x 回答1: Please read carefully the documentation

Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

阅读更多关于 Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

I'm trying to write a custom gradient function for 'my_op' which for the sake of the example contains just a call to tf.identity() (ideally, it could be any graph). import tensorflow as tf from tensorflow.python.framework import function def my_op_grad(x): return [tf.sigmoid(x)] @function.Defun(a=tf.float32, python_grad_func=my_op_grad) def my_op(a): return tf.identity(a) a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32)) sess = tf.Session() sess.run(tf.initialize_all_variables()) grad = tf.gradients(my_op(a), [a])[0] result = sess.run(grad) print(result) sess.close()

pytorch - connection between loss.backward() and optimizer.step()

阅读更多关于 pytorch - connection between loss.backward() and optimizer.step()

Where is an explicit connection between the optimizer and the loss ? How does the optimizer know where to get the gradients of the loss without a call liks this optimizer.step(loss) ? -More context- When I minimize the loss, I didn't have to pass the gradients to the optimizer. loss.backward() # Back Propagation optimizer.step() # Gardient Descent Without delving too deep into the internals of pytorch, I can offer a simplistic answer: Recall that when initializing optimizer you explicitly tell it what parameters (tensors) of the model it should be updating. The gradients are "stored" by the

Vectorization of a gradient descent code

阅读更多关于 Vectorization of a gradient descent code

问题 I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta . theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector. During the update step, I need to set each theta(i) to theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i)) This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term). Any

scipy.optimize.fmin_l_bfgs_b returns 'ABNORMAL_TERMINATION_IN_LNSRCH'

阅读更多关于 scipy.optimize.fmin_l_bfgs_b returns 'ABNORMAL_TERMINATION_IN_LNSRCH'

I am using scipy.optimize.fmin_l_bfgs_b to solve a gaussian mixture problem. The means of mixture distributions are modeled by regressions whose weights have to be optimized using EM algorithm. sigma_sp_new, func_val, info_dict = fmin_l_bfgs_b(func_to_minimize, self.sigma_vector[si][pj], args=(self.w_vectors[si][pj], Y, X, E_step_results[si][pj]), approx_grad=True, bounds=[(1e-8, 0.5)], factr=1e02, pgtol=1e-05, epsilon=1e-08) But sometimes I got a warning 'ABNORMAL_TERMINATION_IN_LNSRCH' in the information dictionary: func_to_minimize value = 1.14462324063e-07 information dictionary: {'task':

Is my implementation of stochastic gradient descent correct?

阅读更多关于 Is my implementation of stochastic gradient descent correct?

I am trying to develop stochastic gradient descent, but I don't know if it is 100% correct. The cost generated by my stochastic gradient descent algorithm is sometimes very far from the one generated by FMINUC or Batch gradient descent. while batch gradient descent cost converge when I set a learning rate alpha of 0.2, I am forced to set a learning rate alpha of 0.0001 for my stochastic implementation for it not to diverge. Is this normal? Here are some results I obtained with a training set of 10,000 elements and num_iter = 100 or 500 FMINUC : Iteration #100 | Cost: 5.147056e-001 BACTH

Vectorization of a gradient descent code

阅读更多关于 Vectorization of a gradient descent code

I am implementing a batch gradient descent on Matlab. I have a problem with the update step of theta . theta is a vector of two components (two rows). X is a matrix containing m rows (number of training samples) and n=2 columns (number of features). Y is an m rows vector. During the update step, I need to set each theta(i) to theta(i) = theta(i) - (alpha/m)*sum((X*theta-y).*X(:,i)) This can be done with a for loop, but I can't figure out how to vectorize it (because of the X(:,i) term). Any suggestion? Looks like you are trying to do a simple matrix multiplication, the thing MATLAB is

Implementing back propagation using numpy and python for cleveland dataset

阅读更多关于 Implementing back propagation using numpy and python for cleveland dataset

问题 I wanted to predict heart disease using backpropagation algorithm for neural networks. For this I used UCI heart disease data set linked here: processed cleveland. To do this, I used the cde found on the following blog: Build a flexible Neural Network with Backpropagation in Python and changed it little bit according to my own dataset. My code is as follows: import numpy as np import csv reader = csv.reader(open("cleveland_data.csv"), delimiter=",") x = list(reader) result = np.array(x)

Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

阅读更多关于 Write Custom Python-Based Gradient Function for an Operation? (without C++ Implementation)

问题 I'm trying to write a custom gradient function for 'my_op' which for the sake of the example contains just a call to tf.identity() (ideally, it could be any graph). import tensorflow as tf from tensorflow.python.framework import function def my_op_grad(x): return [tf.sigmoid(x)] @function.Defun(a=tf.float32, python_grad_func=my_op_grad) def my_op(a): return tf.identity(a) a = tf.Variable(tf.constant([5., 4., 3., 2., 1.], dtype=tf.float32)) sess = tf.Session() sess.run(tf.initialize_all