gradient-descent | 易学教程

How to determine the learning rate and the variance in a gradient descent algorithm？

阅读更多关于 How to determine the learning rate and the variance in a gradient descent algorithm？

问题 I started to learn the machine learning last week. when I want to make a gradient descent script to estimate the model parameters, I came across a problem: How to choose a appropriate learning rate and variance。I found that，different (learning rate，variance) pairs may lead to different results, some times you even can't convergence. Also, if change to another training data set, a well-chose （learning rate，variance）pair probably will not work. For example(script below)，when I set the learning

How to determine the learning rate and the variance in a gradient descent algorithm？

阅读更多关于 How to determine the learning rate and the variance in a gradient descent algorithm？

What is the difference between Gradient Descent and Newton's Gradient Descent?

阅读更多关于 What is the difference between Gradient Descent and Newton's Gradient Descent?

问题 I understand what Gradient Descent does. Basically it tries to move towards the local optimal solution by slowly moving down the curve. I am trying to understand what is the actual difference between the plan gradient descent and the newton's method? From Wikipedia, I read this short line "Newton's method uses curvature information to take a more direct route." What does this intuitively mean? 回答1: At a local minimum (or maximum) x , the derivative of the target function f vanishes: f'(x) = 0

Where is the code for gradient descent?

阅读更多关于 Where is the code for gradient descent?

问题 Running some experiments with TensorFlow, want to look at the implementation of some functions just to see exactly how some things are done, started with the simple case of tf.train.GradientDescentOptimizer . Downloaded the zip of the full source code from github, ran some searches over the source tree, got to: C:\tensorflow-master\tensorflow\python\training\gradient_descent.py class GradientDescentOptimizer(optimizer.Optimizer): def _apply_dense(self, grad, var): return training_ops.apply

How to scale the gradient during batch update in keras?

阅读更多关于 How to scale the gradient during batch update in keras?

问题 I am using a standard keras model and I am training on batch (using the train_on_batch function). Now, I want to take the gradient of each element in the batch and scale it (multiply each sample gradient with a sample-specific value that I have) and after each gradient has been scaled, then it can be summed and used to update the existing weights. Is there anyway to do this given keras functions? And if not, is there a way for me to manipulate this using tensorflow? (given the model and the

How to correctly get the weights using spark for synthetic dataset?

阅读更多关于 How to correctly get the weights using spark for synthetic dataset?

问题 I'm doing LogisticRegressionWithSGD on spark for synthetic dataset. I've calculated the error on matlab using vanilla gradient descent and on R which is ~5%. I got similar weight that was used in the model that I used to generate y. The dataset was generated using this example. While I am able to get very close error rate at the end with different stepsize tuning, the weights for individual feature isn't the same. In fact, it varies a lot. I tried LBFGS for spark and it's able to predict both

How to correctly get the weights using spark for synthetic dataset?

阅读更多关于 How to correctly get the weights using spark for synthetic dataset?

Python gradient-descent multi-regression - cost increases to infinity

阅读更多关于 Python gradient-descent multi-regression - cost increases to infinity

问题 Writing this algorithm for my final year project. Used gradient descent to find the minimum, but instead getting the cost as high as infinity. I have checked the gradientDescent function. I believe that's correct. The csv I am importing and its formatting is causing some error. The data in the CSV is of below format. Each quad before '|' is a row. First 3 columns are independent variables x. 4th column is dependent y. 600 20 0.5 0.63 | 600 20 1 1.5 | 800 20 0.5 0.9 import numpy as np import

Java implementation of fminunc in octave

阅读更多关于 Java implementation of fminunc in octave

问题 I am trying to find a java version of octave's fminunc (function minimization unconstrained) library in Java. The goal is to use it for logistic regression. Currently, I am using a home-brewed version of gradiant descent for cost minimization and I would like to be able to use an already existing library to do that (in Java) for me. This is related to my effort of porting octave code that we have from the Coursera Machine Learning course to Java. 回答1: Ahh, here are a few things you can check

Tensorflow: Convert constant tensor from pre-trained Vgg model to variable

阅读更多关于 Tensorflow: Convert constant tensor from pre-trained Vgg model to variable

问题 My question is how can I convert a constant tensor loaded from a pre-trained Vgg16 model to a tf.Variable tensor? The motivation is that I need to compute the gradient of a specific loss with respect to the Conv4_3 layers' kernel, however, the kernel were seems set to a tf.Constant type and it is not accepted by tf.Optimizer.compute_gradients method. F = vgg.graph.get_tensor_by_name('pretrained_vgg16/conv4_3/filter:0') G = optimizer.compute_gradients(losses, var_list=[F]) # TypeError: