gradient-descent | 易学教程

How to include a custom filter in a Keras based CNN?

阅读更多关于 How to include a custom filter in a Keras based CNN?

问题 I am working on a fuzzy convolution filter for CNNs. I have the function ready - it takes in the 2D input matrix and the 2D kernel/weight matrix. The function outputs the convolved feature or the activation map. Now, I want to use Keras to build the rest of the CNN that will have the standard 2D convolution filters too. Is there any way I can insert my custom filter into the Keras model in such a way that the kernel matrix is updated by the built in libraries of the Keras backend?

How to implement mini-batch gradient descent in python?

阅读更多关于 How to implement mini-batch gradient descent in python?

问题 I have just started to learn deep learning. I found myself stuck when it came to gradient descent. I know how to implement batch gradient descent. I know how it works as well how mini-batch and stochastic gradient descent works in theory. But really can't understand how to implement in code. import numpy as np X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ]) y = np.array([[0,1,1,0]]).T alpha,hidden_dim = (0.5,4) synapse_0 = 2*np.random.random((3,hidden_dim)) - 1 synapse_1 = 2*np.random

How to calculate optimal batch size

阅读更多关于 How to calculate optimal batch size

问题 Sometimes I run into a problem: OOM when allocating tensor with shape e.q. OOM when allocating tensor with shape (1024, 100, 160) Where 1024 is my batch size and I don't know what's the rest. If I reduce the batch size or the number of neurons in the model, it runs fine. Is there a generic way to calculate optimal batch size based on model and GPU memory, so the program doesn't crash? In short: I want the largest batch size possible in terms of my model, which will fit into my GPU memory and

Neural network always predicts the same class

阅读更多关于 Neural network always predicts the same class

问题 I'm trying to implement a neural network that classifies images into one of the two discrete categories. The problem is, however, that it currently always predicts 0 for any input and I'm not really sure why. Here's my feature extraction method: def extract(file): # Resize and subtract mean pixel img = cv2.resize(cv2.imread(file), (224, 224)).astype(np.float32) img[:, :, 0] -= 103.939 img[:, :, 1] -= 116.779 img[:, :, 2] -= 123.68 # Normalize features img = (img.flatten() - np.mean(img)) / np

TypeError: only length-1 arrays can be converted to Python scalars Dot Product

阅读更多关于 TypeError: only length-1 arrays can be converted to Python scalars Dot Product

问题 Writing this algorithm for my final year project. Debugged a few, but stuck on this. Tried changing the float method but nothing really changed. ----> 8 hypothesis = np.dot(float(x), theta) TypeError: only length-1 arrays can be converted to Python scalars Entire code - import numpy as np import random import pandas as pd def gradientDescent(x, y, theta, alpha, m, numIterations): xTrans = x.transpose() for i in range(0, numIterations): hypothesis = np.dot(x, theta) loss = hypothesis - y # avg

Gradient descent and normal equation method for solving linear regression gives different solutions

阅读更多关于 Gradient descent and normal equation method for solving linear regression gives different solutions

问题 I'm working on machine learning problem and want to use linear regression as learning algorithm. I have implemented 2 different methods to find parameters theta of linear regression model: Gradient (steepest) descent and Normal equation. On the same data they should both give approximately equal theta vector. However they do not. Both theta vectors are very similar on all elements but the first one. That is the one used to multiply vector of all 1 added to the data. Here is how the theta s

Mutable Vector field is not updating in F#

阅读更多关于 Mutable Vector field is not updating in F#

问题 let gradientDescent (X : Matrix<double>) (y :Vector<double>) (theta : Vector<double>) alpha (num_iters : int) = let J_history = Vector<double>.Build.Dense(num_iters) let m = y.Count |> double theta.At(0, 0.0) let x = (X.Column(0).PointwiseMultiply(X*theta-y)) |> Vector.sum for i in 0 .. (num_iters-1) do let next_theta0 = theta.[0] - (alpha / m) * ((X.Column(0).PointwiseMultiply(X*theta-y)) |> Vector.sum) let next_theta1 = theta.[1] - (alpha / m) * ((X.Column(1).PointwiseMultiply(X*theta-y)) |

missing value where TRUE/FALSE needed in R [duplicate]

阅读更多关于 missing value where TRUE/FALSE needed in R [duplicate]

问题 This question already has answers here : Error in if/while (condition) {: missing Value where TRUE/FALSE needed (2 answers) Closed 4 years ago . When I run the following code without commenting gr.ascent(MMSE, 0.5, verbose=TRUE) I receive this error Error in b1 * x : 'b1' is missing but when I comment that line I receive the following error when testing MMSE with these arguments MMSE(2,1,farmland$farm,farmland$area) . Do you know where my problem is lying? Error in if (abs(t[i]) <= k) { :

Multi-Variable Gradient Descent using Numpy - Error in no. of coefficients

阅读更多关于 Multi-Variable Gradient Descent using Numpy - Error in no. of coefficients

问题 For the past few days, I have been trying to code this application of Gradient Descent for my final-year project in Mechanical Engineering. https://drive.google.com/open?id=1tIGqZ2Lb0sN4GEpgYEZLFvtmhigXnot0 The HTML file is attached above. Just download the file, and if you see the results. There are only 3 values in theta, whereas x has 3 independent variables. So it should have 4 values in theta. The code is as follows. For the result, it is theta [-0.03312393 0.94409351 0.99853041] import

Pytorch - Getting gradient for intermediate variables / tensors

阅读更多关于 Pytorch - Getting gradient for intermediate variables / tensors

问题 As an exercice in pytorch framework (0.4.1) , I am trying to display the gradient of X (gX or dSdX) in a simple Linear layer (Z = X.W + B). To simplify my toy example, I backward() from a sum of Z (not a loss). To sum up, I want gX(dSdX) of S=sum(XW+B). The problem is that the gradient of Z (dSdZ) is None . As a result, gX is wrong too of course. import torch X = torch.tensor([[0.5, 0.3, 2.1], [0.2, 0.1, 1.1]], requires_grad=True) W = torch.tensor([[2.1, 1.5], [-1.4, 0.5], [0.2, 1.1]]) B =