问题
I have implemented following code for gradient descent using vectorization but it seems the cost function is not decrementing correctly.Instead the cost function is increasing with each iteration.
Assuming theta to be an n+1 vector, y to be a m vector and X to be design matrix m*(n+1)
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
n = length(theta); % number of features
J_history = zeros(num_iters, 1);
error = ((theta' * X')' - y)*(alpha/m);
descent = zeros(size(theta),1);
for iter = 1:num_iters
for i = 1:n
descent(i) = descent(i) + sum(error.* X(:,i));
i = i + 1;
end
theta = theta - descent;
J_history(iter) = computeCost(X, y, theta);
disp("the value of cost function is : "), disp(J_history(iter));
iter = iter + 1;
end
The compute cost function is :
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
H = theta' * X(i,:)';
E = H - y(i);
SQE = E^2;
J = (J + SQE);
i = i+1;
end;
J = J / (2*m);
回答1:
You can vectorise it even further:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
delta = (theta' * X'-y')*X;
theta = theta - alpha/m*delta';
J_history(iter) = computeCost(X, y, theta);
end
end
回答2:
You can vectorize it better as follows
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
theta=theta-(alpha/m)*((X*theta-y)'*X)';
J_history(iter) = computeCost(X, y, theta);
end;
end;
The ComputeCost function can be written as
function J = computeCost(X, y, theta)
m = length(y);
J = 1/(2*m)*sum((X*theta-y)^2);
end;
来源:https://stackoverflow.com/questions/26656640/octave-code-for-gradient-descent-using-vectorization-not-updating-cost-function