My implementation (see below) gives the scalar value 3.18, which is not the right answer. The value should be 0.693. Where does my code deviate from the equation?
Her
Your sigmoid
function is incorrect. The incoming data type is a vector but the operations you are using are performing matrix division. This needs to be element-wise.
function g = sigmoid(z)
g = 1.0 ./ (1.0 + exp(-z));
end
By doing 1 / A
where A
is an expression, you are in fact compute the inverse of A
Since inverses only exist for square matrices, this will compute the pseudo-inverse which is definitely not what you want.
You can keep most of your costFunction
code the same as you're using the dot product. I would get rid of the sum
since that is implied with the dot product. I'll mark my changes with comments:
function [J, grad] = costFunction(theta, X, y)
m = length(y); % number of training examples
% You need to return the following variables correctly
%J = 0; %#ok<NASGU> <-- Don't need to declare this as you'll create the variables later
%grad = zeros(size(theta)); %#ok<NASGU>
hx = sigmoid(X * theta); % <-- Remove transpose
m = length(X);
J = (-y' * log(hx) - (1 - y')*log(1 - hx)) / m; % <-- Remove sum
grad = X' * (hx - y) / m;
end
This is the code for the sigmoid function which I think you have made mistake in:
function g = sigmoid(z)
g = zeros(size(z));
temp=1+exp(-1.*z);
g=1./temp;
end
function [J, grad] = costFunction(theta, X, y)
m = length(y);
J = 0;
grad = zeros(size(theta));
h=X*theta;
xtemp=sigmoid(h);
temp1=(-y'*log(xtemp));
temp2=(1-y)'*log(1-xtemp);
J=1/m*sum(temp1-temp2);
grad=1/m*(X'*(xtemp-y));
end
And I think it should be (1-y)' as shown in temp2=(1-y)'