How to write cost function formula from Andrew Ng assignment in Octave?

前端 未结 2 1427
鱼传尺愫
鱼传尺愫 2021-01-14 16:14

My implementation (see below) gives the scalar value 3.18, which is not the right answer. The value should be 0.693. Where does my code deviate from the equation?

Her

相关标签:
2条回答
  • 2021-01-14 16:34

    Your sigmoid function is incorrect. The incoming data type is a vector but the operations you are using are performing matrix division. This needs to be element-wise.

    function g = sigmoid(z)
        g = 1.0 ./ (1.0 + exp(-z));
    end
    

    By doing 1 / A where A is an expression, you are in fact compute the inverse of A Since inverses only exist for square matrices, this will compute the pseudo-inverse which is definitely not what you want.

    You can keep most of your costFunction code the same as you're using the dot product. I would get rid of the sum since that is implied with the dot product. I'll mark my changes with comments:

    function [J, grad] = costFunction(theta, X, y)
    
    m = length(y); % number of training examples
    
    % You need to return the following variables correctly 
    %J = 0; %#ok<NASGU> <-- Don't need to declare this as you'll create the variables later
    %grad = zeros(size(theta)); %#ok<NASGU>
    
    hx = sigmoid(X * theta);  % <-- Remove transpose
    m = length(X);
    
    J = (-y' * log(hx) - (1 - y')*log(1 - hx)) / m; % <-- Remove sum
    
    grad = X' * (hx - y) / m;
    
    end
    
    0 讨论(0)
  • 2021-01-14 16:41

    This is the code for the sigmoid function which I think you have made mistake in:

    function g = sigmoid(z)
       g = zeros(size(z));
       temp=1+exp(-1.*z);
       g=1./temp;
    end
    
    
    function [J, grad] = costFunction(theta, X, y)
       m = length(y); 
       J = 0;
       grad = zeros(size(theta));
       h=X*theta;
       xtemp=sigmoid(h);
       temp1=(-y'*log(xtemp));
       temp2=(1-y)'*log(1-xtemp);
       J=1/m*sum(temp1-temp2);
       grad=1/m*(X'*(xtemp-y));
    end
    

    And I think it should be (1-y)' as shown in temp2=(1-y)'

    0 讨论(0)
提交回复
热议问题