Neural Networks: Sigmoid Activation Function for continuous output variable

后端 未结 2 1828
闹比i
闹比i 2021-01-05 18:42

Okay, so I am in the middle of Andrew Ng\'s machine learning course on coursera and would like to adapt the neural network which was completed as part of assignment 4.

相关标签:
2条回答
  • 2021-01-05 19:02

    First, your cost function should be:

    J = 1/m * sum( (a3-y).^2 );
    

    I think your Theta2_grad = (delta3'*a2)/m;is expected to match the numerical approximation after changed to delta3 = 1/2 * (a3 - y);).

    Check this slide for more details.

    EDIT: In case there is some minor discrepancy between our codes, I pasted my code below for your reference. The code has already been compared with numerical approximation function checkNNGradients(lambda);, the Relative Difference is less than 1e-4 (not meets the 1e-11 requirement by Dr.Andrew Ng though)

    function [J grad] = nnCostFunctionRegression(nn_params, ...
                                       input_layer_size, ...
                                       hidden_layer_size, ...
                                       num_labels, ...
                                       X, y, lambda)
    
    Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
                     hidden_layer_size, (input_layer_size + 1));
    
    Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
                     num_labels, (hidden_layer_size + 1));
    
    m = size(X, 1);   
    J = 0;
    Theta1_grad = zeros(size(Theta1));
    Theta2_grad = zeros(size(Theta2));
    
    
    X = [ones(m, 1) X];   
    z1 = sigmoid(X * Theta1');
    zs = z1;
    z1 = [ones(m, 1) z1];
    z2 = z1 * Theta2';
    ht = sigmoid(z2);
    
    
    y_recode = zeros(length(y),num_labels);
    for i=1:length(y)
        y_recode(i,y(i))=1;
    end    
    y = y_recode;
    
    
    regularization=lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
    J=1/(m)*sum(sum((ht - y).^2))+regularization;
    delta_3 = 1/2*(ht - y);
    delta_2 = delta_3 * Theta2(:,2:end) .* sigmoidGradient(X * Theta1');
    
    delta_cap2 = delta_3' * z1; 
    delta_cap1 = delta_2' * X;
    
    Theta1_grad = ((1/m) * delta_cap1)+ ((lambda/m) * (Theta1));
    Theta2_grad = ((1/m) * delta_cap2)+ ((lambda/m) * (Theta2));
    
    Theta1_grad(:,1) = Theta1_grad(:,1)-((lambda/m) * (Theta1(:,1)));
    Theta2_grad(:,1) = Theta2_grad(:,1)-((lambda/m) * (Theta2(:,1)));
    
    
    grad = [Theta1_grad(:) ; Theta2_grad(:)];
    
    end
    
    0 讨论(0)
  • 2021-01-05 19:14

    If you want to have continuous output try not to use sigmoid activation when computing target value.

    a1 = [ones(m, 1) X];   
    a2 = sigmoid(X * Theta1');  
    a2 = [ones(m, 1) z1];  
    a3 = z1 * Theta2';  
    ht = a3;
    

    Normalize input before using it in nnCostFunction. Everything else remains same.

    0 讨论(0)
提交回复
热议问题