Neural Networks: Sigmoid Activation Function for continuous output variable

后端未结

关注

 2  1828

Okay, so I am in the middle of Andrew Ng\'s machine learning course on coursera and would like to adapt the neural network which was completed as part of assignment 4.

相关标签:

2条回答

-上瘾入骨i

2021-01-05 19:02

First, your cost function should be:

J = 1/m * sum( (a3-y).^2 );

I think your Theta2_grad = (delta3'*a2)/m;is expected to match the numerical approximation after changed to delta3 = 1/2 * (a3 - y);).

Check this slide for more details.

EDIT: In case there is some minor discrepancy between our codes, I pasted my code below for your reference. The code has already been compared with numerical approximation function checkNNGradients(lambda);, the Relative Difference is less than 1e-4 (not meets the 1e-11 requirement by Dr.Andrew Ng though)

function [J grad] = nnCostFunctionRegression(nn_params, ...
                                   input_layer_size, ...
                                   hidden_layer_size, ...
                                   num_labels, ...
                                   X, y, lambda)

Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
                 hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
                 num_labels, (hidden_layer_size + 1));

m = size(X, 1);   
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));


X = [ones(m, 1) X];   
z1 = sigmoid(X * Theta1');
zs = z1;
z1 = [ones(m, 1) z1];
z2 = z1 * Theta2';
ht = sigmoid(z2);


y_recode = zeros(length(y),num_labels);
for i=1:length(y)
    y_recode(i,y(i))=1;
end    
y = y_recode;


regularization=lambda/2/m*(sum(sum(Theta1(:,2:end).^2))+sum(sum(Theta2(:,2:end).^2)));
J=1/(m)*sum(sum((ht - y).^2))+regularization;
delta_3 = 1/2*(ht - y);
delta_2 = delta_3 * Theta2(:,2:end) .* sigmoidGradient(X * Theta1');

delta_cap2 = delta_3' * z1; 
delta_cap1 = delta_2' * X;

Theta1_grad = ((1/m) * delta_cap1)+ ((lambda/m) * (Theta1));
Theta2_grad = ((1/m) * delta_cap2)+ ((lambda/m) * (Theta2));

Theta1_grad(:,1) = Theta1_grad(:,1)-((lambda/m) * (Theta1(:,1)));
Theta2_grad(:,1) = Theta2_grad(:,1)-((lambda/m) * (Theta2(:,1)));


grad = [Theta1_grad(:) ; Theta2_grad(:)];

end

0 讨论(0)

独厮守ぢ

2021-01-05 19:14
If you want to have continuous output try not to use sigmoid activation when computing target value.
```
a1 = [ones(m, 1) X];   
a2 = sigmoid(X * Theta1');  
a2 = [ones(m, 1) z1];  
a3 = z1 * Theta2';  
ht = a3;
```
Normalize input before using it in nnCostFunction. Everything else remains same.
0 讨论(0)
发布评论:

提交评论
- 加载中...