regularized

TensorFlow - reproducing results when using dropout

纵然是瞬间 提交于 2021-02-07 20:58:51
问题 I am training a neural network using dropout regularization. I save the weights and biases the network is initialized with, so that I can repeat the experiment when I get good results. However, the use of dropout introduces some randomness in the network: since dropout drops units randomly, each time I rerun the network, different units are being dropped - even though I initialize the network with the exact same weights and biases (if I understand this correctly). Is there a way to make the

TensorFlow - reproducing results when using dropout

。_饼干妹妹 提交于 2021-02-07 20:57:55
问题 I am training a neural network using dropout regularization. I save the weights and biases the network is initialized with, so that I can repeat the experiment when I get good results. However, the use of dropout introduces some randomness in the network: since dropout drops units randomly, each time I rerun the network, different units are being dropped - even though I initialize the network with the exact same weights and biases (if I understand this correctly). Is there a way to make the

Lack of Sparse Solution with L1 Regularization in Pytorch

落花浮王杯 提交于 2021-01-27 12:52:48
问题 I am trying to implement L1 regularization onto the first layer of a simple neural network (1 hidden layer). I looked into some other posts on StackOverflow that apply l1 regularization using Pytorch to figure out how it should be done (references: Adding L1/L2 regularization in PyTorch?, In Pytorch, how to add L1 regularizer to activations?). No matter how high I increase lambda (the l1 regularization strength parameter) I do not get true zeros in the first weight matrix. Why would this be?

Regularized logistic regression code in matlab

本小妞迷上赌 提交于 2021-01-20 14:42:39
问题 I'm trying my hand at regularized LR, simple with this formulas in matlab: The cost function: J(theta) = 1/m*sum((-y_i)*log(h(x_i)-(1-y_i)*log(1-h(x_i))))+(lambda/2*m)*sum(theta_j) The gradient: ∂J(theta)/∂theta_0 = [(1/m)*(sum((h(x_i)-y_i)*x_j)] if j=0 ∂j(theta)/∂theta_n = [(1/m)*(sum((h(x_i)-y_i)*x_j)]+(lambda/m)*(theta_j) if j>1 This is not matlab code is just the formula. So far I've done this: function [J, grad] = costFunctionReg(theta, X, y, lambda) J = 0; grad = zeros(size(theta));

How inverting the dropout compensates the effect of dropout and keeps expected values unchanged?

南笙酒味 提交于 2020-05-16 04:42:25
问题 I'm learning regularization in Neural networks from deeplearning.ai course. Here in dropout regularization, the professor says that if dropout is applied, the calculated activation values will be smaller then when the dropout is not applied (while testing). So we need to scale the activations in order to keep the testing phase simpler. I understood this fact, but I don't understand how scaling is done. Here is a code sample which is used to implement inverted dropout. keep_prob = 0.8 # 0 <=

Is it reasonable for l1/l2 regularization to cause all feature weights to be zero in vowpal wabbit?

|▌冷眼眸甩不掉的悲伤 提交于 2020-02-01 08:28:37
问题 I got a weird result from vw , which uses online learning scheme for logistic regression. And when I add --l1 or --l2 regularization then I got all predictions at 0.5 (that means all features are 0) Here's my command: vw -d training_data.txt --loss_function logistic -f model_l1 --invert_hash model_readable_l1 --l1 0.05 --link logistic ...and here's learning process info: using l1 regularization = 0.05 final_regressor = model_l1 Num weight bits = 18 learning rate = 0.5 initial_t = 0 power_t =

TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

北城以北 提交于 2019-12-29 10:07:16
问题 I am playing with a ANN which is part of Udacity DeepLearning course. I have an assignment which involves introducing generalization to the network with one hidden ReLU layer using L2 loss. I wonder how to properly introduce it so that ALL weights are penalized, not only weights of the output layer. Code for network without generalization is at the bottom of the post (code to actually run the training is out of the scope of the question). Obvious way of introducing the L2 is to replace the

How to create an autoencoder where each layer of encoder should represent the same as a layer of the decoder

泪湿孤枕 提交于 2019-12-23 21:18:12
问题 I want to build an autoencoder where each layer in the encoder has the same meaning as a correspondent layer in the decoder. So if the autoencoder is perfectly trained, the values of those layers should be roughly the same. So lets say the autoencoder consists of e1 -> e2 -> e3 -> d2 -> d1, whereas e1 is the input and d1 the output. A normal autoencoder trains to have the same result in d1 as e1, but I want the additional constraint, that e2 and d2 are the same. Therefore I want an additional