regularized | 易学教程

TensorFlow - reproducing results when using dropout

阅读更多关于 TensorFlow - reproducing results when using dropout

问题 I am training a neural network using dropout regularization. I save the weights and biases the network is initialized with, so that I can repeat the experiment when I get good results. However, the use of dropout introduces some randomness in the network: since dropout drops units randomly, each time I rerun the network, different units are being dropped - even though I initialize the network with the exact same weights and biases (if I understand this correctly). Is there a way to make the

TensorFlow - reproducing results when using dropout

阅读更多关于 TensorFlow - reproducing results when using dropout

Lack of Sparse Solution with L1 Regularization in Pytorch

阅读更多关于 Lack of Sparse Solution with L1 Regularization in Pytorch

问题 I am trying to implement L1 regularization onto the first layer of a simple neural network (1 hidden layer). I looked into some other posts on StackOverflow that apply l1 regularization using Pytorch to figure out how it should be done (references: Adding L1/L2 regularization in PyTorch?, In Pytorch, how to add L1 regularizer to activations?). No matter how high I increase lambda (the l1 regularization strength parameter) I do not get true zeros in the first weight matrix. Why would this be?

Regularized logistic regression code in matlab

阅读更多关于 Regularized logistic regression code in matlab

问题 I'm trying my hand at regularized LR, simple with this formulas in matlab: The cost function: J(theta) = 1/m*sum((-y_i)*log(h(x_i)-(1-y_i)*log(1-h(x_i))))+(lambda/2*m)*sum(theta_j) The gradient: ∂J(theta)/∂theta_0 = [(1/m)*(sum((h(x_i)-y_i)*x_j)] if j=0 ∂j(theta)/∂theta_n = [(1/m)*(sum((h(x_i)-y_i)*x_j)]+(lambda/m)*(theta_j) if j>1 This is not matlab code is just the formula. So far I've done this: function [J, grad] = costFunctionReg(theta, X, y, lambda) J = 0; grad = zeros(size(theta));

Why does regularization in pytorch and scratch code does not match and what is the formula used for regularization in pytorch?

阅读更多关于 Why does regularization in pytorch and scratch code does not match and what is the formula used for regularization in pytorch?

来源： https://stackoverflow.com/questions/63502430/why-does-regularization-in-pytorch-and-scratch-code-does-not-match-and-what-is-t

Regularized Custom Loss Function in Keras with Tensor flow as Backend

阅读更多关于 Regularized Custom Loss Function in Keras with Tensor flow as Backend

来源： https://stackoverflow.com/questions/63432963/regularized-custom-loss-function-in-keras-with-tensor-flow-as-backend

How inverting the dropout compensates the effect of dropout and keeps expected values unchanged?

阅读更多关于 How inverting the dropout compensates the effect of dropout and keeps expected values unchanged?

问题 I'm learning regularization in Neural networks from deeplearning.ai course. Here in dropout regularization, the professor says that if dropout is applied, the calculated activation values will be smaller then when the dropout is not applied (while testing). So we need to scale the activations in order to keep the testing phase simpler. I understood this fact, but I don't understand how scaling is done. Here is a code sample which is used to implement inverted dropout. keep_prob = 0.8 # 0 <=

Is it reasonable for l1/l2 regularization to cause all feature weights to be zero in vowpal wabbit?

阅读更多关于 Is it reasonable for l1/l2 regularization to cause all feature weights to be zero in vowpal wabbit?

问题 I got a weird result from vw , which uses online learning scheme for logistic regression. And when I add --l1 or --l2 regularization then I got all predictions at 0.5 (that means all features are 0) Here's my command: vw -d training_data.txt --loss_function logistic -f model_l1 --invert_hash model_readable_l1 --l1 0.05 --link logistic ...and here's learning process info: using l1 regularization = 0.05 final_regressor = model_l1 Num weight bits = 18 learning rate = 0.5 initial_t = 0 power_t =

TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

阅读更多关于 TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

问题 I am playing with a ANN which is part of Udacity DeepLearning course. I have an assignment which involves introducing generalization to the network with one hidden ReLU layer using L2 loss. I wonder how to properly introduce it so that ALL weights are penalized, not only weights of the output layer. Code for network without generalization is at the bottom of the post (code to actually run the training is out of the scope of the question). Obvious way of introducing the L2 is to replace the

How to create an autoencoder where each layer of encoder should represent the same as a layer of the decoder

阅读更多关于 How to create an autoencoder where each layer of encoder should represent the same as a layer of the decoder

问题 I want to build an autoencoder where each layer in the encoder has the same meaning as a correspondent layer in the decoder. So if the autoencoder is perfectly trained, the values of those layers should be roughly the same. So lets say the autoencoder consists of e1 -> e2 -> e3 -> d2 -> d1, whereas e1 is the input and d1 the output. A normal autoencoder trains to have the same result in d1 as e1, but I want the additional constraint, that e2 and d2 are the same. Therefore I want an additional