activation-function

What is the intuition of using tanh in LSTM

人走茶凉 提交于 2019-11-29 19:31:56
In LSTM Network ( Understanding LSTMs ), Why input gate and output gate use tanh? what is the intuition behind this? it is just a nonlinear transformation? if it is, can I change both to another activation function (e.g. ReLU)? Sigmoid specifically, is used as the gating function for the 3 gates(in, out, forget) in LSTM , since it outputs a value between 0 and 1, it can either let no flow or complete flow of information throughout the gates. On the other hand, to overcome the vanishing gradient problem, we need a function whose second derivative can sustain for a long range before going to

What is the intuition of using tanh in LSTM

老子叫甜甜 提交于 2019-11-28 13:45:29
问题 In LSTM Network (Understanding LSTMs), Why input gate and output gate use tanh? what is the intuition behind this? it is just a nonlinear transformation? if it is, can I change both to another activation function (e.g. ReLU)? 回答1: Sigmoid specifically, is used as the gating function for the 3 gates(in, out, forget) in LSTM , since it outputs a value between 0 and 1, it can either let no flow or complete flow of information throughout the gates. On the other hand, to overcome the vanishing

How to make a custom activation function with only Python in Tensorflow?

久未见 提交于 2019-11-26 00:41:40
问题 Suppose you need to make an activation function which is not possible using only pre-defined tensorflow building-blocks, what can you do? So in Tensorflow it is possible to make your own activation function. But it is quite complicated, you have to write it in C++ and recompile the whole of tensorflow [1] [2]. Is there a simpler way? 回答1: Yes There is! Credit: It was hard to find the information and get it working but here is an example copying from the principles and code found here and here

How to make a custom activation function with only Python in Tensorflow?

我们两清 提交于 2019-11-25 20:42:33
Suppose you need to make an activation function which is not possible using only pre-defined tensorflow building-blocks, what can you do? So in Tensorflow it is possible to make your own activation function. But it is quite complicated, you have to write it in C++ and recompile the whole of tensorflow [1] [2] . Is there a simpler way? Yes There is! Credit: It was hard to find the information and get it working but here is an example copying from the principles and code found here and here . Requirements: Before we start, there are two requirement for this to be able to succeed. First you need