activation-function | 易学教程

Is there a simple way to extend an existing activation function? My custom softmax function returns: An operation has `None` for gradient

阅读更多关于 Is there a simple way to extend an existing activation function? My custom softmax function returns: An operation has `None` for gradient

问题 I want to implement an attempt to make softmax faster by using only the top k values in the vector. For that I tried implementing a custom function for tensorflow to use in a model: def softmax_top_k(logits, k=10): values, indices = tf.nn.top_k(logits, k, sorted=False) softmax = tf.nn.softmax(values) logits_shape = tf.shape(logits) return_value = tf.sparse_to_dense(indices, logits_shape, softmax) return_value = tf.convert_to_tensor(return_value, dtype=logits.dtype, name=logits.name) return

how to define the derivative of a custom activation function in keras

阅读更多关于 how to define the derivative of a custom activation function in keras

问题 I have a custom activation function and its derivative, although I can use the custom activation function I don't know how to tell keras what is its derivative. It seems like it finds one itself but I have a parameter that has to be shared between the function and its derivative so how can I do that? I know there is a relatively easy way to do this in tensorflow but I have no idea how to implement it in keras here is how you do it in tensorflow Edit: based on the answer I got maybe I wasn't

Faster implementation for ReLu derivative in python?

阅读更多关于 Faster implementation for ReLu derivative in python?

问题 I have implemented ReLu derivative as: def relu_derivative(x): return (x>0)*np.ones(x.shape) I also tried: def relu_derivative(x): x[x>=0]=1 x[x<0]=0 return x Size of X=(3072,10000). But it's taking much time to compute. Is there any other optimized solution? 回答1: Approach #1 : Using numexpr When working with large data, we can use numexpr module that supports multi-core processing if the intended operations could be expressed as arithmetic ones. Here, one way would be - (X>=0)+0 Thus, to

Keras How to use max_value in Relu activation function

阅读更多关于 Keras How to use max_value in Relu activation function

问题 Relu function as defined in keras/activation.py is: def relu(x, alpha=0., max_value=None): return K.relu(x, alpha=alpha, max_value=max_value) It has a max_value which can be used to clip the value. Now how can this be used/called in the code? I have tried the following: (a) model.add(Dense(512,input_dim=1)) model.add(Activation('relu',max_value=250)) assert kwarg in allowed_kwargs, 'Keyword argument not understood: ' + kwarg AssertionError: Keyword argument not understood: max_value (b) Rel =

Keras - Nan in summary histogram LSTM

阅读更多关于 Keras - Nan in summary histogram LSTM

问题 I've written an LSTM model using Keras, and using LeakyReLU advance activation: # ADAM Optimizer with learning rate decay opt = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001) # build the model model = Sequential() num_features = data.shape[2] num_samples = data.shape[1] model.add( LSTM(16, batch_input_shape=(None, num_samples, num_features), return_sequences=True, activation='linear')) model.add(LeakyReLU(alpha=.001)) model.add(Dropout(0.1)) model.add(LSTM(8

List of activation functions in C#

阅读更多关于 List of activation functions in C#

问题 I can find a list of activation functions in math but not in code. So i guess this would be the right place for such a list in code if there ever should be one. starting with the translation of the algorithms in these 2 links: https://en.wikipedia.org/wiki/Activation_function https://stats.stackexchange.com/questions/115258/comprehensive-list-of-activation-functions-in-neural-networks-with-pros-cons the goal is to have an Activation class (with the functions and their derivative) with easy

Why use softmax only in the output layer and not in hidden layers?

阅读更多关于 Why use softmax only in the output layer and not in hidden layers?

问题 Most examples of neural networks for classification tasks I've seen use the a softmax layer as output activation function. Normally, the other hidden units use a sigmoid, tanh, or ReLu function as activation function. Using the softmax function here would - as far as I know - work out mathematically too. What are the theoretical justifications for not using the softmax function as hidden layer activation functions? Are there any publications about this, something to quote? 回答1: I haven't

How to make a piecewise activation function with Python in TensorFlow?

阅读更多关于 How to make a piecewise activation function with Python in TensorFlow?

问题 The active function in my CNN has the form: abs(X)< tou f = 1.716tanh(0.667x) x >= tou f = 1.716[tanh(2tou/3)+tanh'(2tou/3)(x-tou)] x <= -tou f = 1.716[tanh(-2tou/3)+tanh'(-2tou/3)(x+tou)] tou is a constant. So, in TensorFlow it is possible to make your own activation function. I don't want to write it in C++ and recompile the whole of TensorFlow. How can I use the function available in TensorFlow to achieve it? 回答1: In tensorflow it is easy to write your own activation function if it's

Why is softmax not used in hidden layers [duplicate]

阅读更多关于 Why is softmax not used in hidden layers [duplicate]

问题 This question already has answers here : Why use softmax only in the output layer and not in hidden layers? (4 answers) Closed 2 years ago . I have read the answer given here. My exact question pertains to the accepted answer: Variables independence : a lot of regularization and effort is put to keep your variables independent, uncorrelated and quite sparse. If you use softmax layer as a hidden layer - then you will keep all your nodes (hidden variables) linearly dependent which may result in

How to use softmax activation function at the output layer, but relus in the middle layers in TensorFlow?

阅读更多关于 How to use softmax activation function at the output layer, but relus in the middle layers in TensorFlow?

问题 I have a neural net of 3 hidden layers (so I have 5 layers in total). I want to use Rectified Linear Units at each of the hidden layers, but at the outermost layer I want to apply Softmax on the logits. I want to use the DNNClassifier. I have read the official documentation of the TensorFlow where for setting value of the parameter activation_fn they say: activation_fn: Activation function applied to each layer. If None, will use tf.nn.relu. I know I can always write my own model and use any