Is there a simple way to extend an existing activation function? My custom softmax function returns: An operation has `None` for gradient
问题 I want to implement an attempt to make softmax faster by using only the top k values in the vector. For that I tried implementing a custom function for tensorflow to use in a model: def softmax_top_k(logits, k=10): values, indices = tf.nn.top_k(logits, k, sorted=False) softmax = tf.nn.softmax(values) logits_shape = tf.shape(logits) return_value = tf.sparse_to_dense(indices, logits_shape, softmax) return_value = tf.convert_to_tensor(return_value, dtype=logits.dtype, name=logits.name) return