Tensorflow custom activation function

元气小坏坏 提交于 2019-12-08 11:05:32

问题


I implemented a network with TensorFlow and created the model doing the following in my code:

def multilayer_perceptron(x, weights, biases):
    layer_1 = tf.add(tf.matmul(x, weights["h1"]), biases["b1"])
    layer_1 = tf.nn.relu(layer_1)
    out_layer = tf.add(tf.matmul(layer_1, weights["out"]), biases["out"])
    return out_layer

I initialize the weights and the biases doing:

weights = {
    "h": tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    "out": tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
    }

biases = {
    "b": tf.Variable(tf.random_normal([n_hidden_1])),
    "out": tf.Variable(tf.random_normal([n_classes]))
    }

Now I want to use a custom activation function. Therefore I replaced tf.nn.relu(layer_1) with a custom activation function custom_sigmoid(layer_1) which is defined as:

def custom_sigmoid(x):
    beta = tf.Variable(tf.random.normal(x.get_shape[1]))
    return tf.sigmoid(beta*x)

Where beta is a trainable parameter. I realized that this can not work since I don't know how to implement the derivative such that TensorFlow can use it.

Question: How can I use a custom activation function in TensorFlow? I would really appreciate any help.


回答1:


That's the beauty of automatic differentiation! You don't need to know how to compute the derivative of your function as long as you use all tensorflow constructs that are inherently differentiable (there are some functions that simply are non-differentiable functions in tensorflow).

For everything else the derivative is computed for you by tensorflow, any combination of operations that are inherently differentiable can be used and you never need to think about the gradient. Validate it by using tf.graidents in a test case to show that tensorflow is computing the gradient with respect to your cost function.

Here's a really nice explanation of automatic differentiation for the curious:

https://alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/

You can make sure that beta is a trainable parameter by checking that it exists in the collection tf.GraphKeys.TRAINABLE_VARIABLES, this means that the optimizer will compute its derivative w.r.t. the cost and update it (if it's not in that collection you should investigate).




回答2:


I try to answer my own question. Here is what I did and what seems to work:

First I define a custom activation function:

def custom_sigmoid(x, beta_weights):
    return tf.sigmoid(beta_weights*x)

Then I create weights for the activation function:

beta_weights = {
    "beta1": tf.Variable(tf.random_normal([n_hidden_1]))
    }

Finally I add beta_weights to my model function and replace the activation function in multilayer_perceptron():

def multilayer_perceptron(x, weights, biases, beta_weights):
    layer_1 = tf.add(tf.matmul(x, weights["h1"]), biases["b1"])
    #layer_1 = tf.nn.relu(layer_1) # Old
    layer_1 = custom_sigmoid(x, beta_weights["beta1"]) # New
    out_layer = tf.add(tf.matmul(layer_1, weights["out"]), biases["out"])
    return out_layer


来源:https://stackoverflow.com/questions/49923958/tensorflow-custom-activation-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!