问题
I'm trying to write a custom activation function for use with Keras. I can not write it with tensorflow primitives as it does properly compute the derivative. I followed How to make a custom activation function with only Python in Tensorflow? and it works very we in creating a tensorflow function. However, when I tried putting it into Keras as an activation function for the classic MNIST demo. I got errors. I also tried the tf_spiky
function from the above reference.
Here is the sample code
tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf_spiky),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
Here's my entire error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-48-73a57f81db19> in <module>
3 tf.keras.layers.Dense(512, activation=tf_spiky),
4 tf.keras.layers.Dropout(0.2),
----> 5 tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
6 x=tf.keras.layers.Activation(tf_spiky)
7 y=tf.keras.layers.Flatten(input_shape=(28, 28))
/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py in _method_wrapper(self, *args, **kwargs)
472 self._setattr_tracking = False # pylint: disable=protected-access
473 try:
--> 474 method(self, *args, **kwargs)
475 finally:
476 self._setattr_tracking = previous_value # pylint: disable=protected-access
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py in __init__(self, layers, name)
106 if layers:
107 for layer in layers:
--> 108 self.add(layer)
109
110 @property
/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py in _method_wrapper(self, *args, **kwargs)
472 self._setattr_tracking = False # pylint: disable=protected-access
473 try:
--> 474 method(self, *args, **kwargs)
475 finally:
476 self._setattr_tracking = previous_value # pylint: disable=protected-access
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py in add(self, layer)
173 # If the model is being built continuously on top of an input layer:
174 # refresh its output.
--> 175 output_tensor = layer(self.outputs[0])
176 if isinstance(output_tensor, list):
177 raise TypeError('All layers in a Sequential model '
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
728
729 # Check input assumptions set before layer building, e.g. input rank.
--> 730 self._assert_input_compatibility(inputs)
731 if input_list and self._dtype is None:
732 try:
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in _assert_input_compatibility(self, inputs)
1463 if x.shape.ndims is None:
1464 raise ValueError('Input ' + str(input_index) + ' of layer ' +
-> 1465 self.name + ' is incompatible with the layer: '
1466 'its rank is undefined, but the layer requires a '
1467 'defined rank.')
ValueError: Input 0 of layer dense_1 is incompatible with the layer: its rank is undefined, but the layer requires a defined rank.
From this I gather the last Dense
layer is unable to get the dimensions of the output after the activation function or something to that. I did see in the tensorflow code that many activation functions register a shape. But either I'm not doing that correctly or I'm going in the wrong direction. But I'm guessing something needs to be done to the tensorflow function to make it an activation function that Keras can use.
I would appreciate any help you can give.
As requested here is the sample codes for tf_spiky
, it works as described in the above reference. However, once put into Keras I get the errors shown. This is pretty much as shown in the *How to make a custom activation function with only Python in Tensorflow?" stackoverflow article.
def spiky(x):
print(x)
r = x % 1
if r <= 0.5:
return r
else:
return 0
def d_spiky(x):
r = x % 1
if r <= 0.5:
return 1
else:
return 0
np_spiky = np.vectorize(spiky)
np_d_spiky = np.vectorize(d_spiky)
np_d_spiky_32 = lambda x: np_d_spiky(x).astype(np.float32)
import tensorflow as tf
from tensorflow.python.framework import ops
def tf_d_spiky(x,name=None):
with tf.name_scope(name, "d_spiky", [x]) as name:
y = tf.py_func(np_d_spiky_32,
[x],
[tf.float32],
name=name,
stateful=False)
return y[0]
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
def spikygrad(op, grad):
x = op.inputs[0]
n_gr = tf_d_spiky(x)
return grad * n_gr
np_spiky_32 = lambda x: np_spiky(x).astype(np.float32)
def tf_spiky(x, name=None):
with tf.name_scope(name, "spiky", [x]) as name:
y = py_func(np_spiky_32,
[x],
[tf.float32],
name=name,
grad=spikygrad) # <-- here's the call to the gradient
return y[0]
回答1:
The solution is in this post Output from TensorFlow `py_func` has unknown rank/shape
The easiest fix is to add y[0].set_shape(x.get_shape())
before the return statement in the definition of tf_spiky
.
Perhaps someone out there knows how to properly work with tensorflow shape functions. Digging around I found a unchanged_shape
shape function in tensorflow.python.framework.common_shapes
, which be appropriate here, but I don't know how to attach it to the tf_spiky
function. Seems a python decorator is in order here. It would probably be a service to others to explain customizing tensorflow functions with shape functions.
来源:https://stackoverflow.com/questions/57470003/how-do-you-write-a-custom-activation-function-in-python-for-keras