Is there an alternative to tf.py_function() for custom Python code?

问题

I have started using TensorFlow 2.0 and have a little uncertainty with regard to one aspect.

Suppose I have this use case: while ingesting data with the tf.data.Dataset I want to apply some specific augmentation operations upon some images. However, the external libraries that I am using require that the image is a numpy array, not a tensor.

When using tf.data.Dataset.from_tensor_slices(), the flowing data needs to be of type Tensor. Concrete example:

def my_function(tensor_image):
   print(tensor_image.numpy()
   return


data = tf.data.Dataset.from_tensor_slices(tensor_images).map(my_function)

The code above does not work yielding an

'Tensor' object has no attribute 'numpy' error.

I have read the documentation on TensorFlow 2.0 stating that if one wants to use an arbitrary python logic, one should use tf.py_function or only TensorFlow primitives according to: How to convert "tensor" to "numpy" array in tensorflow?

My question is the following: Is there another way to use arbitrary python code in a function with a custom decorator/an easier way than to use tf.py_function?

To me honestly it seems that there must be a more elegant way than passing to a tf.py_function, transforming to a numpy array, perform operations A,B,C,D and then retransform to a tensor and yield the result.

回答1:

There is no other way of doing it, because tf.data.Datasets are still (and they will always be, I suppose, for performance reasons) executed in graph mode and, thus, you cannot use anything outside of the tf.* methods, that can be easily converted by TensorFlow to its graph representation.

Using tf.py_function is the only way to mix Python execution (and thus, you can use any Python library) and graph execution when using a tf.data.Dataset object (on the contrary of what happens when using TensorFlow 2.0, that being eager by default allow this mixed execution naturally).

来源：https://stackoverflow.com/questions/59497372/is-there-an-alternative-to-tf-py-function-for-custom-python-code

标签

python

tensorflow-datasets

tensorflow2.0