When I use the following custom metric (keras-style):
from sklearn.metrics import classification_report, f1_score from ten
then using @tf.function(experimental_relax_shapes=True) will probably solve your problem