Tensorflow 2.0 doesn't compute the gradient

问题

I want to visualize the patterns that a given feature map in a CNN has learned (in this example I'm using vgg16). To do so I create a random image, feed through the network up to the desired convolutional layer, choose the feature map and find the gradients with the respect to the input. The idea is to change the input in such a way that will maximize the activation of the desired feature map. Using tensorflow 2.0 I have a GradientTape that follows the function and then computes the gradient, however the gradient returns None, why is it unable to compute the gradient?

import tensorflow as tf
import matplotlib.pyplot as plt
import time
import numpy as np
from tensorflow.keras.applications import vgg16

class maxFeatureMap():

    def __init__(self, model):

        self.model = model
        self.optimizer = tf.keras.optimizers.Adam()

    def getNumLayers(self, layer_name):

        for layer in self.model.layers:
            if layer.name == layer_name:
                weights = layer.get_weights()
                num = weights[1].shape[0]
        return ("There are {} feature maps in {}".format(num, layer_name))

    def getGradient(self, layer, feature_map):

        pic = vgg16.preprocess_input(np.random.uniform(size=(1,96,96,3))) ## Creates values between 0 and 1
        pic = tf.convert_to_tensor(pic)

        model = tf.keras.Model(inputs=self.model.inputs, 
                               outputs=self.model.layers[layer].output)
        with tf.GradientTape() as tape:
            ## predicts the output of the model and only chooses the feature_map indicated
            predictions = model.predict(pic, steps=1)[0][:,:,feature_map]
            loss = tf.reduce_mean(predictions)
        print(loss)
        gradients = tape.gradient(loss, pic[0])
        print(gradients)
        self.optimizer.apply_gradients(zip(gradients, pic))

model = vgg16.VGG16(weights='imagenet', include_top=False)


x = maxFeatureMap(model)
x.getGradient(1, 24)

回答1:

This is a common pitfall with GradientTape; the tape only traces tensors that are set to be "watched" and by default tapes will watch only trainable variables (meaning tf.Variable objects created with trainable=True). To watch the pic tensor, you should add tape.watch(pic) as the very first line inside the tape context.

Also, I'm not sure if the indexing (pic[0]) will work, so you might want to remove that -- since pic has just one entry in the first dimension it shouldn't matter anyway.

Furthermore, you cannot use model.predict because this returns a numpy array, which basically "destroys" the computation graph chain so gradients won't be backpropagated. You should simply use the model as a callable, i.e. predictions = model(pic).

来源：https://stackoverflow.com/questions/56916313/tensorflow-2-0-doesnt-compute-the-gradient

标签

python

tensorflow

conv-neural-network

gradient-descent