What should be the Input types for Earth Mover Loss when images are rated in decimals from 0 to 9 (Keras, Tensorflow)

问题

I am trying to implement the NIMA Research paper by Google where they rate the image quality. I am using the TID2013 data set. I have 3000 images each one having a score from 0.00 to 9.00

df.head()
>>
Image Name          Score
0   I01_01_1.bmp    5.51429
1   i01_01_2.bmp    5.56757
2   i01_01_3.bmp    4.94444
3   i01_01_4.bmp    4.37838
4   i01_01_5.bmp    3.86486

I FOUND the code for loss function given below

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.cumsum(y_true, axis=-1)
    cdf_pred = K.cumsum(y_pred, axis=-1)
    emd = K.sqrt(K.mean(K.square(cdf_true - cdf_pred), axis=-1))
    return K.mean(emd)

and I wrote the code for model building as:

base_model = InceptionResNetV2(input_shape=(W,H, 3),include_top=False,pooling='avg',weights='imagenet')
for layer in base_model.layers: 
    layer.trainable = False

x = Dropout(0.45)(base_model.output)
out = Dense(10, activation='softmax')(x) # there are 10 classes

model = Model(base_model.input, out)
optimizer = Adam(lr=0.001)
model.compile(optimizer,loss=earth_mover_loss,)

PROBLEM: When I use ImageDataGenerator as:

gen=ImageDataGenerator(validation_split=0.15,preprocessing_function=preprocess_input)

train = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='training',class_mode='sparse')

val = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='validation',class_mode='sparse')

It either gives an error during Training or loss value of nan

I have tried a few methods:

Creating the scores as rounded = math.round(score) and use class_mode=sparse
Creating the scores as str(rounded) and then use class_mode=categorical

but I am having error every time.

PLEASE help me with loading the images using ImageDataGenerator about how am I supposed to load the images into this model.

Model structure should not change.

回答1:

Following what was introduced here, I have a couple of ideas about the NaN gradient...

I think that your loss is nan because the sqrt is computed on a negative number which is not allowed. so there are two possibilities:

clip the values before applying the sqrt. In this way we clip all the value <= 0, substituting them with a small epsilon

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1)
    cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1)
    emd = K.sqrt(K.maximum(K.mean(K.square(cdf_true - cdf_pred), axis=-1), K.epsilon()))
    return K.mean(emd)

exclude the sqrt, in this way the Earth Mover Loss is more similar to a MSE between CDFs

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1)
    cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1)
    emd = K.mean(K.square(cdf_true - cdf_pred), axis=-1)
    return K.mean(emd)

来源：https://stackoverflow.com/questions/61672258/what-should-be-the-input-types-for-earth-mover-loss-when-images-are-rated-in-dec

标签

python

tensorflow

keras

deep-learning

image-recognition