问题
I am trying to implement the NIMA Research paper by Google where they rate the image quality. I am using the TID2013 data set. I have 3000 images each one having a score from 0.00 to 9.00
df.head()
>>
Image Name Score
0 I01_01_1.bmp 5.51429
1 i01_01_2.bmp 5.56757
2 i01_01_3.bmp 4.94444
3 i01_01_4.bmp 4.37838
4 i01_01_5.bmp 3.86486
I FOUND the code for loss function given below
def earth_mover_loss(y_true, y_pred):
cdf_true = K.cumsum(y_true, axis=-1)
cdf_pred = K.cumsum(y_pred, axis=-1)
emd = K.sqrt(K.mean(K.square(cdf_true - cdf_pred), axis=-1))
return K.mean(emd)
and I wrote the code for model building as:
base_model = InceptionResNetV2(input_shape=(W,H, 3),include_top=False,pooling='avg',weights='imagenet')
for layer in base_model.layers:
layer.trainable = False
x = Dropout(0.45)(base_model.output)
out = Dense(10, activation='softmax')(x) # there are 10 classes
model = Model(base_model.input, out)
optimizer = Adam(lr=0.001)
model.compile(optimizer,loss=earth_mover_loss,)
PROBLEM:
When I use ImageDataGenerator
as:
gen=ImageDataGenerator(validation_split=0.15,preprocessing_function=preprocess_input)
train = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='training',class_mode='sparse')
val = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='validation',class_mode='sparse')
It either gives an error during Training or loss value of nan
I have tried a few methods:
- Creating the scores as
rounded = math.round(score)
and useclass_mode=sparse
- Creating the scores as
str(rounded)
and then useclass_mode=categorical
but I am having error every time.
PLEASE help me with loading the images using ImageDataGenerator
about how am I supposed to load the images into this model.
Model structure should not change.
回答1:
Following what was introduced here, I have a couple of ideas about the NaN gradient...
I think that your loss is nan because the sqrt is computed on a negative number which is not allowed. so there are two possibilities:
clip the values before applying the sqrt. In this way we clip all the value <= 0, substituting them with a small epsilon
def earth_mover_loss(y_true, y_pred): cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1) cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1) emd = K.sqrt(K.maximum(K.mean(K.square(cdf_true - cdf_pred), axis=-1), K.epsilon())) return K.mean(emd)
exclude the sqrt, in this way the Earth Mover Loss is more similar to a MSE between CDFs
def earth_mover_loss(y_true, y_pred): cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1) cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1) emd = K.mean(K.square(cdf_true - cdf_pred), axis=-1) return K.mean(emd)
来源:https://stackoverflow.com/questions/61672258/what-should-be-the-input-types-for-earth-mover-loss-when-images-are-rated-in-dec