问题
I just saved a model, by that code:
def train():
with tf.Session() as sess:
saver = tf.train.Saver(max_to_keep = 2)
Loss = myYoloLoss([Scale1,Scale2,Scale3],[Y1, Y2 ,Y3])
opt = tf.train.AdamOptimizer(2e-4).minimize(Loss)
init = tf.global_variables_initializer()
sess.run(init)
imageNum = 0
Num = 0
while(1):
#get batchInput
batchImg,batchScale1,batchScale2,batchScale3 = getBatchImage(batchSize = BATCHSIZE)
for epoch in range(75):
_ , epochloss = sess.run([opt,Loss],feed_dict={X:batchImg,Y1:batchScale1,Y2:batchScale2,Y3:batchScale3})
if(epoch%15 == 0):
print(epochloss)
imageNum = imageNum + BATCHSIZE
Num = Num + 1
if(Num%4 == 0):
saver.save(sess,MODELPATH + 'MyModle__' + str(imageNum))
if(os.path.exists(STOPFLAGPATH)):
saver.save(sess,MODELPATH + 'MyModle__Stop_' + str(imageNum))
print('checked stopfile,stop')
break
return 0
And then I get some files:
MyModle__Stop_288.index
MyModle__Stop_288.meta
MyModle__Stop_288.data-00000-of-00001
checkpoint
Then I continue to train this model:
def reTrain():
with tf.Session() as sess:
loder = tf.train.import_meta_graph('E:/MyYoloModel/MyModle__Stop_288.meta')
loder.restore(sess, tf.train.latest_checkpoint('E:/MyYoloModel/'))
graph = tf.get_default_graph()
X = graph.get_tensor_by_name("X:0")
Y1 = graph.get_tensor_by_name("Y1:0")
Y2 = graph.get_tensor_by_name("Y2:0")
Y3 = graph.get_tensor_by_name("Y3:0")
Scale1 = graph.get_tensor_by_name("Scale1:0")
Scale2 = graph.get_tensor_by_name("Scale2:0")
Scale3 = graph.get_tensor_by_name("Scale3:0")
Loss = myYoloLoss([Scale1,Scale2,Scale3],[Y1, Y2 ,Y3])
#error code
opt = tf.train.AdamOptimizer(2e-4).minimize(Loss)
init = tf.global_variables_initializer()
sess.run(init)
batchImg,batchScale1,batchScale2,batchScale3 = getBatchImage(batchSize = BATCHSIZE)
for epoch in range(10):
_ ,epochloss = sess.run([opt,Loss],feed_dict={X:batchImg,Y1:batchScale1,Y2:batchScale2,Y3:batchScale3})
print(epochloss)
And it will be occur this error:
ValueError: Duplicate node name in graph: 'conv2d_0/kernel/Adam'
How can fix it?
回答1:
The reason is that AdamOptimizer creates additional variables and operation in your graph. When you store your model, those operations are stored and loaded with the graph when you restore the model. If you run
tf.Graph.get_operations(graph)
you can see the list of operations that are loaded with you model. You will see operations that have /Adam or train/Adam init. When you try to find-tune or reuse you model, the new AdamOptimizer tries to create those operations again, hence it raises the "Duplicate node name" error. One way to fix the issue is to give a name to your new AdampOptimzer.
opt = tf.train.AdamOptimizer(2e-4m name='MyNewAdam').minimize(Loss)
However, We are not done yet. As you want to reuse the weight, you cannot initialize variable. However, if you will get error of uninitialized parameters when you run your training which is raised due to new AdamOptimizer variables which have not been initialized yet. To get around it, you need to initialize those new variables by :
uninitialized_vars = []
for var in tf.all_variables():
try:
sess.run(var)
except tf.errors.FailedPreconditionError:
uninitialized_vars.append(var)
tf.initialize_variables(uninitialized_vars)
Note: Unused nodes will not be executed and hence they won't affect training time.
来源:https://stackoverflow.com/questions/53172215/duplicate-node-name-in-graph-conv2d-0-kernel-adam