Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled

非 Y 不嫁゛ 提交于 2020-03-19 05:59:52

问题


While running kubeflow pipeline having code that uses tensorflow 2.0. below error is displayed at end of each epoch

W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Cancelled: Operation was cancelled

Also, after some epochs, it does not show log and shows this error

This step is in Failed state with this message: The node was low on resource: memory. Container main was using 100213872Ki, which exceeds its request of 0. Container wait was using 25056Ki, which exceeds its request of 0.


回答1:


In my case, I didn't match the batch_size and steps_per_epoch

For example,

his = Test_model.fit_generator(datagen.flow(trainrancrop_images, trainrancrop_labels, batch_size=batchsize), steps_per_epoch=len(trainrancrop_images)/batchsize, validation_data=(test_images, test_labels), epochs=1, callbacks=[callback])

batch_size in the datagen.flow must correspond to the steps_per_epoch in Test_model.fit_generator (actually, I used the wrong value on the steps_per_epoch)

This is one of the cases for the Error, I guess.

As a result, I think the problem arises when there is wrong correspondence on the batch size and steps(iterations)

Maybe the floats can be a problem when you get the step by dividing...

Check your code about this issue.

Good luck :)




回答2:


In my case: I installed tf-nightly. Now it's working, Though I am new to tensorflow. I followed this link

You can try.




回答3:


I have the same problem. People claimed that warming is superfluous and it has been removed in the tf-nightly, see here. But the memory leak is still there for each epoch.




回答4:


This was due to incompatible CUDA and Tensorflow versions. below versions work well with each other

tensorflow-gpu==2.0.0

tensorflow-addons==0.6.0

nvidia/cuda:10.0-cudnn7-runtime



来源:https://stackoverflow.com/questions/60000573/error-occurred-when-finalizing-generatordataset-iterator-cancelled-operation-w

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!