Distributed Tensorflow: check failed: size>=0

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-07 05:11:09

问题


I'm using keras 2.0.6. The version of tensorflow is 1.3.0.

My code can run with theano backend, but failed with tensorflow backend:

F tensorflow/core/framework/tensor_shape.cc:241] Check failed: size >= 0 (-14428307456 vs. 0)

I was wondering if anyone can thought of any possible reason that might cause this.

Thank you!

----UPDATE-----

I tested exactly the same code on my PC with tensorflow. It runs perfectly.

However, it throw out this error when I run it on a Supercomputer.

Although this error looks like overflow, there is no way that it didn't overflow on my PC, but overflow on a supercomputer.

I suspect that it comes from a bug on tensorflow for distributed computation.


回答1:


it came out the same bug, but it ran ok after that I shrimped the batch size.

I think the reason is it running out of GPU memories.



来源:https://stackoverflow.com/questions/45423134/distributed-tensorflow-check-failed-size-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!