Very low GPU usage during training in Tensorflow

六眼飞鱼酱① 提交于 2019-12-03 12:56:44

MNIST size networks are tiny and it's hard to achieve high GPU (or CPU) efficiency for them, I think 30% is not unusual for your application. You will get higher computational efficiency with larger batch size, meaning you can process more examples per second, but you will also get lower statistical efficiency, meaning you need to process more examples total to get to target accuracy. So it's a trade-off. For tiny character models like yours, the statistical efficiency drops off very quickly after a 100, so it's probably not worth trying to grow the batch size for training. For inference, you should use the largest batch size you can.

On my nVidia GTX 1080, if I use a convolutional neural network on the MNIST database, the GPU load is ~68%.

If I switch to a simple, non-convolutional network, then the GPU load is ~20%.

You can replicate these results by building successively more advanced models in the tutorial Building Autoencoders in Keras by Francis Chollet.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!