发表新帖

发表新帖

OOM when allocating tensor

前端未结

关注

 1  625

How do I solve the problem of ResourceExhaustedError: OOM when allocating tensor？

ResourceExhaustedError (see above for traceback): OOM when allocati

相关标签:

1条回答

遥遥无期

2020-12-22 13:17

The process failed with out-of-memory (OOM) because you pushed the whole test set for evaluation at once (see this question). It's easy to see that 10000 * 32 * 28 * 28 * 4 is almost 1Gb, while your GPU has only 1.66Gb available in total and most of it is already taken by the network itself.

The solution is to feed the neural network batches not only for training, but for testing as well. The result accuracy is going to be an average across all batches. Moreover, you don't need to do this after each epoch: are you really interested in test results of all intermediate networks?

Your second error message is very likely a result of the previous failures, because CUDNN driver doesn't seem to work anymore. I'd suggest to restart your machine.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题