OOM when allocating tensor

前端 未结 1 625
轻奢々
轻奢々 2020-12-22 12:50

How do I solve the problem of ResourceExhaustedError: OOM when allocating tensor?

ResourceExhaustedError (see above for traceback): OOM when allocati

相关标签:
1条回答
  • 2020-12-22 13:17

    The process failed with out-of-memory (OOM) because you pushed the whole test set for evaluation at once (see this question). It's easy to see that 10000 * 32 * 28 * 28 * 4 is almost 1Gb, while your GPU has only 1.66Gb available in total and most of it is already taken by the network itself.

    The solution is to feed the neural network batches not only for training, but for testing as well. The result accuracy is going to be an average across all batches. Moreover, you don't need to do this after each epoch: are you really interested in test results of all intermediate networks?

    Your second error message is very likely a result of the previous failures, because CUDNN driver doesn't seem to work anymore. I'd suggest to restart your machine.

    0 讨论(0)
提交回复
热议问题