问题
Similar to this question in R here, I get out of memory issues when running loops with grid search in H2O. In R, doing gc() during each loop did help. What is the proposed solution here?
回答1:
There appears to be no h2o.gc()
function in the Python API. See "How can I debug memory issues?" in the FAQ. You could POST that back-end command (GarbageCollect
) directly using the REST API if you suspect the problem is the back-end holding on to memory that it no longer should be. Studying the detailed logs, might help confirm if that is the case.
Wrapping up the advice from the comments:
- Use
h2o.remove()
on H2O frames and models you no longer need, at the end of the loop. - Use
h2o.removeAll()
if you do not need to keep anything around, and your loop will be re-loading all the data it needs. - Use
H2OGridSearch
rather than your own loops and your own grid code.
I'd also add to be aware that cbind, rbind and any function that modifies an H2O frame will make a copy of the entire frame. Sometimes re-thinking the way you do your data munging steps can reduce the memory requirements.
来源:https://stackoverflow.com/questions/45435739/python-h2o-memory-management