Python H2O Memory Management

ε祈祈猫儿з 提交于 2020-12-05 09:35:28

问题


Similar to this question in R here, I get out of memory issues when running loops with grid search in H2O. In R, doing gc() during each loop did help. What is the proposed solution here?


回答1:


There appears to be no h2o.gc() function in the Python API. See "How can I debug memory issues?" in the FAQ. You could POST that back-end command (GarbageCollect) directly using the REST API if you suspect the problem is the back-end holding on to memory that it no longer should be. Studying the detailed logs, might help confirm if that is the case.

Wrapping up the advice from the comments:

  • Use h2o.remove() on H2O frames and models you no longer need, at the end of the loop.
  • Use h2o.removeAll() if you do not need to keep anything around, and your loop will be re-loading all the data it needs.
  • Use H2OGridSearch rather than your own loops and your own grid code.

I'd also add to be aware that cbind, rbind and any function that modifies an H2O frame will make a copy of the entire frame. Sometimes re-thinking the way you do your data munging steps can reduce the memory requirements.



来源:https://stackoverflow.com/questions/45435739/python-h2o-memory-management

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!