问题
In H2O site, it says
H2O’s core code is written in Java. Inside H2O, a Distributed Key/Value store is used to access and reference data, models, objects, etc., across all nodes and machines. The algorithms are implemented on top of H2O’s distributed Map/Reduce framework and utilize the Java Fork/Join framework for multi-threading.
Does this mean H2O will not work better than other libraries if it runs on single node cluster? But will work well on multiple nodes cluster. Is that right?
Also what's the difference between h2o on multi-nodes and h2o on hadoop?
回答1:
please see the documentation on how to run H2O on Hadoop:http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html#hadoop-users
as well as this presentation
you can think of "H2O on Hadoop" as H2O's certified integration for Hadoop. However, you don't need Hadoop to run H2O in a multi-node environment, you could always do this manually if you wanted to.
来源:https://stackoverflow.com/questions/50753130/whats-the-difference-between-h2o-on-multi-nodes-and-h2o-on-hadoop