Is it possible to build Deep Water/TensorFlow model in H2O without CUDA

问题

My goal is to integrate H2O with TensorFlow without CUDA on a machine.

As TensorFlow supports both CPU and GPU execution, I expect H2O/TensorFlow integration to be possible without CUDA. But I'm pretty confused by mentioning of CUDA software in system specifications of Deep Water.

I've tried to build Deep Water/TensorFlow model in H2O Flow but failed. The steps I've performed:

Downloaded H2O standalone JAR;
Created data frame in H2O Flow as usual;
Tried to build a model with Deep Water and tensorflow chosen as an algorithm and backend respectively;
Got the following exception:

java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: No backend found. Cannot build a Deep Water model.
    at hex.deepwater.DeepWaterModelInfo.setupNativeBackend(DeepWaterModelInfo.java:246)
    at hex.deepwater.DeepWaterModelInfo.(DeepWaterModelInfo.java:193)
    at hex.deepwater.DeepWaterModel.(DeepWaterModel.java:225)
    at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:127)
    at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:114)
    at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:169)
    at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:107)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1220)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

So my questions are:

Is it possible to build Deep Water/TensorFlow model in H2O without CUDA at all?
If it is, what should I do to get it working? If it is not, are there other options to integrate H2O and TensorFlow without CUDA?

Update 1:

I've set the gpu parameter to false and tried to build model again with all possible backends. Both caffe and tensorflow produce the same stacktrace as shown above. mxnet also fails but with two different stacktraces.

mxnet (first attempt to build a model):

java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: null
    at hex.deepwater.DeepWaterModelInfo.setupNativeBackend(DeepWaterModelInfo.java:246)
    at hex.deepwater.DeepWaterModelInfo.(DeepWaterModelInfo.java:193)
    at hex.deepwater.DeepWaterModel.(DeepWaterModel.java:225)
    at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:127)
    at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:114)
    at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:169)
    at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:107)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1220)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

mxnet (subsequent attempts):

java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: Could not initialize class deepwater.backends.mxnet.MXNetBackend$MXNetLoader
    at hex.deepwater.DeepWaterModelInfo.setupNativeBackend(DeepWaterModelInfo.java:246)
    at hex.deepwater.DeepWaterModelInfo.(DeepWaterModelInfo.java:193)
    at hex.deepwater.DeepWaterModel.(DeepWaterModel.java:225)
    at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:127)
    at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:114)
    at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:169)
    at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:107)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1220)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Update 2

Environment:

SW: CentOS Linux release 7.3.1611 (Core), Java HotSpot 64-Bit Server VM (build 25.121-b13, mixed mode);
HW: virtual machine running on Xeon CPU E5-2620 v4 with 4 cores and 8 GB RAM available. No physical GPU is available, lspci -vnn | grep VGA returns 00:0f.0 VGA compatible controller [0300]: VMware SVGA II Adapter [15ad:0405] (prog-if 00 [VGA controller])

I've cleared my /tmp directory and tried mxnet again. On the first attempt I've got new exception:

java.lang.RuntimeException: Unable to initialize the native Deep Learning backend: /tmp/libmxnet.so: libcudart.so.8.0: cannot open shared object file: No such file or directory
    at hex.deepwater.DeepWaterModelInfo.setupNativeBackend(DeepWaterModelInfo.java:246)
    at hex.deepwater.DeepWaterModelInfo.(DeepWaterModelInfo.java:193)
    at hex.deepwater.DeepWaterModel.(DeepWaterModel.java:225)
    at hex.deepwater.DeepWater$DeepWaterDriver.buildModel(DeepWater.java:127)
    at hex.deepwater.DeepWater$DeepWaterDriver.computeImpl(DeepWater.java:114)
    at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:169)
    at hex.deepwater.DeepWater$DeepWaterDriver.compute2(DeepWater.java:107)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1220)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

The file /tmp/libmxnet.so is present, its permissions are -rw-rw-r--.

回答1:

The answer to your first question is as below:

You sure can get to run deep water without GPU, it will be very slow. When you are using FLOW you could disable gpu setting as below (which is TRUE by default)

Also you can set gpu as false in the FLOW cell as below:

"gpu":false

However your main problem is that none of the backend (mxnet, tensorflow, caffe) was available to run your code. We did test gpu flag settings with mxnet for sure. Please try to investigate more about the error above.

来源：https://stackoverflow.com/questions/43545511/is-it-possible-to-build-deep-water-tensorflow-model-in-h2o-without-cuda

标签

h2o