问题
I'm catching up on h2o
's MOJO and POJO model format. I'm able to save a model in MOJO/POJO with
h2o.download_mojo(model, path = "/media/somewhere/tmp") # ok
h2o.download_pojo(model, path = "/media/somewhere/tmp") # ok
which writes an object with name like mymodel.zip
or mymodel.java
to the directory.
However, it's not clear to me how to read it back into the server in R. I tried,
saved_model2 <- h2o.loadModel("/media/somewhere/tmp/mymodel.java") # not work
saved_model3 <- h2o.loadModel("/media/somewhere/tmp/mymodel.zip") # not work
but got error msg like this,
ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://localhost:54321/99/Models.bin/)
java.lang.IllegalArgumentException
[1] "java.lang.IllegalArgumentException: Missing magic number 0x1CED at stream start"
....
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
Missing magic number 0x1CED at stream start
回答1:
If you are looking to make predictions on an H2O model in R, then you have three options (which method you choose depends on your use-case):
- You can use a binary model instead of a MOJO (or POJO). For this method, you export the model to disk using h2o.saveModel() and load it back into the H2O clsuter using h2o.loadModel() and make predictions using
predict(model, test)
. This method requires having an H2O cluster running. - If you's still prefer to export a model to MOJO (or POJO) format, you can use the h2o.mojo_predict_df() or h2o.mojo_predict_csv() function in R to generate predictions on a test set (from an R data.frame or in a CSV file).
- As an alternative to #2, if your data is in JSON format, you can use h2o.predict_json(), but it will only score one row at a time.
回答2:
h2o.loadModel
is meant to be used with h2o.saveModel
. If you want to compile and run a MOJO you need to do the following:
first let's say you created a MOJO from a GBM:
library(h2o)
h2o.init(nthreads=-1)
path = "http://h2o-public-test-data.s3.amazonaws.com/smalldata/prostate/prostate.csv"
h2o_df = h2o.importFile(path)
h2o_df$RACE = as.factor(h2o_df$RACE)
model = h2o.gbm(y="CAPSULE",
x=c("AGE", "RACE", "PSA", "GLEASON"),
training_frame=h2o_df,
distribution="bernoulli",
ntrees=100,
max_depth=4,
learn_rate=0.1)
and then downloaded the MOJO and the resulting h2o-genmodel.jar file to a new experiment folder. Note that the h2o-genmodel.jar file is a library that supports scoring and contains the required readers and interpreters. This file is required when MOJO models are deployed to production.
modelfile = model.download_mojo(path="~/experiment/", get_genmodel_jar=True)
print("Model saved to " + modelfile)
Model saved to /Users/user/GBM_model_R_1475248925871_74.zip"
Then you would open a new terminal window and change into the experiment directory where you have have the MOJO files .zip and .jar.
$ cd experiment
Then you would create your main program in the experiment folder by creating a new file called main.java (for example, using "vim main.java"). Include the following contents. Note that this file is referencing the GBM model created above using R.
import java.io.*;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
public class main {
public static void main(String[] args) throws Exception {
EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("GBM_model_R_1475248925871_74.zip"));
RowData row = new RowData();
row.put("AGE", "68");
row.put("RACE", "2");
row.put("DCAPS", "2");
row.put("VOL", "0");
row.put("GLEASON", "6");
BinomialModelPrediction p = model.predictBinomial(row);
System.out.println("Has penetrated the prostatic capsule (1=yes; 0=no): " + p.label);
System.out.print("Class probabilities: ");
for (int i = 0; i < p.classProbabilities.length; i++) {
if (i > 0) {
System.out.print(",");
}
System.out.print(p.classProbabilities[i]);
}
System.out.println("");
}
}
Then compile and run in terminal window 2 to get a display of predicted probabilities
$ javac -cp h2o-genmodel.jar -J-Xms2g -J-XX:MaxPermSize=128m main.java
$ java -cp .:h2o-genmodel.jar main
来源:https://stackoverflow.com/questions/45335697/r-h2o-load-a-saved-model-from-disk-in-mojo-or-pojo-format