h2o

How to use the saved .rds h2o model for prediction afterwards?

…衆ロ難τιáo~ 提交于 2019-12-02 07:40:58
问题 I have created a R model using mlr and h2o package as below library(h2o) rfh20.lrn = makeLearner("classif.h2o.randomForest", predict.type = "prob") Done the model tunings and model initiates h2o JVM and connects R to h2o cluster, modelling is done and I saved the model as .rds file. saveRDS(h2orf_mod, "h2orf_mod.rds") I do the prediction as pred_h2orf <- predict(h2orf_mod, newdata = newdata) then i shutdown h2o h2o.shutdown() Later I re-call the saved model h2orf_mod <- readRDS("h2orf_mod.rds

h2o.saveModel throwing exception with directory on Windows 8

痴心易碎 提交于 2019-12-02 07:32:56
I'm using h2o version 3.0.0.22 in R and I'm trying to save my model. But I can't seem to figure out what format is expected. I've tried all sorts of variations but getting all sorts of different exceptions. h2o.saveModel(model, dir="c:/temp", name= "my.model") ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://127.0.0.1:54321/3/Models.bin/DeepLearningModel__8412f3abf1699b5593a55c6861c8468d?dir=c%3A%2Ftemp%2Fmy.model&force=0) java.lang.IllegalArgumentException [1] "water.persist.PersistManager.getPersistForURI(PersistManager.java:407)" [2] "water.serial.ObjectTreeBinarySerializer

how many classes h2o deep learning algorithm accepts?

拜拜、爱过 提交于 2019-12-02 07:29:58
问题 I want to predict the response variable, and it has 700 classes. Deep learning model parameters from h2o.estimators import deeplearning dl_model = deeplearning.H2ODeepLearningEstimator( hidden=[200,200], epochs = 10, missing_values_handling='MeanImputation', max_categorical_features=4, distribution='multinomial' ) # Train the model dl_model.train(x = Content_vecs.names, y='tags', training_frame = data_split[0], validation_frame = data_split[1] ) Orginal Response Variable -Tags: apps, email,

Create a map to call the POJO for each row of Spark Dataframe

你说的曾经没有我的故事 提交于 2019-12-02 07:03:10
问题 I built an H2O model in R and saved the POJO code. I want to score parquet files in hdfs using the POJO but I'm not sure how to go about it. I plan on reading the parquet files into spark (scala/SparkR/PySpark) and scoring them on there. Below is the excerpt I found on H2O's documentation page. "How do I run a POJO on a Spark Cluster? The POJO provides just the math logic to do predictions, so you won’t find any Spark (or even H2O) specific code there. If you want to use the POJO to make

how many classes h2o deep learning algorithm accepts?

核能气质少年 提交于 2019-12-02 04:39:01
I want to predict the response variable, and it has 700 classes. Deep learning model parameters from h2o.estimators import deeplearning dl_model = deeplearning.H2ODeepLearningEstimator( hidden=[200,200], epochs = 10, missing_values_handling='MeanImputation', max_categorical_features=4, distribution='multinomial' ) # Train the model dl_model.train(x = Content_vecs.names, y='tags', training_frame = data_split[0], validation_frame = data_split[1] ) Orginal Response Variable -Tags: apps, email, mail finance,freelancers,contractors,zen99 genomes gogovan brazil,china,cloudflare hauling,service

How to specify the file name when saving the model using h2o package from R

大兔子大兔子 提交于 2019-12-01 21:47:18
问题 I am trying to save the model build using the function: h2o.saveModel() , based on function description on page 159 of the H2O user manual for R, the arguments only consider path . I looked at other similar function such as: h2o.saveModelDetails() but it uses the same argument. Please advise if there any another way to specify the name of the model. 回答1: The name of the model file will be determined by the ID of the model. So if you specify model_id when training your model, then you can

How to directly plot ROC of h2o model object in R

限于喜欢 提交于 2019-12-01 21:47:04
My apologies if I'm missing something obvious. I've been thoroughly enjoying working with h2o in the last few days using R interface. I would like to evaluate my model, say a random forest, by plotting an ROC. The documentation seems to suggest that there is a straightforward way to do that: Interpreting a DRF Model By default, the following output displays: Model parameters (hidden) A graph of the scoring history (number of trees vs. training MSE) A graph of the ROC curve (TPR vs. FPR) A graph of the variable importances ... I've also seen that in python you can apply roc function here . But

Error with h2o.predict in R

泄露秘密 提交于 2019-12-01 21:19:36
I am getting an error when trying to create deep learning predictions with h2o in R. The error occurs for about one third of predictions with the command h2o.predict. Here is the model setup: localH2O = h2o.init(ip = "localhost", port = 54321, startH2O = TRUE,max_mem_size='20g',nthreads=6) model <- h2o.deeplearning(x = 2:100, y = 1, training_frame = x, l1 = 1e-5, l2 = 1e-5, epochs=500, hidden = c(800,800,100)) prediction <- h2o.predict(model, x[,2:100]) Here is the error that occurs on and off: ERROR: Unexpected HTTP Status code: 500 Server Error (url = http://localhost:54321/99/Rapids) java

h2o failed to connect when called from R: Java version missmatch

亡梦爱人 提交于 2019-12-01 20:30:17
h2o was working before on my laptop, but I didn't use it for a while (and have installed new packages and updated things in the meantime). Yesterday I tried using it, but it didn't work. I erased the R h2o packaged and I've reinstalled h2o from scratch with install.packages("h2o") I tried running h2o with h2o.init() but it gives me this error java version "9" Java(TM) SE Runtime Environment (build 9+181) Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode) Starting H2O JVM and connecting: ............................................................ [1] "localhost" [1] 54321 [1] TRUE [1]

Wrong Euclidean distance H2O calculations R

元气小坏坏 提交于 2019-12-01 19:43:29
I am using H2O with R to calculate the euclidean distance between 2 data.frames: set.seed(121) #create the data df1<-data.frame(matrix(rnorm(1000),ncol=10)) df2<-data.frame(matrix(rnorm(300),ncol=10)) #init h2o h2o.init() #transform to h2o df1.h<-as.h2o(df1) df2.h<-as.h2o(df2) if I use normal calculations, i.e. the first row: distance1<-sqrt(sum((df1[1,]-df2[1,])^2)) And If I use the H2O library: distance.h2o<-h2o.distance(df1.h[1,],df2.h[1,],"l2") print(distance1) print(distance.h2o) The distance1 and distance.h2o are not the same. Does anybody knows why? Thanks!! It seems as if h2o.distance