h2o

How to create a H2OFrame using H2O REST API

那年仲夏 提交于 2019-12-06 06:15:56
Is it possible to create a H2OFrame using the H2O's REST API and if so how? My main objective is to utilize models stored inside H2O so as to make predictions on external H2OFrames. I need to be able to generate those H2OFrames externally from JSON (I suppose by calling an endpoint) I read the API documentation but couldn't find any clear explanation. I believe that the closest endpoints are /3/CreateFrame which creates random data and /3/ParseSetup but I couldn't find any reliable tutorial. Pasha Currently there is no REST API endpoint to directly convert some JSON record into a Frame object.

How to map over DataFrame in spark to extract RowData and make predictions using h2o mojo model

五迷三道 提交于 2019-12-06 05:37:38
I have a saved h2o model in mojo format, and now I am trying to load it and use it to make predictions on a new dataset ( df ) as part of a spark app written in scala. Ideally, I wish to append a new row to the existing DataFrame containing the class probability based on this model. I can see how to apply a mojo to an individual row already in a RowData format (as per answer here ), but I am not sure how to map over an existing DataFrame so that it is in the right format to make predictions using the mojo model. I have worked with DataFrames a fair bit, but never with the underlying RDDs. Also

Attribute selection in h2o

时光总嘲笑我的痴心妄想 提交于 2019-12-06 05:15:54
I am very beginner in h2o and I want to know if there is any attribute selection capabilities in h2o framework so to be applied in h2oframes? No there are not currently feature selection functions in H2O -- my advice would be to use Lasso regression (in H2O this means use GLM with alpha = 1.0 ) to do the feature selection, or simply allow whatever machine learning algorithm (e.g. GBM) you are planning to use to use all the features (they'll tend to ignore the bad ones, but it could still degrade performance of the algorithm to have bad features in the training data). If you'd like, you can

what is the different between h2o.ensemble and h2o.stack in package h2oEnsemble

怎甘沉沦 提交于 2019-12-06 02:46:25
问题 Accoding to the Description of function: h2o.stack: This function creates a "Super Learner" (stacking) ensemble using a list of existing H2O base models specified by the user. h2o.ensemble: This function creates a "Super Learner" (stacking) ensemble using the H2O base learning algorithms specified by the user. 回答1: They are two different ways to construct an ensemble. They have a different interface, but they produce the exact same type of object in the end. The h2o.stack() function takes as

Is H2O target mean encoding available in Python?

痴心易碎 提交于 2019-12-06 02:23:47
I noticed H2O has released the target mean encoding http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-munging/target-encoding.html It only comes with an R code example. Does anyone have a Python example? Like this: from h2o.targetencoder import TargetEncoder # Fit target encoding on training data targetEncoder = TargetEncoder(x= ["addr_state", "purpose"], y = "bad_loan", fold_column = "cv_fold_te") targetEncoder.fit(ext_train) But this requires version at least 3.22 Here is a link to an example: https://github.com/h2oai/h2o-tutorials/blob/78c3766741e8cbbbd8db04d54b1e34f678b85310/best

something similar to permutation accuracy importance in h2o package

假如想象 提交于 2019-12-05 20:26:02
I fitted a random forest for my multinomial target with the randomForest package in R. Looking for the variable importance I found out permutation accuracy importance which is what I was looking for my analysis. I fitted a random forest with the h2o package too, but the only measures it shows me are relative_importance, scaled_importance, percentage . My question is: can I extract a measure that shows me the level of the target which better classify the variable i want to take in exam? Permutation accuracy importance is the best measure I can use in this case? For example: I have a 3 levels

H2o GLM interact only certain predictors

大城市里の小女人 提交于 2019-12-05 20:06:22
I'm interested in creating interaction terms in h2o.glm(). But I do not want to generate all pairwise interactions. For example, in the mtcars dataset...I want to interact 'mpg' with all the other factors such as 'cyl','hp', and 'disp' but I don't want the other factors to interact with each other (so I don't want disp_hp or disp_cyl). How should I best approach this problem using the (interactions = interactions_list) parameter in h2o.glm() ? Thank you According to ?h2o.glm the interactions= parameter takes: A list of predictor column indices to interact. All pairwise combinations will be

How do know how many deep learning epochs were done, from R?

情到浓时终转凉″ 提交于 2019-12-05 17:10:45
Early stopping is turned on by default for h2o.deeplearning() . But, from R, how do I find out if it did stop early, and how many epochs it did? I've tried this: model = h2o.deeplearning(...) print(model) which tells me information on the layers, the MSE, R2, etc. but nothing about how many epochs were run. Over on Flow I can see the information (e.g. where the x-axis stops in the "Scoring History - Deviance" chart, or in the Scoring History table). If your model is called m , then to get just the number of epochs trained: last(m@model$scoring_history$epochs) To see what other information is

How to tune hidden_dropout_ratios in h2o.grid in R

。_饼干妹妹 提交于 2019-12-05 16:36:41
I want to tune a neural network with dropout using h2o in R. Here I provide a reproducible example for the iris dataset. I'm avoiding to tune eta and epsiplon (i.e. ADADELTA hyper-parameters) with the only purpose of making computations faster. require(h2o) h2o.init() data(iris) iris = iris[sample(1:nrow(iris)), ] irisTrain = as.h2o(iris[1:90, ]) irisValid = as.h2o(iris[91:120, ]) irisTest = as.h2o(iris[121:150, ]) hyper_params <- list( input_dropout_ratio = list(0, 0.15, 0.3), hidden_dropout_ratios = list(0, 0.15, 0.3, c(0,0), c(0.15,0.15),c(0.3,0.3)), hidden = list(64, c(32,32))) grid = h2o

how to save/load a trained model in H2o?

一曲冷凌霜 提交于 2019-12-05 03:14:38
The user tutorial says Navigate to Data > View All Choose to filter by the model key Hit Save Model Input for path: /data/h2o-training/... Hit Submit The problem is that I do not have this menu (H2o, 3.0.0.26, web interface) I am, unfortunately, not familiar with the web interface but I can offer a workaround involving H2O in R. The functions h2o.saveModel(object, dir = "", name = "", filename = "", force = FALSE) and h2o.loadModel(path, conn = h2o.getConnection()) Should offer what you need. I will try to have a look at H2O Flow. Update I cannot find the possibility to explicitly save a model