h2o

GLM model: h2o.predict gives very different results depending on number of rows used in the validation data

谁说胖子不能爱 提交于 2019-12-24 00:54:59
问题 I built a H2O (v. 3.14) GLM model. However, when I check the predictions using h2o.predict, I got very different results based on how many rows I use in the validation set. Calling h2o.predict on the first 10 rows, I got: # Predict using the first 10 lines in validation set h2o.predict(glm.test, df.valid[1:10,]) # Result: predict p0 p1 1 0 0.9999224 7.756014e-05 2 0 0.9962711 3.728930e-03 3 0 0.9997378 2.622195e-04 4 0 0.9999556 4.437544e-05 5 0 0.9998994 1.006037e-04 6 0 0.9999394 6.062479e

Perform data transformation on training data inside cross validation

寵の児 提交于 2019-12-23 23:42:11
问题 I would like to do cross validation for 5 folds. In each fold, I have a training and valid set. However, due to data issue, I need to transform my data. First, I transform the training data, train the model,apply the transformation rule to the validation data, and then test the model. I need to redo the transformation for every fold. How would I do that in H2O? I can't find away to separate the transformation part out. Does anyone have any suggestion? 来源: https://stackoverflow.com/questions

H2O Target Mean Encoder “frames are being sent in the same order” ERROR

好久不见. 提交于 2019-12-23 22:09:42
问题 I am following the H2O example to run target mean encoding in Sparking Water (sparking water 2.4.2 and H2O 3.22.04). It runs well in all the following paragraph from h2o.targetencoder import TargetEncoder # change label to factor input_df_h2o['label'] = input_df_h2o['label'].asfactor() # add fold column for Target Encoding input_df_h2o["cv_fold_te"] = input_df_h2o.kfold_column(n_folds = 5, seed = 54321) # find all categorical features cat_features = [k for (k,v) in input_df_h2o.types.items()

H2O server crash

橙三吉。 提交于 2019-12-23 12:57:57
问题 I've been working with H2O for the last year, and I am getting very tired of server crashes. I have given up on "nightly releases", as they are easily crashed by my data sets. Please tell me where I can download a release that is stable. Charles My environment is: Windows 10 enterprise, build 1607, with 64 GB memory. Java SE Development Kit 8 Update 77 (64-bit). Anaconda Python 3.6.2-0. I started the server with: localH2O = h2o.init(ip = "localhost", port = 54321, max_mem_size="12G", nthreads

How to convert r data frame to h2o object

ε祈祈猫儿з 提交于 2019-12-23 06:49:14
问题 Im new to R and H2O and I have tried to find a way to convert r data frame to a h2o object. I have spent some time research on how to do this with no luck. Other way around is possible and well documented as follows. prosPath = system.file("extdata", "prostate.csv", package="h2o") prostate.hex = h2o.importFile(localH2O, path = prosPath) prostate.data.frame <- as.data.frame(prostate.hex) But what i want is complete opposite of this. I wants to convert r "prostate.data.frame" data object

Is there any document about using Python preprocess with h2o steam?

家住魔仙堡 提交于 2019-12-23 03:50:31
问题 The h2o steam website said Python preprocess with pojo As .War is an optional, but I can not find any examples about doing this step by step, Where can I find out more details about this? Or I better do it in Java only? The situation is I have one python preprocess program, mainly use pandas to do some data munging before calling h2o to train/score the model. I want to use the h2o steam as the score engine. The website mentions I can wrap the python and h2o pojo/mojo file together as a .war

Using H2OApi Java bindings to retrieve H2O Frame

可紊 提交于 2019-12-23 03:12:57
问题 I work on a Java project using the H2O (3.10.4.7) REST Api provided by the H2O Java bindings and I have the following problem: We need to retrieve Metadata from existing H2O Frames like: Column Names and DataTypes of those columns, preferrably using the H2oApi.class. Our approach is to fetch one Row from the H2O Frame and then use it to get the Metadata we need. So far I tried the following: FramesV3 targetFrame = new FramesV3(); targetFrame.frameId = frameKey; // key provided by import

unable to init h2o. can somebody help me with it

♀尐吖头ヾ 提交于 2019-12-22 19:10:54
问题 Checking whether there is an H2O instance running at http://localhost:54321..... not found. Attempting to start a local H2O server... Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) Starting server from C:\Users\Ramakanth\Anaconda2\lib\site-packages\h2o\backend\bin\h2o.jar Ice root: c:\users\ramaka~1\appdata\local\temp\tmpeaff8n JVM stdout: c:\users\ramaka~1\appdata\local\temp\tmpeaff8n\h2o_Ramakanth_started_from_python.out JVM stderr: c:\users\ramaka~1\appdata\local\temp

How to get different Variable Importance for each class in a binary h2o GBM in R?

﹥>﹥吖頭↗ 提交于 2019-12-22 01:25:48
问题 I'm trying to explore the use of a GBM with h2o for a classification issue to replace a logistic regression (GLM). The non-linearity and interactions in my data make me think a GBM is more suitable. I've ran a baseline GBM (see below) and compared the AUC against the AUC of the logistic regression. THe GBM performs much better. In a classic linear logistic regression, one would be able to see the direction and effect of each of the predictors (x) on the outcome variable (y). Now, I would like

R h2o load a saved model from disk in MOJO or POJO format

家住魔仙堡 提交于 2019-12-21 04:34:25
问题 I'm catching up on h2o 's MOJO and POJO model format. I'm able to save a model in MOJO/POJO with h2o.download_mojo(model, path = "/media/somewhere/tmp") # ok h2o.download_pojo(model, path = "/media/somewhere/tmp") # ok which writes an object with name like mymodel.zip or mymodel.java to the directory. However, it's not clear to me how to read it back into the server in R. I tried, saved_model2 <- h2o.loadModel("/media/somewhere/tmp/mymodel.java") # not work saved_model3 <- h2o.loadModel("