h2o

How to interpret results of h2o.predict

故事扮演 提交于 2019-12-11 04:43:53
问题 After running h2o.deeplearning for a binary classification problem I then run the h2o.predict and obtain the following results predict No Yes 1 No 0.9784425 0.0215575 2 Yes 0.4667428 0.5332572 3 Yes 0.3955087 0.6044913 4 Yes 0.7962034 0.2037966 5 Yes 0.7413591 0.2586409 6 Yes 0.6800801 0.3199199 I was hoping to get a confusion matrix with only two rows. But this seems to be quite different. How do I interpret these results? Is there any way of getting something like a confusion matrix with

What is the best way to store distances with H2O?

▼魔方 西西 提交于 2019-12-11 04:43:10
问题 Supose I have 2 data.frames and I want to calculate the euclidean distance between all of the rows of them. My code is: set.seed(121) # Load library library(h2o) system.time({ h2o.init() # Create the df and convert to h2o frame format df1 <- as.h2o(matrix(rnorm(7500 * 40), ncol = 40)) df2 <- as.h2o(matrix(rnorm(1250 * 40), ncol = 40)) # Create a matrix in which I will record the distances matrix1 <- as.h2o(matrix(0, nrow = 7500, ncol = 40)) # Loop to calculate all the distances for (i in 1

h2o DRF unseen categorical values handling

坚强是说给别人听的谎言 提交于 2019-12-11 04:27:13
问题 The documentation for DRF states What happens when you try to predict on a categorical level not seen during training? DRF converts a new categorical level to a NA value in the test set, and then splits left on the NA value during scoring. The algorithm splits left on NA values because, during training, NA values are grouped with the outliers in the left-most bin. Questions: So h2o converts unseen levels to NAs and then treats them the same way as NAs in the training data. But what if there

Is it possible to build Deep Water/TensorFlow model in H2O without CUDA

流过昼夜 提交于 2019-12-11 03:52:07
问题 My goal is to integrate H2O with TensorFlow without CUDA on a machine. As TensorFlow supports both CPU and GPU execution, I expect H2O/TensorFlow integration to be possible without CUDA. But I'm pretty confused by mentioning of CUDA software in system specifications of Deep Water. I've tried to build Deep Water/TensorFlow model in H2O Flow but failed. The steps I've performed: Downloaded H2O standalone JAR; Created data frame in H2O Flow as usual; Tried to build a model with Deep Water and

is multi-cpu supported by h2o-xgboost?

佐手、 提交于 2019-12-11 03:47:27
问题 Is there a configuration which allows to run H2OXGBoostEstimator in multithreading and not in the minimal config with one CPU, with h2o version 3.15.0.4035? 回答1: xgboost implementation on H2O is multithreaded and like all other algorithms supported into H2O however it is platform dependent which is described into H2O documentation properly. So if you try it on Linux, and have all supported libraries available then you will take advantage of distributed xgboost otherwise like OSX, you might

How to use : function in H2O ddply, R?

倖福魔咒の 提交于 2019-12-11 02:05:20
问题 Consider the below code : library(h2o) library(plyr) h2o.init() data1x <- "x row1 1 1 1 2 1 3 1 4 2 1 2 2 2 3 3 1 4 2" data1x <- read.table(textConnection(data1x), header=TRUE) data1xH2O <- as.h2o(data1x) fun = function(df) { 1:2 } h2o.ddply(data1xH2O, "x", fun) ddply(data1x, "x", fun) The h20 version of ddply gives below error. ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://localhost:54321/99/Rapids) water.rapids.Rapids.IllegalASTException [1] "water.rapids.Rapids

as.h2o produces additional row when column names contain special characters

偶尔善良 提交于 2019-12-11 01:46:06
问题 I have a matrix containing non-ascii character in a column name: df <- replicate(3, rnorm(5)) colnames(df) <- c('A', 'B', 'Č') df A B Č [1,] 1.6882234 0.37369538 0.1412783 [2,] -1.4538027 0.37603834 -0.2108820 [3,] 0.2878318 0.52661834 -0.4106152 [4,] 1.0373949 1.41206911 0.5056488 [5,] -2.3852925 0.05160573 -1.1288920 When I run the following, the result has one additional row and column name changes: library(h2o) h2o.init() df_h2o <- as.h2o(df) df_h2o A B "ÄŹĹĽËť 1 NaN NaN NaN 2 1.6882234 0

How to find the validation error as a function of the number of epochs on a fine scale using h2o.grid in R

天大地大妈咪最大 提交于 2019-12-11 01:24:14
问题 I have a very noisy dataset with 2000 observations and 42 features (financial data) and I'm performing binary classification. Here I'm tuning the network using h2o.grid and providing a validation set. I've set epochs=1000 and I'm imposing to stop the training when the misclassification error does not improve by >=1% for 5 scoring events ( stopping_rounds=5, stopping_tolerance=0.01 ). I'm interested to know what is the value for epochs that minimises the validation error. hyper_params = list

Error while using h2o.init() in R

删除回忆录丶 提交于 2019-12-11 01:13:09
问题 I get the following error whenever I use h2o.init() : localh2o<-h2o.init() H2O is not running yet, starting it now... Error in system2(command, "-version", stdout = TRUE, stderr = TRUE) : '""' not found In addition: Warning message: In .h2o.checkJava() : Found JRE at C:/Program Files (x86)/Java/jre7/bin/java.exe but H2O requires the JDK to run I am running it on RStudio Version 0.99.473 and R version 3.2.2, 64 bit os 回答1: The error message is pretty self-explanatory: Found JRE at C:/Program

Unexpected error in h2o.predict

回眸只為那壹抹淺笑 提交于 2019-12-11 00:59:13
问题 I'm trying to use h2o.predict but it's throwing a weird error. Any pointers on how to resolve it? ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://localhost:54321/99/Rapids) java.lang.IllegalArgumentException [1] "water.rapids.ASTTmpAssign.apply(ASTAssign.java:254)" [2] "water.rapids.ASTTmpAssign.apply(ASTAssign.java:248)" [3] "water.rapids.ASTExec.exec(ASTExec.java:46)" [4] "water.rapids.Session.exec(Session.java:56)" [5] "water.rapids.Exec.exec(Exec.java:63)" [6] "water.api