h2o

permutation importance in h2o random Forest

混江龙づ霸主 提交于 2020-01-06 05:47:08
问题 The CRAN implementation of random forests offers both variable importance measures: the Gini importance as well as the widely used permutation importance defined as For classification, it is the increase in percent of times a case is OOB and misclassified when the variable is permuted. For regression, it is the average increase in squared OOB residuals when the variable is permuted By default h2o.varimp() computes only the former. Is there really no option in h2o to get the alternative

Starting h2o in hadoop cluster with specific connection node url

a 夏天 提交于 2020-01-05 03:54:06
问题 Is there a way to start an h2o instance interface on a specific node of a cluster? For example... When using the command: $ hadoop jar h2odriver.jar -nodes 4 -mapperXmx 6g -output hdfsOutputDir from say in the h2o install directory, in say node 172.18.4.62, I get the (abridged) output: .... H2O node 172.18.4.65:54321 reports H2O cluster size 1 H2O node 172.18.4.66:54321 reports H2O cluster size 1 H2O node 172.18.4.67:54321 reports H2O cluster size 1 H2O node 172.18.4.63:54321 reports H2O

Empty list returned when accessing automl leader via @leader@model

微笑、不失礼 提交于 2020-01-04 09:57:46
问题 Running h2o.automl() returns a single model in leaderboard; however, when trying to access the actual model via @leader@model , the following error ensues: Error in is.H2OFrame(x) : trying to get slot "metrics" from an object of a basic class ("NULL") with no slots As well, when calling h2o.predict() on the leader model, got the error message: Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, : ERROR MESSAGE: Object 'dummy' not found in function: predict for

Empty list returned when accessing automl leader via @leader@model

£可爱£侵袭症+ 提交于 2020-01-04 09:56:10
问题 Running h2o.automl() returns a single model in leaderboard; however, when trying to access the actual model via @leader@model , the following error ensues: Error in is.H2OFrame(x) : trying to get slot "metrics" from an object of a basic class ("NULL") with no slots As well, when calling h2o.predict() on the leader model, got the error message: Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, : ERROR MESSAGE: Object 'dummy' not found in function: predict for

String UTF-8 encoding with cyrillic in H2O

[亡魂溺海] 提交于 2020-01-04 07:01:44
问题 I load csv file of utf-8 encoding with cyrillic strings. After parsing in Flow interface - i see not cyrillic, but not readable symbols like "пїўпѕЂпѕ™пїђпѕ" How can i use utf-8 cyrillic strings in H2O? 回答1: This appears to be a bug in the Flow interface, but only in the setupParse command. If you continue through and do the import, the data gets imported correctly. I've reported the bug, with test data and screenshots (taken in Firefox) here: https://0xdata.atlassian.net/browse/PUBDEV-4640

String UTF-8 encoding with cyrillic in H2O

[亡魂溺海] 提交于 2020-01-04 07:01:31
问题 I load csv file of utf-8 encoding with cyrillic strings. After parsing in Flow interface - i see not cyrillic, but not readable symbols like "пїўпѕЂпѕ™пїђпѕ" How can i use utf-8 cyrillic strings in H2O? 回答1: This appears to be a bug in the Flow interface, but only in the setupParse command. If you continue through and do the import, the data gets imported correctly. I've reported the bug, with test data and screenshots (taken in Firefox) here: https://0xdata.atlassian.net/browse/PUBDEV-4640

Not Able to Run H2o Function

泪湿孤枕 提交于 2020-01-04 05:57:59
问题 I was able to install h2o fine (in R) but get the following error when I run h2o.init() h2o.init() H2O is not running yet, starting it now... Error in value[3L] : You have a 32-bit version of Java. H2O works best with 64-bit Java. Please download the latest Java SE JDK 7 from the following URL: http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html I updated java SE JDK version to 7 (and got the 64 bit) and am still receiving this error. Why is this? 回答1: The

R h2o.glm - issue with max_active_predictors

孤人 提交于 2020-01-04 05:46:11
问题 I wanted to estimate h2o.glm model with pre-defined maximum number of active predictors (non-default max_active_predictors column). Here is the example: set.seed(123) par1 <- matrix(c(100, 200, 300, 400, 40, 30, 20, 10), 4, 2) par2 <- c(1000, 2000, 3000, 4000) coef <- c(0.5, -0.5, 1, -1, 1.5, -1.5, 2, -2) mat <- as.data.frame(cbind(apply(par1, 1, function(x) rnorm(1000, mean = x[1], sd = x[2])), sapply(par2, function(x) rpois(1000, lambda = x)))) mat$Y <- as.numeric(t(coef %*% t(mat))) h2o

H2O running slower than data.table R

人盡茶涼 提交于 2020-01-04 05:31:29
问题 How it is possible that storing data into H2O matrix are slower than in data.table? #Packages used "H2O" and "data.table" library(h2o) library(data.table) #create the matrix matrix1<-data.table(matrix(rnorm(1000*1000),ncol=1000,nrow=1000)) matrix2<-h2o.createFrame(1000,1000) h2o.init(nthreads=-1) #Data.table variable store for(i in 1:1000){ matrix1[i,1]<-3 } #H2O Matrix Frame store for(i in 1:1000){ matrix2[i,1]<-3 } Thanks! 回答1: H2O is a client/server architecture. (See http://docs.h2o.ai

h2o predictions sometimes fail when response variable not present in test set

百般思念 提交于 2020-01-04 05:21:07
问题 When predicting on a test set where the response variable is not present, h2o fails in various different ways if one hot encoding was used for a factor variable in the training, either when specified implicitly when training a GLM or when specifying it explicitly in other methods. This error is present in R 3.4.0 and h2o 3.12.0.1. We have also tested with h2o 3.10.3.3 library(h2o) localH2O = h2o.init() prostatePath = system.file("extdata", "prostate.csv", package = "h2o") prostate.hex = read