h2o

h2o.ai - Flow UI not detecting date formatting to convert to Time

橙三吉。 提交于 2020-05-18 21:08:32
问题 I am using h2o 's flow ui to uplaod a csv file to train a model on. When I upload the file and edit the column types before parsing, this is what I am setting a date column to: After parsing, the data summary shows that all of the date column values are 'missing' and viewing the data with the view data button shows that they are indeed blanks (.). Looking here for acceptable date formats, it says that: "The first format is for dates formatted as yyyy-MM-dd. Year is a four-digit number, the

h2o.ai - Flow UI not detecting date formatting to convert to Time

吃可爱长大的小学妹 提交于 2020-05-18 21:07:46
问题 I am using h2o 's flow ui to uplaod a csv file to train a model on. When I upload the file and edit the column types before parsing, this is what I am setting a date column to: After parsing, the data summary shows that all of the date column values are 'missing' and viewing the data with the view data button shows that they are indeed blanks (.). Looking here for acceptable date formats, it says that: "The first format is for dates formatted as yyyy-MM-dd. Year is a four-digit number, the

How to pass dynamic column name to h2o arrange function

让人想犯罪 __ 提交于 2020-05-15 04:28:47
问题 Given a h2o dataframe df with a numeric column col, the sort of df by col works if the column is defined specifically: h2o.arrange(df, "col") But the sort doesn't work when I passed a dynamic variable name: var <- "A" h2o.arrange(df, var) I do not want to hard-coded the column name. Is there any way to solve it? Thanks. added an example per Darren's request library(h2o) h2o.init() df <- as.h2o(cars) var <- "dist" h2o.arrange(df, var) # got error h2o.arrange(df, "dist") # works 回答1: It turns

xgboost in pysparkling water throws an error: XGBoost is not available on all nodes

那年仲夏 提交于 2020-04-18 05:46:08
问题 I am trying to run xgboost from H2O package in a spark cluster. I am using h2o on an on-prem cluster on a Red Hat Enterprise Linux Server, versin:'3.10.0-1062.9.1.el7.x86_64'. I start H2O cluster inside the Spark environment .appName('APP1')\ .config('spark.executor.memory', '15g')\ .config('spark.executor.cores', '8')\ .config('spark.executor.instances','5')\ .config('spark.yarn.queue', "DS")\ .config('spark.yarn.executor.memoryOverhead', '1096')\ .enableHiveSupport()\ .getOrCreate() from

H2O Python - how to get variable types, getTypes equivalent

送分小仙女□ 提交于 2020-01-24 10:01:30
问题 What is the Python equivalent of getTypes in R? I'm trying to extract the variable types for each column from H2O data frame (enum, string, int etc.) Also, broadly can someone send me a link to some documentation listing all the properties and functions for data frames for Python? Things like. df.nrow, df.shape etc. I have really hard time finding such clear source. 回答1: You can get the documentation for H2O's Python API (specifically for H2OFrame methods) here: http://docs.h2o.ai/h2o/latest

How to map over DataFrame in spark to extract RowData and make predictions using h2o mojo model

ぐ巨炮叔叔 提交于 2020-01-23 12:36:13
问题 I have a saved h2o model in mojo format, and now I am trying to load it and use it to make predictions on a new dataset ( df ) as part of a spark app written in scala. Ideally, I wish to append a new row to the existing DataFrame containing the class probability based on this model. I can see how to apply a mojo to an individual row already in a RowData format (as per answer here), but I am not sure how to map over an existing DataFrame so that it is in the right format to make predictions

How to load table from SQL server using H2o in R?

空扰寡人 提交于 2020-01-23 02:41:10
问题 I try to load table into R using h2o but had the following error my_data <- h2o.import_sql_table(my_sql_conn, table, username, password) ERROR: Unexpected HTTP Status code: 500 Server Error (url = http://localhost:54321/99/ImportSQLTable) java.lang.RuntimeException [1] "java.lang.RuntimeException: SQLException: No suitable driver found for jdbc:mysql://10.140.20.29/MySQL?&useSSL=false\nFailed to connect and read from SQL database with connection_url: jdbc:mysql://10.140.20.29/MySQL?&useSSL

Unable to convert data frame to h2o object

人走茶凉 提交于 2020-01-22 12:03:48
问题 I am running the h2o package in Rstudio Version 0.99.447. I run version 10.9.5 OSX. I would like to set up a local cluster within R, following the steps of this tutorial: http://blenditbayes.blogspot.co.uk/2014/07/things-to-try-after-user-part-1-deep.html The first step does not seem to be a problem. What does seem to be a problem is converting my data frame to a proper h2o object. library(mlbench) dat = BreastCancer[,-1] #reading in data set from mlbench package library(h2o) localH2O <- h2o

Unable to convert data frame to h2o object

蹲街弑〆低调 提交于 2020-01-22 11:59:08
问题 I am running the h2o package in Rstudio Version 0.99.447. I run version 10.9.5 OSX. I would like to set up a local cluster within R, following the steps of this tutorial: http://blenditbayes.blogspot.co.uk/2014/07/things-to-try-after-user-part-1-deep.html The first step does not seem to be a problem. What does seem to be a problem is converting my data frame to a proper h2o object. library(mlbench) dat = BreastCancer[,-1] #reading in data set from mlbench package library(h2o) localH2O <- h2o

H2O GLM model: saved MOJO's prediction is very different when running on the same validation data

吃可爱长大的小学妹 提交于 2020-01-16 13:39:07
问题 I built a GLM model using H2O (ver 3.14) in R. Please note that the training data contains integers, and also many NA, which I use MeanImputation to handle them. glm <- h2o.glm( training_frame = train.truth, x=getColNames(train.truth), y="isFemale", family = "binomial", missing_values_handling = "MeanImputation", seed = 1000000) I then use a validation data set to look at the perf, and the Precision looks good to me: h2o.performance(glm, newdata=valid.truth)%>% h2o.confusionMatrix() Confusion