问题
I know that I can access the predictor names of an H2OModel
via the @parameters
slot, but can I access the predictor data types?
I'm trying to generate an input schema for my h2OModel
, and right now I have to cross-reference the training_frame
and get data types from there. Obviously, this would be a problem if my training_frame
was no longer in memory.
Here's my current approach:
getInputSchema <- function(model){
require(jsonlite)
require(h2o)
training_frame <- h2o.getFrame(model@parameters$training_frame)
toJSON(
setNames( h2o.getTypes(training_frame),
names(training_frame)
)[model@parameters$x],
auto_unbox = T
)
}
and an example of how it could be used:
#--- Example dataset ----
library(h2o)
library(data.table)
options('h2o.use.data.table'=TRUE)
library(rpart.plot) # for 'ptitanic' dataset
h2o.init()
data(ptitanic, package='rpart.plot')
survival <- as.h2o(
setDT( ptitanic)[, `:=`( age = as.numeric(age),
sibsp = as.integer(sibsp),
parch = as.integer(parch)
) ]
)
#--- Example model -----
fit <-
h2o.gbm( x = c('pclass','sex','age','sibsp','parch'),
y = 'survived',
training_frame = survival
)
#--- Example use ----
getInputSchema(fit)
# {"pclass":"enum","sex":"enum","age":"real","sibsp":"int","parch":"int"}
I'm looking for a solution that I could apply to an existing model where the dataset referenced in training_frame
is missing.
来源:https://stackoverflow.com/questions/52431210/get-predictor-data-types-from-h2omodel