I want to predict the response variable, and it has 700 classes.
Deep learning model parameters
from h2o.estimators import deeplearning
dl_model = deepl
When you say you have 700 classes, do you mean your response variable is made up of arrays of those 700 unique numbers? Because you gave this example:
Response variable tags:
[74]
[156, 89]
[153, 13, 133, 40]
[150]
[474, 277, 113]
[181, 117]
[15, 87, 8, 11]
H2O cannot predict arrays. Each unique combination of numbers will be counting as a single class. You therefore likely have a lot more than 700 classes, from H2O's point of view.
If you look at the data over on Flow ( http://127.0.0.1:54321/ ) it will tell you how many unique levels there are in 'tags'. (You can also get it from the python API, using describe()
on the frame, or categories()
on the column in question will list all the levels.)
Your next question is going to be what to do about this. I suggest making that a new question, where you explain what the 700 values, and the arrays represent; it is almost certainly going to involve some domain-specific pre-processing. However you could try playing with categorical_encoding
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/categorical_encoding.html