问题
In brief, I am trying to come up with a ML (and later DL model) for predicting control input variable of my computer simulation model, based on all other model input variables - let's call them environmental variables. Whether the simulation gives a convergent result or not depends on the value of the control variable. The database for the problem has been generated in a long, iterative simulation run with different scenarios. It consists of all environmental inputs, the control input value, and the residual values that are used for calculating convergence.
The problem is highly class imbalanced. I'm using F1 macro avg to evaluate the results.
The control variable is DISCRETE, but only due to the fact that I had to discretize the input in simulation environment. It's an unequally spaced grid of size 17.
The environmental variables are both numerical continuous and discrete (also due to input discretization in a simulation definition) and categorical.
I have decided on XGBoost for the ML algorithm. I'm training both a classification and a regression model because I couldn't decide which one is more applicable in that case. The regressor rounds up continuous output to the closest value on the output grid. The results of both classifier and regression run are comparable with a slight advantage on the side of classifier (score: 0.62 vs 0.67 F1 macro avg).
My question is: Do you think that it makes sense to pose the problem as mutliclass classification just because it seems to perform better, even though the problem seems to be naturally of regression type? The next step is to develop a DL model from fast.ai tabular toolbox. As long as ML were fast and easy to train, the DL may take more time and effort. I wonder if it makes sense to carry on with developing two models, or should I decide on one approach.
I'll be grateful for any suggestions, especially if you've had some experience with a similarly posed problems.
来源:https://stackoverflow.com/questions/60397041/prediction-of-a-discrete-numerical-target-multiclass-classifier-or-regressor