Neural network with categorical variables (enum) as inputs

前端 未结 1 1604
暗喜
暗喜 2021-02-06 01:08

I\'m trying to solve some machine-learning problems using neural networks, mostly with the NEAT evolution (NeuroEvolution of Augmented Topologies).

Some of

相关标签:
1条回答
  • 2021-02-06 01:42

    Unfortunately there is no good solution, each leads to some kind of problems:

    • Your solution is adding the topology, as you mentioned; it may not be that bad, as NN can fit arbitrary functions and represent "ifs", but in many cases it will (as NN are often falling into some local minima).
    • You can encode your data in form of is_categorical_feature_i_equal_j, which won't induce any additional topology, but will grow the number of features exponentially. So instaed of "species" you get features "is_lion", "is_leopard", etc. and only one of them is equal 1 at the time
    • in case of large amount of data as compared to the possible categorical values (for example you have 10000 od data points, and only 10 possible categorical values) one can also split the problem into 10 independent ones, each trained on one particular value (so we have "neural network for lions" "neural network for jaguars" etc.)

    These two first approaches are to "extreme" cases - one is very computationally cheap, but can lead to high bias, while the seond introduces much complexity, but should not influence the classification process itself. The last one is rarely usable (due to assumption of small number of categorical values) yet quite reasonable in terms of machine learning.

    0 讨论(0)
提交回复
热议问题