Usually, when you want to get a one-hot encoding for classification in machine learning, you have an array of indices.
import numpy as np
nb_classes = 6
targets = np.array([[2, 3, 4, 0]]).reshape(-1)
one_hot_targets = np.eye(nb_classes)[targets]
The one_hot_targets
is now
array([[[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 1., 0., 0., 0., 0., 0.]]])
The .reshape(-1)
is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]
). The -1
is a special value which means "put all remaining stuff in this dimension". As there is only one, it flattens the array.
Copy-Paste solution
def get_one_hot(targets, nb_classes):
res = np.eye(nb_classes)[np.array(targets).reshape(-1)]
return res.reshape(list(targets.shape)+[nb_classes])
Package
You can use mpu.ml.indices2one_hot. It's tested and simple to use:
import mpu.ml
one_hot = mpu.ml.indices2one_hot([1, 3, 0], nb_classes=5)