I have a sequence s = [4,3,1,0,5]
and num_classes = 6
and I want to generate a Numpy matrix m
of shape (len(s), num_classes)
Brocasting still works just do
$ (labels[:,:,:,None]==np.arange(num_classes))+0
You can use broadcasting -
(np.array(s)[:,None]==np.arange(num_classes))+0
Sample run -
In [439]: s
Out[439]: [4, 3, 1, 0, 5]
In [440]: num_classes = 9
In [441]: (np.array(s)[:,None]==np.arange(num_classes))+0
Out[441]:
array([[0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0]])
Since you want a single 1
per row, you can fancy-index using arange(len(s))
along the first axis, and using s
along the second:
s = [4,3,1,0,5]
n = len(s)
k = 6
m = np.zeros((n, k))
m[np.arange(n), s] = 1
m
=>
array([[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 1., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 1.]])
m.nonzero()
=> (array([0, 1, 2, 3, 4]), array([4, 3, 1, 0, 5]))
This can be thought of as using index (0,4), then (1,3), then (2,1), (3,0), (4,5).
The accepted answer won't work if your one-hot encoding adds an extra dimension to a multidimensional array. Using broadcasting will give you unexpected results - https://scipy.github.io/old-wiki/pages/Cookbook/Indexing . This solution is elegant but not very efficient.
labels.shape # (80, 256, 10)
def b(labels):
onehot = np.zeros((a,b,c,num_classes), dtype=float)
# This is the slow, dumb line:
(onehot_i, onehot_j, onehot_k) = np.ones(labels.shape).nonzero()
thehotone = labels[onehot_i, onehot_j, onehot_k]
onehot[onehot_i, onehot_j, onehot_k, thehotone] = 1
return onehot