generate sequence by indices / one-hot encoding

后端 未结 4 730
时光说笑
时光说笑 2020-12-21 13:36

I have a sequence s = [4,3,1,0,5] and num_classes = 6 and I want to generate a Numpy matrix m of shape (len(s), num_classes)

相关标签:
4条回答
  • 2020-12-21 13:53

    Brocasting still works just do

    $ (labels[:,:,:,None]==np.arange(num_classes))+0

    0 讨论(0)
  • 2020-12-21 14:03

    You can use broadcasting -

    (np.array(s)[:,None]==np.arange(num_classes))+0
    

    Sample run -

    In [439]: s
    Out[439]: [4, 3, 1, 0, 5]
    
    In [440]: num_classes = 9
    
    In [441]: (np.array(s)[:,None]==np.arange(num_classes))+0
    Out[441]: 
    array([[0, 0, 0, 0, 1, 0, 0, 0, 0],
           [0, 0, 0, 1, 0, 0, 0, 0, 0],
           [0, 1, 0, 0, 0, 0, 0, 0, 0],
           [1, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 1, 0, 0, 0]])
    
    0 讨论(0)
  • 2020-12-21 14:06

    Since you want a single 1 per row, you can fancy-index using arange(len(s)) along the first axis, and using s along the second:

    s = [4,3,1,0,5]
    n = len(s)
    k = 6
    m = np.zeros((n, k))
    m[np.arange(n), s] = 1
    m
    => 
    array([[ 0.,  0.,  0.,  0.,  1.,  0.],
           [ 0.,  0.,  0.,  1.,  0.,  0.],
           [ 0.,  1.,  0.,  0.,  0.,  0.],
           [ 1.,  0.,  0.,  0.,  0.,  0.],
           [ 0.,  0.,  0.,  0.,  0.,  1.]])
    
    m.nonzero()
    => (array([0, 1, 2, 3, 4]), array([4, 3, 1, 0, 5]))
    

    This can be thought of as using index (0,4), then (1,3), then (2,1), (3,0), (4,5).

    0 讨论(0)
  • 2020-12-21 14:06

    The accepted answer won't work if your one-hot encoding adds an extra dimension to a multidimensional array. Using broadcasting will give you unexpected results - https://scipy.github.io/old-wiki/pages/Cookbook/Indexing . This solution is elegant but not very efficient.

    labels.shape # (80, 256, 10)
    
    def b(labels):
         onehot = np.zeros((a,b,c,num_classes), dtype=float)
    
         # This is the slow, dumb line:
         (onehot_i, onehot_j, onehot_k) = np.ones(labels.shape).nonzero()
    
         thehotone = labels[onehot_i, onehot_j, onehot_k]
         onehot[onehot_i, onehot_j, onehot_k, thehotone] = 1
         return onehot
    
    0 讨论(0)
提交回复
热议问题