Numpy thermometer encoding

后端 未结 4 1617
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-06 00:28

I am trying to use numpy optimized in-built functions to generate thermometer encoding. Thermometer encoding is basically generating n amount if 1\'s in a given len

相关标签:
4条回答
  • 2021-01-06 00:43

    Wim's answer is incredible. I also never heard of thermometer encoding, but if I were to do I would go with map. It's simply shorter without for loop solution. The performance is quite similar.

    >>> def setValue(val):
          return np.append(np.ones(val), np.zeros(8-val))
    >>> np.array(list(map(setValue, [2,3,4,5])))
    
    array([[ 1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.]])
    

    or one-liner with lambda function

    >>> np.array(list(map(lambda v: np.append(np.ones(v), np.zeros(8-v)), [1,6,3,8])))
    
    array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.],
       [ 1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]])
    
    0 讨论(0)
  • 2021-01-06 00:43

    not much different, listcomp inside an array creation function

    temps = [1,2,4,1]
    tlen = 8
    np.stack([np.pad(np.ones(t), (0, tlen-t), 'constant') for t in temps])
    
    Out[66]: 
    array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
           [ 1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
           [ 1.,  1.,  1.,  1.,  0.,  0.,  0.,  0.],
           [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])
    
    0 讨论(0)
  • 2021-01-06 00:47

    I'd never heard of "thermometer encoding" before, but when you realise how it's so similar to one-hot encoding, it becomes clear you can get there using bit shift ops:

    >>> a = np.array([2, 3, 4, 1], dtype=np.uint8)
    >>> print(np.fliplr(np.unpackbits((1 << a) - 1).reshape(-1,8)))
    [[1 1 0 0 0 0 0 0]
     [1 1 1 0 0 0 0 0]
     [1 1 1 1 0 0 0 0]
     [1 0 0 0 0 0 0 0]]
    

    Edit: You can generalise the idea to arbitrary size integers by working in 8 column chunks:

    a = np.array([2, 13, 4, 0, 1, 17], dtype=np.uint8)
    out = np.empty((len(a), 0), dtype=np.uint8)
    while a.any():
        block = np.fliplr(np.unpackbits((1 << a) - 1).reshape(-1,8))
        out = np.concatenate([out, block], axis=1)
        a = np.where(a<8, 0, a-8)
    
    print(out)
    [[1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
     [1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0]
     [1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
     [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
     [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0]]
    
    0 讨论(0)
  • 2021-01-06 00:56
    In [22]: x = [2, 3, 4, 1, 0, 8]
    
    In [23]: length = 8
    
    In [24]: (np.arange(length) < np.array(x).reshape(-1, 1)).astype(int)
    Out[24]:
    array([[1, 1, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 0, 0, 0, 0],
           [1, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 1, 1, 1, 1]])
    

    Or, create an array of the various lengths of "bars":

    In [46]: k = np.arange(length + 1)
    
    In [47]: bars = (k[:-1] < k.reshape(-1, 1)).astype(int)
    
    In [48]: bars
    
    Out[48]: 
    array([[0, 0, 0, 0, 0, 0, 0, 0],
           [1, 0, 0, 0, 0, 0, 0, 0],
           [1, 1, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 0, 0, 0, 0],
           [1, 1, 1, 1, 1, 0, 0, 0],
           [1, 1, 1, 1, 1, 1, 0, 0],
           [1, 1, 1, 1, 1, 1, 1, 0],
           [1, 1, 1, 1, 1, 1, 1, 1]])
    

    and use it as a lookup table:

    In [49]: bars[x]
    Out[49]:
    array([[1, 1, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 0, 0, 0, 0],
           [1, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 1, 1, 1, 1]])
    

    In the above code, the preallocated array bars has shape (length+1, length). A more memory efficient representation of bars can be created using:

    In [61]: from numpy.lib.stride_tricks import as_strided
    
    In [62]: u = np.zeros(2*length, dtype=int)
    
    In [63]: u[length:] = 1
    
    In [64]: bars = as_strided(u[length-1:], shape=(length+1, length), strides=(u.strides[0], -u.strides[0]))
    
    In [65]: bars
    Out[65]: 
    array([[0, 0, 0, 0, 0, 0, 0, 0],
           [1, 0, 0, 0, 0, 0, 0, 0],
           [1, 1, 0, 0, 0, 0, 0, 0],
           [1, 1, 1, 0, 0, 0, 0, 0],
           [1, 1, 1, 1, 0, 0, 0, 0],
           [1, 1, 1, 1, 1, 0, 0, 0],
           [1, 1, 1, 1, 1, 1, 0, 0],
           [1, 1, 1, 1, 1, 1, 1, 0],
           [1, 1, 1, 1, 1, 1, 1, 1]])
    

    Then bars is a view of the one-dimensional array u, and it only uses 2*length integers.

    0 讨论(0)
提交回复
热议问题