np.concatenate a ND tensor/array with a 1D array

前端 未结 4 1626
南笙
南笙 2021-01-14 09:33

I have two arrays a & b

a.shape
(5, 4, 3)
array([[[ 0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.        ],
        [ 0.          


        
4条回答
  •  余生分开走
    2021-01-14 09:54

    You can also use np.insert.

    b_broad = np.expand_dims(b, axis=0) # b_broad.shape = (1, 3)
    ab = np.insert(a, 4, b_broad, axis=1)
    """ 
    Because now we are inserting along axis 1
         a'shape without axis 1 = (5, 3) 
         b_broad's shape          (1, 3)  
    can be aligned and broadcast b_broad to (5, 3)
    """
    

    In this example, we insert along the axis 1, and will put b_broad before the index given, 4 here. In other words, the b_broad will occupy index 4 at long the axis and make ab.shape equal (5, 5, 3).

    Note again that before we do insertion, we turn b into b_broad for safely achieve the right broadcasting you want. The dimension of b is smaller and there will be broadcasting at insertion. We can use expand_dims to achieve this goal.

    If a is of shape (3, 4, 5), you will need b_broad to have shape (3, 1) to match up dimensions if inserting along axis 1. This can be achieved by

    b_broad = np.expand_dims(b, axis=1)  # shape = (3, 1)
    

    It would be a good practice to make b_broad in a right shape because you might have a.shape = (3, 4, 3) and you really need to specify which way to broadcast in this case!

    Timing Results

    From OP's dataset: COLDSPEED's answer is 3 times faster.

    def Divakar():  # Divakar's answer
        b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
        r = np.concatenate((a, b3D), axis=1)
    # COLDSPEED's result
    %timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
    2.95 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    # Divakar's result
    %timeit Divakar()
    3.03 µs ± 173 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    # Mine's
    %timeit np.insert(a, 4, b, axis=1)
    10.1 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    

    Dataset 2 (Borrow the timing experiment from COLDSPEED): nothing can be concluded in this case because they share nearly the same mean and standard deviation.

    a = np.random.randn(100, 99, 100)
    b = np.random.randn(100)
    
    # COLDSPEED's result
    %timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1) 
    2.37 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    # Divakar's
    %timeit Divakar()
    2.31 ms ± 249 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    # Mine's
    %timeit np.insert(a, 99, b, axis=1) 
    2.34 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    Speed will depend on data's size, shape, and volume. Please tested on you dataset if speed is your concern.

提交回复
热议问题