np.concatenate a ND tensor/array with a 1D array

前端未结

关注

 4  1624

I have two arrays a & b

a.shape
(5, 4, 3)
array([[[ 0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.        ],
        [ 0.


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  余生分开走        
                
              
                            
                2021-01-14 09:54
              
            
            
                                                                       
You can also use np.insert.

b_broad = np.expand_dims(b, axis=0) # b_broad.shape = (1, 3)
ab = np.insert(a, 4, b_broad, axis=1)
""" 
Because now we are inserting along axis 1
     a'shape without axis 1 = (5, 3) 
     b_broad's shape          (1, 3)  
can be aligned and broadcast b_broad to (5, 3)
"""


In this example, we insert along the axis 1, and will put b_broad before the index given, 4 here. In other words, the b_broad will occupy index 4 at long the axis and make ab.shape equal (5, 5, 3).

Note again that before we do insertion, we turn b into b_broad for safely achieve the right broadcasting you want. The dimension of b is smaller and there will be broadcasting at insertion. We can use expand_dims to achieve this goal.

If a is of shape (3, 4, 5), you will need b_broad to have shape (3, 1) to match up dimensions if inserting along axis 1. This can be achieved by 

b_broad = np.expand_dims(b, axis=1)  # shape = (3, 1)


It would be a good practice to make b_broad in a right shape because you might have a.shape = (3, 4, 3) and you really need to specify which way to broadcast in this case!

Timing Results

From OP's dataset: COLDSPEED's answer is 3 times faster.

def Divakar():  # Divakar's answer
    b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
    r = np.concatenate((a, b3D), axis=1)
# COLDSPEED's result
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
2.95 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Divakar's result
%timeit Divakar()
3.03 µs ± 173 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Mine's
%timeit np.insert(a, 4, b, axis=1)
10.1 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


Dataset 2 (Borrow the timing experiment from COLDSPEED): nothing can be concluded in this case because they share nearly the same mean and standard deviation. 

a = np.random.randn(100, 99, 100)
b = np.random.randn(100)

# COLDSPEED's result
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1) 
2.37 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Divakar's
%timeit Divakar()
2.31 ms ± 249 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Mine's
%timeit np.insert(a, 99, b, axis=1) 
2.34 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Speed will depend on data's size, shape, and volume. Please tested on you dataset if speed is your concern.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不要未来只要你来        
                
              
                            
                2021-01-14 09:59
              
            
            
                                                                       
Here are some simple timings based on cᴏʟᴅsᴘᴇᴇᴅ's and Divakar's solutions:

%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)


Output:
The slowest run took 6.44 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.68 µs per loop

%timeit np.concatenate((a, np.broadcast_to(b[None,None], (a.shape[0], 1, len(b)))), axis=1)


Output:
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.7 µs per loop

Now here is the timing based on your original code:

%timeit original_func(a, b)


Output:
The slowest run took 4.62 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.69 µs per loop

Since the question asked for faster ways to come up with the same result, I would go for cᴏʟᴅsᴘᴇᴇᴅ's solution based on these problem calculations. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2021-01-14 10:03
              
            
            
                                                                       
Simply broadcast b to 3D and then concatenate along second axis -

b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
out = np.concatenate((a,b3D),axis=1)


The broadcasting part with np.broadcast_to doesn't actual replicate or make copies and is simply a replicated view and then in the next step, we do the concatenation that does the replication on-the-fly.

Benchmarking

We are comparing np.repeat version from @cᴏʟᴅsᴘᴇᴇᴅ's solution against np.broadcast_to one 
 in this section with focus on performance. The broadcasting based one does the replication and concatenation in the second step, as a merged command so to speak, while np.repeat version makes copy and then concatenates in two separate steps.

Timing the approaches as whole :

Case #1 : a = (500,400,300) and b = (300,)

In [321]: a = np.random.rand(500,400,300)

In [322]: b = np.random.rand(300)

In [323]: %%timeit
     ...: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
     ...: r = np.concatenate((a, b3D), axis=1)
10 loops, best of 3: 72.1 ms per loop

In [325]: %%timeit
     ...: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
     ...: out = np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72.5 ms per loop


For smaller input shapes, call to np.broadcast_to would take a bit longer than np.repeat given the work needed for setting up the broadcasting is apparently more complicated, as the timings suggest below :

In [360]: a = np.random.rand(5,4,3)

In [361]: b = np.random.rand(3)

In [366]: %timeit np.broadcast_to(b,(a.shape[0],1,len(b)))
100000 loops, best of 3: 3.12 µs per loop

In [367]: %timeit b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
1000000 loops, best of 3: 957 ns per loop


But, the broadcasting part would have a constant time irrepective of the shapes of the inputs, i.e. the 3 u-sec part would stay around that mark. The timing for the counterpart : b.reshape(1, 1, -1).repeat(a.shape[0], axis=0) would depend on the input shapes. So, let's dig deeper and see how the concatenation steps for the two approaches fair/behave. 

Diging deeper

Trying to dig deeper to see how much the concatenation part is consuming :

In [353]: a = np.random.rand(500,400,300)

In [354]: b = np.random.rand(300)

In [355]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))

In [356]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72 ms per loop

In [357]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)

In [358]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72 ms per loop


Conclusion : Doesn't seem too different.

Now, let's try a case where the replication needed for b is a bigger number and b has noticeably high number of elements as well.

In [344]: a = np.random.rand(10000, 10, 1000)

In [345]: b = np.random.rand(1000)

In [346]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))

In [347]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 130 ms per loop

In [348]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)

In [349]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 141 ms per loop


Conclusion : Seems like the merged concatenate+replication with np.broadcast_to is doing a bit better here.

Let's try the original case of (5,4,3) shape :

In [360]: a = np.random.rand(5,4,3)

In [361]: b = np.random.rand(3)

In [362]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))

In [363]: %timeit np.concatenate((a,b3D),axis=1)
1000000 loops, best of 3: 948 ns per loop

In [364]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)

In [365]: %timeit np.concatenate((a,b3D),axis=1)
1000000 loops, best of 3: 950 ns per loop


Conclusion : Again, not too different.

So, the final conclusion is that if there are a lot of elements in b and if the first axis of a is also a big number (as the replication number is that one), np.broadcast_to would be a good option, otherwise np.repeat based version takes care of the other cases pretty well.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2021-01-14 10:11
              
            
            
                                                                       
You can use np.repeat:

r = np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)


What this does, is first reshape your b array to match the dimensions of a, and then repeat its values as many times as needed according to a's first axis:

b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)

array([[[1, 2, 3]],

       [[1, 2, 3]],

       [[1, 2, 3]],

       [[1, 2, 3]],

       [[1, 2, 3]]])

b3D.shape
(5, 1, 3)


This intermediate result is then concatenated with a - 

r = np.concatenate((a, b3d), axis=0)

r.shape
(5, 5, 3)


This differs from your current answer mainly in the fact that the repetition of values is not hard-coded (i.e., it is taken care of by the repeat). 

If you need to handle this for a different number of dimensions (not 3D arrays), then some changes are needed (mainly in how remove the hardcoded reshape of b).



Timings

a = np.random.randn(100, 99, 100)
b = np.random.randn(100)




# Tai's answer
%timeit np.insert(a, 4, b, axis=1)
100 loops, best of 3: 3.7 ms per loop

# Divakar's answer
%%timeit 
b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
np.concatenate((a,b3D),axis=1)

100 loops, best of 3: 3.67 ms per loop

# solution in this post
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
100 loops, best of 3: 3.62 ms per loop


These are all pretty competitive solutions. However, note that performance depends on your actual data, so make sure you test things first!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复