numpy random shuffle by row independently

后端未结

关注

 5  890

I have the following array:

 a= array([[  1,  2, 3],
           [  1,  2, 3],
           [  1,  2, 3])

I understand that np.random,sh


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  滥情空心        
                
              
                            
                2020-12-17 03:22
              
            
            
                                                                       
import numpy as np
np.random.seed(2018)

def scramble(a, axis=-1):
    """
    Return an array with the values of `a` independently shuffled along the
    given axis
    """ 
    b = a.swapaxes(axis, -1)
    n = a.shape[axis]
    idx = np.random.choice(n, n, replace=False)
    b = b[..., idx]
    return b.swapaxes(axis, -1)

a = a = np.arange(4*9).reshape(4, 9)
# array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8],
#        [ 9, 10, 11, 12, 13, 14, 15, 16, 17],
#        [18, 19, 20, 21, 22, 23, 24, 25, 26],
#        [27, 28, 29, 30, 31, 32, 33, 34, 35]])

print(scramble(a, axis=1))


yields 

[[ 3  8  7  0  4  5  1  2  6]
 [12 17 16  9 13 14 10 11 15]
 [21 26 25 18 22 23 19 20 24]
 [30 35 34 27 31 32 28 29 33]]


while scrambling along the 0-axis:

print(scramble(a, axis=0))


yields

[[18 19 20 21 22 23 24 25 26]
 [ 0  1  2  3  4  5  6  7  8]
 [27 28 29 30 31 32 33 34 35]
 [ 9 10 11 12 13 14 15 16 17]]




This works by first swapping the target axis with the last axis:

b = a.swapaxes(axis, -1)


This is a common trick used to standardize code which deals with one axis.
It reduces the general case to the specific case of dealing with the last axis.
Since in NumPy version 1.10 or higher swapaxes returns a view, there is no copying involved and so calling swapaxes is very quick.

Now we can generate a new index order for the last axis:

n = a.shape[axis]
idx = np.random.choice(n, n, replace=False)


Now we can shuffle b (independently along the last axis):

b = b[..., idx]


and then reverse the swapaxes to return an a-shaped result:

return b.swapaxes(axis, -1)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  后悔当初        
                
              
                            
                2020-12-17 03:25
              
            
            
                                                                       
If you don't want a return value and want to operate on the array directly, you can specify the indices to shuffle.

>>> import numpy as np
>>>
>>>
>>> a = np.array([[1,2,3], [1,2,3], [1,2,3]])
>>>
>>> # Shuffle row `2` independently
>>> np.random.shuffle(a[2])
>>> a
array([[1, 2, 3],
       [1, 2, 3],
       [3, 2, 1]])
>>>
>>> # Shuffle column `0` independently
>>> np.random.shuffle(a[:,0])
>>> a
array([[3, 2, 3],
       [1, 2, 3],
       [1, 2, 1]])


If you want a return value as well, you can use numpy.random.permutation, in which case replace np.random.shuffle(a[n]) with a[n] = np.random.permutation(a[n]).

Warning, do not do a[n] = np.random.shuffle(a[n]). shuffle does not return anything, so the row/column you end up "shuffling" will be filled with nan instead.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  隐瞒了意图╮        
                
              
                            
                2020-12-17 03:32
              
            
            
                                                                       
Good answer above. But I will throw in a quick and dirty way:

a = np.array([[1,2,3], [1,2,3], [1,2,3]])
ignore_list_outpput = [np.random.shuffle(x) for x in a]
Then, a can be something like this
array([[2, 1, 3],
       [4, 6, 5],
       [9, 7, 8]])


Not very elegant but you can get this job done with just one short line.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  面向向阳花        
                
              
                            
                2020-12-17 03:33
              
            
            
                                                                       
Building on my comment to @Hun's answer, here's the fastest way to do this:

def shuffle_along(X):
    """Minimal in place independent-row shuffler."""
    [np.random.shuffle(x) for x in X]


This works in-place and can only shuffle rows. If you need more options:

def shuffle_along(X, axis=0, inline=False):
    """More elaborate version of the above."""
    if not inline:
        X = X.copy()
    if axis == 0:
        [np.random.shuffle(x) for x in X]
    if axis == 1:
        [np.random.shuffle(x) for x in X.T]
    if not inline:
        return X


This, however, has the limitation of only working on 2d-arrays. For higher dimensional tensors, I would use:

def shuffle_along(X, axis=0, inline=True):
    """Shuffle along any axis of a tensor."""
    if not inline:
        X = X.copy()
    np.apply_along_axis(np.random.shuffle, axis, X)  # <-- I just changed this
    if not inline:
        return X

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-12-17 03:42
              
            
            
                                                                       
You can do it with numpy without any loop or extra function, and much more faster. E. g., we have an array of size (2, 6) and we want a sub array (2,2) with independent random index for each column.

import numpy as np

test = np.array([[1, 1],
                 [2, 2],
                 [0.5, 0.5],
                 [0.3, 0.3],
                 [4, 4],
                 [7, 7]])

id_rnd = np.random.randint(6, size=(2, 2))  # select random numbers, use choice and range if don want replacement.
new = np.take_along_axis(test, id_rnd, axis=0)

Out: 
array([[2. , 2. ],
       [0.5, 2. ]])


It works for any number of dimensions.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复