Selecting Random Windows from Multidimensional Numpy Array Rows

前端 未结 2 1607
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-27 08:12

I have a large array where each row is a time series and thus needs to stay in order.

I want to select a random window of a given size for each row.

Example

相关标签:
2条回答
  • 2020-11-27 08:29

    Here's one leveraging np.lib.stride_tricks.as_strided -

    def random_windows_per_row_strided(arr, W=3):
        idx = np.random.randint(0,arr.shape[1]-W+1, arr.shape[0])
        strided = np.lib.stride_tricks.as_strided 
        m,n = arr.shape
        s0,s1 = arr.strides
        windows = strided(arr, shape=(m,n-W+1,W), strides=(s0,s1,s1))
        return windows[np.arange(len(idx)), idx]
    

    Runtime test on bigger array with 10,000 rows -

    In [469]: arr = np.random.rand(100000,100)
    
    # @Psidom's soln
    In [470]: %timeit select_random_windows(arr, window_size=3)
    100 loops, best of 3: 7.41 ms per loop
    
    In [471]: %timeit random_windows_per_row_strided(arr, W=3)
    100 loops, best of 3: 6.84 ms per loop
    
    # @Psidom's soln
    In [472]: %timeit select_random_windows(arr, window_size=30)
    10 loops, best of 3: 26.8 ms per loop
    
    In [473]: %timeit random_windows_per_row_strided(arr, W=30)
    100 loops, best of 3: 9.65 ms per loop
    
    # @Psidom's soln
    In [474]: %timeit select_random_windows(arr, window_size=50)
    10 loops, best of 3: 41.8 ms per loop
    
    In [475]: %timeit random_windows_per_row_strided(arr, W=50)
    100 loops, best of 3: 10 ms per loop
    
    0 讨论(0)
  • 2020-11-27 08:41

    In the return statement, change the slicing to advanced indexing, also you need to fix the sampling code a little bit:

    def select_random_windows(arr, window_size):
        offsets = np.random.randint(0, arr.shape[1]-window_size+1, size=arr.shape[0])
        return arr[np.arange(arr.shape[0])[:,None], offsets[:,None] + np.arange(window_size)]
    
    select_random_windows(arr, 3)
    #array([[ 4,  5,  6],
    #       [ 7,  8,  9],
    #       [17, 18, 19],
    #       [25, 26, 27],
    #       [31, 32, 33],
    #       [39, 40, 41]])
    
    0 讨论(0)
提交回复
热议问题