Window overlap in Pandas

后端 未结 2 1562
庸人自扰
庸人自扰 2021-01-03 00:30

In pandas, there are several methods to manipulate data in a given window (e.g. pd.rolling_mean or pd.rolling_std.) However, I would like to set a

相关标签:
2条回答
  • 2021-01-03 00:56

    Using as_strided you would do something like this:

    import numpy as np
    from numpy.lib.stride_tricks import as_strided
    
    def windowed_view(arr, window, overlap):
        arr = np.asarray(arr)
        window_step = window - overlap
        new_shape = arr.shape[:-1] + ((arr.shape[-1] - overlap) // window_step,
                                      window)
        new_strides = (arr.strides[:-1] + (window_step * arr.strides[-1],) +
                       arr.strides[-1:])
        return as_strided(arr, shape=new_shape, strides=new_strides)
    

    If you pass a 1D array to the above function, it will return a 2D view into that array, with shape (number_of_windows, window_size), so you could calculate, e.g. the windowed mean as:

    win_avg = np.mean(windowed_view(arr, win_size, win_overlap), axis=-1)
    

    For example:

    >>> a = np.arange(16)
    >>> windowed_view(a, 4, 2)
    array([[ 0,  1,  2,  3],
           [ 2,  3,  4,  5],
           [ 4,  5,  6,  7],
           [ 6,  7,  8,  9],
           [ 8,  9, 10, 11],
           [10, 11, 12, 13],
           [12, 13, 14, 15]])
    >>> windowed_view(a, 4, 1)
    array([[ 0,  1,  2,  3],
           [ 3,  4,  5,  6],
           [ 6,  7,  8,  9],
           [ 9, 10, 11, 12],
           [12, 13, 14, 15]])
    
    0 讨论(0)
  • 2021-01-03 01:02

    I am not familiar with pandas, but in numpy you would do it something like this (untested):

    def overlapped_windows(x, nwin, noverlap = None):
        if noverlap is None:
            noverlap = nwin // 2
        step = nwin - noverlap
        for i in range(0, len(x) - nwin + 1, step):
            window = x[i:i+nwin] #this is a view, not a copy
            y = window * hann(nwin)
            #your code here with y
    

    This is ripped from some old code to calculate an averaged PSD, which you typically process with half-overlapping windows. Note that window is a 'view' into array x, which means it does not do any copying of data (very fast, so probably good) and that if you modify window you also modify x (so dont do window = hann * window).

    0 讨论(0)
提交回复
热议问题