Aggregate Numpy Array By Summing

前端 未结 3 1222
耶瑟儿~
耶瑟儿~ 2021-01-21 01:37

I have a Numpy array of shape (4320,8640). I would like to have an array of shape (2160,4320).

You\'ll notice that each cell of the new array m

3条回答
  •  被撕碎了的回忆
    2021-01-21 02:31

    You are operating on sliding windows of the original array. There are numerous questions and answers on SO regarding. sliding windows and numpy and python. By manipulating the strides of an array, this process can be sped up considerably. Here is a generic function that will return (x,y) windows of the array with or without overlap. Using this stride trick appears to be just a hair slower than @mskimm's solution. It's a nice thing to have in your toolkit. This function is not mine - it was found at Efficient Overlapping Windows with Numpy

    import numpy as np
    from numpy.lib.stride_tricks import as_strided as ast
    from itertools import product
    
    def norm_shape(shape):
        '''
        Normalize numpy array shapes so they're always expressed as a tuple, 
        even for one-dimensional shapes.
    
        Parameters
            shape - an int, or a tuple of ints
    
        Returns
            a shape tuple
    
        from http://www.johnvinyard.com/blog/?p=268
        '''
        try:
            i = int(shape)
            return (i,)
        except TypeError:
            # shape was not a number
            pass
    
        try:
            t = tuple(shape)
            return t
        except TypeError:
            # shape was not iterable
            pass
    
        raise TypeError('shape must be an int, or a tuple of ints')
    
    
    def sliding_window(a,ws,ss = None,flatten = True):
        '''
        Return a sliding window over a in any number of dimensions
    
        Parameters:
            a  - an n-dimensional numpy array
            ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size 
                 of each dimension of the window
            ss - an int (a is 1D) or tuple (a is 2D or greater) representing the 
                 amount to slide the window in each dimension. If not specified, it
                 defaults to ws.
            flatten - if True, all slices are flattened, otherwise, there is an 
                      extra dimension for each dimension of the input.
    
        Returns
            an array containing each n-dimensional window from a
    
        from http://www.johnvinyard.com/blog/?p=268
        '''
    
        if None is ss:
            # ss was not provided. the windows will not overlap in any direction.
            ss = ws
        ws = norm_shape(ws)
        ss = norm_shape(ss)
    
        # convert ws, ss, and a.shape to numpy arrays so that we can do math in every 
        # dimension at once.
        ws = np.array(ws)
        ss = np.array(ss)
        shape = np.array(a.shape)
    
    
        # ensure that ws, ss, and a.shape all have the same number of dimensions
        ls = [len(shape),len(ws),len(ss)]
        if 1 != len(set(ls)):
            error_string = 'a.shape, ws and ss must all have the same length. They were{}'
            raise ValueError(error_string.format(str(ls)))
    
        # ensure that ws is smaller than a in every dimension
        if np.any(ws > shape):
            error_string = 'ws cannot be larger than a in any dimension. a.shape was {} and ws was {}'
            raise ValueError(error_string.format(str(a.shape),str(ws)))
    
        # how many slices will there be in each dimension?
        newshape = norm_shape(((shape - ws) // ss) + 1)
        # the shape of the strided array will be the number of slices in each dimension
        # plus the shape of the window (tuple addition)
        newshape += norm_shape(ws)
        # the strides tuple will be the array's strides multiplied by step size, plus
        # the array's strides (tuple addition)
        newstrides = norm_shape(np.array(a.strides) * ss) + a.strides
        strided = ast(a,shape = newshape,strides = newstrides)
        if not flatten:
            return strided
    
        # Collapse strided so that it has one more dimension than the window.  I.e.,
        # the new array is a flat list of slices.
        meat = len(ws) if ws.shape else 0
        firstdim = (np.product(newshape[:-meat]),) if ws.shape else ()
        dim = firstdim + (newshape[-meat:])
        # remove any dimensions with size 1
        dim = filter(lambda i : i != 1,dim)
        return strided.reshape(dim)
    

    Usage:

    # 2x2 windows with NO overlap
    b = sliding_window(arr, (2,2), flatten = False)
    c = b.sum((1,2))
    

    Approximate 24% performance improvement using numpy.einsum

    c = np.einsum('ijkl -> ij', b)
    

    One SO Q&A example How can I efficiently process a numpy array in blocks similar to Matlab's blkproc (blockproc) function, the selected answer would work for you.

提交回复
热议问题