Are there any convenience utilities for doing blockwise operations on Numpy arrays?
I am thinking of operations like Ising spin renormalization where you divide a ma
You might be looking for superbatfish's blockwise_view. This uses np.lib.stride_tricks.as_strided
to create a view of the array which places "blocks" of the array in their own axes.
For example, suppose you have a 2D array such as,
In [97]: arr = np.arange(24).reshape(6, 4)
In [98]: arr.shape
Out[98]: (6, 4)
In [99]: arr
Out[99]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
and you wish to "chop it" into 4 blocks of shape (3, 2). You could use blockwise_view to convert it into a 4D array of shape (4, 3, 2):
In [34]: blocked = blockwise_view(arr, (3, 2)); blocked
Out[34]:
array([[[[ 0, 1],
[ 4, 5],
[ 8, 9]],
[[ 2, 3],
[ 6, 7],
[10, 11]]],
[[[12, 13],
[16, 17],
[20, 21]],
[[14, 15],
[18, 19],
[22, 23]]]])
In [37]: blocked.shape
Out[37]: (2, 2, 3, 2)
Now you could reshape it so all the values from one block are in the last axis:
In [41]: reshaped = blocked.reshape(-1, 3*2); reshaped
Out[41]:
array([[ 0, 1, 4, 5, 8, 9],
[ 2, 3, 6, 7, 10, 11],
[12, 13, 16, 17, 20, 21],
[14, 15, 18, 19, 22, 23]])
Now you can sum along that axis, or take its mean or apply some other function to the elements of each block:
In [103]: reshaped.sum(axis=-1)
Out[103]: array([ 27, 39, 99, 111])
In [104]: reshaped.mean(axis=-1)
Out[104]: array([ 4.5, 6.5, 16.5, 18.5])
Unlike my first answer, which can only be applied to 2D arrays,
blockwise_view
can be applied to arbitrary N-dimensional arrays. It returns a
2N-dimensional array where the first N axes index the blocks.
For sliding blockwise operations, you can borrow an implementation from Implement Matlab's im2col_sliding 'sliding' in python that groups each block into a column, thereby blockwise operation would become as easy as operating along the axis = 0
and as such would accept all NumPy ufuncs for vectorized solutions. Here's a formal way to define such a sliding blocks creating function -
def im2col_sliding(A,BLKSZ):
# Parameters
M,N = A.shape
col_extent = N - BLKSZ[1] + 1
row_extent = M - BLKSZ[0] + 1
# Get Starting block indices
start_idx = np.arange(BLKSZ[0])[:,None]*N + np.arange(BLKSZ[1])
# Get offsetted indices across the height and width of input array
offset_idx = np.arange(row_extent)[:,None]*N + np.arange(col_extent)
# Get all actual indices & index into input array for final output
return np.take (A,start_idx.ravel()[:,None] + offset_idx.ravel())
Sample run to calculate blockwise sum
, average
, std
, etc. -
In [6]: arr # Sample array
Out[6]:
array([[6, 5, 0, 6, 0],
[7, 4, 2, 3, 6],
[6, 3, 3, 8, 1],
[5, 5, 1, 1, 8]])
In [7]: im2col_sliding(arr,[2,3]) # Blockwise array with blocksize : (2,3)
Out[7]:
array([[6, 5, 0, 7, 4, 2, 6, 3, 3],
[5, 0, 6, 4, 2, 3, 3, 3, 8],
[0, 6, 0, 2, 3, 6, 3, 8, 1],
[7, 4, 2, 6, 3, 3, 5, 5, 1],
[4, 2, 3, 3, 3, 8, 5, 1, 1],
[2, 3, 6, 3, 8, 1, 1, 1, 8]])
In [8]: np.sum(im2col_sliding(arr,[2,3]),axis=0) # Perform blockwise summation
Out[8]: array([24, 20, 17, 25, 23, 23, 23, 21, 22])
In [9]: np.mean(im2col_sliding(arr,[2,3]),axis=0) # Perform blockwise averaging
Out[9]:
array([ 4. , 3.33333333, 2.83333333, 4.16666667, 3.83333333,
3.83333333, 3.83333333, 3.5 , 3.66666667])
In [10]: np.std(im2col_sliding(arr,[2,3]),axis=0) # Blockwise std. deviation
Out[10]:
array([ 2.38047614, 1.97202659, 2.47767812, 1.77169097, 1.95078332,
2.40947205, 1.67497927, 2.43241992, 3.14466038])
This can also be done with simple reshaping. The idea is to interleave the number of blocks and the block size for each dimension.
For example, with an array of shape (6, 12, 20)
, and targeting a block size of (2, 3, 4)
, reshaping to (3, 2, 4, 3, 5, 4)
will do.
After the reshaping, the block position and the block size axes are alternating:
but they can easily be rearranged with np.transpose()
.
A 2D example follows:
import numpy as np
block_shape = 2, 2
repeats = 3, 3
m = repeats[0] * block_shape[0]
n = repeats[1] * block_shape[1]
arr = np.arange((m * n)).reshape((m, n))
print(arr)
# [[ 0 1 2 3 4 5]
# [ 6 7 8 9 10 11]
# [12 13 14 15 16 17]
# [18 19 20 21 22 23]
# [24 25 26 27 28 29]
# [30 31 32 33 34 35]]
block_shape = tuple(x for nm in zip(repeats, block_shape) for x in nm)
# (3, 2, 3, 2)
repeat_indexes = tuple(range(0, len(block_shape), 2))
# 0, 2
block_indexes = tuple(range(1, len(block_shape), 2))
# 1, 3
block_arr = arr.reshape(block_shape).transpose(repeat_indexes + block_indexes)
print(block_arr)
# [[[[ 0 1]
# [ 6 7]]
# [[ 2 3]
# [ 8 9]]
# [[ 4 5]
# [10 11]]]
# [[[12 13]
# [18 19]]
# [[14 15]
# [20 21]]
# [[16 17]
# [22 23]]]
# [[[24 25]
# [30 31]]
# [[26 27]
# [32 33]]
# [[28 29]
# [34 35]]]]