问题
I have to apply dct2
to small windows of my image, preferably by using an overlapping window.
I have found out that there are two functions in Matlab that can make this possible blockproc
and im2col
. I am having trouble understanding either and would appreciate some clarification.
blockproc
can be used to implement my function on a sliding window using the BorderSize
and TrimBorder
arguments.
B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');
I realize that this creates a block of [64 + 2*5, 64 + 2*5]
and applies the function @fun
on each block. But since I cannot go into my function @fun
in debugging to verify proper operation I cannot be sure this is what I need.
Is my above code correct for what I need? I know that I get a concatenated result in B
but it should be on a overlapping sliding block. Will this achieve what I need?
The second is im2col
. im2col(A,[m n],block_type)
will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply the dct2
function on each column? Because I doubt it will take vectors as input?
Some clarification would be greatly appreciated.
回答1:
OK, this is quite a complex question. I'll try and break this up into separate parts and will answer each question separately.
Question #1
blockproc
can be used to implement my function on a sliding window using theBorderSize
andTrimBorder
arguments.
B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');
I realize that this creates a block of
[64 + 2*5, 64 + 2*5]
and applies the function@fun
on each block. But since I cannot go into my function@fun
in debugging to verify proper operation I cannot be sure this is what I need. Is my above code correct for what I need? I know that I get a concatenated result inB
but it should be on a overlapping sliding block. Will this achieve what I need?
After experimenting around with blockproc
, this is indeed correct where you can use it to get sliding neighbourhood processing working. However, you're going to need an additional flag, which is PadPartialBlocks
. The purpose of this flag is so that if you are extracting a block where you're at the outer edges of the image and you can't make a block of a specified size, this will zero-pad this partial block so that it conforms to the same size. Here's a small example to get this working with sliding windows. Supposing we had a matrix such that:
>> A = reshape(1:25,5,5)
A =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
Let's say we wanted to take the average of each 3 x 3 overlapping neighbourhood in the matrix above, and zero-padding those elements that go beyond the borders of the matrix. You would do this with blockproc
:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [1 1], 'TrimBorder', false, 'PadPartialBlocks', true);
What's important to note is that the block size, which is 1 x 1 in this case and BorderSize
which is 1 x 1 as well are set differently than what you'd expect for a 3 x 3 block. To go into why this is the case, we need some further insight on how BorderSize
works. For a given centre of a block, BorderSize
allows you to capture values / pixels beyond the dimensions of the originally sized block. For those locations that go beyond the borders of the matrix, we would pad these locations as zero by default. BorderSize
allows us to capture 2M + 2N
pixels more, where M
and N
are the horizontal and vertical border size you want. This would allow us to capture M
more pixels both above and below the original block and N
more pixels to the left and right of the original block.
Therefore, for the value of 1 in A
, if the block size is 1 x 1, this means that the element consists of only 1, and if our BorderSize
was 1 x 1. This means our final block would be:
0 0 0
0 1 6
0 2 7
Because our block size is 1, the next block would be centred at 6, and we would get a 3 x 3 grid of pixels and so on. It is also important that TrimBorder
is set to false
so that we can keep those pixels that were originally captured upon expansion of the block. The default is set to true
. Finally, PadPartialBlocks
is true
to ensure that all blocks are the same size. When you run the above code, the result we get is:
B =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
You can verify that we get the same result using nlfilter where we can apply the mean to 3 x 3 sliding neighbourhoods:
C = nlfilter(A, [3 3], @(x) mean(x(:)))
C =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
As such, if you want to properly use blockproc
for sliding operations, you need to be careful on how you set the block size and border size respectively. In this case, the general rule is to always set your block size to be 1 x 1, and allow BorderSize
to specify the size of each block you want. Specifically, for a block of size K x K
, you would set BorderSize
to be floor(K/2) x floor(K/2)
respectively. It would make things easy if K
was odd.
For example, if you wanted a 5 x 5
mean filtering operation on a sliding window basis, you would set BorderSize
to [2 2]
, as K = 5
and floor(K/2) = 2
. Therefore, you would do this:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [2 2], 'TrimBorder', false, 'PadPartialBlocks', true)
B =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
Replicating this with nlfilter
with a size of 5 x 5 also gives:
C = nlfilter(A, [5 5], @(x) mean(x(:)))
C =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
I was doing some timing tests, and it seems that blockproc
used in this context is faster than nlfilter
.
Question #2
The second is
im2col
.im2col(A,[m n],block_type)
will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply thedct2
function on each column? Because I doubt it will take vectors as input?
You are correct in that im2col
transforms each pixel neighbourhood or block into a single column and the concatenation of these columns forms the output matrix. You can control whether the blocks are overlapping or are distinct by the block_type
parameter. Specify distinct
or sliding
(which is default) to control this. You can also control the size of each neighbourhood with m
and n
.
However, if it is your goal to apply dct2
with the output of im2col
, then you will not get what you desire. Specifically, dct2
takes the spatial location of each data point within your 2D data into account and is used as part of the transform. By transforming each pixel neighbourhood into a single column, the 2D spatial relationships that were originally there for each block are now gone. dct2
expects 2D spatial data, but you would be specifying 1D data instead. As such, im2col
is probably not what you're looking for. If I understand what you want correctly, you'll want to use blockproc
instead.
Hope this helps!
来源:https://stackoverflow.com/questions/29040688/overlapping-sliding-window-over-an-image-using-blockproc-or-im2col