Find the nXn submatrix with the highest number of 1's

问题

I tried to solve this question but without success:

Given 3n X 3n boolean matrix, find the nXn submatrix with the highest number of 1's in O(n^2).

I had an idea but it was O(n^3).

The idea:

count the 1's in the matrix begin with (0,0) and move to the right and check only the new col that should be added VS the col that should be deleted. And the same idea for down.
Each submatrix calculation is O(n^2) and passing all the matrix is O(n) so it's too much.

I don't see a way how to pass across the 3n X 3n matrix (which is O(n^2)) and also to calculate the number of 1's in O(1).

Any ideas?

Edit for MBo:

This is the original matrix: The martix and after running the two for loops, CS looks like: CS martix

So if you want to calculate the matrix from (0,0) to (1,1) sum will be

sum = CS[1,1] + CS[0,0] - CS[1,0] - CS[0,1] which is 2 + 1 - 2 - 1 = 0

but the real result should be 2.

回答1:

Calculate cumulative sums for the matrix

Copy the first row of source matrix A into matrix CS

for every row except for the first one:
   for every column:
       CS[r][c] = CS[r-1][c] + A[r][c]

for every row:
   for every column except for the first one:
       CS[r][c] += CS[r][c-1]

Now you can find sum of any submatrix with bottom right corner at y,x (omit addends with indices < 0)

 Sum = CS[y][x] + CS[y - n][x - n] - CS[y-n][x] - CS[y][x-n]

For reference: integral image in OpenCV

import random, pprint
a = [[random.randint(0,1) for _ in range(9)] for _ in range(9)]
pprint.PrettyPrinter(indent = 2).pprint(a)
cs = [[0]*10 for _ in range(10)]
for i in range(9):
    cs[0][i] = a[0][i]
for r in range(1, 9):
    for c in range(9):
       cs[r][c] = cs[r-1][c] + a[r][c]

for r in range(9):
    for c in range(1, 9):
       cs[r][c] += cs[r][c-1]
pprint.PrettyPrinter(indent = 2).pprint(cs)

maxs = -1
for r in range(2, 9):
    for c in range(2, 9):
        s = cs[r][c] + cs[r - 3][c - 3] - cs[r -3][c] - cs[r][c-3]
        if s > maxs:
            maxs = s
            best = (r, c)
print(maxs, best)


[ [1, 1, 0, 0, 0, 1, 1, 0, 0],
  [1, 1, 0, 1, 0, 0, 0, 1, 0],
  [0, 1, 0, 1, 0, 1, 0, 1, 0],
  [0, 0, 1, 0, 1, 1, 0, 0, 1],
  [0, 0, 0, 1, 1, 0, 0, 1, 1],
  [0, 1, 1, 0, 0, 0, 0, 0, 1],
  [1, 0, 1, 0, 0, 1, 1, 0, 1],
  [1, 1, 0, 1, 0, 1, 0, 0, 0],
  [1, 0, 0, 1, 0, 0, 1, 1, 1]]
[ [1, 2, 2, 2, 2, 3, 4, 4, 4, 0],
  [2, 4, 4, 5, 5, 6, 7, 8, 8, 0],
  [2, 5, 5, 7, 7, 9, 10, 12, 12, 0],
  [2, 5, 6, 8, 9, 12, 13, 15, 16, 0],
  [2, 5, 6, 9, 11, 14, 15, 18, 20, 0],
  [2, 6, 8, 11, 13, 16, 17, 20, 23, 0],
  [3, 7, 10, 13, 15, 19, 21, 24, 28, 0],
  [4, 9, 12, 16, 18, 23, 25, 28, 32, 0],
  [5, 10, 13, 18, 20, 25, 28, 32, 37, 0],
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

6 (4, 5)

回答2:

First of all, as you pointed out, there are O(n^2) submatrices of size n x n, in fact, (2n+1)^2 submatrices, so you need be able to update the number of 1's in O(1), in order to obtain a O(n^2) algorithm.

In order to update the number of 1's when moving along the 3n x 3n matrix you must have done some preprocessing, if not, it is clearly going to take you O(n^3). The idea is to store in each cell the number of 1's on its right and on below. That is, imagine your initial matrix is M, then you will need two extra matrices:

R[i,j] = M[i,j] + R[i,j+1] (counts the number of 1's on its right)

B[i,j] = M[i,j] + B[i+1,j] (counts the number of 1's below)

Then, you can compute the number of 1's of any segment of column or row in O(1). For instance, the number of 1's in the first third of column would be R[0,0] - R[n,0]. This allows you to update the number of 1's in O(1).

For instance, say that, as you suggested, you start counting the number of 1's of the (0,0) matrix. Say you store the number of 1's in the matrix Count, and Count[0,0] is the number of ones of the (0,0) matrix.

Then,

Count[0][1] = Count[0][0] - (R[0][0] - R[n][0]) + (R[0][n] - R[n][n])

That is, we subtract the number of 1's of the first third of the first column and add the number of 1's of the first third of the n-th column (starting by 0).

Similarly,

Count[1][0] = Count[0][0] - (B[0][0] - B[0][n]) + (B[n][0] - B[n][n])

That is, we subtract the number of 1's of the first third of the first row and add the number of 1's of the first third of the n-th row (starting by 0).

And, in general,

Count[i][j] = Count[i][j-1] - (R[i][j-1] - R[i+n][j-1]) + (R[i][j-1+n] - R[i+n][j-1+n])

Count[i][j] = Count[i-1][j] - (B[i-1][j] - B[i-1][j+n]) + (B[i-1+n][j] - B[i-1+n][j+n])

Therefore, you can update the number of 1's in O(1), by moving along the initial matrix. This means that you will be able to compute the number of 1's in each submatrix, in time O(n^2). The answer will just be the maximum.

Hope it helped!

In order to initialize the R matrix, we first assign the values of the last column, which are the values of the last column of the matrix, and then we apply the recursion (R[i,j] = M[i,j] + R[i,j+1]) from right to left.

for (i = 0...3n-1){
    R[i][3n-1] = M[i][3n-1]

    for (j = 3n-2...0) R[i][j] = M[i][j] + R[i][j+1]
}

The same with B, but with rows instead of columns and from bottom to top:

for (j = 0...3n-1){
    B[3n-1][j] = M[3n-1][j]

    for (i = 3n-2...0) B[i][j] = M[i][j] + B[i+1][j]
}

来源：https://stackoverflow.com/questions/62653555/find-the-nxn-submatrix-with-the-highest-number-of-1s

标签

algorithm

math

matrix