Find largest rectangle containing only zeros in an N×N binary matrix

前端 未结 8 2186
伪装坚强ぢ
伪装坚强ぢ 2020-11-22 06:25

Given an NxN binary matrix (containing only 0\'s or 1\'s), how can we go about finding largest rectangle containing all 0\'s?

Example:

      I
    0          


        
相关标签:
8条回答
  • 2020-11-22 06:33

    Here is a Python3 solution, which returns the position in addition to the area of the largest rectangle:

    #!/usr/bin/env python3
    
    import numpy
    
    s = '''0 0 0 0 1 0
    0 0 1 0 0 1
    0 0 0 0 0 0
    1 0 0 0 0 0
    0 0 0 0 0 1
    0 0 1 0 0 0'''
    
    nrows = 6
    ncols = 6
    skip = 1
    area_max = (0, [])
    
    a = numpy.fromstring(s, dtype=int, sep=' ').reshape(nrows, ncols)
    w = numpy.zeros(dtype=int, shape=a.shape)
    h = numpy.zeros(dtype=int, shape=a.shape)
    for r in range(nrows):
        for c in range(ncols):
            if a[r][c] == skip:
                continue
            if r == 0:
                h[r][c] = 1
            else:
                h[r][c] = h[r-1][c]+1
            if c == 0:
                w[r][c] = 1
            else:
                w[r][c] = w[r][c-1]+1
            minw = w[r][c]
            for dh in range(h[r][c]):
                minw = min(minw, w[r-dh][c])
                area = (dh+1)*minw
                if area > area_max[0]:
                    area_max = (area, [(r-dh, c-minw+1, r, c)])
    
    print('area', area_max[0])
    for t in area_max[1]:
        print('Cell 1:({}, {}) and Cell 2:({}, {})'.format(*t))
    

    Output:

    area 12
    Cell 1:(2, 1) and Cell 2:(4, 4)
    
    0 讨论(0)
  • 2020-11-22 06:35

    I propose a O(nxn) method.

    First, you can list all the maximum empty rectangles. Empty means that it covers only 0s. A maximum empty rectangle is such that it cannot be extended in a direction without covering (at least) one 1.

    A paper presenting a O(nxn) algorithm to create such a list can be found at www.ulg.ac.be/telecom/rectangles as well as source code (not optimized). There is no need to store the list, it is sufficient to call a callback function each time a rectangle is found by the algorithm, and to store only the largest one (or choose another criterion if you want).

    Note that a proof exists (see the paper) that the number of largest empty rectangles is bounded by the number of pixels of the image (nxn in this case).

    Therefore, selecting the optimal rectangle can be done in O(nxn), and the overall method is also O(nxn).

    In practice, this method is very fast, and is used for realtime video stream analysis.

    0 讨论(0)
  • 2020-11-22 06:41

    Here's a solution based on the "Largest Rectangle in a Histogram" problem suggested by @j_random_hacker in the comments:

    [Algorithm] works by iterating through rows from top to bottom, for each row solving this problem, where the "bars" in the "histogram" consist of all unbroken upward trails of zeros that start at the current row (a column has height 0 if it has a 1 in the current row).

    The input matrix mat may be an arbitrary iterable e.g., a file or a network stream. Only one row is required to be available at a time.

    #!/usr/bin/env python
    from collections import namedtuple
    from operator import mul
    
    Info = namedtuple('Info', 'start height')
    
    def max_size(mat, value=0):
        """Find height, width of the largest rectangle containing all `value`'s."""
        it = iter(mat)
        hist = [(el==value) for el in next(it, [])]
        max_size = max_rectangle_size(hist)
        for row in it:
            hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
            max_size = max(max_size, max_rectangle_size(hist), key=area)
        return max_size
    
    def max_rectangle_size(histogram):
        """Find height, width of the largest rectangle that fits entirely under
        the histogram.
        """
        stack = []
        top = lambda: stack[-1]
        max_size = (0, 0) # height, width of the largest rectangle
        pos = 0 # current position in the histogram
        for pos, height in enumerate(histogram):
            start = pos # position where rectangle starts
            while True:
                if not stack or height > top().height:
                    stack.append(Info(start, height)) # push
                elif stack and height < top().height:
                    max_size = max(max_size, (top().height, (pos - top().start)),
                                   key=area)
                    start, _ = stack.pop()
                    continue
                break # height == top().height goes here
    
        pos += 1
        for start, height in stack:
            max_size = max(max_size, (height, (pos - start)), key=area)    
        return max_size
    
    def area(size):
        return reduce(mul, size)
    

    The solution is O(N), where N is the number of elements in a matrix. It requires O(ncols) additional memory, where ncols is the number of columns in a matrix.

    Latest version with tests is at https://gist.github.com/776423

    0 讨论(0)
  • 2020-11-22 06:45

    Here is J.F. Sebastians method translated into C#:

    private Vector2 MaxRectSize(int[] histogram) {
            Vector2 maxSize = Vector2.zero;
            int maxArea = 0;
            Stack<Vector2> stack = new Stack<Vector2>();
    
            int x = 0;
            for (x = 0; x < histogram.Length; x++) {
                int start = x;
                int height = histogram[x];
                while (true) {
                    if (stack.Count == 0 || height > stack.Peek().y) {
                        stack.Push(new Vector2(start, height));
    
                    } else if(height < stack.Peek().y) {
                        int tempArea = (int)(stack.Peek().y * (x - stack.Peek().x));
                        if(tempArea > maxArea) {
                            maxSize = new Vector2(stack.Peek().y, (x - stack.Peek().x));
                            maxArea = tempArea;
                        }
    
                        Vector2 popped = stack.Pop();
                        start = (int)popped.x;
                        continue;
                    }
    
                    break;
                }
            }
    
            foreach (Vector2 data in stack) {
                int tempArea = (int)(data.y * (x - data.x));
                if(tempArea > maxArea) {
                    maxSize = new Vector2(data.y, (x - data.x));
                    maxArea = tempArea;
                }
            }
    
            return maxSize;
        }
    
        public Vector2 GetMaximumFreeSpace() {
            // STEP 1:
            // build a seed histogram using the first row of grid points
            // example: [true, true, false, true] = [1,1,0,1]
            int[] hist = new int[gridSizeY];
            for (int y = 0; y < gridSizeY; y++) {
                if(!invalidPoints[0, y]) {
                    hist[y] = 1;
                }
            }
    
            // STEP 2:
            // get a starting max area from the seed histogram we created above.
            // using the example from above, this value would be [1, 1], as the only valid area is a single point.
            // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3.
            // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on
            // a single row of data.
            Vector2 maxSize = MaxRectSize(hist);
            int maxArea = (int)(maxSize.x * maxSize.y);
    
            // STEP 3:
            // build histograms for each additional row, re-testing for new possible max rectangluar areas
            for (int x = 1; x < gridSizeX; x++) {
                // build a new histogram for this row. the values of this row are
                // 0 if the current grid point is occupied; otherwise, it is 1 + the value
                // of the previously found historgram value for the previous position. 
                // What this does is effectly keep track of the height of continous avilable spaces.
                // EXAMPLE:
                //      Given the following grid data (where 1 means occupied, and 0 means free; for clairty):
                //          INPUT:        OUTPUT:
                //      1.) [0,0,1,0]   = [1,1,0,1]
                //      2.) [0,0,1,0]   = [2,2,0,2]
                //      3.) [1,1,0,1]   = [0,0,1,0]
                //
                //  As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous
                //  free space.
                for (int y = 0; y < gridSizeY; y++) {                
                    if(!invalidPoints[x, y]) {
                        hist[y] = 1 + hist[y];
                    } else {
                        hist[y] = 0;
                    }
                }
    
                // find the maximum size of the current histogram. If it happens to be larger
                // that the currently recorded max size, then it is the new max size.
                Vector2 maxSizeTemp = MaxRectSize(hist);
                int tempArea = (int)(maxSizeTemp.x * maxSizeTemp.y);
                if (tempArea > maxArea) {
                    maxSize = maxSizeTemp;
                    maxArea = tempArea;
                }
            }
    
            // at this point, we know the max size
            return maxSize;            
        }
    

    A few things to note about this:

    1. This version is meant for use with the Unity API. You can easily make this more generic by replacing instances of Vector2 with KeyValuePair. Vector2 is only used for a convenient way to store two values.
    2. invalidPoints[] is an array of bool, where true means the grid point is "in use", and false means it is not.
    0 讨论(0)
  • 2020-11-22 06:45

    Here is a version of jfs' solution, which also delivers the position of the largest rectangle:

    from collections import namedtuple
    from operator import mul
    
    Info = namedtuple('Info', 'start height')
    
    def max_rect(mat, value=0):
        """returns (height, width, left_column, bottom_row) of the largest rectangle 
        containing all `value`'s.
    
        Example:
        [[0, 0, 0, 0, 0, 0, 0, 0, 3, 2],
         [0, 4, 0, 2, 4, 0, 0, 1, 0, 0],
         [1, 0, 1, 0, 0, 0, 3, 0, 0, 4],
         [0, 0, 0, 0, 4, 2, 0, 0, 0, 0],
         [0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
         [4, 3, 0, 0, 1, 2, 0, 0, 0, 0],
         [3, 0, 0, 0, 2, 0, 0, 0, 0, 4],
         [0, 0, 0, 1, 0, 3, 2, 4, 3, 2],
         [0, 3, 0, 0, 0, 2, 0, 1, 0, 0]]
         gives: (3, 4, 6, 5)
        """
        it = iter(mat)
        hist = [(el==value) for el in next(it, [])]
        max_rect = max_rectangle_size(hist) + (0,)
        for irow,row in enumerate(it):
            hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
            max_rect = max(max_rect, max_rectangle_size(hist) + (irow+1,), key=area)
            # irow+1, because we already used one row for initializing max_rect
        return max_rect
    
    def max_rectangle_size(histogram):
        stack = []
        top = lambda: stack[-1]
        max_size = (0, 0, 0) # height, width and start position of the largest rectangle
        pos = 0 # current position in the histogram
        for pos, height in enumerate(histogram):
            start = pos # position where rectangle starts
            while True:
                if not stack or height > top().height:
                    stack.append(Info(start, height)) # push
                elif stack and height < top().height:
                    max_size = max(max_size, (top().height, (pos - top().start), top().start), key=area)
                    start, _ = stack.pop()
                    continue
                break # height == top().height goes here
    
        pos += 1
        for start, height in stack:
            max_size = max(max_size, (height, (pos - start), start), key=area)
    
        return max_size
    
    def area(size):
        return size[0] * size[1]
    
    0 讨论(0)
  • 2020-11-22 06:47

    Please take a look at Maximize the rectangular area under Histogram and then continue reading the solution below.

    Traverse the matrix once and store the following;
    
    For x=1 to N and y=1 to N    
    F[x][y] = 1 + F[x][y-1] if A[x][y] is 0 , else 0
    
    Then for each row for x=N to 1 
    We have F[x] -> array with heights of the histograms with base at x.
    Use O(N) algorithm to find the largest area of rectangle in this histogram = H[x]
    
    From all areas computed, report the largest.
    

    Time complexity is O(N*N) = O(N²) (for an NxN binary matrix)

    Example:

    Initial array    F[x][y] array
     0 0 0 0 1 0     1 1 1 1 0 1
     0 0 1 0 0 1     2 2 0 2 1 0
     0 0 0 0 0 0     3 3 1 3 2 1
     1 0 0 0 0 0     0 4 2 4 3 2
     0 0 0 0 0 1     1 5 3 5 4 0
     0 0 1 0 0 0     2 6 0 6 5 1
    
     For x = N to 1
     H[6] = 2 6 0 6 5 1 -> 10 (5*2)
     H[5] = 1 5 3 5 4 0 -> 12 (3*4)
     H[4] = 0 4 2 4 3 2 -> 10 (2*5)
     H[3] = 3 3 1 3 2 1 -> 6 (3*2)
     H[2] = 2 2 0 2 1 0 -> 4 (2*2)
     H[1] = 1 1 1 1 0 1 -> 4 (1*4)
    
     The largest area is thus H[5] = 12
    
    0 讨论(0)
提交回复
热议问题