How can I sort contours from left to right and top to bottom?

前端 未结 4 1334
[愿得一人]
[愿得一人] 2020-12-06 07:14

I am trying to build an character recognition program using Python. I am stuck on sorting the contours. I am using this page as a reference.

I managed to find the c

相关标签:
4条回答
  • 2020-12-06 07:44

    While I solved my task I made such an approach (this one is not optimized and can be improved, I guess):

    import pandas as pd
    import cv2
    import cv2
    import matplotlib.pyplot as plt
    import numpy as np
    %matplotlib inline
    import matplotlib
    matplotlib.rcParams['figure.figsize'] = (20.0, 10.0)
    matplotlib.rcParams['image.cmap'] = 'gray'
    
    imageCopy = cv2.imread("./test.png")
    imageGray = cv2.imread("./test.png", 0)
    image = imageCopy.copy()
    
    contours, hierarchy = cv2.findContours(imageGray, cv2.RETR_EXTERNAL, 
                                               cv2.CHAIN_APPROX_SIMPLE)
    bboxes = [cv2.boundingRect(i) for i in contours]
    bboxes=sorted(bboxes, key=lambda x: x[0])
    
    df=pd.DataFrame(bboxes, columns=['x','y','w', 'h'], dtype=int)
    df["x2"] = df["x"]+df["w"] # adding column for x on the right side
    df = df.sort_values(["x","y", "x2"]) # sorting
    
    for i in range(2): # change rows between each other by their coordinates several times 
    # to sort them completely 
        for ind in range(len(df)-1):
        #     print(ind, df.iloc[ind][4] > df.iloc[ind+1][0])
            if df.iloc[ind][4] > df.iloc[ind+1][0] and df.iloc[ind][1]> df.iloc[ind+1][1]:
                df.iloc[ind], df.iloc[ind+1] = df.iloc[ind+1].copy(), df.iloc[ind].copy()
    num=0
    for box in df.values.tolist():
    
        x,y,w,h, hy = box
        cv2.rectangle(image, (x,y), (x+w,y+h), (255,0,255), 2)
        # Mark the contour number
        cv2.putText(image, "{}".format(num + 1), (x+40, y-10), cv2.FONT_HERSHEY_SIMPLEX, 1, 
                    (0, 0, 255), 2);
        num+=1
    plt.imshow(image[:,:,::-1])
    

    Original sorting: Up-to-bottom left-to-right: The original image, if you want to test it:

    0 讨论(0)
  • 2020-12-06 07:59

    contours.sort(key=lambda r: round( float(r[1] / nearest))) will cause similar effect like (int(nearest * round(float(r[1])/nearest)) * max_width + r[0])

    0 讨论(0)
  • 2020-12-06 08:07

    I don't think you are going to be able to generate the contours directly in the correct order, but a simple sort as follows should do what you need:

    import numpy as np
    
    c = np.load(r"rect.npy")
    contours = list(c)
    
    # Example - contours = [(287, 117, 13, 46), (102, 117, 34, 47), (513, 116, 36, 49), (454, 116, 32, 49), (395, 116, 28, 48), (334, 116, 31, 49), (168, 116, 26, 49), (43, 116, 30, 48), (224, 115, 33, 50), (211, 33, 34, 47), ( 45, 33, 13, 46), (514, 32, 32, 49), (455, 32, 31, 49), (396, 32, 29, 48), (275, 32, 28, 48), (156, 32, 26, 49), (91, 32, 30, 48), (333, 31, 33, 50)] 
    
    max_height = np.max(c[::, 3])
    nearest = max_height * 1.4
    
    contours.sort(key=lambda r: [int(nearest * round(float(r[1]) / nearest)), r[0]])
    
    for x, y, w, h in contours:
        print(f"{x:4} {y:4} {w:4} {h:4}") 
    

    This would display the following output:

      36   45   33   40
      76   44   29   43
     109   43   29   45
     145   44   32   43
     184   44   21   43
     215   44   21   41
     241   43   34   45
     284   46   31   39
     324   46    7   39
     337   46   14   41
     360   46   26   39
     393   46   20   41
     421   45   45   41
     475   45   32   41
     514   43   38   45
      39  122   26   41
      70  121   40   48
     115  123   27   40
     148  121   25   45
     176  122   28   41
     212  124   30   41
     247  124   91   40
     342  124   28   39
     375  124   27   39
     405  122   27   43
      37  210   25   33
      69  199   28   44
     102  210   21   33
     129  199   28   44
     163  210   26   33
     195  197   16   44
     214  210   27   44
     247  199   25   42
     281  212    7   29
     292  212   11   42
     310  199   23   43
     340  199    7   42
     355  211   43   30
     406  213   24   28
     437  209   31   35
     473  210   28   43
     506  210   28   43
     541  210   17   31
      37  288   21   33
      62  282   15   39
      86  290   24   28
     116  290   72   30
     192  290   23   30
     218  290   26   41
     249  288   20   33
    

    It works by grouping similar y values into row values, and then sorting by the x offset of the rectangle. The key is a list holding the estimated row and then the x offset.

    The maximum height of a single rectangle is calculated to determine a suitable grouping value for nearest. The 1.4 value is a line spacing value. This could also be calculated automatically. So for both of your examples nearest is about 70.

    The calculations could also be done directly in numpy.

    0 讨论(0)
  • 2020-12-06 08:10

    after finding the contours using contours=cv2.findContours(),use -

    boundary=[]
    for c,cnt in enumerate(contours):
        x,y,w,h = cv2.boundingRect(cnt)
        boundary.append((x,y,w,h))
    count=np.asarray(boundary)
    max_width = np.sum(count[::, (0, 2)], axis=1).max()
    max_height = np.max(count[::, 3])
    nearest = max_height * 1.4
    ind_list=np.lexsort((count[:,0],count[:,1]))
    
    c=count[ind_list]
    

    now c will be sorted in left to right and top to bottom.

    0 讨论(0)
提交回复
热议问题