Text extraction - line-by-line

夙愿已清 提交于 2019-12-23 10:58:26

问题


I am using Google Vision API, primarily to extract texts. I works fine, but for specific cases where I would need the API to scan the enter line, spits out the text before moving to the next line. However, it appears that the API is using some kind of logic that makes it scan top to bottom on the left side and moving to right side and doing a top to bottom scan. I would have liked if the API read left-to-right, move down and so on.

For example, consider the image:

The API returns the text like this:

“ Name DOB Gender: Lives In John Doe 01-Jan-1970 LA ”

Whereas, I would have expected something like this:

“ Name: John Doe DOB: 01-Jan-1970 Gender: M Lives In: LA ”

I suppose there is a way to define the block size or margin setting (?) to read the image/scan line by line?

Thanks for your help. Alex


回答1:


This might be a late answer but adding it for future reference. You can add feature hints to your JSON request to get the desired results.

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "https://i.stack.imgur.com/TRTXo.png"
        }
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ]
    }
  ]
}

For text which are very far apart the DOCUMENT_TEXT_DETECTION also does not provide proper line segmentation.

The following code does simple line segmentation based on the character polygon coordinates.

https://github.com/sshniro/line-segmentation-algorithm-to-gcp-vision




回答2:


You can extract the text based on the bounds per line too, you can use boundyPoly and concatenate the text in the same line

"boundingPoly": {
        "vertices": [
          {
            "x": 87,
            "y": 148
          },
          {
            "x": 411,
            "y": 148
          },
          {
            "x": 411,
            "y": 206
          },
          {
            "x": 87,
            "y": 206
          }
        ]

for example this 2 words are in the same "line"

"description": "you",
      "boundingPoly": {
        "vertices": [
          {
            "x": 362,
            "y": 1406
          },
          {
            "x": 433,
            "y": 1406
          },
          {
            "x": 433,
            "y": 1448
          },
          {
            "x": 362,
            "y": 1448
          }
        ]
      }
    },
    {
      "description": "start",
      "boundingPoly": {
        "vertices": [
          {
            "x": 446,
            "y": 1406
          },
          {
            "x": 540,
            "y": 1406
          },
          {
            "x": 540,
            "y": 1448
          },
          {
            "x": 446,
            "y": 1448
          }
        ]
      }
    }



回答3:


Here a simple code to read line by line. y-axis for lines and x-axis for each word in the line.

items = []
lines = {}

for text in response.text_annotations[1:]:
    top_x_axis = text.bounding_poly.vertices[0].x
    top_y_axis = text.bounding_poly.vertices[0].y
    bottom_y_axis = text.bounding_poly.vertices[3].y

    if top_y_axis not in lines:
        lines[top_y_axis] = [(top_y_axis, bottom_y_axis), []]

    for s_top_y_axis, s_item in lines.items():
        if top_y_axis < s_item[0][1]:
            lines[s_top_y_axis][1].append((top_x_axis, text.description))
            break

for _, item in lines.items():
    if item[1]:
        words = sorted(item[1], key=lambda t: t[0])
        items.append((item[0], ' '.join([word for _, word in words]), words))

print(items)


来源:https://stackoverflow.com/questions/42391009/text-extraction-line-by-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!