Does anyone knows the meaning of output of image_to_data, image_to_osd methods of pytesseract?

后端 未结 2 1657
暖寄归人
暖寄归人 2021-01-22 10:39

I\'m trying to extract the data from image using pytesseract. This module has image_to_data, image_to_osd methods. These two m

相关标签:
2条回答
  • 2021-01-22 11:18

    Column Level:

    1. Item with no block_num, paragraph_num, line_num, word_num
    2. Item with block_num and with no paragraph_num, line_num, word_num
    3. Item with block_num, paragraph_num and with no line_num, word_num
    4. Item with block_num, paragraph_num, line_num, and with no word_num
    5. Item with all those numbers

    Column block_num: Block number of the detected text or item
    Column par_num: Paragraph number of the detected text or item
    Column line_num: Line number of the detected text or item
    Column word_num: word number of the detected text or item

    But above all 4 columns are interconnected.If the item comes from new line then word number will start counting again from 0, it doesn't continue from previous line last word number. Same goes with line_num, par_num, block_num.

    Check out the below image for reference.
    1st column: block_num
    2nd column: par_num
    3rd column: line_num
    4rth column: word_num

    0 讨论(0)
  • 2021-01-22 11:25

    my_image.jpg

    For example, Test the my_image.jpg with image_to_data in the following code, we will get the results like the results.png.

    results.png

    • level = 1/2/3/4/5,the level of current item.

    • page_num: the page index of the current item. In most instances, a image only has one page.

    • block_num: the block item of the current item. when tesseract OCR Image, it will split the image into several blocks according the PSM parameters and some rules. The words in a line often in a block.

    • par_num: The paragraph index of the current item. It is the page analysis results. line_num: The line index of the current item. It is the page analysis results. word_num: The word index in one block.

    • line_num: The line index of the current item. It is the page analysis results.

    • word_num: The word index in one block.

    • left/top/width/height:the top-left coordinate and the width and height of the current word.

    • conf: the confidence of the current word, the range is -1~100.. The -1 means that there is no text here. The 100 is the highest value.

    • text: the word ocr results.

    The meaning of the results from image_to_osd:

    • Page number: the page index of the current item. In most instances, a image only has one page.

    • Orientation in degrees: the clockwise rotation angle of the text in the current image relative to its reading angle, the value range is [0, 270, 180, 90].

    • Rotate: Record the angle at which the text in the current image is to be converted into readable, relative to the clockwise rotation of the current image, the value range is [0, 270, 180, 90]. Complementary to the [Orientation in degrees] value.

    • Orientation confidence:Indicates the confidence of the current [Orientation in degrees] and [Rotate] detection values. The greater the confidence, the more credible the test result, but no explanation of its value range has been found so far.

    • Script: The encoding type of the text in the current picture.

    • Script confidence: The confidence of the text encoding type in the current image.

    from pytesseract import Output import pytesseract import cv2

    image = cv2.imread("my_image.jpg")
    
    #swap color channel ordering from BGR (OpenCV’s default) to RGB (compatible with Tesseract and pytesseract).
    # By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
    # we need to convert from BGR to RGB format/mode:
    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
     
    pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'
    custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
    results = pytesseract.image_to_data(rgb, output_type=Output.DICT,lang='eng',config=custom_config)
    print(results)
    
    0 讨论(0)
提交回复
热议问题