Return coordinates for bounding boxes Google's Object Detection API

How can i get the coordinates of the produced bounding boxes using the inference script of Google's Object Detection API? I know that printing boxes[0][i] returns the predictions of the ith detection in an image but what exactly is the meaning of these returned numbers? Is there a way that i can get xmin,ymin,xmax,ymax? Thanks in advance.

The boxes array that you mention contains this information and the format is a [N, 4] array where each row is of the format: [ymin, xmin, ymax, xmax] in normalized coordinates relative to the size of the input image.

Google Object Detection API returns bounding boxes in the format [ymin, xmin, ymax, xmax] and in normalised form (full explanation here). To find the (x,y) pixel coordinates we need to multiply the results by width and height of the image. First get the width and height of your image:

width, height = image.size

Then, extract ymin,xmin,ymax,xmax from the boxes object and multiply to get the (x,y) coordinates:

ymin = boxes[0][i][0]*height
xmin = boxes[0][i][1]*width
ymax = boxes[0][i][2]*height
xmax = boxes[0][i][3]*width

Finally print the coordinates of the box corners:

print 'Top left'
print (xmin,ymin,)
print 'Bottom right'
print (xmax,ymax)

来源：https://stackoverflow.com/questions/47110528/return-coordinates-for-bounding-boxes-googles-object-detection-api

标签

tensorflow

object-detection

object-detection-api

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!