问题
Hy! I am annotating image data through an online plateform which is generating output coordinates like this: bbox":{"top":634,"left":523,"height":103,"width":145} However, i want to use this annotation to train Yolo. So, I have to convert it in yolo format like this: 4 0.838021 0.605556 0.177083 0.237037
In this regard, i need help about how to convert it.
回答1:
Here, For the size you need to pass the (w,h) and the for the box you need to pass (x,x+w, y, y+h) https://github.com/ivder/LabelMeYoloConverter/blob/master/convert.py
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
Alternatively, you can use below
def convert(x,y,w,h):
dw = 1.0/w
dh = 1.0/h
x = (2*x+w)/2.0
y = (2*y+w)/2.0
x = x*dw
y = y*dh
w = w*dw
h = h*dh
return (x,y,w,h)
Each grid cell predicts B bounding boxes as well as C class probabilities. The bounding box prediction has 5 components: (x, y, w, h, confidence). The (x, y) coordinates represent the center of the box, relative to the grid cell location (remember that, if the center of the box does not fall inside the grid cell, than this cell is not responsible for it). These coordinates are normalized to fall between 0 and 1. The (w, h) box dimensions are also normalized to [0, 1], relative to the image size. Let’s look at an example:
What does the coordinate output of yolo algorithm represent?
回答2:
Convert bbox dictionary into list with relative coordinates
If you want to convert a python dictionary with the keys top
, left
, widht
, height
into a list in the format [x1
, y1
, x2
, y2
]
Where x1
, y1
are the relative coordinates of the top left corner
of the bounding box and x2
, y2
are the relative coordinates of the bottom right corner
of the bounding box you can use the following function :
def bbox_dict_to_list(bbox_dict, image_size):
h = bbox_dict.get('height')
l = bbox_dict.get('left')
t = bbox_dict.get('top')
w = bbox_dict.get('width')
img_w, img_h = image_size
x1 = l/img_w
y1 = t/img_h
x2 = (l+w)/img_w
y2 = (t+h)/img_h
return [x1, y1, x2, y2]
You must pass as arguments the bbox dictionary, and the image size as a tuple -> (image_width, image_height)
Example
bbox = {"top":634,"left":523,"height":103,"width":145}
bbox_dict_to_list(bbox, (1280, 720))
>> [0.40859375, 0.8805555555, 0.521875, 1.02361111111]
You can change the return order to suit your needs
来源:https://stackoverflow.com/questions/64634300/how-to-convert-2d-bounding-box-pixel-coordinates-x-y-w-h-into-relative-coor