YOLO object detection: how does the algorithm predict bounding boxes larger than a grid cell?

前端 未结 3 1603
醉梦人生
醉梦人生 2021-02-15 14:54

I am trying to better understand how the YOLO2 & 3 algorithms works. The algorithm processes a series of convolutions until it gets down to a 13x13 grid. Then i

3条回答
  •  说谎
    说谎 (楼主)
    2021-02-15 15:09

    everything outside of the grid cell should be unknown to the neurons predicting the bounding boxes for an object detected in that cell right.

    It's not quite right. The cells correspond to a partition of the image where the neuron have learned to respond if the center of an object is located within.

    However, the receptive field of those output neurons is much larger than the cell and actually cover the entire image. It is therefore able to recognize and draw a bounding box around an object much larger than its assigned "center cell".

    So a cell is centered on the center of the receptive field of the output neuron but is a much smaller part. It is also somewhat arbitrary, and one could image for example to have overlapping cells -- in which case you would expect neighboring neurons to fire simultaneously when an object is centered in the overlapping zone of their cells.

提交回复
热议问题