kitti dataset camera projection matrix

廉价感情. 提交于 2019-12-22 01:31:06

问题


I am looking at the kitti dataset and particularly how to convert a world point into the image coordinates. I looked at the README and it says below that I need to transform to camera coordinates first then multiply by the projection matrix. I have 2 questions, coming from a non computer vision background

  1. I looked at the numbers from calib.txt and in particular the matrix is 3x4 with non-zero values in the last column. I always thought this matrix = K[I|0], where K is the camera's intrinsic matrix. So, why is the last column non-zero and what does it mean? e.g P2 is
array([[7.070912e+02, 0.000000e+00, 6.018873e+02, 4.688783e+01],
       [0.000000e+00, 7.070912e+02, 1.831104e+02, 1.178601e-01],
       [0.000000e+00, 0.000000e+00, 1.000000e+00, 6.203223e-03]])
  1. After applying projection into [u,v,w] and dividing u,v by w, are these values with respect to origin at the center of image or origin being at the top left of the image?

README:

calib.txt: Calibration data for the cameras: P0/P1 are the 3x4 projection matrices after rectification. Here P0 denotes the left and P1 denotes the right camera. Tr transforms a point from velodyne coordinates into the left rectified camera coordinate system. In order to map a point X from the velodyne scanner to a point x in the i'th image plane, you thus have to transform it like:

  x = Pi * Tr * X

回答1:


Refs:

  1. How to understand the KITTI camera calibration files?
  2. Format of parameters in KITTI's calibration file
  3. http://www.cvlibs.net/publications/Geiger2013IJRR.pdf

Answer:

I strongly recommend you read those references above. They may solve most, if not all, of your questions.

For question 2: The projected points on images are with respect to origin at the top left. See ref 2 & 3, the coordinates of a far 3d point in image are (center_x, center_y), whose values are provided in the P_rect matrices. Or you can verify this with some simple codes:

import numpy as np
p = np.array([[7.070912e+02, 0.000000e+00, 6.018873e+02, 4.688783e+01],
              [0.000000e+00, 7.070912e+02, 1.831104e+02, 1.178601e-01],
              [0.000000e+00, 0.000000e+00, 1.000000e+00, 6.203223e-03]])
x = [0, 0, 1E8, 1]  # A far 3D point
y = np.dot(p, x)
y[0] /= y[2]
y[1] /= y[2]
y = y[:2]
print(y)

You will see some output like:

array([6.018873e+02, 1.831104e+02 ])

which is quite near the (p[0, 2], p[1, 2]), a.k.a. (center_x, center_y).



来源:https://stackoverflow.com/questions/53218743/kitti-dataset-camera-projection-matrix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!