The RPP algorithm gives a more stable tracking (less jitter) than ARToolKit's pose estimation algorithm.
The robust pose estimator algorithm has been provided by G. Schweighofer and A. Pinz (Inst. Of l. Measurement and Measurement Signal Processing, Graz University of Technology). Details about the algorithm are given in a Technical Report: TR-EMT-2005 -01, available here. Thanks go to Thomas Pintaric for implementing the C ++ version of this algorithm.
Computer vision
Internal parameter calibration
2. The external parameter calibration is the attitude estimation problem. Estimate the 3D pose of an object from a set of 2D point maps.
3. Restoring posture from three corresponding points requires minimal information, which is called "three-point perspective problem" or P3P. Similarly, if it is extended to N points, it is called "PnP".
4. Vision-based pose estimation is divided into monocular vision and multi-eye vision based on the number of cameras used. According to the algorithm, it can be divided into model-based pose estimation and learning-based pose estimation.
5. In OpenCV, there are solvePnP and solvePnPRansac which are used to realize the known plane four-point coordinates to determine the translation and rotation of the camera relative to the world coordinate system. cvPOSIT is based on orthogonal projection, using the affine projection model to approximate the perspective projection model, and iteratively calculates the estimated value. This algorithm may not converge when the depth of the object is relatively large relative to the distance from the object to the camera.
6. The transformation from the world coordinate system to the camera coordinate system requires a matrix [R | t], where R is the rotation matrix and t is the displacement vector. If the world coordinate system is X and the camera coordinate system is X ′, then X ′ = [R | t] * X. The transformation from the camera coordinate system to the ideal screen coordinate system requires an internal parameter matrix C. Then the ideal screen coordinate system L = C * [R | t] * X. How to obtain [R | t] is roughly known that the coordinates of several key points on the template in the world coordinate system are known as X, and then obtain the coordinates of the corresponding points on the template in the screen coordinate system in the frame captured by the camera, which is L It is known that the initial value of [R | t] is obtained by solving the linear equations, and then the optimal transformation matrix [R | t] is obtained by iteratively using the non-linear least square method.
7. In most cases, the background is a two-dimensional plane, and the recognized object is also a two-dimensional plane. For ARToolkit, the identified Targets are flat (but this method is not robust). If the internal parameter matrix is known, then the camera pose can be calculated by knowing 4 or more coplanar and non-collinear points.
8. The problem of camera attitude estimation is to find the external parameters of the camera, that is, the problem of minimizing the error function. Some error functions are based on image-space and some are based on object-space.
9. The RPP algorithm provides a visual method for the error function based on object-space. The error function has two local minima. Under noise-free conditions, the first local minimum corresponds to the correct attitude. The minimum value of the other error function is why the standard attitude estimation algorithm jitters. Since the attitude estimation algorithm always uses an iterative algorithm to minimize the error function, an initial value is required. If the initial value is close to the second local minimum, the iterative algorithm converges to the wrong result.
10. To estimate the first pose, the RPP algorithm uses any known pose estimation algorithm, and here, iterative algorithms are used. Use the P3P algorithm to estimate the second pose from the first pose. This attitude is close to the second local minimum of the error function. Using the estimated second pose as an initial value, an iterative algorithm is used to obtain the second pose. In the end, the correct attitude is the one with the smallest error.
11. This type of problem is ultimately a problem of solving linear equations AX = b. When b ∈ R (A), x = A's generalized inverse * b; when b ∈ R (A), is it possible that Ax is close to b, that is, is there x to minimize || Ax-b || It is customary to use 2-norm, which is the European norm. Least squares solutions often exist, and then such solutions are not necessarily unique. When there is no solution to the equation, the optimal solution is found. It is to minimize the sum of squares of all errors and find the solution with the least sum of squares, which is the least squares. Minimization is to minimize the length of the error vector.
Original link: https://blog.csdn.net/leowangzi/article/details/16846405
计算机视觉
1. 内参数标定
2. 外参数标定即姿态估计问题。从一组2D点的映射中估计物体的3D姿态。
3. 从三个对应点中恢复姿态,需要的信息是最少的,称为“三点透视问题”即P3P。同理,扩展到N个点,就称为“PnP”。
4. 基于视觉的姿态估计根据使用的摄像机数目分为单目视觉和多目视觉。根据算法又可以分为基于模型的姿态估计和基于学习的姿态估计。
5. OpenCV中有solvePnP以及solvePnPRansac用来实现已知平面四点坐标确定摄像头相对世界坐标系的平移和旋转。cvPOSIT基于正交投影,用仿射投影模型近似透视投影模型,不断迭代计算出估计值。此算法在物体深度相对于物体到相机的距离比较大的时候,算法可能不收敛。
6. 从世界坐标系到相机坐标系的转换,需要矩阵[R|t],其中R是旋转矩阵,t是位移向量。如果世界坐标系为X,相机坐标系对应坐标为X‘,那么X' = [R|t]*X。从相机坐标系到理想屏幕坐标系的变换就需要内参数矩阵C。那么理想屏幕坐标系L = C*[R|t]*X。如何获得[R|t],大致是已知模板上的几个关键点在世界坐标系的坐标即X已知,然后在摄像头捕获的帧里获得模板上对应点在屏幕坐标系的坐标即L已知,通过求解线性方程组得到[R|t]的初值,再利用非线性最小二乘法迭代求得最优变换矩阵[R|t]。
7. 大多数情况下,背景是二维平面,识别的物体也是二维平面。对于ARToolkit,识别的Targets就是平面的(但是这种方法鲁棒性不好)。如果内参数矩阵是已知的,那么知道4个或者更多共面不共线的点就可以计算出相机的姿态。
8. 相机姿态估计的问题就是寻找相机的外参数,即是最小化误差函数的问题。误差函数有的基于image-space,有的基于object-space。
9. RPP算法基于object-space为误差函数提供了一种可视化的方法。误差函数有两个局部极小值。在无噪声条件下,第一个局部极小值跟正确的姿态对应。另外的误差函数的极小值就是标准姿态估计算法为什么会抖动的原因。由于姿态估计算法最小化误差函数总是要使用迭代算法,因此需要一个初值。如果初值接近第二个局部极小值,那么迭代算法就收敛到错误的结果。
10. 估计第一个姿态,RPP算法使用任何已知的姿态估计算法,在这里里,使用迭代算法。从第一个姿态使用P3P算法估计第二个姿态。这个姿态跟误差函数的第二个局部极小值接近。使用估算的第二个姿态作为初值,使用迭代算法获得第二个姿态。最终正确的姿态是有最小误差的那个。
11. 这类问题最终都是解线性方程组AX=b的问题。当b∈R(A)时,x=A的广义逆*b;当b∈不R(A)时,能否是Ax接近b呢,即是否有x使||Ax-b||最小,习惯上用2-范数即欧式范数来度量。最小二乘解常存在,然后这样的解未必是唯一的。当在方程无解的情况下,要找到最优解。就是要最小化所有误差的平方和,要找拥有最小平方和的解,即最小二乘。最小化就是把误差向量的长度最小化。
来源:CSDN
作者:qq_36537774
链接:https://blog.csdn.net/qq_36537774/article/details/103943461