Mismatch between OpenCV projected points and Unity camera view

问题

We are working on an AR application in which we need to overlay a 3D model of an object on a video stream of the object. A Unity scene contains the 3D model and a camera is filming the 3D object. The camera pose is initially unknown.

▶ What we have tried

We did not find a good solution to estimate the camera pose directly in Unity. We, therefore, used OpenCV which provides an extensive library of computer vision functions. In particular, we locate Aruco tags and then pass their matching 3D-2D coordinates to solvePnp.

solvePnp returns a camera position that is consistent with the reality up to a few centimeters. We also verify the reprojection error which is low.

Each used tag corner is reprojected and shown as a red point on the image. As you can see, the difference is minimal.

These results look decent and should be sufficient for our use-case. We, therefore, validate the camera pose in regard with reality and OpenCV.

▶ The problem

When placing the camera at the estimated pose in the Unity scene however, the 3D objects do not line up well.

On this Unity screenshot, you can see that the view of the virtual (Unity objects) green tags does not match with the real ones from the video feed.

▶ Possible root cause

We identified different possible root causes that could explain the mismatch between Unity and OpenCV:

Differences in camera intrinsic parameters: we tried different sets of parameters, none with absolute success. We first calibrated the camera with OpenCV and tried to backport the parameters to Unity. We also looked at the manufacturer data but it didn't provide better results. Lastly, we manually measured the Field of View (FoV) and combined it with the known camera sensor size. Results didn't differ much between these tests.
Differences in camera model between Unity and OpenCV: OpenCV works with a pinhole camera model but I was not able to find a conclusive answer on which model Unity simulates.

▶ Notes

Our camera has a large field of view (115°).

Both the image passed to OpenCV and to Unity are already well undistorted.

We went through most of the SO questions tagged OpenCV and Unity. Most were concerned with the different coordinates system and rotation convention. This doesn't seem to be the problem in our case as the camera is shown at its expected location in the 3D Unity scene.

▶ Questions

Is there any fundamental difference in the camera model used by Unity and OpenCV?
Do you see any other possible causes that could explain the difference in projection between Unity and OpenCV?
Do you know of any reliable way to estimate the camera pose without OpenCV?

来源：https://stackoverflow.com/questions/59375303/mismatch-between-opencv-projected-points-and-unity-camera-view

标签

OpenCV

unity3d

computer-vision

pose-estimation

opencv-solvepnp