What are we trying to achieve by creating a right and down vectors in this ray tracing depiction?

问题

Often, we see the following picture when talking about ray tracing.

Here, I see the Z axis as the sort of direction if the camera pointed straight ahead, and the XY grid as the grid that the camera is seeing here. From the camera's point of view, we see the usual Cartesian grid me and my classmates are used to.

Recently I was examining code that simulates this. One thing that is not obvious from this picture to me is the requirement for the "right" and "down" vectors. Obviously we have look_at, which shows where the camera is looking. And campos is where the camera is located. But why do we need camright and camdown? What are we trying to construct?

    Vect X (1, 0, 0);
    Vect Y (0, 1, 0);
    Vect Z (0, 0, 1);

    Vect campos (3, 1.5, -4);

    Vect look_at (0, 0, 0);
    Vect diff_btw (
        campos.getVectX() - look_at.getVectX(),
        campos.getVectY() - look_at.getVectY(),
        campos.getVectZ() - look_at.getVectZ()
    );
    Vect camdir = diff_btw.negative().normalize();
    Vect camright = Y.crossProduct(camdir);
    Vect camdown = camright.crossProduct(camdir);
    Camera scene_cam (campos, camdir, camright, camdown);

I was searching about this question recently and found this post as well: Setting the up vector in the camera setup in a ray tracer

Here, the answerer says this: "My answer assumes that the XY plane is the "ground" in world space and that Z is the altitude. Imagine standing on the floor holding a camera. It's position has a positive Z component, and it's view vector is nearly parallel to the ground. (In my diagram, it's pointing slightly upwards.) The plane of the camera's film (the uv grid) is perpendicular to the view grid. A common approach is to transform everything until the film plane coincides with the XY plane and the camera points along the Z axis. That's equivalent, but not what I'm describing here."

I'm not entirely sure why "transformations" are necessary.. How is this point of view different from the picture at the top? Here they also say that they need an "up" vector and "right" vector to "construct an image plane". I'm not sure what an image plane is..

Could someone explain better the relationship between the physical representation and code representation?

回答1:

How do you know that you always want the camera's "up" to be aligned with the vertical lines in the grid in your image?

Trying to explain it another way: The grid is not really there. That imaginary grid is the result of the calculations of camera's directional vectors and the resolution you are rendering in. The grid is not what decides the camera angle.

When you are holding a physical camera in your hand, like the camera in the cell phone, don't you ever rotate the camera little bit for effect? Or when filming, you may want to slowly rotate the camera? Have you not seen any movies where the camera is rotated?

In the same way, you may want to rotate the "camera" in your ray traced world. And rotating only the camera is much easier than rotating all your objects in the scene(may be millions!)

Check out the example of rotating the camera from the movie Ice Age here: https://youtu.be/22qniGtZhZ8?t=61

回答2:

The (up or down) and right vectors constructs the plane you project the scene onto. Since the scene is in 3D you need to project the scene onto a 2D scene in order to render a picture to display on your screen.

If you have the camera position and direction you still don't know whether you're holding the camera right-side up, upside down, or tilted to the left and right. Using camera position, lookat, up (down) and right vectors we can uniquely define the 3D scene is projected into a 2D picture.

Concretely, if you look at the code and the picture. The 3D scene are the objects displayed. The image/projection plane is the grid infront of the camera. It's orientation is defined by the the camright and camdir vectors (because we are assuming the cameras line of sight is perpendicular to camdir, camdown is uniquely defined by the other two). The placement of the grid is based on the camera's position and intrinsic properties (it's not being displayed here, but the camera will have a specific field of view).

来源：https://stackoverflow.com/questions/57932651/what-are-we-trying-to-achieve-by-creating-a-right-and-down-vectors-in-this-ray-t

标签

c++

raytracing