I have an input 3D vector, along with the pitch and yaw of the camera. Can anyone describe or provide a link to a resource that will help me understand and implement the req
The world-to-camera transformation matrix is the inverse of the camera-to-world matrix. The camera-to-world matrix is the combination of a translation to the camera's position and a rotation to the camera's orientation. Thus, if M is the 3x3 rotation matrix corresponding to the camera's orientation and t is the camera's position, then the 4x4 camera-to-world matrix is:
M00 M01 M02 tx M10 M11 M12 ty M20 M21 M22 tz 0 0 0 1
Note that I've assumed that vectors are column vectors which are multiplied on the right to perform transformations. If you use the opposite convention, make sure to transpose the matrix.
To find M, you can use one of the formulas listed on Wikipedia, depending on your particular convention for roll, pitch, and yaw. Keep in mind that those formulas use the convention that vectors are row vectors which are multiplied on the left.
Instead of computing the camera-to-world matrix and inverting it, a more efficient (and numerically stable) alternative is to calculate the world-to-camera matrix directly. To do so, just invert the camera's position (by negating all 3 coordinates) and its orientation (by negating the roll, pitch, and yaw angles, and adjusting them to be in their proper ranges), and then compute the matrix using the same algorithm.