https://docs.opencv.org/4.x/d0/dbd/group__triangulation.html → https://www.robots.ox.ac.uk/~vgg/hzbook/
Direct Linear Transformation is a general algorithm that solves a set of variables from a set of relationships. We primarily care about this for projective geometry where we have the relation between 3D points in a scene and their projection onto the image plane of a pinhole camera.
Given the known projection matrices from two cameras (homogenous) and two corresponding image points (also in homogenous form), the equation needs to be solved where is the 2D point and is the 3D point and is some unknown scalar.
can be eliminated using the idea that two vectors are collinear iff their cross product is , . Expanding this cross product yields where this yields 2 linearly independent equations (the first two can be eliminated to form the third equation). Therefore, for two cameras, you end with with a 4x4 matrix which is the system of equations that needs to be solved.
This system is solved using SVD (black box this concept) and produces in homogenous coordinates. You now just need to de-homogenize using the methods in Homogeneous System which yields the euclidian triangulated 3D point.
Limitations
- Sensitive to noise if the cameras become too parallel
Nonlinear refinement usually follows.