Computation of the Camera Matrix \(\mathtt{P}\)¶
(i)¶
\(\mathtt{K}\) provides the transformation between an image point and a ray in \(\mathbb{R}^3\). Once \(\mathtt{K}\) is known, the camera is termed calibrated. A calibrated camera is a direction sensor, able to measure the direction of rays — like a 2D protractor (8.7).
Suppose the camera matrix has zero skew:
Recall that one notion of degrees of freedom is the number of free parameters minus the number of constraints. The degrees of freedom of a fitting procedure describes the effective number of parameters used by this procedure, and hence provides a quantitative measure of estimator complexity [Tib].
Although \(\mathtt{P}\) has twelve free parameters, the use of homogeneous coordinates requires constraining \(\left\Vert \mathtt{P} \right\Vert_F = 1\) to obtain a unique solution. Hence there are eleven degrees of freedom, and that remains unchanged even when the skew is known because the entries of \(\mathtt{P}\) are still quadratic in the unknowns \(\left\{ \mathtt{K}, \mathtt{R}, \tilde{\mathbf{C}} \right\}\).
As demonstrated in Least-squares Minimization (Appendix 5), the minimal solution requires 5.5 point correspondences. The corresponding DLT
has rank eleven due to the use of homogeneous coordinates. What follows is one way to extract the intrinsic parameters [MR]. Suppose the DLT estimates \(\mathtt{P}\) up to some unknown scale \(\rho\) such that
Observe that
The first step is to normalize \(B\) to cancel out \(\rho^{-2}\). Let \(\tilde{B}\) denote the normalized version of \(B\). The intrinsic parameters can be extracted as follows:
The extrinsic parameters can be calculated as
Solving for \(\tilde{\mathbf{C}}\) without decomposing \(M\) gives an unambiguous solution because \(\rho\) cancels out. However, if \(M\) is singular, then \(\tilde{\mathbf{C}} = \mathbf{d}\) where \(M \mathbf{d} = \boldsymbol{0}\) because the camera centre is the right null vector of \(\mathtt{P}\). The foregoing illustrates that \(\mathtt{R}\) is the reason behind the four different solutions.
Given the intrinsic matrix \(\mathtt{K}\), the problem is reduced to P3P. [FB81] describes a unique solution when the number of point correspondences is at least six. However, it will fail when all of them are coplanar. These are known as critical configurations: all 3D points including the camera centre must lie on a special twisted cubic space curve (the horopter) that wraps around a circular cylinder (the dangerous cylinder). Notable degenerate cases of this geometry include:
all object points at infinity (camera translation not estimable);
the projection center is coplanar with any three of the four object points;
a 3D line and a circle in an orthogonal plane touching the line.
The last case is particularly troublesome for pose from any three points, or from a square or rectangle of four coplanar points, when the camera is in the region directly above the points.
[Tri99] explores the accuracy and robustness of P4P and P5P with partially uncalibrated intrinsics. The former assumes the focal length is the only unknown calibration parameter while the latter assumes neither the focal length nor the principal point. The proposed approach separates the problem into DLT and multiresultant. The results indicate one should not bother with P4P and P5P. The accuracy is not as good as \(6+\) points, and the robustness is not even close to RANSAC P3P.
(ii)¶
[HLONolle94] is an excellent introduction to Perspective-n-Point (PnP). An instance of PnP where \(n = 3\) has in general four solutions. See [Haa][Par] for better illustrations.
(iii)¶
What follows are linear algorithms for estimating the finite projective camera \(\mathtt{P}\) under different conditions. Once \(\mathtt{P}\) is known, the intrinsics and extrinsics can be extracted using the classical formulation.
(a)¶
Given the camera location \(\tilde{\mathbf{C}}\), the minimal solution needs to satisfy
where \(\mathbf{p}_i^\top\) is a 3-vector, the \(i\text{-th}\) row of \(\mathtt{P}\).
By inspection, the camera matrix \(\mathtt{P}\) has nine free parameters remaining. Imposing \(\left\Vert \mathtt{K} \mathtt{R} \right\Vert_F = 1\) yields a unique solution, and hence only four point correspondences are needed to constrain the eight degrees of freedom.
(b)¶
Recall that \(\mathbf{v} = \det(M) \mathbf{m}_3\) is a vector in the direction of the principal axis, directed towards the front of the camera.
Given the direction of the camera’s principal ray \(\mathbf{v}\), the minimal solution needs to satisfy
where \(\mathbf{p}_i^\top\) is a 4-vector, the \(i\text{-th}\) row of \(\mathtt{P}\).
By inspection, \(\mathtt{P}\) has nine free parameters remaining, and the DLT is no longer a homogeneous system of linear equations. A least squares approximate solution can be obtained when there are at least 4.5 point correspondences constraining the nine degrees of freedom.
(c)¶
Given the camera location \(\tilde{\mathbf{C}}\) and the direction of the camera’s principal ray \(\mathbf{v}\), the minimal solution needs to satisfy
By inspection, \(\mathtt{P}\) has six free parameters remaining, and the DLT is no longer a homogeneous system of linear equations. A least squares approximate solution can be obtained when there are at least three point correspondences constraining the six degrees of freedom.
(d)¶
Given the camera location \(\tilde{\mathbf{C}}\) and orientation \(\mathtt{R}\), the minimal solution needs to satisfy
where \(\mathbf{k}_i^\top\) is the \(i\text{-th}\) row of \(\mathtt{K}\).
By inspection, \(\mathtt{P}\) has five free parameters remaining, and the DLT is no longer a homogeneous system of linear equations. A least squares approximate solution can be obtained when there are at least 2.5 point correspondences constraining the five degrees of freedom.
Note that in terms of rank deficiency, the second and third row equations is more robust than the first and second row equations.
(e)¶
Given the setup in (d) and some subset of the internal camera parameters, the minimal solution needs to satisfy
By inspection, \(\mathtt{P}\) has less than five free parameters remaining, and the DLT is no longer a homogeneous system of linear equations. Suppose there are \(N\) degrees of freedom. A least squares approximate solution can be obtained when there are at least \(N / 2\) point correspondences. The rest of the linear algorithm proceeds as in (d).
(iv) Conflation of Focal Length and Position on Principal Axis¶
The finite projective camera is defined as
Given a 3D point represented in homogeneous coordinate as \(\mathbf{X} = \begin{bmatrix} \bar{\mathbf{x}}^\top & 1 \end{bmatrix}^\top\), the imaged position is at
The imaged point after dehomogenizing is given by \(\mathbf{x} = \tilde{\mathbf{x}}_0 + \frac{f}{\delta} \tilde{\mathbf{x}}\).
\(\Delta f\)¶
The imaged position of \(\mathbf{X}\) after an increase in camera focal length is at
The imaged point after dehomogenizing is given by
\(\Delta t_3\)¶
The camera matrix after a displacement \(\Delta t_3\) backwards along the principal axis is given by
where \(d_t = -\mathbf{r}_3^\top \tilde{\mathbf{C}} + \Delta t_3\). The image position is at
The imaged point after dehomogenizing is given by
Since the imaged position of \(\mathbf{X}\) is given at depth \(d\), by (6.15), \(\delta = d\).
(v) Pushbroom Camera Computation¶
[GH97] describes how the LP camera can be computed using a DLT method with a minimum of seven points.
References
- FB81
Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
- GH97
Rajiv Gupta and Richard I Hartley. Linear pushbroom cameras. IEEE Transactions on pattern analysis and machine intelligence, 19(9):963–975, 1997.
- Haa
Trym Vegard Haavardsholm. Pose from known 3d points. http://www.uio.no/studier/emner/matnat/its/UNIK4690/v16/forelesninger/lecture_5_2_pose_from_known_3d_points.pdf. Accessed on 2017-09-14.
- MR
Ajmal Mian and Mehdi Ravanbakhsh. Cits 4402 computer vision: camera calibration. http://teaching.csse.uwa.edu.au/units/CITS4402/lectures/Lecture08-CameraCalibration.pdf. Accessed on 2017-09-14.
- Par
HyunSoo Park. Least square ++ 3d triangulation + camera registration. http://cis.upenn.edu/ cis580/Spring2015/Lectures/cis580-13-LeastSq-PnP.pdf. Accessed on 2017-09-14.
- Tib
Ryan Tibshirani. Degrees of freedom. http://www.stat.cmu.edu/ ryantibs/advmethods/notes/df.pdf. Accessed on 2017-09-14.
- Tri99
Bill Triggs. Camera pose and calibration from 4 or 5 known 3d points. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 1, 278–284. IEEE, 1999.