1 Modelling Camera Residual Terms Using Reprojection Error and Photometric ErrorJune Pranav Ganti
2 Outline Overview Geometry Cameras β Part 1 Multi-view GeometryReprojection Error Cameras β Part 2 Photometric Error Application (SVO)
3 Overview Residual Difference between the observed value and estimated value Cost/ Loss function is the function to be minimized Generally a function of the residual Camera residuals Formulation depends on indirect vs direct methods A value to be minimized, which can estimate the camera pose.
4 Geometry | Euclidean SpaceEuclidean geometry describes: Lines Circles Angles Issue: β How do we represent points at infinity?
5 Geometry | Projective SpaceEuclidean Space + ideal points Ideal points: points at infinity. Now, 2 lines always meet in a point! Projective space is derived from Euclidean space by adding infinity Points at infinity can be described using homogeneous coordinates
6 Geometry | Homogeneous CoordinatesHomogeneous coordinates in πΉ π written as an π+1 vector π 2 : π₯, π¦, 1 π , π 3 : π₯, π¦, π§, 1 π Ideal points: π, π, β¦, π π» What about scaled points? ππ₯, ππ¦, π π is an equivalence class of π₯ π , π¦ π , 1 π - (weβll revisit why later!) Euclidean space can be extended to projective space using homogeneous vectors
7 Geometry | TransformationsEuclidean Transform: Rotation + Translation Affine transform: Rotation + Translation + Stretching (linear scaling) For both Euclidean and Affine transforms, points at infinity remain at infinity What about a projective transform?
8 Geometry | Projective TransformationsWhat properties of an object are preserved? Shape? Angles? Lengths? Distances? Straightness? Projective transformation is any mapping that preserves straight lines.
9 Geometry | Projective Transformationsβ¦contProjective transformation is a mapping of the homogeneous coordinates Ideal points are not preserved Points at infinity are mapped to arbitrary points For computer vision, the projective space is convenient Treat 3D space as π 3 instead of π 3 Images as π 2 Useful for practical applications β even though we know points at β are our own construct
10 Cameras, Part 1 | Pinhole CameraAlso known as βcamera obscuraβ First type of camera Light passes through an opening Image is reflected on the other side
11 Cameras, Part 1 | Central ProjectionCameras are a map between the 3D world and 2D image Projection: lose 1 dimension Can be mapped via central projection Ray from 3D point passes through camera center of projection (COP) Intersects image plane If 3D structure is planar, then there is no drop in dimension
12 Cameras, Part 1 | Central Projection
13 Cameras, Part 1 | Central ProjectionFor convenience: can place image plane in front of COP Image Plane π
14 Cameras, Part 1 | Central ProjectionIn essence, central projection is just mapping π 3 β π 2 The camera matrix P is a 3x4 matrix of rank 3 π₯, π¦, π€ π =π π, π, π, π π π₯,π¦, π€ π are homogeneous coordinates of image space ( π 2 ) π, π,π,π π are homogeneous coordinates of 3D world ( π 3 )
15 Cameras, Part 1 | Rays and PointsRay passing through COP is a projected point in the image. Therefore, all points on ray can be considered equal. Rays are image points, and we can represent rays as homogeneous coordinates Need calibration to express relative Euclidean geometry between image and world. With a calibrated camera, can back-project 2 points in an image Can then determine angle between two rays
16 Cameras, Part 1 | Matrix DerivationLetβs derive the camera matrix. Assumptions: Center of projection is origin ( π 3 ) Using pinhole camera model: By similar triangles: π, π, π π β ππ π , ππ π , π
17 Cameras, Part 1 | Pinhole Recall: image plane is located at a distance equivalent to the focal length ππ π , ππ π , π π π, π, π, 1 π 0, 0, 0 π
18 Mapping from π 3 to π 2 using similar trianglesCameras, Part 1 | Pinhole Mapping from π 3 to π 2 using similar triangles π, π, π, 1 π ππ π , ππ π , π π 0, 0, 0 π
19 Cameras, Part 1 | Camera MatrixWith the Euclidean the COP: Central projection just becomes a linear map b/w homogenous coordinates Can be written as:
20 Cameras, Part 1 | Camera MatrixThe previous equation assumes image coordinates at the principal point. A more generic mapping is:
21 Cameras, Part 1 | Camera MatrixπΎ is the camera calibration matrix Can also add a skew parameter Can then express where π πππ is π, π, π, 1 π , expressed in a coordinate frame at the COP. s
22 Cameras, Part 1 | Camera MatrixThe world coordinate frame is not always expressed at COP. Example: a moving camera! Coordinate frames related through a rotation and translation
23 Cameras, Part 1 | Camera MatrixThe equation can now be expressed as:
24 Cameras, Part 1 | ProjectionsForward projection: maps a point in 3D space to an image point π₯=ππ Back projection: from a point π₯ in an image, we can determine the set of points that map to this point. Ray in space passing through the space How can we obtain the back projection?
25 Cameras, Part 1 | Back ProjectionNull space of C is the camera center We know 2 points on each ray: COP (ππΆ=0) Image point (π + π₯), π + = π π π π π β1 Why is π + π₯ the second point? It projects to x! π π + π₯ =πΌπ₯=π₯ The ray is then the line connecting these two points.
26 Cameras, Part 1 | Lenses Pinhole camera is ideal Not a true representation of a camera Need to correct for distortions Want images as if we were using a pinhole camera Distortion can be radial or tangential
27 Cameras, Part 1 | Lens DistortionBarrel Distortion Pincushion Distortion
28 Cameras, Part 1 | Lens CorrectionLens distortion occurs during initial projection onto image plane π₯ , π¦ are ideal, π₯ π , π¦ π are actual π is Euclidean distance πΏ( π ) is the distortion factor. Can be solved for through calibration
29 Multi-view Geometry | Epipolar GeometryMotivation: to search for corresponding points in stereo matching Baseline: Line joining camera centers Epipole: point of intersection b/w baseline and image plane Epipolar line: intersection of an epipolar plane with the image plane Epipolar plane: plane containing the baseline
30 Multi-view Geometry | Epipolar constraints
31 Multi-view Geometry | Fundamental MatrixAlgebraic representation of epipolar geometry πΉ represents the mapping from π 2 βπ, through the epipolar lines. Two steps: Map point π₯ to π₯β Obtain πβ² from joining π₯β² to πβ²
32 Multi-view Geometry | Fundamental MatrixProperties: Correspondence: π₯ β²π πΉπ₯=0 Transpose: If πΉ is the matrix for camera π, πΉ π is the corresponding fundamental matrix for camera πβ² Epipolar lines: π β² =πΉπ₯, π= πΉ π π₯β² πΉπ=0, π β²π πΉ=0 Methods to solve: 7 point algorithm, 8 point algorithm, RANSACβ¦
33 Multi-view Geometry | Stereo Cameras
34 Reprojection Error Summed squared distance between projections of π, and measured image points. Euclidean distance In 2 images
35 Reprojection Error | ApplicationsFundamental matrix MLE of πΉ (assuming Gaussian noise) minimizes reprojection error π₯ , π₯β² are ideal points, and obtained from π₯ =ππ. Both π and π can be modified to minimize this error. Recall: , and π , π‘ represent the camera pose in the world frame! Bundle adjustment Similar, except the intrinsic parameters can also be modified.
36 Cameras, Part 2 | PhotositesCamera sensors consist of photosites Quantifies amount of light collected The digitized information is a pixel CCD (charge-coupled device), CMOS (complementary metal-oxide semiconductor)
37 Cameras, Part 2 | Shutter Rolling Shutter Soft Global ShutterHard Global Shutter
38 Cameras, Part 2 | Intensity ImageThe resulting information from the image capture is an intensity image. Allows for use of the entire image, as opposed to just keypoints. Becomes dense, so some direct methods only use patches of interest Intensity image is defined as: Ξ© is image domain Recall previously, images were π 3 β π 2
39 Photometric Error | SVO NotationNotation (from SVO) πΌ πβ1 , πΌ π : intensity images π π,πβ1 : frame transform π’: image coordinate π: 3D point π π’ : depth π: π 3 β π 2 : camera projection model π β1 : inverse π: camera frame of reference, or timestep π π: twist coordinates, se(3) Relationships
40 Photometric Error | PrinciplesPhotometric error: intensity difference between pixels observing the same point in 2 scenes.
41 Photometric Error | PrinciplesIntensity residual can be computed by: Back-projecting a 2D point from the previous image. Reprojecting it into the current camera view. Looking to minimize negative log-likelihood between camera poses, using intensity residual.
42 Photometric Error | SolvingIntensity residuals are normally distributed The equation is nonlinear in π π, πβ1 , can be solved via the Gauss-Newton algorithm Incremental update: π π π π,πβ1 is an estimate of the relative transformation πβππ(3)
43 Camera Residual Terms Reprojection error: Binary factor between feature and camera pose Photometric error: Unary factor (at least in SVO) No feature locations to estimate position of.
44 Application | SVO Applications Reprojection Error: Indirect VO/ SLAM Photometric Error: Direct VO/SLAM SVO (Semi-direct Visual Odometry) takes advantage of both. Initial pose estimate using direct Further refinement using indirect methods on keyframes
45 Application | SVO Indirect methods extract features, match them, and then recover camera pose (+structure) using epipolar geometry and reprojection error Pros: Robust matches even with high inter-image motion Cons: Extraction, matching, correspondenceβ¦can be quite costly Direct methods estimate camera pose (+structure) directly from intensity values and image gradients. Pros: Can use all information in image. More robust to motion blur, defocus. Can outperform indirect methods. Cons: Can also be costly, due to density.
46 Application | SVO SVO steps: In parallel:Initial pose estimate through minimizing photometric error. Relaxation through feature alignment. Further refinement through reprojection error. In parallel: Determine keyframes, extract features Estimate depth through projection model
47 Application | SVO Results:
48 Application | SVO 2.0 SVO 2.0:
49 References R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge university press, 2003. C. Forster, M. Pizzoli, and D. Scaramuzza, βSvo: Fast semi-direct monocular visual odometry,β in Robotics and Automation (ICRA), 2014 IEEE International Conference on, pp. 15β22, IEEE, 2014. C. Forster, Z. Zhang, M. Gassner, M. Werlberger, and D. Scaramuzza, βSvo: Semidirect visual odometry for monocular and multicamera systems,β IEEE Transactions on Robotics, 2016.
50 Image References https://i.stack.imgur.com/SitTF.png Retrieved June Retrieved June https://en.wikipedia.org/wiki/Errors_and_residuals https://en.wikipedia.org/wiki/Euclidean_space https://en.wikipedia.org/wiki/Distortion_(optics) https://en.wikipedia.org/wiki/Camera_obscura