Hide menu

3D reconstruction project

Required functionalities

Each group should implement a system with the following functionality:

  1. Gold standard F-matrix estimation from known correspondences. (Many of the required functions exist in LAB3, and in OpenCV).
  2. Computation of an E-matrix from F and known K, and subsequent extraction of R and t from E, and visible-points constraint.
  3. Robust Perspective-n-point (PnP) estimation for adding a new view and detecting outliers.
  4. Bundle adjustment for N cameras (N≥2) using known correspondences, intrinsic camera parameters (K-matrix), and an initial guess for the camera poses.
  5. Use a sparsity mask for the Jacobian in the Bundle adjustment step (this speedup is absolutely essential if you use scipy.optimize.least_squares). Alternatively, in C++ you can use Ceres Solver, which handles the sparsity in a different way. Yet another option is to use pytorch (Ask your guide for details).
  6. Evaluate the robustness to noise in the implemented system, by measuring the camera pose errors of your reconstruction. Details can be found here.

In addition to the above, groups with four students should implement at least one of the following functionalities:

  1. Visualisation of the 3D model. Pick one of the following:
    • Use Poisson surface reconstruction (or screened PSR) to obtain a volumetric representation. Try e.g. the version in Meshlab, or Kazhdan's own implementation (has more options). Both require you to estimate colour and surface normals for the 3D-points, and saving these in PLY format. Use visibility in cameras to determine normal signs. See Kazhdan and Hoppe for a theoretical description.
    • Use space carving (as in Fitzgibbon, Cross and Zisserman) to generate a volumetric representation of your object.
    • Use texture patches (e.g. PMVS) or billboards, to represent the texture of the estimated 3D model.
  2. Find correspondences between pairs of views, and remove false matches using epipolar geometry and cross-checking. (Allows the implemented system to use any image set with known K)
  3. Use your own camera to take images of an object, and reconstruct it. This requires estimation of the K-matrix, and possibly lens distortion, followed by image rectification. Note that this also requires finding your own correspondences (functionality #2), and that you need to carefully plan how to take the images.
  4. Densify your initial sparse SfM model, using dense stereo between selected frame pairs. E.g. PatchMatch can be used, or using one of the approaches in the Multiview Stereo tutorial.
  5. Use next-best-view selection (e.g. as in Schönberger et al., or some simplified form thereof) to select the order in which cameras are added to SfM.
  6. Replace Incremental SfM with Global SfM. Use global estimation of all rotation matrices as an initialisation of Bundle Adjustment (as in Martinec and Pajdla).
  7. Write your own non-linear solver, e.g.~Levenberg-Marquardt (see the report by Madsen et al.), and use the Shur-Complement Trick to speed up the computation of the update step (see the IREG compendium).

Groups with five students are required to implement two of the above, and groups with six members should implement three.

Datasets

The Visual Geometry Group at Oxford University has a number of datasets availabe on their web site, e.g., the dinosaur sequence:

These data sets contain coordinates of image points that are tracked between images, which means that there is both some amount of noise on the image coordinates and that these points are not visible over the entire sequence since they are occluded in some or even most of the images. The data sets contain some type of ground truth in terms of estimated camera matrices for each view.

Notice that the dinosaur sequence is a turn-table sequence that is generated by rotating the object around a fixed axis. The acquisition geometry of this dataset makes it sensitive to the quality of the used point correspondences.

For debugging it is useful to work with a noise free dataset, and for this purpose we have "cleaned" the dinosaur data set from any noise (up to the numerical accuracy of Matlab) and produced a dataset that contains 2D image points, 3D points, and the camera matrices. These are found in the BAdino2.mat file on the local ISY file system:

  • /courses/TSBB15/sequences/BAdino/BAdino2.mat

The EPFL multi-view stereo datasets are also highly recommended:

They all have undistorted high-res images, with ground-truth camera poses and known intrinsic camera parameters. However, you have to find the correspondences yourself here.

Python code

Please look at the utility code for CE3. You can find it at /courses/TSBB15/python/ in the computer labs. Also look at the Extra exercises in the lab sheet, they are meant to help you get started with the projects.

Matlab code

In the case that you use Matlab for solving the tasks in this project there are some software packages, that will spare you from implementing all functionalities yourself:

C code

You are absolutely NOT allowed to use complete SfM systems such as COLMAP, SBA, Bundler, Visual SFM, SSBA, and OpenMVG.

On the other hand, you are encouraged to use a non-linear least squares package, and if you intend to make a more advanced representation of your 3D model, you may use a package for this. Some useful (and allowed) packages are listed below.

  • Ceres Solver, a library for efficient non-linear optimization.
  • levmar, and lmfit, Levenberg-Marquardt optimisation packages.
  • sparseLM, a sparse Levenberg-Marquardt optimisation package (much faster, but also more complex to set up).
  • PMVS, package to compute 3D models from images and camera poses.
  • OpenCV, implements several local invariant features, and some geometry functions.
  • Lambda twist P3P by Mikael Persson and Klas Nordberg.
  • P3P by Laurent Kneip.
  • OpenGV, a collection of geometric solvers for calibrated epipolar geometry.

Deliverables

In order to pass, each project group should do the following:

  1. Make a design plan, and get it approved by the guide.
    The design plan should contain the following:
    • A list of the tasks/functionalities that will be implemented
    • A group member list, with responsibilities for each person (e.g. what task)
    • A flow-chart of the system components
    • A brief description of the components
  2. Deliver a good presentation at the seminar
  3. Hand in a written report to the project guide.
The report should contain the following:
  • A group member list, with responsibilities for each person
  • A description of the problem that is solved
  • How the problem is solved
  • What the result is (i.e. performance relative to ground truth)
  • Why the result is what it is
  • References to used methods

We recommend that you use the CVPR LaTeX template when writing the report.

References


Last updated: 2021-12-22