This project is devoted to research into combination of sketch-based modeling and computer vision techniques for reconstructing complex real-world objects from videos and image sequences. Initially this project was started in 2004 in cooperation with Samsung Advanced Institute of Technology and continued as internal project of the laboratory.
Despite serious progress in the field of shape-from-X reconstruction techniques, fully automated reconstruction of arbitrary real-world objects still fail to yield acceptable result over a broad class of object shapes, image types and scene complexity. Hence, research into efficient and easy-to-use combination of automated and manual 3D modeling from images is very promising.
![]() |
![]() |
![]() |
| Images from input sequences of 35 frames |
he project key idea is to support computer vision techniques with user interaction to contribute to overall scheme robustness. The project goals include:
The overall reconstruction process comprises of the following steps:
amera calibration is vital information for all 3d reconstruction methods that are essentially based on triangulation of points and lines position in 3d space from their projections extracted from images. Common interactive 3d reconstruction software such as “Canoma” or “ImageModeler” relies on tedious manual matching of large set of points and lines between several input images for camera calibration. The amount of boring user interaction limits the applicability and ease-of-use of such systems. We have developed an automatic camera calibration system based on renowned sparse structure and motion (SAM) estimation approach, which is also a basis of film-industry camera calibration software like “Boujou” by 2d3.
Sparse SAM consists of two steps: feature tracking and camera calibration. We have developed a feature tracking framework that provides several enhancements compared to existing methods. Guided tracking and matching methods combined reduce the number of erroneously “lost” tracks.Uniform feature detection in images increase the robustness of camera tracking in presense of outlier-dense natural textures like trees.
Our current implementation of camera calibration relies on supplied intrinsic camera parameters (e.g. focal length) that can be obtained directly from camera or using the pattern-based off-line camera calibration toolbox. Extrinsic calibration (camera trajectory) is then calculated automatically using sequential or hierarchical scheme. Camera poses are then refined by bundle-adjustment.
![]() |
![]() |
| Example of feature tracking and SAM estimation |
Camera calibration step provides number of 3d points on scene object surfaces. This information is exploited for shape estimation by guided model fitting. User selects in several images object of interest by rectangular marque. Parametric model of selected type (box,cylinder,plane) is robustly fitted to 3d point cloud.
A lot of real-world objects can be modeled as a combination of simple and complex shape (e.g. cup). We have developed a sketch-based modeling tool for reconstruction such complex shapes as cup handle, which are untedious for the user.
![]() |
![]() |
![]() |
| Cup handle reconstruction from 2 silhouettes |
We are currently investigating additional approaches to shape reconstruction that can boost the user-guided geometric fitting. One of them is quasi-dense reconstruction, which is a kind of intermidiate step between sparse and dense reconstruction. Sparse SAM suffers from low number of 3d points. Dense SAM is very unreliable and noise. Quasi-dense reconstruction benifits from precision and reliability of sparse matches, which are used as input, and greatly increases the number of matches and corresponding 3d points.
![]() |
![]() |
| Reprojection and 3D view of quasi-dense 3D points |
Originally our image-based modeling system were designed for small-sized objects. Such image sequences are taken in lab environment. Several examples of reconstructed objects are shown below. Pay attention to complex cup handle shape captured precisely.



Later feature tracking and camera calibration algorithms were adapted to video sequences of city streets objects, captured by hand-held generic photocamera like Canon IXUS 500. This allows reconstruction of large-sized objects like car garages and buildings:




[1] Anton Konouchine, Victor Gaganov, Vladimir Veznevets "Combined Guided Tracking and Matching with Adaptive Track Initialization". Graphicon-2005, Novosibirsk Akademgorodok, Russia, 2005..pdf (260kb)
[2] Anton Konouchine “A system for 3d real-world model reconstruction from image sequences”, Lomonosov-2005, Moscow, Russia, 2005 .pdf (101kb) (in Russian)
[3] Anton Konushin, Victor Gaganov, Vladimir Veznevets "AMLESAC: A New
Maximum Likelihood Robust Estimator". Graphicon-2005, Novosibirsk Akademgorodok, Russia, 2005. .pdf (419kb)
[4] Ànton Konushin, Kirill Marinichev, Vladimir Vezhnevets "A survey of robust parameter estimation methods based on random sampling" Graphicon-2004, Moscow, Moscow State University, 2004. .pdf (240kb) (In Russian)
[5] Eugeney Lisitsin, Anton Konushin, Vladimir Vezhnevets "Point feature tracking in video sequences with sharpness changes" Graphicon-2004, Moscow, 2004 .pdf (324kb) (In Russian)
Principal researcher
Lead researcher
Researchers
Collaborators at the SAIT side
Dr. Anton Konouchine
ktosh [at] graphics [dot] cs [dot] msu [dot] ru