Incremental Structure from Motion for Small-Scale Scenes Based on Auxiliary Calibration
Abstract
:1. Introduction
2. Related Works
2.1. Structure from Motion
2.2. Feature Extraction and Matching
2.3. Planar Marker
3. Materials and Methods
3.1. Camera Pose Estimation During Reconstruction
3.2. Feature-Point Enhancement Algorithm
3.3. Sparse Reconstruction
4. Results
4.1. Analysis of Factors Influencing Calibration Board
4.2. Comparing Feature Point Extraction Algorithms on the Local Dataset
4.3. Evaluation of Feature Point Extraction Algorithms on the ETH3D Dataset
4.4. Sparse-Point-Cloud Reconstruction
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Ullman, S. The interpretation of structure from motion. Proc. R. Soc. London Ser. B Biol. Sci. 1979, 203, 405–426. [Google Scholar]
- Agarwal, S.; Furukawa, Y.; Snavely, N.; Simon, I.; Curless, B.; Seitz, S.M.; Szeliski, R. Building rome in a day. Commun. ACM 2011, 54, 105–112. [Google Scholar] [CrossRef]
- Chan, K.H.; Tang, C.Y.; Hor, M.K.; Wu, Y.L. Robust trifocal tensor constraints for structure from motion estimation. Pattern Recognit. Lett. 2013, 34, 627–636. [Google Scholar] [CrossRef]
- Wu, C. Towards linear-time incremental structure from motion. In Proceedings of the 2013 International Conference on 3D Vision-3DV, Seattle, WA, USA, 29 June–1 July 2013; pp. 127–134. [Google Scholar]
- Wang, X.; Xiao, T.; Kasten, Y. A hybrid global structure from motion method for synchronously estimating global rotations and global translations. ISPRS J. Photogramm. Remote Sens. 2021, 174, 35–55. [Google Scholar] [CrossRef]
- Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle adjustment: A modern synthesis. In Proceedings of the 1999 International Workshop on Vision Algorithms: Theory and Practice, Corfu, Greece, 21–22 September 1999; pp. 298–372. [Google Scholar]
- Cui, Z.; Tan, P. Global Structure-from-Motion by Similarity Averaging. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 864–872. [Google Scholar]
- Ozyesil, O.; Singer, A. Robust Camera Location Estimation by Convex Programming. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2674–2683. [Google Scholar]
- Yang, M.; Shen, C.L. Uncalibrated TWO—Views 3D Reconstruction Based on Geometric Constraints in Scene. J. Image Graph. 2003, 8, 26–30. [Google Scholar]
- Szeliski, R.; Torr, P.H. Geometrically Constrained Structure from Motion: Points on Planes. In European Workshop on 3D Structure from Multiple Images of Large-Scale Environments; Springer: Berlin/Heidelberg, Germany, 1998; pp. 171–186. [Google Scholar]
- Dzitsiuk, M.; Sturm, J.; Maier, R.; Ma, L.; Cremers, D. De-noising, stabilizing and completing 3D reconstructions on-the-go using plane priors. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3976–3983. [Google Scholar]
- Yang, X.; Jiang, G. A practical 3D reconstruction method for weak texture scenes. Remote Sens. 2021, 13, 3103. [Google Scholar] [CrossRef]
- Gao, Y.; Liu, T.; Li, H. Stereo Matching Algorithm Based on Pixel Category Optimized Patch Match. Acta Opt. Sin. 2019, 39, 0715006. [Google Scholar]
- Liu, J.; Wang, Y. 3D surface reconstruction of small height object based on thin structured light scanning. Micron 2021, 143, 103022. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Q.P.; Cao, Y. Research on Three-Dimensional Reconstruction Algorithm of Weak Textured Objects in Indoor Scenes. Laser Optoelectron. Prog. 2021, 58, 197–203. [Google Scholar]
- Muñoz-Salinas, R.; Marín-Jimenez, M.J.; Yeguas-Bolivar, E.; Medina-Carnicer, R. Mapping and localization from planar markers. Pattern Recognit. 2018, 73, 158–171. [Google Scholar] [CrossRef]
- DeGol, J.; Bretl, T.; Hoiem, D. Improved structure from motion using fiducial marker matching. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 273–288. [Google Scholar]
- Munoz-Salinas, R.; Marin-Jimenez, M.J.; Medina-Carnicer, R. SPM-SLAM: Simultaneous localization and mapping with squared planar markers. Pattern Recognit. 2019, 86, 156–171. [Google Scholar] [CrossRef]
- Jia, Z.; Rao, Y.; Fan, H.; Dong, J. An efficient visual SfM framework using planar markers. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar] [CrossRef]
- Shunyi, Z.; Xiaonan, W.; Dian, M. A Convenient 3D Reconstruction Method of Small Objects. Geomat. Inf. Sci. Wuhan Univ. 2015, 40, 147–152+158. [Google Scholar]
- Beardsley, P.; Torr, P.; Zisserman, A. 3D model acquisition from extended image sequences. In Lecture Notes in Computer Science, Proceedings of the Computer Vision—ECCV’96, 4th European Conference on Computer Vision, Cambridge, UK, 15–18 April 1996; Proceedings Volume II 4; Springer: Berlin/Heidelberg, Germany, 1996; pp. 683–695. [Google Scholar]
- Dellaert, F.; Seitz, S.M.; Thorpe, C.E.; Thrun, S. Structure from motion without correspondence. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), Seattle, WA, USA, 13–19 June 2000; Volume 2, pp. 557–564. [Google Scholar]
- Snavely, N.; Seitz, S.M.; Szeliski, R. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 2006, 25, 835–846. [Google Scholar] [CrossRef]
- Fuhrmann, S.; Langguth, F.; Goesele, M. Mve-a multi-view reconstruction environment. GCH 2014, 3, 4. [Google Scholar]
- Schonberger, J.L.; Frahm, J.M. Structure-from-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4104–4113. [Google Scholar]
- OpenMVS, O. Open Multi-View Stereo Reconstruction Library. GitHub Repos. 2020. Available online: https://github.com/cdcseacave/openMVS (accessed on 1 December 2024).
- Cignoni, P.; Callieri, M.; Corsini, M.; Dellepiane, M.; Ganovelli, F.; Ranzuglia, G. Meshlab: An open-source mesh processing tool. In Proceedings of the Eurographics Italian Chapter Conference 2008, Salerno, Italy, 2–4 July 2008; pp. 129–136. [Google Scholar]
- Moulon, P.; Monasse, P.; Marlet, R. Global fusion of relative motions for robust, accurate and scalable structure from motion. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, NSW, Australia, 1–8 December 2013; pp. 3248–3255. [Google Scholar]
- Yang, J.; Liu, L.; Xu, J.; Wang, Y.; Deng, F. Efficient global color correction for large-scale multiple-view images in three-dimensional reconstruction. ISPRS J. Photogramm. Remote Sens. 2021, 173, 209–220. [Google Scholar] [CrossRef]
- Duggal, S.; Wang, S.; Ma, W.C.; Hu, R.; Urtasun, R. Deeppruner: Learning efficient stereo matching via differentiable patchmatch. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4384–4393. [Google Scholar]
- Viswanathan, D.G. Features from accelerated segment test (fast). In Proceedings of the 10th Workshop on Image Analysis for Multimedia Interactive Services, London, UK, 6–8 May 2009. [Google Scholar]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
- Alcantarilla, P.F.; Solutions, T. Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell. 2011, 34, 1281–1298. [Google Scholar]
- Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary robust invariant scalable keypoints. In Proceedings of the2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2548–2555. [Google Scholar]
- Jakubović, A.; Velagić, J. Image feature matching and object detection using brute-force matchers. In Proceedings of the 2018 International Symposium ELMAR, Zadar, Croatia, 16–19 September 2018; pp. 83–86. [Google Scholar]
- Wang, Z.; Zhang, Z.; Zhu, W.; Hu, X.; Deng, H.; He, G.; Kang, X. A robust planar marker-based visual slam. Sensors 2023, 23, 917. [Google Scholar] [CrossRef]
- Munoz-Salinas, R.; Medina-Carnicer, R. UcoSLAM: Simultaneous localization and mapping by fusion of keypoints and squared planar markers. Pattern Recognit. 2020, 101, 107193. [Google Scholar] [CrossRef]
- Xie, Y.; Huang, Z.; Chen, K.; Zhu, L.; Ma, J. MCGMapper: Light-Weight Incremental Structure from Motion and Visual Localization With Planar Markers and Camera Groups. arXiv 2024, arXiv:2405.16599. [Google Scholar]
- Germanese, D.; Leone, G.R.; Moroni, D.; Pascali, M.A.; Tampucci, M. Long-term monitoring of crack patterns in historic structures using UAVs and planar markers: A preliminary study. J. Imaging 2018, 4, 99. [Google Scholar] [CrossRef]
- Gatrell, L.B.; Hoff, W.A.; Sklair, C.W. Robust image features: Concentric contrasting circles and their image extraction. Coop. Intell. Robot. Space II 1992, 1612, 235–244. [Google Scholar]
- Calvet, L.; Gurdjos, P.; Griwodz, C.; Gasparini, S. Detection and accurate localization of circular fiducials under highly challenging conditions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 562–570. [Google Scholar]
- Fiala, M. ARTag, a fiducial marker system using digital techniques. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 2, pp. 590–596. [Google Scholar]
- Olson, E. AprilTag: A robust and flexible visual fiducial system. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3400–3407. [Google Scholar]
- Scaramuzza, D.; Fraundorfer, F. Visual odometry [tutorial]. IEEE Robot. Autom. Mag. 2011, 18, 80–92. [Google Scholar] [CrossRef]
- Fraundorfer, F.; Scaramuzza, D. Visual odometry: Part ii: Matching, robustness, optimization, and applications. IEEE Robot. Autom. Mag. 2012, 19, 78–90. [Google Scholar] [CrossRef]
- Zhang, Z. Flexible camera calibration by viewing a plane from unknown orientations. In Proceedings of the Seventh Ieee International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 1, pp. 666–673. [Google Scholar]
Shooting Distance/cm | Chessboard Grid Edge Length/mm | Errors/pix | |||
---|---|---|---|---|---|
Scene1 | Scene2 | Scene3 | Average | ||
50 | 10 | 1.7967 | 1.7979 | 1.7982 | 1.7976 |
50 | 13 | 0.4735 | 0.4763 | 0.4767 | 0.4755 |
50 | 16 | 0.2461 | 0.2453 | 0.2495 | 0.2470 |
50 | 19 | 0.7483 | 0.7534 | 0.7569 | 0.7528 |
75 | 10 | 1.9869 | 1.9827 | 1.9785 | 1.9827 |
75 | 13 | 0.5763 | 0.5726 | 0.5803 | 0.5764 |
75 | 16 | 0.3833 | 0.3820 | 0.3865 | 0.3840 |
75 | 19 | 0.8059 | 0.8037 | 0.7942 | 0.8012 |
Scenes | SIFT | AKAZE | Ours | ||||||
---|---|---|---|---|---|---|---|---|---|
CMN | Re-projection Error/pix | Time/s | CMN | Re-projection Error/pix | Time/s | CMN | Re-projection Error/pix | Time/s | |
Scene1 | 190 | 0.3445 | 9.4221 | 55 | 0.3389 | 7.7324 | 304 | 0.3389 | 9.3913 |
Scene2 | 118 | 0.2633 | 10.1486 | 106 | 0.2137 | 6.4070 | 198 | 0.2063 | 9.1821 |
Scene3 | 933 | 0.2323 | 10.0145 | 1194 | 0.2485 | 7.8439 | 1419 | 0.1841 | 10.4204 |
Scenes | Methods | Points Number | Re-projection Error/pix | Time/s |
---|---|---|---|---|
Scene1 | Paper [19] | 2216 | 0.7079 | 93.5579 |
Paper [20] | 2680 | 0.6402 | 137.6154 | |
Ours | 4538 | 0.5245 | 123.2882 | |
Scene2 | Paper [19] | 634 | 0.6459 | 97.8644 |
Paper [20] | 439 | 0.6493 | 104.6297 | |
Ours | 822 | 0.4996 | 100.1918 | |
Scene3 | Paper [19] | 4917 | 0.6336 | 95.0657 |
Paper [20] | 4609 | 0.6096 | 113.6048 | |
Ours | 4925 | 0.4151 | 102.6768 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, S.; Li, J.; Yang, T.; A, X.; Liu, J. Incremental Structure from Motion for Small-Scale Scenes Based on Auxiliary Calibration. Sensors 2025, 25, 415. https://doi.org/10.3390/s25020415
Li S, Li J, Yang T, A X, Liu J. Incremental Structure from Motion for Small-Scale Scenes Based on Auxiliary Calibration. Sensors. 2025; 25(2):415. https://doi.org/10.3390/s25020415
Chicago/Turabian StyleLi, Sixu, Jiatian Li, Tao Yang, Xiaohui A, and Jiayin Liu. 2025. "Incremental Structure from Motion for Small-Scale Scenes Based on Auxiliary Calibration" Sensors 25, no. 2: 415. https://doi.org/10.3390/s25020415
APA StyleLi, S., Li, J., Yang, T., A, X., & Liu, J. (2025). Incremental Structure from Motion for Small-Scale Scenes Based on Auxiliary Calibration. Sensors, 25(2), 415. https://doi.org/10.3390/s25020415