Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras
Abstract
:1. Introduction
2. Background and Method
2.1. Related Work
2.1.1. Normal Estimation
2.1.2. 3D Filtering
2.2. The Proposed FPN Based Architecture Details
- Bottom-up pathThe bottom-up construction involves a feedforward computation of the convolutional neural network, that combines specific feature maps on each level. Pixel-shuffle with bilinear interpolation was used for the upsampling, this is due to the inconsistency of size of the feature maps. The information flows through the layers in a serial manner. ResNet yielded the usage of the feature activation output from the last residual block. We denoted this with at every i-th layer. Unlike the work in [32], we also used the 0 layer with the strides of between the layers, to obtain the required resolution for the output image. A stable output was achieved by using ReLU.
- Top-down pathThe purpose of this path is to simulate higher resolution features. This is obtained with upsampling the more descriptive, yet sparser feature maps, denoted with . Between the bottom-up and top-down paths, the lateral connections and upsampling enforce the main features.The connection among the different paths is shown in Figure 1.Two convolutional layers follow the last layers, which are used to process the final feature map. Finally, there’s a Sigmoid activation function.
- Lateral connectionsThe layers are connected through a convolution with a 1 stride and by element-wise addition. At every layer a traditional feature design is performed, with the corresponding dimensions from the two paths, thus generating the layers. We found this architecture to be relatively efficient as the runtime, although, other architectures can be configured.The predicted layer summarizes the contribution of the individual layers and the upsamples the required output resolution (in our case being identical with the input depth image resolution).
2.3. ToFNest Normal Estimation
2.3.1. Normal Loss Function
2.3.2. Training Details
2.4. ToFClean Filtering
2.4.1. Loss Function
2.4.2. Derived Test Cases
3. Tests and Results
3.1. Comparing ToFNest to Other Methods
3.1.1. Dataset Used for Evaluation
3.1.2. Performance Evaluation and Comparison
3.1.3. Performance Evaluation on Noisy Data
3.1.4. Runtime Performance Evaluation on Different Platforms
3.1.5. Performance Evaluation on Custom Data
3.1.6. Cross-Validation
3.2. Comparing ToFClean to Other Methods
3.2.1. Performance Evaluation and Comparison
3.2.2. Runtime Performance Analysis
3.2.3. Integration into the PCN Pipeline
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
DDD | Deep Depth Denoising |
FPN | Feature Pyramid Network |
GT | Ground Truth |
MS | MultiScale |
PCN | PointCleanNet |
PCL | Point Cloud Library |
RGB-D | Red-Green-Blue-Depth |
SS | SingleScale |
References
- Tamas, L.; Jensen, B. All-Season 3D Object Recognition Challenges. In Proceedings of the ICRA Workshop on Visual Place Recognition in Changing Environments, Hong Kong, China, 24 May 2014. [Google Scholar]
- Frohlich, R.; Tamas, L.; Kato, Z. Absolute Pose Estimation of Central Cameras Using Planar Regions. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 377–391. [Google Scholar] [CrossRef] [Green Version]
- Che, E.; Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. J. Photogramm. Remote. Sens. 2018, 143, 233–248. [Google Scholar] [CrossRef]
- Hashimoto, T.; Saito, M. Normal Estimation for Accurate 3D Mesh Reconstruction with Point Cloud Model Incorporating Spatial Structure. In Proceedings of the CVPR Workshops, Long Beach, CA, USA, 16–20 June 2019; pp. 54–63. [Google Scholar]
- Peng, S.; Jiang, C.M.; Liao, Y.; Niemeyer, M.; Pollefeys, M.; Geiger, A. Shape As Points: A Differentiable Poisson Solver. arXiv 2021, arXiv:2106.03452. [Google Scholar]
- Blaga, A.; Militaru, C.; Mezei, A.D.; Tamas, L. Augmented reality integration into MES for connected workers. Robot. Comput. Integr. Manuf. 2021, 68, 102057. [Google Scholar] [CrossRef]
- Taubin, G. Curve and surface smoothing without shrinkage. In Proceedings of the IEEE International Conference on Computer Vision, Cambridge, MA, USA, 20–23 June 1995. [Google Scholar]
- Alexa, M.; Behr, J.; Cohen-Or, D.; Fleishman, S.; Levin, D.; Silva, C. Computing and rendering point set surfaces. IEEE Trans. Vis. Comput. Graph. 2003, 9, 3–15. [Google Scholar] [CrossRef] [Green Version]
- Berger, M.; Tagliasacchi, A.; Seversky, L.M.; Alliez, P.; Guennebaud, G.; Levine, J.A.; Sharf, A.; Silva, C.T. A Survey of Surface Reconstruction from Point Clouds. Comput. Graph. Forum 2017, 36, 301–329. [Google Scholar] [CrossRef] [Green Version]
- Pistilli, F.; Fracastoro, G.; Valsesia, D.; Magli, E. Learning Robust Graph-Convolutional Representations for Point Cloud Denoising. IEEE J. Sel. Top. Signal Process. 2021, 15, 402–414. [Google Scholar] [CrossRef]
- Hermosilla, P.; Ritschel, T.; Ropinski, T. Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 52–60. [Google Scholar]
- Hyeon, J.; Lee, W.; Kim, J.H.; Doh, N. NormNet: Point-wise normal estimation network for three-dimensional point cloud data. Int. J. Adv. Robot. Syst. 2019, 16, 1729881419857532. [Google Scholar] [CrossRef] [Green Version]
- Lenssen, J.E.; Osendorfer, C.; Masci, J. Deep Iterative Surface Normal Estimation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 11244–11253. [Google Scholar]
- Boulch, A.; Marlet, R. Deep Learning for Robust Normal Estimation in Unstructured Point Clouds. Comput. Graph. Forum 2016, 35, 281–290. [Google Scholar] [CrossRef] [Green Version]
- Wang, X.; Fouhey, D.F.; Gupta, A. Designing deep networks for surface normal estimation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 539–547. [Google Scholar]
- Zhou, H.; Chen, H.; Feng, Y.; Wang, Q.; Qin, J.; Xie, H.; Wang, F.L.; Wei, M.; Wang, J. Geometry and Learning Co-Supported Normal Estimation for Unstructured Point Cloud. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 13235–13244. [Google Scholar]
- Guerrero, P.; Kleiman, Y.; Ovsjanikov, M.; Mitra, N.J. PCPNet: Learning Local Shape Properties from Raw Point Clouds. Comput. Graph. Forum 2018, 37, 75–85. [Google Scholar] [CrossRef] [Green Version]
- Lu, D.; Lu, X.; Sun, Y.; Wang, J. Deep Feature-preserving Normal Estimation for Point Cloud Filtering. Comput. Aided Des. 2020, 125, 102860. [Google Scholar] [CrossRef]
- Ben-Shabat, Y.; Lindenbaum, M.; Fischer, A. Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Molnar, S.; Kelenyi, B.; Tamas, L. ToFNest: Efficient normal estimation for time-of-flight depth cameras. In Proceedings of the ICCV Workshop on Assistive Computer Vision and Robotics, Virtual, 11 October 2021. [Google Scholar]
- Hoppe, H.; DeRose, T.; Duchamp, T.; McDonald, J.; Stuetzle, W. Surface Reconstruction from Unorganized Points. In Proceedings of the 19th Annual Conference on Computer Graphics and Interactive Techniques, Chicago, IL, USA, 26–31 July 1992; Association for Computing Machinery: New York, NY, USA, 1992; pp. 71–78. [Google Scholar]
- Wang, Z.; Prisacariu, V.A. Neighbourhood-Insensitive Point Cloud Normal Estimation Network. arXiv 2020, arXiv:2008.09965. [Google Scholar]
- Mérigot, Q.; Ovsjanikov, M.; Guibas, L.J. Voronoi-Based Curvature and Feature Estimation from Point Clouds. IEEE Trans. Vis. Comput. Graph. 2011, 17, 743–756. [Google Scholar] [CrossRef] [Green Version]
- Dey, T.K.; Li, G.; Sun, J. Normal estimation for point clouds: A comparison study for a Voronoi based method. In Proceedings of the Eurographics/IEEE VGTC Symposium Point-Based Graphics, Stony Brook, NY, USA, 21–22 June 2005; pp. 39–46. [Google Scholar]
- Dey, T.K.; Goswami, S. Provable surface reconstruction from noisy samples. Comput. Geom. 2006, 35, 124–141. [Google Scholar] [CrossRef] [Green Version]
- Cazals, F.; Pouget, M. Estimating differential quantities using polynomial fitting of osculating jets. Comput. Aided Geom. Des. 2005, 22, 121–146. [Google Scholar] [CrossRef] [Green Version]
- Guennebaud, G.; Gross, M. Algebraic Point Set Surfaces. ACM Trans. Graph. 2007, 26, 23. [Google Scholar] [CrossRef]
- Bormann, R.; Hampp, J.; Hägele, M.; Vincze, M. Fast and accurate normal estimation by efficient 3D edge detection. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 3930–3937. [Google Scholar]
- Jordan, K.; Mordohai, P. A quantitative evaluation of surface normal estimation in point clouds. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 4220–4226. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Ladický, L.; Zeisl, B.; Pollefeys, M. Discriminatively trained dense surface normal estimation. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Switzerland, 2014; pp. 468–484. [Google Scholar]
- Lin, T.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Buckland, W.R. Outliers in Statistical Data. J. Oper. Res. Soc. 1979, 30, 674–675. [Google Scholar] [CrossRef]
- Han, X.; Jin, J.; Wang, M.; Jiang, W.; Gao, L.; Xiao, L. A review of algorithms for filtering the 3D point cloud. Signal Process. Image Commun. 2017, 57, 103–112. [Google Scholar] [CrossRef]
- Pincus, R.; Barnett, V.; Lewis, T. Outliers in Statistical Data; John Wiley & Sons/Wiley: Hoboken, NJ, USA, 1994. [Google Scholar]
- Zhang, D.; Lu, X.; Qin, H.; He, Y. Pointfilter: Point Cloud Filtering via Encoder-Decoder Modeling. IEEE Trans. Vis. Comput. Graph. 2021, 27, 2015–2027. [Google Scholar] [CrossRef]
- Buades, A.; Coll, B.; Morel, J.M. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005; Volume 2, pp. 60–65. [Google Scholar]
- Cazals, F.; Pouget, M. Algorithm 889: Jet fitting 3: A Generic C Package for Estimating the Differential Properties on Sampled Surfaces via Polynomial Fitting. ACM Trans. Math. Softw. 2008, 35, 1–20. [Google Scholar] [CrossRef]
- Dinesh, C.; Cheung, G.; Bajić, I.V. Point Cloud Denoising via Feature Graph Laplacian Regularization. IEEE Trans. Image Process. 2020, 29, 4143–4158. [Google Scholar] [CrossRef] [PubMed]
- Hu, W.; Gao, X.; Cheung, G.; Guo, Z. Feature Graph Learning for 3D Point Cloud Denoising. IEEE Trans. Signal Process. 2020, 68, 2841–2856. [Google Scholar] [CrossRef] [Green Version]
- Dinesh, C.; Cheung, G.; Bajic, I. Super-Resolution of 3D Color Point Clouds Via Fast Graph Total Variation. In Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 1983–1987. [Google Scholar]
- Dinesh, C.; Cheung, G.; Wang, F.; Bajić, I.V. Sampling Of 3d Point Cloud Via Gershgorin Disc Alignment. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 2736–2740. [Google Scholar]
- Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rakotosaona, M.J.; La Barbera, V.; Guerrero, P.; Mitra, N.J.; Ovsjanikov, M. PointCleanNet: Learning to Denoise and Remove Outliers from Dense Point Clouds. Comput. Graph. Forum 2020, 39, 185–203. [Google Scholar] [CrossRef] [Green Version]
- Jia, C.; Yang, T.; Wang, C.; Fan, B.; He, F. A new fast filtering algorithm for a 3D point cloud based on RGB-D information. PLoS ONE 2019, 14, e0220253. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Mujeeb, A.; Dai, W.; Erdt, M.; Sourin, A. One class based feature learning approach for defect detection using deep autoencoders. Adv. Eng. Inform. 2019, 42, 100933. [Google Scholar] [CrossRef]
- Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor Segmentation and Support Inference from RGBD Images. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Rusu, R.B.; Cousins, S. 3D is here: Point Cloud Library (PCL). In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011; pp. 1–4. [Google Scholar]
- NVIDIA Isaac Sim | NVIDIA Developer. 2019. Available online: https://developer.nvidia.com/isaac-sim (accessed on 20 August 2021).
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef] [Green Version]
- Boulch, A.; Marlet, R. Fast and Robust Normal Estimation for Point Clouds with Sharp Features. Comput. Graph. Forum 2012, 31, 1765–1774. [Google Scholar] [CrossRef] [Green Version]
- Sterzentsenko, V.; Saroglou, L.; Chatzitofis, A.; Thermos, S.; Zioulis, N.; Doumanoglou, A.; Zarpalas, D.; Daras, P. Self-Supervised Deep Depth Denoising. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 1242–1251. [Google Scholar]
- Rusu, R.B.; Marton, Z.C.; Blodow, N.; Dolha, M.; Beetz, M. Towards 3D Point Cloud Based Object Maps for Household Environments; Elsevier: Amsterdam, The Netherlands, 2008; Volume 56, pp. 927–941. [Google Scholar]
Comparison between the Normal Estimation Methods on Public Dataset | |||||||
---|---|---|---|---|---|---|---|
Own | Nesti-Net [19] | PCPNet ss [17] | PCPNet ms [17] | PCL [50] | Hough [53] | Ladicky [31] | |
Avg. hist. [%] | 0.94 | 0.93 | 0.89 | 0.91 | 0.90 | 0.85 | 0.90 |
Abs. angle [deg] | 19.61 | 21.25 | 27.75 | 24.35 | 25.31 | 31.90 | 26.23 |
Avg. runtime [s] | 0.02 | 1200.00 | 234.00 | 596.00 | 7.09 | 2.70 | - |
Runtime Comparison on Different Platforms | |||||
---|---|---|---|---|---|
Device | RTX 3080 | Jetson NX | Jetson AGX | GTX 1060 | Colab |
Time [s] | 0.02 | 0.31 | 0.23 | 0.05 | 0.11 |
Custom Dataset Performance | ||
---|---|---|
Indoor | Outdoor | |
Avg. hist. [%] | 0.959 | 0.952 |
Abs. angle [deg] | 16.46 | 17.82 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Molnár, S.; Kelényi, B.; Tamas, L. Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras. Sensors 2021, 21, 6257. https://doi.org/10.3390/s21186257
Molnár S, Kelényi B, Tamas L. Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras. Sensors. 2021; 21(18):6257. https://doi.org/10.3390/s21186257
Chicago/Turabian StyleMolnár, Szilárd, Benjamin Kelényi, and Levente Tamas. 2021. "Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras" Sensors 21, no. 18: 6257. https://doi.org/10.3390/s21186257
APA StyleMolnár, S., Kelényi, B., & Tamas, L. (2021). Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras. Sensors, 21(18), 6257. https://doi.org/10.3390/s21186257