Train Distance Estimation in Turnout Area Based on Monocular Vision
Abstract
:1. Introduction
- (1)
- A monocular vision-based distance estimation framework is designed to achieve the detection of trains in turnout areas. According to the authors’ knowledge, this problem has not been solved with other sensors or strategies, such as Lidar and stereo vision.
- (2)
- An instance segmentation strategy is proposed for train side window detection. By treating multiple side windows in each train carriage as separate entities, the segmentation can be achieved robustly and accurately even in poor illuminance conditions.
- (3)
- A geometric feature extraction strategy is proposed to obtain the quantitative representation of train side window contours, thus the further scale-based distance estimation could be achieved with an acceptable accuracy in practical usage.
2. Method
2.1. Framework Overview
2.2. Instance Segmentation of Train Side Windows
2.2.1. Train Side Window Segmentation Strategy
2.2.2. Instance Segmentation with YOLOv8
2.3. Distance Estimation Based on Geometric Features
2.3.1. Vertical Directions of Side Windows
- (1)
- By performing minimum bounding box fitting on the instance segmentation regions, a rough estimation of the width and height of the windows is obtained, as depicted with red rectangles in Figure 8a,b.
- (2)
- By locally applying Otsu’s thresholding and Canny edge detection in the instance segmentation regions, the contour edges of individual side windows within each train carriage are obtained, as shown in Figure 8c.
- (3)
- By employing Hough line detection to the contour edges, a bunch of straight lines can be extracted. Considering the dimensions and main directions of the fitted bounding boxes, the lines for vertical edges of side windows are then determined, and by computing the directional average of the detected vertical lines, the optimal estimation for the vertical dimension of the side windows is obtained, as depicted with green lines in Figure 8d.
2.3.2. Upper and Lower Edges of Side Windows
- (1)
- At the center of the minimum bounding box obtained from instance segmentation, auxiliary lines are constructed perpendicular to the side window’s vertical direction. Along the auxiliary lines, multiple sampling points are selected at a certain interval, as depicted with the khaki color in Figure 8e.
- (2)
- Starting from these sampling points, extension lines are drawn in both the positive and negative directions along the side windows’ vertical directions. The intersection points of these extension lines with the instance segmentation contours are then calculated as the estimation for the upper and lower edges of the side window, as depicted with a yellow color in Figure 8e.
- (3)
- Linear fitting is separately applied to the sets of points representing the upper and lower edges, resulting in linear estimation for the upper and lower edges of the side window, as depicted with blue lines in Figure 8e.
2.3.3. Distance Estimation Based on Approximated Pinhole Imaging Model
3. Experiments
3.1. Experiment Setup
3.2. Instance Segmentation
3.2.1. Dataset and Model Training
3.2.2. Instance Segmentation Performance
3.3. Target Train Distance Estimation
3.3.1. Ground Truth Acquisition
3.3.2. Distance Estimation Performance
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gebauer, O.; Pree, W.; Stadlmann, B. Autonomously Driving Trains on Open Tracks—Concepts, System Architecture and Implementation Aspects; Oldenbourg Wissenschaftsverlag GmbH: Garching bei München, Germany, 2012. [Google Scholar]
- Ristić-Durrant, D.; Franke, M.; Michels, K. A review of vision-based on-board obstacle detection and distance estimation in railways. Sensors 2021, 21, 3452. [Google Scholar] [CrossRef] [PubMed]
- Gao, H.; Huang, Y.; Li, H.; Zhang, Q. Multi-Sensor Fusion Perception System in Train. In Proceedings of the 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS), Suzhou, China, 14–16 May 2021; pp. 1171–1176. [Google Scholar]
- Ye, T.; Zhang, X.; Zhang, Y.; Liu, J. Railway traffic object detection using differential feature fusion convolution neural network. IEEE Trans. Intell. Transp. Syst. 2020, 22, 1375–1387. [Google Scholar] [CrossRef]
- Wang, Z.; Yu, G.; Wu, X.; Li, H.; Li, D. A camera and LiDAR data fusion method for railway object detection. IEEE Sens. J. 2021, 21, 13442–13454. [Google Scholar]
- Wang, Z.; Wu, X.; Yu, G.; Li, M. Efficient rail area detection using convolutional neural network. IEEE Access 2018, 6, 77656–77664. [Google Scholar] [CrossRef]
- Wang, Z.; Yu, G.; Zhou, B.; Wang, P.; Wu, X. A train positioning method based-on vision and millimeter-wave radar data fusion. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4603–4613. [Google Scholar] [CrossRef]
- Durmus, M.S.; Kursun, A.; Söylemez, M. Fail-safe signalization design for a railway yard: A level crossing case. IFAC Proc. Vol. 2010, 43, 337–342. [Google Scholar]
- Pascoe, R.D.; Eichorn, T.N. What is communication-based train control? IEEE Veh. Technol. Mag. 2009, 4, 16–21. [Google Scholar] [CrossRef]
- Fernandes, D.; Névoa, R.; Silva, A.; Simões, C.; Monteiro, J.; Novais, P.; Melo, P. Comparison of major LiDAR data-driven feature extraction methods for autonomous vehicles. In Trends and Innovations in Information Systems and Technologies: Volume 28; Springer: Cham, Switzerland, 2020; pp. 574–583. [Google Scholar]
- Zhao, J.; Xu, H.; Liu, H.; Wu, J.; Zheng, Y.; Wu, D. Detection and tracking of pedestrians and vehicles using roadside LiDAR sensors. Transp. Res. Part C Emerg. Technol. 2019, 100, 68–87. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, X.N.; Gao, H.; Zhou, M.; Tan, C.; Xue, C. DHA: Lidar and vision data fusion-based on road object classifier. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–7. [Google Scholar]
- Muckenhuber, S.; Holzer, H.; Bockaj, Z. Automotive lidar modelling approach based on material properties and lidar capabilities. Sensors 2020, 20, 3309. [Google Scholar] [CrossRef] [PubMed]
- Tian, W.; Tang, L.; Chen, Y.; Li, Z.; Zhu, J.; Jiang, C.; Hu, P.; He, W.; Wu, H.; Pan, M.; et al. Analysis and radiometric calibration for backscatter intensity of hyperspectral LiDAR caused by incident angle effect. Sensors 2021, 21, 2960. [Google Scholar] [CrossRef] [PubMed]
- Schneider, S.; Himmelsbach, M.; Luettel, T.; Wuensche, H.J. Fusing vision and lidar-synchronization, correction and occlusion reasoning. In Proceedings of the 2010 IEEE Intelligent Vehicles Symposium, La Jolla, CA, USA, 21–24 June 2010; pp. 388–393. [Google Scholar]
- Leu, A.; Aiteanu, D.; Gräser, A. High speed stereo vision based automotive collision warning system. In Applied Computational Intelligence in Engineering and Information Technology: Revised and Selected Papers from the 6th IEEE International Symposium on Applied Computational Intelligence and Informatics SACI 2011; Springer: Berlin/Heidelberg, Germany, 2012; pp. 187–199. [Google Scholar]
- Lee, A.; Dallmann, W.; Nykl, S.; Taylor, C.; Borghetti, B. Long-range pose estimation for aerial refueling approaches using deep neural networks. J. Aerosp. Inf. Syst. 2020, 17, 634–646. [Google Scholar] [CrossRef]
- Pinggera, P.; Pfeiffer, D.; Franke, U.; Mester, R. Know your limits: Accuracy of long range stereoscopic object measurements in practice. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Part II. Springer: Cham, Switzerland, 2014; pp. 96–111. [Google Scholar]
- Pavlović, M.G.; Ćirić, I.T.; Ristić-Durrant, D.; Nikolić, V.D.; Simonović, M.B.; Ćirić, M.V.; Banić, M.S. Advanced thermal camera based system for object detection on rail tracks. Therm. Sci. 2018, 22, 1551–1561. [Google Scholar] [CrossRef]
- Kudinov, I.A.; Kholopov, I.S. Perspective-2-point solution in the problem of indirectly measuring the distance to a wagon. In Proceedings of the 2020 9th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 8–11 June 2020; pp. 1–5. [Google Scholar]
- Haseeb, M.A.; Guan, J.; Ristic-Durrant, D.; Gräser, A. DisNet: A novel method for distance estimation from monocular camera. In Proceedings of the 10th Planning, Perception and Navigation for Intelligent Vehicles (PPNIV18), IROS, Madrid, Spain, 1 October 2018. [Google Scholar]
- Franke, M.; Gopinath, V.; Reddy, C.; Ristić-Durrant, D.; Michels, K. Bounding Box Dataset Augmentation for Long-range Object Distance Estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1669–1677. [Google Scholar]
- Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv 2016, arXiv:1606.02147. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Terven, J.; Cordova-Esparza, D. A Comprehensive Review of YOLO: From YOLOv1 and Beyond. arXiv 2023, arXiv:2304.00501. [Google Scholar]
- Jocher, G.; Chaurasia, A.; Qiu, J. YOLO by Ultralytics. License: AGPL-3.0, Version: 8.0.0. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 26 October 2023).
- Zhu, Y.; Zheng, C.; Yuan, C.; Huang, X.; Hong, X. Camvox: A low-cost and accurate lidar-assisted visual slam system. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 5049–5055. [Google Scholar]
Parameter | Value | Parameter | Value |
---|---|---|---|
Epochs | 500 | Patience | 50 |
Batch | 8 | Image Size | 640 |
Save | True | Save Period | −1 |
Device | Auto | Workers | 8 |
Project | Null | Name | Null |
Pretrained | True | Optimizer | Auto |
Verbose | True | Seed | 0 |
Deterministic | True | Cosine LR | False |
Mixed Precision | True | Validation | True |
Validation Split | Val | Conf. Threshold | Null |
IoU Threshold | 0.7 | Max Detection | 300 |
Overlap Mask | True | Mask Ratio | 4 |
Dropout | 0.0 |
Parameter | Value | Parameter | Value |
---|---|---|---|
HSV Hue Range | 0.015 | HSV Saturation Range | 0.7 |
HSV Value Range | 0.4 | Degrees Range | 0.0 |
Translation Range | 0.1 | Scale Range | 0.5 |
Shear Range | 0.0 | Perspective Range | 0.0 |
Vertical Flip Prob. | 0.0 | Horizontal Flip Prob. | 0.5 |
Mosaic Augmentation | 1.0 |
Scene | RMSE (Unit: m) |
---|---|
1 | 0.7020 |
2 | 1.0203 |
3 | 1.2598 |
4 | 0.5239 |
mean | 0.9523 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hao, Y.; Tang, T.; Gao, C. Train Distance Estimation in Turnout Area Based on Monocular Vision. Sensors 2023, 23, 8778. https://doi.org/10.3390/s23218778
Hao Y, Tang T, Gao C. Train Distance Estimation in Turnout Area Based on Monocular Vision. Sensors. 2023; 23(21):8778. https://doi.org/10.3390/s23218778
Chicago/Turabian StyleHao, Yang, Tao Tang, and Chunhai Gao. 2023. "Train Distance Estimation in Turnout Area Based on Monocular Vision" Sensors 23, no. 21: 8778. https://doi.org/10.3390/s23218778
APA StyleHao, Y., Tang, T., & Gao, C. (2023). Train Distance Estimation in Turnout Area Based on Monocular Vision. Sensors, 23(21), 8778. https://doi.org/10.3390/s23218778