RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark
Abstract
:1. Introduction
- The object detection models based on pre-training on the traditional datasets are not ideal when directly applied to the insulator detection. Therefore, the insulator detection task requires new high-quality datasets for model training and testing.
- Although the existing object detection methods can be transferred to insulator detection, there is still a lack of efficient models, evaluation statistics, and benchmarks specifically for insulator object detection.
- We construct a novel dataset (RSIn-Dataset) for insulator detection in the electric power patrol scene. Compared with other datasets, RSIn-Dataset has more special power scenarios and diversity of objectives, which can provide an important foundation for the intelligence of UAV electric power patrol based on deep learning.
- We propose the YoloV4++ network by improving YoloV4 for insulator detection. The experimental results show that YoloV4++ achieves better performance compared to other advanced networks on RSIn-Dataset.
- With the analysis of several baseline methods for object detection, the benchmark of RSIn-Dataset is constructed, which provides an important reference for future work.
2. Related Work
2.1. Existing Datasets for Object Detection
- Pascal VOC Dataset: This dataset “http://host.robots.ox.ac.uk/pascal/VOC/ (accessed on 10 November 2022)” is used as a standard dataset for image detection and classification. There are two versions, i.e., voc2007 and voc2012. Voc2007 has 9963 images, while voc2012 has 17,125 images. The dataset contains 20 categories which are common in life, such as a person, bicycle, cat, bottle, etc. It has horizontal images with a size of about 500 × 375 pixels and a vertical image size of about 375 × 500 pixels. This dataset is widely used in the evaluation criteria for various object detection methods.
- Microsoft Common Objects in Context (COCO): This dataset “http://mscoco.org/ (accessed on 10 November 2022)” is a large-scale dataset available for image detection, semantic segmentation, material recognition, and image description. It has more than 330,000 images, of which 220,000 have annotated labels, containing 1.5 million targets, 80 object categories, and 91 material categories. Due to its abundance of images, deep learning methods usually carry out pre-training based on it.
- ImageNet Dataset: This dataset “https://image-net.org/ (accessed on 10 November 2022)” has more than 14 million images, covering more than 20,000 categories, of which more than 1 million images have clear categories and boundary box annotation. Deep learning methods usually choose a subset from the whole dataset for training and testing.
- Dataset for Object Detection in Aerial Images (DOTA Dataset): This dataset “https://captain-whu.github.io/DOTA/dataset.html (accessed on 10 November 2022)” is a common dataset for aerial remote sensing image object detection. There are 2806 aerial images with image resolution ranging from 800 × 800 to 4000 × 4000, containing 15 categories for a total of 188,282 instances. The images mainly contain large objects such as an airplane, ship, port, basketball court, etc. It is characterized by large changes in image spatial resolution and contains a large number of densely arranged small objects.
- Git Dataset: This dataset “https://github.com/InsulatorData/InsulatorDataSet (accessed on 10 November 2022)” is publicly available, with 848 images, divided into normal insulators (600 images) and insulators with defects (248 images). Among them, the defect insulators are synthetic images. The dataset contains only one type of insulator and insufficient kinds of power line inspection scene backgrounds. Therefore, the application scope of this dataset is limited in the insulator detection.
2.2. Insulator Detection Methods
3. Dataset Construction
3.1. Insulator Image Acquisition
3.2. Dataset Labeling
3.3. Dataset Statistical Analysis
4. Baseline Methods and The Proposed YoloV4++
4.1. Baseline Methods
4.2. The Proposed Method
4.2.1. The Structure of the Proposed YoloV4++
4.2.2. Focal Loss for Insulator Detection
4.2.3. Implementation and Evaluation Metrics
5. Results and Discussion
5.1. Ablation Studies
- YoloV4+ employs MobileNetv1 as the backbone network, which brings a reduction in model parameter size. The model size of YoloV4+ is about 1/5 that of YoloV4. For detection accuracy, compared with YoloV4, YoloV4+ decreased COCO mAP by about 1.5%, while YoloV4+ achieves a huge improvement in the FPS from 17.01 to 54.74. According to the comparison of YoloV4 and YoloV4+, we can see that using MobileNetv1 as the backbone network can greatly improve the efficiency performance. It proves that the lightweight backbone network can effectively improve the model processing efficiency and significantly reduce the model size with a good accuracy for insulator detection.
- Compared with YoloV4+, YoloV4++ introduces the focal loss to alleviate the problem of positive and negative sample imbalance, increasing COCO mAP by 6.54% and model size by 0.39MB. In addition, YoloV4++ has FPS only 0.92 less than YoloV4+. Considering the efficiency, accuracy, and model size, this strategy is effective, which has an obvious accuracy improvement and a very small reduction in efficiency for insulator detection.
5.2. Results and Benchmark
5.2.1. Qualitative Evaluation
- In ordinary scenes without complicated backgrounds, such as Figure 10a–d, SSD, Faster R-CNN, YoloV3, and YoloV4 can identify insulator targets. However, the category confidence of each target is generally low. This phenomenon indicates that the identification ability of these models is not good enough, and they easily make mistakes when working with a complex background, such as in Figure 10d,f. Yolo X and our algorithm show a very good recognition ability in this simple scenario. They can complete the detection task with no error detection or mis-detection situation. Additionally, the category confidence is generally high, close to 1.
- In the scenario with dense insulator targets, as shown in Figure 10d, some baseline methods miss real insulator targets when detecting. In Figure 10d, the SSD algorithm misses the insulator on the lower right side of the image. Faster R-CNN, YoloV3, and YoloV4 performed poorly. The category confidence of some insulator targets is just over the threshold. In this scene, Yolo X and our algorithm still perform very well. They maintain a high category confidence in identifying the insulator target with no mis-detection or error detection.
- With dense insulator targets against a complex background, the detection effect of SSD, Faster R-CNN, YoloV3, and YoloV4 algorithms is worse. As shown in Figure 10f, in the case where the insulators are dense and mutually masked, the YoloV4 algorithm misses the insulator target. Although YoloV4 has no mis-detection in the scenario as in Figure 10d. This reflects that dense target detection and how to correct the background and foreground are still the goals that need to be pursued. Our algorithm adapts well, maintaining high category confidence to detect each insulator target with no error detection.
5.2.2. Quantitative Evaluation
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Nguyen, V.N.; Jenssen, R.; Roverso, D. Automatic autonomous vision-based power line inspection: A review of current status and the potential role of deep learning. Int. J. Electr. Power Energy Syst. 2018, 99, 107–120. [Google Scholar] [CrossRef]
- Xu, C.; Li, Q.; Zhou, Q.; Zhang, S.; Yu, D.; Ma, Y. Power line-guided automatic electric transmission line inspection system. IEEE Trans. Instrum. Meas. 2022, 71, 3512118. [Google Scholar] [CrossRef]
- Lopez, R.L.; Sanchez, M.J.B.; Jimenez, M.P.; Arrue, B.C.; Ollero, A. Autonomous UAV System for Cleaning Insulators in Power Line Inspection and Maintenance. Sensors 2021, 21, 8488. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the 2015 International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar] [CrossRef]
- Li, B.; Fei, Z.; Zhang, Y. UAV Communications for 5G and Beyond: Recent Advances and Future Trends. IEEE Internet Things J. 2019, 6, 2241–2263. [Google Scholar] [CrossRef]
- Zamir, N.M.; Ling, G.F.; Han, P.Y.; Yin, O.S. Vision-based Human Action Recognition on Pre-trained AlexNet. In Proceedings of the 2019 9th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 29 November–1 December 2019. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), Miami, FL, USA, 20–25 June 2009. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context; Springer International Publishing: Cham, Switzerland, 2014. [Google Scholar]
- Lee, A.J.; Cho, Y.; Shin, Y.S.; Kim, A.; Myung, H. ViViD++: Vision for Visibility Dataset. arXiv 2022, arXiv:2204.06183v2. [Google Scholar] [CrossRef]
- Jeong, J.; Cho, Y.; Shin, Y.S.; Roh, H.; Kim, A. Complex Urban LiDAR Data Set. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018. [Google Scholar]
- Wang, J.; Li, Y.; Chen, W. Detection of Glass Insulators Using Deep Neural Networks Based on Optical Imaging. Remote. Sens. 2022, 14, 5153. [Google Scholar] [CrossRef]
- Wei, L.; Dragomir, A.; Dumitru, E.; Christian, S.; Scott, R.; Cheng-Yang, F.; Alexander, C.B. SSD: Single Shot MultiBox Detector. In Lecture Notes in Computer Science, Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLO X: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep Learning for Generic Object Detection: A Survey. Int. J. Comput. Vis. 2019, 128, 261–318. [Google Scholar] [CrossRef] [Green Version]
- Yang, Z.; Xu, Z.; Wang, Y. Bidirection-Fusion-YOLOv3: An Improved Method for Insulator Defect Detection Using UAV Image. IEEE Trans. Instrum. Meas. 2022, 71, 3521408. [Google Scholar] [CrossRef]
- Wang, Z.; Liu, X.; Peng, H.; Zheng, L.; Gao, J.; Bao, Y. Railway Insulator Detection Based on Adaptive Cascaded Convolutional Neural Network. IEEE Access 2021, 9, 115676–115686. [Google Scholar] [CrossRef]
- Huang, X.; Shang, E.; Xue, J.; Ding, H.; Li, P. A Multi-Feature Fusion-Based Deep Learning for Insulator Image Identification and Fault Detection. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020. [Google Scholar]
- Sampedro, C.; Rodriguez-Vazquez, J.; Rodriguez-Ramos, A.; Carrio, A.; Campoy, P. Deep Learning-Based System for Automatic Recognition and Diagnosis of Electrical Insulator Strings. IEEE Access 2019, 7, 101283–101308. [Google Scholar] [CrossRef]
- Kang, G.; Gao, S.; Yu, L.; Zhang, D. Deep Architecture for High-Speed Railway Insulator Surface Defect Detection: Denoising Autoencoder with Multitask Learning. IEEE Trans. Instrum. Meas. 2019, 68, 2679–2690. [Google Scholar] [CrossRef]
- Li, S.; Gao, L.; Yue, Y. Detection of Helmet Wearing Based on Improved Yolo v3. In Proceedings of the 2021 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July 2021; pp. 7965–7970. [Google Scholar] [CrossRef]
- Han, G.; He, M.; Zhao, F.; Xu, Z.; Zhang, M.; Qin, L. Insulator detection and damage identification based on improved lightweight YOLOv4 network. Energy Reports. Insulator detection and damage identification based on improved lightweight YOLOv4 network. Sci. Direct. Energy Rep. 2021, 7, 187–197. [Google Scholar]
- Ling, Z.; Zhang, D.; Qiu, R.C.; Jin, Z.; Zhang, Y.; He, X.; Liu, H. An Accurate and Real-time Method of Self-blast Glass Insulator Location Based on Faster R-CNN and U-net with Aerial Images. J. Electr. Power Energy Syst. CSEE 2019, 5, 474–482. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015. [Google Scholar]
- Xiao, L.; Feng, Y.; Yong, P. Research on Defect Detection of UAV Power Grid Inspection Based on Deep Learning. Electr. Power Syst. Prot. Control. 2022, 50, 132–139. [Google Scholar] [CrossRef]
- Kong, Y.; Han, S.; Li, X.; Lin, Z.; Zhao, Q. Object detection method for industrial scene based on MobileNet. In Proceedings of the 2020 12th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 22–23 August 2020; pp. 79–82. [Google Scholar] [CrossRef]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. A review on state-of-the-art power line inspection techniques. IEEE Trans. Instrum. Meas. 2020, 69, 9350–9365. [Google Scholar] [CrossRef]
- Pang, Y.; Wang, T.; Anwer, R.M.; Khan, F.S.; Shao, L. Efficient Featurized Image Pyramid Network for Single Shot Detector. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7328–7336. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 2017 IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Zhu, Y.; Zhao, C.; Guo, H.; Wang, J.; Zhao, X.; Lu, H. Attention Couple Net: Fully Convolutional Attention Coupling Network for Object Detection. IEEE Trans. Image Process. 2018, 28, 113–126. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the 2018 European conference on computer vision, Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Piotr, D. Focal loss for dense object detection. In In Proceedings of the IEEE international conference on computer vision, Honolulu, HI, USA, 21–26 July 2017; pp. 2980–2988. [Google Scholar]
Dataset | Pascal VOC 2007 | COCO | ImageNet | DOTA | Git | Ours |
---|---|---|---|---|---|---|
Electric Scene | No | No | No | No | Yes | Yes |
Resolution | 375 × 500 | / | / | 800 × 800 4000 × 4000 | 1152 × 864 | 1152 × 864 7360 × 4912 |
Number of Categories | 20 | 91 | 20,000+ | 15 | 1 | 4 |
Number of Images | 9963 | 330,000+ | 14,000,000+ | 2806 | 848 | 1887 |
Number of Samples | 24,640 | 1,500,000+ | 1,000,000+ | 188,282 | 1262 | 3286 |
Number of Insulators | Few | Few | Few | Few | 1262 | 3286 |
Hyperparameters | SSD | Faster R-CNN | YoloV3 | YoloV4 | Yolo X |
---|---|---|---|---|---|
Epoch | 100 | 100 | 100 | 100 | 100 |
Initial learning rate | 0.001 | 0.0001 | 0.001 | 0.0001 | 0.001 |
Batch size | 4 | 4 | 4 | 4 | 4 |
Momentum | 0.9 | 0.9 | 0.9 | 0.9 | 0.9 |
IoU threshold | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 |
Method | Focal loss | MobileNetv1 | COCO mAP (%) | Param (MB) | FPS |
---|---|---|---|---|---|
YoloV4 | No | No | 50.56 | 245.53 | 17.01 |
YoloV4+ | No | Yes | 48.42 | 48.42 | 54.74 |
YoloV4++ | Yes | YeS | 55.64 | 48.81 | 53.82 |
Method | mAP (%) | ap1 | ap2 | ap3 | ap4 |
---|---|---|---|---|---|
YoloV3 | 75.52 | +4.29 | −6.93 | +2.18 | +0.44 |
YoloV4 | 91.93 | +4.38 | −2.50 | +2.72 | −4.21 |
YoloV4+ | 90.24 | +5.64 | −1.30 | +3.54 | −7.89 |
YoloV4++ | 94.24 | +1.08 | −3.03 | −0.03 | +1.98 |
Method | COCO mAP (%) | mAP (%) | AP1 | AP2 | AP3 | AP4 | Param (MB) | FPS |
---|---|---|---|---|---|---|---|---|
SSD | 44.54 | 84.98 | 87.80 | 80.77 | 80.25 | 91.07 | 99.7 | 35.41 |
Faster R-CNN | 54.58 | 92.72 | 90.97 | 88.42 | 95.42 | 96.05 | 522.91 | 4.56 |
YoloV3 | 42.52 | 75.52 | 79.81 | 68.59 | 77.70 | 75.96 | 236.32 | 22.04 |
YoloV4 | 50.56 | 91.93 | 96.31 | 89.43 | 94.65 | 87.72 | 245.53 | 17.01 |
YoloV4+ | 49.10 | 90.24 | 95.88 | 88.94 | 93.78 | 82.35 | 48.42 | 54.74 |
Yolo X | 56.51 | 93.33 | 95.37 | 92.34 | 95.22 | 90.40 | 34.21 | 38.46 |
YoloV4++ | 55.64 | 94.24 | 95.32 | 91.21 | 94.21 | 96.22 | 48.81 | 53.82 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shuang, F.; Han, S.; Li, Y.; Lu, T. RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark. Drones 2023, 7, 125. https://doi.org/10.3390/drones7020125
Shuang F, Han S, Li Y, Lu T. RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark. Drones. 2023; 7(2):125. https://doi.org/10.3390/drones7020125
Chicago/Turabian StyleShuang, Feng, Sheng Han, Yong Li, and Tongwei Lu. 2023. "RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark" Drones 7, no. 2: 125. https://doi.org/10.3390/drones7020125
APA StyleShuang, F., Han, S., Li, Y., & Lu, T. (2023). RSIn-Dataset: An UAV-Based Insulator Detection Aerial Images Dataset and Benchmark. Drones, 7(2), 125. https://doi.org/10.3390/drones7020125