1. Introduction
Submarine cables, including electric cables, fiber optic cables, and photoelectric composite cables, are laid on the seabed and protected by a reinforced sheath. They have advantages that cannot be obtained by other means in the domains of electrical energy transmission, transoceanic communication, marine engineering, and the development of new energy [
1]. With the increasing demand for marine resources in countries around the world, the construction scale of domestic and foreign submarine cables has been expanding [
2]. According to research on the submarine cable market, with the construction of 25 additional submarine cables worldwide between 2018 and 2020, they reached a total length of more than 250,000 km. The global submarine cable market had reached 5.14 billion dollars by 2021. Around the year 2000, the first expansion in submarine cable construction began. Typically, submarine cables must be replaced every 20 years. Therefore, the global submarine cable market will enter a new construction period in the next few years, while submarine cable testing will also enter a new phase.
The working environment of the submarine cable is extremely harsh. Natural damage includes long-term erosion by seawater, the impact of ocean currents, fish bites, etc., and human-derived external forces include ship anchorage and fishing. These can result in the damage or even the breakage of the outer protective sheath of the submarine cable, as well as the displacement of the submarine cable position, disrupting the regular operation of the submarine transmission and communication network [
3]. In order to ensure that the submarine cable can work properly, regular inspection of the cable condition is essential. Traditional diving detection techniques not only restrict the duration, range, and depth of sea cable detection, but also threaten the diver’s life. Therefore, unmanned and intelligent inspection methods for submarine cables are particularly crucial [
4,
5,
6], among which deep learning-based machine vision detection methods draw the most attention in producing new applications [
7,
8].
With the development of machine vision, many image-related problems have been solved [
9]. In recent years, deep learning [
10,
11] techniques have increasingly been applied to underwater target detection [
12,
13], in which convolutional neural networks achieve high accuracy in image classification. Han et al. [
14] used deep convolutional neural networks for underwater image processing and target detection with an accuracy of 90%, and the proposed method was applied in underwater vehicles. Li et al. [
15] used Faster-RCNN [
16] for fish detection and identification. They modified AlexNet [
17], making the detection accuracy 9.4% higher than that of the deformable parts model (DPM). Jalal et al. [
18] used a combination of a Gaussian mixture model and the YOLO-V1 [
19] deep neural network for the detection and classification of underwater species, and their detection
F1 score and classification accuracy reached 95.47% and 91.64%. Hu et al. [
20] used YOLO-V4 [
21] to achieve high detection accuracy for low-quality underwater images and very small targets. They modified the feature pyramid network (FPN) [
22] and path aggregation network (PANet) [
23] to perform a de-redundancy operation, which increased the average accuracy of the prototype network to 92.61%.
Machine vision inspection is insensitive to underwater noise data, allowing for good environmental information and the ability to identify submarine cables from high-definition, blurred images. Fatan et al. [
24] used multi-layer perceptron (MLP) [
25] and support vector machine (SVM) [
26] to classify the edges of submarine cables. They used morphological filtering and the Hough transform for edge repair and detection, with 95.95% detection accuracy. However, they only used this method to detect the straight line of the cable, and it does not have a good detection effect on blurred cable pictures. Stamoulakatos et al. [
27] used a deep convolutional neural network (ResNet-50) to detect submarine pipelines under low light, with accuracy of 95% to 99.7%. However, they use a two-stage detection network, which has a long average detection time and fails to meet the requirements when performing real-time detection. Balasuriya et al. [
28] used visual detection to solve the problem of the partial absence of submarine cables and selection when multiple submarine cables are present at the same time. Finally, the position of the submarine cable was inferred by combining the position information of an autonomous underwater vehicle (AUV). Chen et al. [
29] considered the poor quality of underwater images due to underwater scattering and absorption characteristics. They preprocessed the images by enhancing the original contrast memory grayscale and finally extracted the edge of the sea cable by mathematical morphological processing and edge detection. However, their method is not applicable to the case of complex backgrounds and will lead to the difficulty of submarine cable edge extraction and low recognition accuracy.
The underwater images [
30,
31] of the submarine cables are always blurred and blue–green [
32,
33], which makes the position and feature information difficult to extract. In this paper, the YOLO-SC network based on the improved YOLO-V3 [
34] network is proposed. The detection model improves the detection accuracy while simplifying the prediction network structure and shortening the average time of detection. The main contributions of this article are listed as follows:
(1) A target detection model is proposed for submarine cables to fill the gaps in the current research domain for submarine cable detection. The proposed YOLO-SC model outperforms other network models that have been applied to underwater target detection.
(2) An image preprocessing method is added in front of the feature extraction module to enhance the performance of detection. The method can effectively solve the problem of difficult feature extraction due to the blue–green color of underwater images of submarine cables.
(3) Skip connection [
35] and multi-structured multi-size feature fusion are added in feature extraction and feature fusion, respectively, to solve the problem of insufficient position information and feature information of submarine cable targets due to blurred images.
(4) A lightweight prediction network is proposed based on the slightly larger proportion of submarine cables in the image, which shortens the detection time of the model and can meet the standard of real-time detection underwater.
(5) A diversity submarine cable dataset was created to supply the data for the study of submarine cable object recognition. Our image dataset contains 3104 images with different disturbances, such as motion blur, partial absence, occlusion, absorption, and scattering effects of water on light.
The rest of this paper is organized as follows:
Section 2 describes submarine cable image preprocessing, including several algorithms for image data enhancement. In
Section 3, the YOLO-V3 network and the proposed YOLO-SC network, which is based on an improved YOLO-V3 prototype network, are described. The model’s assessment via experiments is discussed in
Section 4.
Section 5 presents the conclusions and future work.
4. Experiment and Discussion
This section is divided into four parts. The first part introduces the experimental setup; the second part shows the experimental results of the YOLO-SC model on the submarine cable dataset; the third part compares the performance of the proposed YOLO-SC model with other algorithms; and, finally, the YOLO-SC model is studied for ablation.
4.1. Datasets and Experimental Settings
4.1.1. Image Data Acquisition
The submarine cable dataset used in this paper was collected at the pool test site of Hangzhou Dianzi University (Dongyue Campus). The submarine cable body was simulated with PVC pipe according to the real submarine cable. Images were taken with a deep-sea high-definition and high-frame-rate network camera jointly developed by Dahua and Hangzhou Dianzi University, with a resolution of 2688 × 1520 pixels. A total of 3104 images were taken, including 2399 images of the target object of the submarine cable. All the images were taken under natural conditions, including some disturbing factors: motion blur, partial absence, occlusion, absorption, and scattering effects of water on light. Some samples of the dataset under different disturbances are shown in
Figure 11.
4.1.2. Image Annotation and Dataset Production
In this paper, the acquired image data are firstly filtered to eliminate the images without the target object of the submarine cable. Secondly, the filtered images are subjected to image data enhancement, after which each image is numbered. The numbered images are manually annotated and the submarine cable target object is selected with a horizontal frame. Finally, the annotated images are converted to PASCAL VOC format so that they can be easily compared with the performance of other algorithms. It is randomly divided into a training set, a validation set, and a test set. The training set consists of 3886 images, the validation set consists of 432 images, and the test set consists of 480 images of the original image. The ratios are as follows:
The specific parameters of the
,
, and
are shown in
Table 3.
4.1.3. YOLO-SC Model Initialization Parameters
The YOLO-SC model for this experimental study was trained and tested on a desktop computer. The hardware parameters are shown in
Table 4. The whole model is built on the pytorch platform. The programming language and software used are Python and pycharm, respectively. The pixel size of the input image of this model is bit
. Considering the effect of CPU memory, the batch size is set to 8 in this paper. The model is trained for 100 epochs, and other initialization parameters, such as momentum and initial learning rate, are shown in
Table 5.
4.2. Experimental Results of YOLO-SC Model on Submarine Cable Dataset
In this paper, the YOLO-SC model is obtained after completing the improvement of the YOLO-V3 model. After training the YOLO-SC model, it is applied to the test set of the submarine cable dataset and measured result of submarine cable detection are obtatined. Some of them are shown in
Figure 12. These images have a pixel size of
. The red part of the figure shows the confidence of the prediction box, while is enlarged next to it because the font size is too small.
From the measured results, the detection rate of the YOLO-SC model for submarine cables is high. The confidence score of straight submarine cables can ultimately reach above 0.97. The model also has good performance for bent submarine cables and submarine cables with occlusion in the images.
4.3. Comparison of Different Algorithms
To further validate the effectiveness of the submarine cable detection model proposed in this paper, the proposed YOLO-SC model is compared with models that have been applied in underwater target detection (YOLO-V3, SSD [
42], and Faster-RCNN) in terms of average detection accuracy,
F1 score, and detection speed, with a consistent dataset and confidence threshold (0.8). The results are shown in
Table 6. The precision–recall curves (P-R curves) for the different models are shown in
Figure 13.
According to the experimental results, it can be seen that the YOLO-SC model proposed in this paper has the best performance among all the above models. The average time of SSD model detection is the lowest, but its detection accuracy is also the lowest. SSD is a one-stage detection algorithm that localizes and classifies the target only once, so it has a short average detection time. It performs convolution on the feature map to detect the target, not using fully connected layers. Therefore, it loses a great deal of spatial information and has lower average detection accuracy. YOLO-V3 has medium detection accuracy and average time. The detection accuracy of the Faster-RCNN model is higher than that of the YOLO-V3 model, but its average detection time is up to 2.068 s, which does not allow for the real-time detection of underwater targets. Finally, the detection accuracy of the YOLO-SC model proposed in this paper is the highest, with an AP of 99.41% and an average time of 0.452 s, indicating that it can achieve the real-time detection of underwater targets. The effectiveness of the YOLO-SC model proposed in this paper is further verified through the comparison of different algorithms.
4.4. Impact of Data Enhancement Methods on Detection Models
In this paper, the luminance transformation, color balance, and rotation transformation methods of data enhancement are used to enhance the image dataset. Since underwater images are strongly influenced by the absorption and scattering of light by water, this paper analyzes the effect of the data enhancement methods on the inspection model. Firstly, the original submarine cable image is used as the original dataset, The original image data set is randomly divided into three equal parts for brightness transformation, color balance and rotation transformation operations respectively, and the enhanced cable images are used as the enhanced data set after data enhancement. Secondly, the two datasets are input to the YOLO-SC model for training. Finally, the results of the effect of the two datasets on the detection model are obtained. The loss curves and precision–recall curves (P-R curves) of the YOLO-SC model for the two datasets are shown, respectively, in
Figure 14 and
Figure 15. The average precision (AP) and
F1 scores of the YOLO-SC model for the two datasets are shown in
Table 7.
As can be seen in
Figure 14, the data-enhanced dataset is trained on the YOLO-SC model and the model presents lower loss. As can be seen in
Figure 15, the YOLO-SC model is trained using the data-enhanced dataset, and its P-R curve is considerably above the original image dataset used. As can be seen from
Table 7, the data-enhanced dataset trained on the YOLO-SC model has higher average precision and
F1 score in the final results than the original dataset test. Therefore, image data enhancement methods improve the detection performance of the model, and removing these methods from the dataset will make the model detection less powerful.
4.5. Ablation Studies with Different Variations
In order to verify the effectiveness of the improvement of the prototype network YOLO-V3, this paper ablates the skip connection, the lightweighted network, and the multi-structured multi-size feature fusion. The lightweight-only network is called YOLO-LW; the model with skip connections added to its base is called YOLO-LWS; and, finally, the model with multi-structured multi-size feature fusion is called YOLO-SC. The precision–recall curves (P-R curves) for the above models are shown in
Figure 16. The average precision,
F1 score, and detection speed for each of the above models are shown in
Table 8.
According to the experimental results, it can be seen that, as the three modules are added to the YOLO-V3 model in turn, the detection accuracy of the YOLO-SC model proposed in this paper is also incremented one step at a time. When the last module is added, the overall detection accuracy of the model improves substantially, with its AP reaching 99.41%. The F1 score drops when the second module is added, but the score is not significantly different from that of YOLO-V3. The average time for model detection decreases and then increases. The decrease is due to the fact that the first module reduces one branch of the prediction network of the YOLO-V3 model, shortening the model detection process. The latter two modules enhance the extraction of position and feature information, making the model structure richer and therefore increasing the average time of detection. Overall, the average detection time of the YOLO-SC model proposed in this paper increases by 0.036 s compared to YOLO-V3. As the underwater vehicle is operating, its speed is basically maintained at a low cruising speed. During the real-time detection, YOLO-SC only needs to provide at least one detection result within 1 s. The average detection speed of the YOLO-SC model is 0.452 s. Therefore, the model can meet the requirements of real-time underwater detection.