Study of a Machine Vision Approach to Leak Monitoring of a Marine System

Jiang, Xingjia; Dai, Yingwei; Zhang, Peng; Wang, Yucheng; Du, Taili; Zou, Yongjiu; Zhang, Yuewen; Sun, Peiting

doi:10.3390/jmse11071275

Open AccessArticle

Study of a Machine Vision Approach to Leak Monitoring of a Marine System

by

Xingjia Jiang

^1,2

,

Yingwei Dai

¹,

Peng Zhang

^1,2,*

,

Yucheng Wang

¹,

Taili Du

^1,2

,

Yongjiu Zou

^1,2

,

Yuewen Zhang

^1,2,* and

Peiting Sun

¹

Marine Engineering College, Dalian Maritime University, Dalian 116026, China

²

Collaborative Innovation Research Institute of Autonomous Ship, Dalian Maritime University, Dalian 116026, China

^*

Authors to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(7), 1275; https://doi.org/10.3390/jmse11071275

Submission received: 22 May 2023 / Revised: 20 June 2023 / Accepted: 21 June 2023 / Published: 23 June 2023

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Leak monitoring is essential for the intelligent operation and maintenance of marine systems, and can effectively prevent catastrophic accidents on ships. In response to this challenge, a machine vision-based leak model is proposed in this study and applied to leak detection in different types of marine system in complex engine room environments. Firstly, an image-based leak database is established, and image enhancement and expansion methods are applied to the images. Then, Standard Convolution and Fast Spatial Pyramid Pooling modules are added to the YOLOv5 backbone network to reduce the floating-point operations involved in the leak feature channel fusion process, thereby improving the detection speed. Additionally, Bottleneck Transformer and Shuffle Attention modules are introduced to the backbone and neck networks, respectively, to enhance the feature representation performance, select critical information for the leak detection task, and suppress non-critical information to improve detection accuracy. Finally, the proposed model’s effectiveness is verified using leak images collected by the ship’s video system. The test results demonstrate that the proposed model exhibits excellent recognition performance for various types of leak, especially for drop-type leaks (for which the accuracy reaches 0.97).

Keywords:

leak monitoring; machine vision; marine system; intelligent operation and maintenance; YOLOv5

1. Introduction

The modern ship is a floating city on the water that is often equipped with numerous pieces of mechanical equipment and systems, such as a marine propulsion plant, an auxiliary plant, and machinery and hull piping systems, to meet the functions of navigation, operation, and crew living. The equipment and systems of ships operate under harsh working conditions, including high temperatures, high humidity, vibrations, and deformations. Additionally, they are continuously subjected to adverse effects such as fluid erosion and corrosion. These factors frequently result in fluid leak failures at vulnerable points, such as sealing surfaces, valves, flanges, and welded joints [1]. These frequent leak failures have become a significant safety hazard, threatening the crew, cargo, and the ship itself and damaging the marine environment. Therefore, there is an urgent need for a practical method to monitor leaks in real time, accurately identify leak points, and ensure safety and reliability, to avoid catastrophic accidents.

Engineers and researchers have made great efforts to explore leak detection strategies. Moreover, leak detection technology has significantly progressed, and many mature methods have been applied in engineering systems. In general, leak detection methods can be divided into hardware-based or software-based methods [2].

Hardware-based approaches are usually associated with hardware devices. Liu et al. [3] proposed a leak detection and localization model for detecting background leaks and even micro-leaks in liquid pipelines based on a dynamic monitoring module and static testing module integration. Li et al. [4] used a negative pressure wave attenuation algorithm for pipeline leak location that depends on pressure changes. Liu et al. [5] introduced a new intrinsic permutation entropy method based on variational mode decomposition (VMD) to suppress noise disturbances and maintain the mutation characteristics of the leakage to locate pipeline leaks. Hardware-based methods typically rely on the use of sensor modules to measure pressure with low energy consumption, and specialized hardware devices can be applied to reduce energy consumption and provide higher efficiency through a highly optimized design compared to general-purpose computing devices. However, these methods require prior manual extraction of signal characteristics, low detection efficiency, and lengthy analysis time, and they also ignore the detection of small leaks and the detection of leak types.

The software-based approach uses software algorithms for leak detection analysis. Yao et al. [6] used an artificial neural network to implement an internal leak fault detection method by collecting data signals in a closed-loop control system, preprocessing the data, and training the neural network with randomly selected training samples. Diao et al. [7] developed an improved VMD method based on the processing of acoustic emission signals, and finally adapted the support vector machine (SVM) for leak pattern identification. Wang et al. [2] combined a sparse autoencoder network with an improved SVM to build an integrated learning framework to improve classification accuracy for pipeline leak detection. Akhand Rai et al. [8] proposed a health index method for pipeline leak detection based on multiscale analysis, a Kolmogorov–Smirnov test, and a Gaussian mixture model. Zhang et al. [9] used a hidden Markov model based on deep neural networks to detect pipe leak locations. Software-based approaches offer the advantages of high flexibility, low cost, rapid iteration and improvement, wide application, and ease of integration and interaction. These advantages make software methods the preferred solution in many application domains and widely used in artificial intelligence. Therefore, this paper adopts a software-based approach to achieve marine system leak detection. However, while these above-mentioned algorithms have effectively improved detection efficiency, the single feature of the procedures leads to specific defects in feature extraction and leak detection. On the one hand, relying on large amounts of data, the detection accuracy is unsatisfactory, and all leak types cannot be detected. On the other hand, leaks cannot be accurately and quickly detected in real time. Moreover, the equipment and piping systems of ships are becoming increasingly complex. If intelligent ships lack personnel to detect and troubleshoot these leaks in time, the consequences of failure may affect the ship’s navigation safety. Research on leak detection is essential for the safe operation of ships. Therefore, when faced with deficiencies in the appeal, machine vision technology can obtain better detection results.

Machine vision mainly uses computers to simulate human visual functions, extract information from images of objective objects, process and understand it, and ultimately use it for practical detection, measurement, and control [10]. It can detect features in the environment and make decisions based on these features. At present, machine vision methods are mainly divided into traditional and deep learning detection algorithms [11]. The former detection algorithm leads to excessive computation, slows down the detection process of the network model, and has significant limitations and weak generalization, which affects the effectiveness of the overall algorithm. As a result, research scholars have shifted their focus from traditional algorithms to deep learning detection algorithms. In the development of detection algorithms for deep learning, there are mainly two-stage detection algorithms and one-stage detection algorithms. Classical two-stage detection algorithms include R-CNN [12] and Faster R-CNN [13]. Although that method has high detection accuracy, it cannot meet the requirements of real-time detection due to high training costs, slow detection speed, and network depth [14]. The main one-stage object detection algorithm is the YOLO [15,16,17,18] family, which has the advantage of fast detection speed because the depth of the network and the number of parameters are smaller than in the two-stage network. YOLOv5, proposed by the Ultralytics company, is the best algorithm so far in terms of accuracy and speed, and achieves a good balance between accuracy and speed. When directly applied to the marine systems leak detection task, YOLOv5 has low detection accuracy, and cannot quickly detect the leak position or effectively determine the leak type. Meanwhile, the model does not fully consider the complex environment of the engine room (E/R), and direct application will not achieve satisfactory detection results.

Based on the above analysis, this paper proposes a machine vision model based on the improved YOLOv5 to realize marine system leak detection. Firstly, a leak database is established to collect various leak images, which strongly supports the subsequent leak detection training. More images are added to the model’s training process to enable it to better adapt to the complex lighting changes in the marine environment. Secondly, a Transformer module is added to the backbone network to capture the features of marine system leaks and the potential global spatial relationships in the leak images, which can strengthen the critical information, thus enabling the learning of insensitive leak features. Finally, by combining the neck network with the attention module, it can autonomously learn the spatial weights of different elements, improve the interactivity between feature images, and capture more detailed features, thus speeding up the model processing speed to quickly detect the location of the system leak.

The remainder of this paper is as follows: Section 2 describes related work and the framework of the machine vision model. Section 3 describes the improved detection model and evaluation metrics. Section 4 presents the experimental design and a discussion of the results obtained. Finally, Section 5 presents the conclusions of this study.

2. Related Work and the Framework

2.1. Related Work

The core of leak detection based on machine vision is object detection, which aims to identify the types of leaks and their locations. With the development of deep learning, more and more algorithms related to object detection are proposed [19]. YOLO algorithms have achieved good results in object detection and classification. A YOLO algorithm divides an image into multiple regions and uses convolutional neural networks (CNNs) to predict the bounding box. Each bounding box has a class label and a confidence level for detecting objects in the image [18]. As the best algorithm of the YOLO family, YOLv5 has a slow detection speed, and there are problems involving the false detection of leaks in large objects, which cannot be detected well. Therefore, research scholars have started to study the improved YOLOv5 algorithm, and the main improvement methods are Transformers [20,21,22,23], attention mechanisms [24,25,26,27], etc.

A Transformer is a deep neural network primarily based on a self-attentive mechanism, and has achieved significant success in several research fields, such as natural language processing [28] and computer vision [29]. Several improved Transformer-based YOLOv5 models have been proposed recently with outstanding detection results. These works mainly use multi-head attention mechanisms to capture global feature relationships. Guo et al. [20] used a Transformer module to extend the receiver domain of a convolutional layer, combining global information to provide more features. Liu et al. [21] proposed a nested residual Transformer structure to obtain global information, and the model improved the accuracy of tiny object detection and reduced its complexity. Wang et al. [22] used a visual Transformer structure to make their model focus more on the global features of an object, compensating for CNNs’ tendency to focus only on local features and substantially improving the model’s accuracy.

Attention mechanisms are widely used in many fields of deep learning. This module improves the efficiency of the network by selecting the relevant and irrelevant information features so that the useless information is weakened and the vital information is enhanced. Dong et al. [24] adopted a convolutional block attention module (CBAM) to combine spatial attention and channel attention, enabling the model to focus on the most critical information in the image and improve the detection accuracy. Zhang et al. [30] propose a squeeze-and-excitation (SE) module to obtain a global description, which uses excitation operations to obtain weights for each feature channel and extract key features. Guo et al. [25] introduced the coordinate attention (CA) module to encode channel relationships and long-term dependencies with precise location information to extract important object features.

In summary, the Transformer module has advantages in capturing global features, and the attention mechanism can enhance the critical information of detection, and they both can improve the extraction of feature information. Therefore, this paper adds the Transformer and attention mechanisms to the basic YOLOv5 model, and the ability of the model to capture global and local feature relationships is enhanced.

2.2. Framework

Next, an improved machine vision approach for the leak monitoring of marine systems is proposed, as shown in Figure 1, and includes steps 1–5.

Step 1: The leak is classified based on the type of ship leak, and images are collected from the marine system to prepare for establishing a leak database.

Step 2: Image enhancement and expansion methods are used to process leak images and enhance their quality.

Step 3: A YOLOv5 basic model is constructed, which consists of four parts: the input of leak images, the feature extraction of images, feature fusion, and the output of the final recognition and detection results.

Step 4: Bottleneck Transformer and Shuffle Attention modules are introduced into the basic model to make it more adaptable to the complex marine system environment and to improve the extraction of leak features.

Step 5: The ship’s E/R video monitoring system is utilized to obtain equipment and pipeline system leak images for training the model and verifying its effectiveness.

3. Methodology

This section details the methodology of this paper, which is divided into four parts: the leak database, the basic model of YOLOv5, the improved model framework, and the evaluation metrics.

3.1. Leak Database

In the maritime industry, leaks stand out as one of the primary causes of equipment and system failures on board ships. They encompass the unintended release of substances such as ship fuel, lubricants, and freshwater, leading to substantial resource losses. Furthermore, leaks can induce corrosion in ship equipment and systems, posing significant risks to both vessel integrity and the marine environment. Depending on the type of leak, leaks can be classified into four types: seepage, drop, flow, and spray. In the case of seepage, there are obvious traces of the leak on the surface, which reappear within a few minutes, even if wiped off. However, in the real environment of a ship’s E/R, due to the encapsulation of specific ship equipment and the shielding effect of complex pipeline structures, some of the leak points cannot be visually located. Eventually, the leak can only be found by observing fluid flow traces. A drop leak is one that slowly flows or dripps down. Flow refers to a leak that heavily and continuously flows, but not to the extent of spray. Spray refers to a leak that spreads heavily and gushes constantly.

All images were taken from the historical image database in the training ship’s E/R video monitoring system, as shown in Figure 2. The system is arranged with multiple monitoring cameras to continuously record the operation of critical equipment and systems. However, the quality of the generated leak images is affected by many unfavorable factors, such as the dark environment of the ship’s E/R, some areas having limited space, the uneven distribution of light sources, restricted camera angles, and ship vibration. These factors can lead to changes in image angle, size, brightness, and blurriness, and leak images can be greatly affected. As a result, leak images can have low brightness, large distortion, and blurring, making it difficult to distinguish leaks, which brings difficulties to the training of the model.

Multi-Scale Retinex with Color Restoration (MSRCR) [31] is an image enhancement technique that improves image quality. It restores the details and colors of distorted images by enhancing them at multiple scales, thereby increasing the clarity of the image. In addition, MSRCR can also restore the colors of the image, making it more realistic and natural. Thereby, MSRCR can effectively improve the quality of the leak images and the model’s performance. As shown in Figure 3, MSRCR processes the ship seepage image.

During the training process, the accuracy of the model is influenced by the quantity of image samples. However, the availability of leak images that can be used as training samples is severely limited due to factors such as the size, location, and complexity of the background in the leak images. Therefore, in order to ensure an adequate number of training samples, this study employed image expansion techniques to increase the quantity of images in the training sample set. The OpenCV algorithm in the Python framework was used to translate, rotate, randomly scale, crop, and extend the number of leak images. In the end, we obtained 1995 images, which formed the leak database. According to the 8:2 distribution of the training set and validation set, 1620 images were randomly selected as the training set. The remaining 375 images were used as validation sets, some of which are shown in Figure 4. Then, the leak database was manually labeled as a visual object class (VOC)-type database using the labeling method.

3.2. Base YOLOv5 Model

The structure of YOLOv5 consists of input and output layers, and backbone and neck networks. The backbone network is the core structure, consisting of Focus, Conv, C3, and Spatial Pyramid Pooling (SPP) modules. Focus performs slicing operations on images, which uses slicing to split a high-resolution feature map into multiple low-resolution feature maps to reduce information loss. Conv convolves the input feature, including two-dimensional Convolution2d (Conv2d), normalization, and activation operations. The Activate operation uses the SiLU activation function. C3 consists of three Conv blocks, whose main purpose is to increase the depth and perceptual field of the network and improve its feature extraction ability. In addition, SPP uses three convolutional kernels of different sizes for maximum pooling to achieve the fusion of different features [32]. The neck network comprises the Feature Pyramid Network (FPN) and the Path Aggregation Network (PAN). They are a bidirectional aggregation network from top to bottom and from bottom to top. FPN has the function of upsampling, PAN has the function of downsampling, and the combination of the two modules greatly enhances the fusion effect of the network [24].

3.3. Improved YOLOv5 Model

The YOLOv5 model has a broader domain of acceptance in that it focuses on representing abstract image features. However, when faced with complex marine systems and multiple background color interferences, the algorithm’s classification and recognition performance is not ideal. Therefore, an improved machine vision model based on YOLOv5 is proposed to improve the accuracy and speed of leak detection.

The improvement measures mainly include two aspects, as shown in Figure 5.

(1): The Focus module is replaced with the Conv module and the SPP module is replaced with the SPPF module, which further reduces the number of parameters and floating-point operations (FLOPs). The calculation speed is increased by this improvement, enabling the model to better adapt to the leak detection timeliness requirements of marine systems.
(2): The Bottleneck Transformer module is inserted behind the SPPF module. Meanwhile, the Shuffle Attention module is introduced into the neck network. This improvement makes the model more adaptable to the E/R environment and enhances the network’s feature extraction capabilities by highlighting the essential leak information, thereby improving detection accuracy.

3.3.1. Improved Detection Performance

Conv replaces Focus

The most intuitive function of the Focus module is the slicing operation, which plays the role of sampling the image. However, the Focus module has some drawbacks, such as limited supported device models, a lack of user-friendliness, high cost, and high subsequent computation. In addition, if the slices are not aligned, it can cause the model to crash. The Conv module also has the role of image sampling in the convolution operation. It can not only extract local features in the image, but also downsample the image by reducing the size of the feature map. A comparison of the calculation of Focus and Conv is shown in Table 1.

As shown in Table 1, Focus has more calculations and parameters, with the latter being four times that of Conv. In addition, the slicing process of Focus has several more Concat operations compared to Conv, which increases the memory overhead to some extent. As a result, the Conv module will be lighter and simpler. Using Conv instead of Focus can reduce the complexity and number of parameters of the model network, making the network lighter and more efficient. It will also improve the model performance and accuracy while increasing the speed of leak detection operation. The Focus and Conv modules are shown in Figure 6.

2.: SPPF replaces SPP

The SPP module is located at the end of the backbone network. As shown in Figure 7a, three convolutional kernels (5, 9, and 13) are designated for feature fusion. When applying the SPP module for image leak detection, if the leak features are not obvious after fusion, the overlapping position information of small targets will become inaccurate, and in severe cases, it may even be lost. At the same time, many leak points in the leak image overlap or intersect with the background, resulting in missed detection and reduced accuracy. To address the above problem effectively, a Fast Spatial Pyramid Pooling (SPPF) module, as depicted in Figure 7b, is introduced. The SPPF module takes a set of feature maps as input. It extracts leak features of different scales from these feature maps, and then, fuses them to form stronger leak feature maps, thus enabling more accurate detection of different types of leak. Furthermore, the SPPF module specifies a convolution kernel. The output after each pooling will become the input for the next pooling, where multiple small-sized pooling kernels replace a large pooling kernel. This reduces the number of fusion parameters, speeds up the fusion pooling, and improves the accuracy of post-fusion detection while preserving the original features of the fused feature images of different types of leak, enriching the expressiveness of the leak features, and improving the accuracy and speed of the fused leak detection.

3.3.2. Improved Feature Extraction

Bottleneck Transformer

In general, leak image features of marine systems are sparse and obscure, and detection accuracy is lower than in open-source datasets. To reduce the impact on detection, the Transformer module was added to the YOLOv5 model to calculate the self-attention value of the leak images, which enables the detection results to prioritize the leak targets over the background or empty regions, enhancing the focus of the detection process.

A Transformer is a full-attention model. The Bottleneck Transformer [33] used in this paper, referred to as BoT, is shown in Figure 8a. It is a hybrid model that combines a CNN and a traditional Transformer. The CNN network is responsible for extracting local features from leak images, whereas the Transformer captures global leak features. Therefore, by combining these two modules, the BoT module was built to pay attention to both the local recognition features of the leak images and the global features of the leak images, thus fusing local and global leak features to accurately detect leak images.

Compared to the traditional Transformer, the Multi-Headed Attention mechanism (MHSA) in BoT was changed, as shown in Figure 8b. The specific changes were as follows:

(1): Normalization: The Transformer uses layer normalization, and BoT use batch normalization.
(2): Non-linear activation: The Transformer uses only one non-linear activation, while in the FPN block, BoT uses three non-linear activations.
(3): Output projection: MHSA in the Transformer contains an output projection, but BoT does not.
(4): Optimizer: BoT uses the SGD momentum optimizer, while a Transformer typically uses the Adam optimizer for training.

Therefore, using this new MHSA mechanism to extract global features, the relationship between different pixels can be computed to capture this potential global spatial relationship to enhance the key information. When used for leak image detection, it can capture complex leak relationships, especially with multiple leak points on a leak image, thus improving the accuracy and precision of the model.

The base YOLOv5 backbone network is composed of convolutional layers and C3 modules. While this module has a strong capacity for learning local object features, its ability to learn global features is relatively weak. Consequently, we swapped out the last C3 module in the network and placed the BoT behind the SPPF to extract features. This integration facilitates the fusion of both local and global features, consequently improving the accuracy of model detection.

2.: Shuffle Attention

The attention mechanism allows the network to accurately focus on all relevant elements of the input and has become an important part of improving the performance of deep networks. The Shuffle Attention (SA) [34] module is introduced in YOLOv5, and the overall structure is shown in Figure 9. SA groups the input features and integrates channel attention and spatial attention into one block in each group. This enables the network to focus more on useful leak image features while ignoring the entire leak image. It can achieve the best network learning results, enhance the expression of useful leakage features, and suppress unimportant features.

SA modules offer the advantages of high flexibility, high modularity, and light weight. The addition of the SA module makes it possible to better detect complex marine system leaks, adjust the channels of leak image features to give more useful information and greater retention weights to leak images, suppress some useless information, and improve the feature extraction ability of leak images while minimizing the adverse effects due to the influence of attention. In this way, it allows for better detection of leaks in complex contexts and with higher accuracy.

3.4. Evaluation Metrics

Establishing uniform metrics is important for comparing different detection models. Evaluation metrics usually include precision, recall, F1 score, Intersection over Union (IoU), mean accuracy precision (mAP), parameters, and FLOPs.

Precision indicates prediction accuracy and reflects the proportion of positive samples among the examples, and can be calculated using Equation (1).

Precision = \frac{T P}{T P + F P}

(1)

Recall is the proportion of correct results predicted by the model to the total true results. It is defined in Equation (2).

Recall = \frac{T P}{T P + F N}

(2)

where true positives (TP) refer to the number of positive samples that are correctly classified, true negatives (TN) refer to the number of negative samples that are correctly classified, and false positives (FP) refer to the number of negative samples that are incorrectly classified.

Furthermore, the F1 score is the harmonic average of precision and recall, reflecting the model’s overall performance, and can be formulated as Equation (3).

F 1 = 2 \times \frac{p r e c i s i o n \times r e c a l l}{p r e c i s i o n + r e c a l l}

(3)

IoU refers to the ratio of the intersection and union of each category predicted by the model. AP is the integral of the precision–recall curve. mAP refers to the average precision and recall of each category predicted by the model. [email protected] is the average AP of all categories when IoU is set to 0.5, and [email protected]:0.95 is the average AP when IoU is 0.95. They can be reformulated as follows:

A P = \int_{0}^{1} Precision - Recall (Recall) d (Recall)

(4)

m A P = \frac{1}{N} \sum_{i = 1}^{N} {A P}_{i}

(5)

The parameters are the size of the final saved model after training. FLOPs refer to the number of floating-point operations that can be completed per unit of time. In addition, the loss of the model is proposed to estimate the error between the model predictions and the ground truth. In Equation (5), the loss function consists of three parameters: objective confidence loss (Obj_loss), classification loss (Cls_loss), and localization loss (Box_loss).

l o s s = O b j_l o s s + C l s_l o s s + B o x_l o s s

(6)

4. Case Study

The proposed model’s effectiveness was assessed using the marine system leak database constructed in Section 3.1, and the limitations and applicability of the model are discussed in this section.

4.1. Running Environment

The development environment of the model was Python 3.7, and the algorithm was implemented based on the PyTorch deep learning framework. In the training process, the input image size was 640 × 640, the batch size was 16, the epoch was 300, the learning rate was set to 0.02, the optimizer was SGD, the momentum was 0.9, and the weight decay was 0.0005.

4.2. Experimental Results

The training loss curve of a model reflects the change in the loss function during the training process. This visual representation demonstrates the model’s performance and training effect. As such, this paper uses the loss curve as a proposed model evaluation index. The results are shown in Figure 10. It is evident from the figure that during the first 50 epochs, both the training loss and the validation loss decrease rapidly, indicating faster model convergence. In the subsequent epochs, the training and validation loss curves tend to be smooth, and the loss value of the validation set is lower than that of the training set, indicating a good convergence state for the model.

The effectiveness of the visual model for leak detection in marine systems was measured using accuracy as the main metric. In this paper, the leak database (seepage, drop, flow, and spray) established in Section 3.1 was used to train the model and validate its accuracy. The PR curves reflect the relationship between precision and recall, better assessing the model’s performance. The closer the PR curve is to the upper right corner, the better the model’s performance. The PR curve of the model after training is shown in Figure 11, in which the drop curve is close to the upper right corner, indicating high recognition accuracy of drop detection. The low accuracy of spray and seepage detection is due to the similarity between these two leak surface features, which affects the detection results. Drop leakage is the initial stage of a marine system leak, and if no proper intervention is taken, it can lead to further development and evolution towards the flow and spray forms, resulting in serious failure. The accurate and fast detection and localization of drop faults and timely repairs can prevent catastrophic accidents resulting from marine system failures.

Based on the F1 score in Figure 12, it is evident that the accuracy of drop and flow detection is higher than that of seepage and spray. This is primarily because the seepage and spray feature edges are less obvious. The transition between them and the normal leak part is gradual, making it difficult to define the transition region. Consequently, the detection results are similar.

Table 2 presents the detection results of the four leak types. Drop exhibits the best detection effect, with [email protected] and [email protected]:0.95 values of 97.2% and 56.5%, respectively. This can be attributed to the clearly defined characteristics of a drop leak. Overall, the model achieves high detection accuracy, with the indexes reaching more than 80%, meeting the industrial requirements for marine leak detection.

Figure 13 depicts the detection results of the four leaks in a real ship. The boxes in the figure indicate the detection region, and the label values represent the confidence level of the model, which is a value between 0 and 1. A higher confidence value indicates a higher match for the current model, and vice versa for a lower match. As shown in Figure 13, the model accurately identifies all input samples, with high detection results for all four leak types and a confidence range of 0.65–0.85. The confidence levels of the drop, seepage, and spray types are above 0.7, with the highest confidence level occurring for drop at 0.85. However, the confidence level of the flow type is relatively lower, ranging between 0.6 and 0.75, which may be attributed to the characteristics and color of the flow itself and the darker environment of the leak location, which has more influence after background fusion.

4.3. Comparisons

To verify the effectiveness of the proposed improved model, this study selected some advanced models for comparison. There were a total of four models, including the basic YOLOv5. Then, the Conv and SPPF modules were added to the basic model, the BoT module was added to the basic model, and finally, we added the SA module. The experimental results are shown in Table 3 and Figure 14, respectively.

Table 3 represents the training results of different models on the leak database. First, it can be seen that the proposed model accuracy of [email protected] is 83.5% and [email protected]:0.95 is 40.1%, which are 2.4% and 0.8% better than the basic model, respectively. In terms of accuracy, the model outperforms most existing leak models. Second, the proposed model parameter is 6.7 M, 0.4 M lighter than the basic model, and easier to deploy in real industrial applications. Finally, the proposed model FLOPs is 15.5 G, which is slightly less than the value of 0.8 G of the current optimal model, and the average detection processing time of the model decreases slightly. In summary, the proposed model has higher accuracy for leak detection, is lighter, and achieves a better balance between accuracy and parameters, and is thus more suitable for application in practical scenarios.

The comparative results of the four experiments are given in Figure 14. As shown in Figure 14a,b, the proposed model has higher mAP values than the basic model and indicates better detection of the model. Figure 14c,d show that the basic model has obvious curve fluctuations during the curve rise phase, and this phenomenon continues until the end, but the proposed model has a flat curve rise and insignificant oscillation, and the detection effect is greatly improved.

Table 4 illustrates the comparison results of mAP values between the proposed improved model and the basic model. Our model achieved higher mAP values for four types of leak, increasing by 3.2%, 1.5%, 6.1%, and 1.5%, respectively. However, due to blurry edges and a large leak source, the leak source is difficult to determine, and the performance improvement in the basic YOLOv5 model and the improved model is not significant. Actually, in ship leak, drop detection is the most important, with an accuracy rate of 97.2%. It can be effectively applied to leak detection in actual marine systems.

5. Discussion and Conclusions

5.1. Discussion

In this paper, the proposed machine vision detection model is an improvement of the basic YOLOv5 model. Considering the problem of the existing image dataset containing few leak images, which is not conducive to model training, image enhancement and expansion methods are used to create a specific leak database to address this issue. This method allows for more targeted model training and feature extraction. In addition, the addition of Conv and SPPF modules reduces the computational cost and increases the processing speed of the model. Then, BoT and SA modules are introduced in the backbone network to capture potential global leak relationships and enhance critical information, thus improving the accuracy and precision of the model. Meanwhile, the overall performance of the model is improved compared with the basic model. The proposed improved model is found to have higher detection accuracy for drop-type leaks, with an accuracy rate of 97.2%. In real ship piping system failures, the initial characteristic of a leak is drop leakage. Flow and spray evolve gradually from the drop. Therefore, accurately detecting the occurrence of a drop is crucial for making necessary maintenance decisions and avoiding the occurrence of flow and spray failures in ship leak detection.

Machine vision technology faces several influencing factors in applying E/R leak detection. First of all, due to the influence of the shooting angle, the obtained leak image features are not obvious, which can easily lead to missed detection. The cloud station control system installed in the ship’s E/R video monitoring system can adjust the shooting angle to solve this problem. This adjustment will enable the obtained images to contain more features. It can control the camera angle, focal length, and other parameters in conjunction with the detection results to achieve active sensing of the leaked features. Second, the visual detection model requires many image databases to complete the training, improving the model’s performance. However, the current database of leak images has a relatively small number of images. Therefore, collecting more images with leak features to expand the image database can improve the model’s detection capability. Finally, the ship’s E/R is a closed space with poor lighting and a complex structure, and some of the leak features are integrated with the background. Deep image processing techniques can be considered to eliminate background interference and improve the quality of the images.

5.2. Conclusions

This paper presents a marine system leak monitoring method based on machine vision technology. Considering the characteristics of the ship’s E/R environment, the shortcomings of existing leak detection methods, and the problems of visual detection algorithms, the paper proposes an improved visual algorithm based on YOLOv5 to achieve leak detection in marine systems. Although artificial intelligence (AI) algorithms have been used in some leak monitoring tasks, to the best of our knowledge, no previous research investigations have used machine vision to detect the occurrence of marine leaks, and this paper is the first to apply it. In addition, we built a unique ship leak database and adopted effective strategies such as image enhancement and image expansion to obtain various types of leak feature and adapt to the ship E/R environment. Finally, Bottleneck Transformer and Shuffle Attention modules were added to the basic YOLOv5 network, which enables the detection model to have good E/R environment adaptability and improves image detection accuracy. The model was tested and verified through leak images obtained on a real ship. The results show that seepage, drop, flow, and spray all have high recognition accuracy (3.2%, 1.5%, 6.1%, and 1.5%), especially for the initial drop feature, which has higher detection accuracy. The visual leak detection model can be applied to the online monitoring of marine systems, effectively preventing leaks from threatening ship navigation safety.

Although the proposed method in the research can achieve better detection of leak anomalies, due to the limited leak image features and the large target size, it cannot distinguish seepage and spray accurately. Therefore, further research will introduce a richer database into the model to enhance the generalization capability and compress the model to fit E/R scenarios well, as well as to achieve long-term continuous observation of monitoring objects. Considering that the existing visual technology involves passive recognition, AI methods could be fully utilized to accomplish proactive visual perception of leaks, track leak paths, focus on and locate the source of leakage accurately, and based on leak characteristics, estimate leak volume and predict leak evolution trends, which is an important research direction for the future.

Author Contributions

X.J. and Y.D. contributed equally to this work. Conceptualization, X.J. and Y.D.; methodology, Y.D. and P.Z.; software, Y.D.; validation, Y.D.; data curation, X.J. and Y.D.; writing—original draft preparation, Y.D.; writing—review and editing, X.J., Y.W., T.D. and Y.Z. (Yongjiu Zou); supervision, P.Z., Y.Z. (Yuwen Zhang) and P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (2022YFB4300805) and the Fundamental Research Funds for the Central Universities (3132023214, LJKMZ20220360).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, P.; Gao, Z.; Cao, L.; Dong, F.; Zou, Y.; Wang, K.; Zhang, Y.; Sun, P. Marine Systems and Equipment Prognostics and Health Management: A Systematic Review from Health Condition Monitoring to Maintenance Strategy. Machines 2022, 10, 72. [Google Scholar] [CrossRef]
Wang, C.; Han, F.; Zhang, Y.; Lu, J. An SAE-based resampling SVM ensemble learning paradigm for pipeline leakage detection. Neurocomputing 2020, 403, 237–246. [Google Scholar] [CrossRef]
Liu, C.; Li, Y.; Xu, M. An integrated detection and location model for leakages in liquid pipelines. J. Pet. Sci. Eng. 2019, 175, 852–867. [Google Scholar] [CrossRef]
Li, J.; Zheng, Q.; Qian, Z.; Yang, X. A novel location algorithm for pipeline leakage based on the attenuation of negative pressure wave. Process Saf. Environ. Prot. 2019, 123, 309–316. [Google Scholar] [CrossRef]
Liu, B.; Jiang, Z.; Nie, W.; Wen, H. Application of VMD in Pipeline Leak Detection Based on Negative Pressure Wave. J. Sens. 2021, 2021, 8699362. [Google Scholar] [CrossRef]
Yao, Z.; Yu, Y.; Yao, J. Artificial neural network–based internal leakage fault detection for hydraulic actuators: An experimental investigation. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng. 2017, 232, 369–382. [Google Scholar] [CrossRef]
Diao, X.; Jiang, J.; Shen, G.; Chi, Z.; Wang, Z.; Ni, L.; Mebarki, A.; Bian, H.; Hao, Y. An improved variational mode decomposition method based on particle swarm optimization for leak detection of liquid pipelines. Mech. Syst. Signal Process. 2020, 143, 106787. [Google Scholar] [CrossRef]
Rai, A.; Kim, J.-M. A novel pipeline leak detection approach independent of prior failure information. Measurement 2021, 167, 108284. [Google Scholar] [CrossRef]
Zhang, M.; Chen, X.; Li, W. A Hybrid Hidden Markov Model for Pipeline Leakage Detection. Appl. Sci. 2021, 11, 3138. [Google Scholar] [CrossRef]
Hafiz, A.M.; Parah, S.A.; Bhat, R.U. Attention mechanisms and deep learning for machine vision: A survey of the state of the art. arXiv 2021, arXiv:2106.07550. [Google Scholar]
Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep Learning for Generic Object Detection: A Survey. Int. J. Comput. Vis. 2019, 128, 261–318. [Google Scholar] [CrossRef] [Green Version]
Girshick, R.B.; Donahue, J.; Darrell, T.; Malik, J.; Recognition, P. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Xu, J.; Zou, Y.; Tan, Y.; Yu, Z. Chip Pad Inspection Method Based on an Improved YOLOv5 Algorithm. Sensors 2022, 22, 6685. [Google Scholar] [CrossRef] [PubMed]
Redmon, J.; Divvala, S.K.; Girshick, R.B.; Farhadi, A.J.; Recognition, P. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A.J.; Recognition, P. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
Redmon, J.; Farhadi, A.J.A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M.J.A. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Feng, J.; Yi, C. Lightweight Detection Network for Arbitrary-Oriented Vehicles in UAV Imagery via Global Attentive Relation and Multi-Path Fusion. Drones 2022, 6, 108. [Google Scholar] [CrossRef]
Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. MSFT-YOLO: Improved YOLOv5 Based on Transformer for Detecting Defects of Steel Surface. Sensors 2022, 22, 3467. [Google Scholar] [CrossRef]
Liu, Y.; He, G.; Wang, Z.; Li, W.; Huang, H. NRT-YOLO: Improved YOLOv5 Based on Nested Residual Transformer for Tiny Remote Sensing Object Detection. Sensors 2022, 22, 4953. [Google Scholar] [CrossRef]
Wang, C.; Sun, W.; Wu, H.; Zhao, C.; Teng, G.; Yang, Y.; Du, P. A Low-Altitude Remote Sensing Inspection Method on Rural Living Environments Based on a Modified YOLOv5s-ViT. Remote Sens. 2022, 14, 4784. [Google Scholar] [CrossRef]
Yu, Y.; Zhao, J.; Gong, Q.; Huang, C.; Zheng, G.; Ma, J. Real-Time Underwater Maritime Object Detection in Side-Scan Sonar Images Based on Transformer-YOLOv5. Remote Sens. 2021, 13, 3555. [Google Scholar] [CrossRef]
Dong, X.; Yan, S.; Duan, C. A lightweight vehicles detection network model based on YOLOv5. Eng. Appl. Artif. Intell. 2022, 113, 104914. [Google Scholar] [CrossRef]
Guo, G.; Zhang, Z. Road damage detection algorithm for improved YOLOv5. Sci. Rep. 2022, 12, 15523. [Google Scholar] [CrossRef]
Hou, H.; Chen, M.; Tie, Y.; Li, W. A Universal Landslide Detection Method in Optical Remote Sensing Images Based on Improved YOLOX. Remote Sens. 2022, 14, 4939. [Google Scholar] [CrossRef]
Jin, Y.; Gao, H.; Fan, X.; Khan, H.; Chen, Y. Defect Identification of Adhesive Structure Based on DCGAN and YOLOv5. IEEE Access 2022, 10, 79913–79924. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K.J.A. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Zhang, J.-L.; Su, W.-H.; Zhang, H.-Y.; Peng, Y. SE-YOLOv5x: An Optimized Model Based on Transfer Learning and Visual Attention Mechanism for Identifying and Localizing Weeds and Vegetables. Agronomy 2022, 12, 2061. [Google Scholar] [CrossRef]
Zhai, X.; Wei, H.; He, Y.; Shang, Y.; Liu, C. Underwater Sea Cucumber Identification Based on Improved YOLOv5. Appl. Sci. 2022, 12, 9105. [Google Scholar] [CrossRef]
Dai, G.; Hu, L.; Fan, J.; Yan, S.; Li, R. A Deep Learning-Based Object Detection Scheme by Improving YOLOv5 for Sprouted Potatoes Datasets. IEEE Access 2022, 10, 85416–85428. [Google Scholar] [CrossRef]
Srinivas, A.; Lin, T.Y.; Parmar, N.; Shlens, J.; Abbeel, P.; Vaswani, A. Bottleneck Transformers for Visual Recognition. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 16514–16524. [Google Scholar]
Zhang, Q.-L.; Yang, Y. SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 2235–2239. [Google Scholar]

Figure 1. Flow chart for leak detection.

Figure 2. Ship’s E/R video monitoring system.

Figure 3. MSRCR algorithm processing of seepage image results: (a) the original image; (b) the processed image.

Figure 4. Leak database composed of four different leak types (a) Seepage type, (b) Drop type, (c) Flow type, (d) Spray type.

Figure 5. The improved YOLOv5 model (the red parts are the improved modules).

Figure 6. Focus and Conv modules.

Figure 7. SPP and SPPF modules.

Figure 8. The overall diagram of the Bottleneck Transformer. This includes: (a) multi-head self attention (MHSA) and concepts with different block boundaries; (b) a multi-head self-attention (MHSA) layer.

R_{h}

and

R_{w}

are represented as relative position codes for height and width, respectively. The attention logic is

q k^{T} + q r^{T}

, where

q

,

k

, and

r

, respectively, represent query, key, and position encodings (BoT uses relative distance encoding).

\oplus

and

\otimes

represent matrix addition and matrix multiplication, respectively, while

1 \times 1

represents a pointwise revolution. With the use of multiple heads, feature extraction becomes more comprehensive.

Figure 8. The overall diagram of the Bottleneck Transformer. This includes: (a) multi-head self attention (MHSA) and concepts with different block boundaries; (b) a multi-head self-attention (MHSA) layer.

R_{h}

and

R_{w}

are represented as relative position codes for height and width, respectively. The attention logic is

q k^{T} + q r^{T}

, where

q

,

k

, and

r

, respectively, represent query, key, and position encodings (BoT uses relative distance encoding).

\oplus

and

\otimes

represent matrix addition and matrix multiplication, respectively, while

1 \times 1

represents a pointwise revolution. With the use of multiple heads, feature extraction becomes more comprehensive.

Figure 9. An overview of Shuffle Attention (SA). SA groups input features, and each group is processed internally using SA Unit. The green section represents the channel attention branch, which uses a pair of parameters to scale and move the channel vector. The blue part represents the spatial attention mechanism, generating features similar to channel branches. Then, the internal information of the two groups is fused. Finally, the Channel Shuffle operation is used to achieve information communication between the two groups.

Figure 10. The training loss curve.

Figure 11. PR curve for four different types of leak.

Figure 12. F1 score curves for four different types of leak.

Figure 13. Confidence of detection results for four different types of leak: (a) seepage; (b) drop; (c) flow; (d) spray.

Figure 14. Experimental results of each model: (a) [email protected] curve; (b) [email protected]:0.95 curve; (c) precision curve; (d) recall curve.

Table 1. A comparison of Focus and Conv.

Class	FLOPs	Parameters
Focus	0.35 s	0.034 M
Conv	0.08 s	0.008 M

Table 2. Detection results of different indicators of leak.

Class	Precision	Recall	F1 Score	[email protected]	[email protected]:0.95
Seepage	82.7%	69.7%	75.6%	79.3%	33%
Drop	97.4%	96.7%	97.1%	97.2%	56.5%
Flow	92.4%	87.2%	89.7%	86.4%	46.8%
Spray	84.1%	63.7%	72.5%	72.7%	56.5%
All	89.1%	79.3%	84%	83.5%	40.1%

Table 3. The performance of different models on the leak database.

Model	[email protected]	[email protected]:0.95	Parameters (M)	FLOPs (G)	Speed-GPU (ms)
YOLOv5 (baseline)	81.1%	39.3%	7.07	16.3	2.2
+Conv, SPPF	82.9%	39.7%	7.02	15.8	2.1
+BoT	83.1%	40.0%	6.69	15.5	2.1
The proposed model	83.5%	40.1%	6.69	15.5	2.0

Table 4. Comparison of mAP values for four types of leak.

Category	YOLOv5	The Proposed Model
Seepage	76.0%	79.2%
Drop	95.7%	97.2%
Flow	80.3%	86.4%
Spray	71.2%	72.7%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, X.; Dai, Y.; Zhang, P.; Wang, Y.; Du, T.; Zou, Y.; Zhang, Y.; Sun, P. Study of a Machine Vision Approach to Leak Monitoring of a Marine System. J. Mar. Sci. Eng. 2023, 11, 1275. https://doi.org/10.3390/jmse11071275

AMA Style

Jiang X, Dai Y, Zhang P, Wang Y, Du T, Zou Y, Zhang Y, Sun P. Study of a Machine Vision Approach to Leak Monitoring of a Marine System. Journal of Marine Science and Engineering. 2023; 11(7):1275. https://doi.org/10.3390/jmse11071275

Chicago/Turabian Style

Jiang, Xingjia, Yingwei Dai, Peng Zhang, Yucheng Wang, Taili Du, Yongjiu Zou, Yuewen Zhang, and Peiting Sun. 2023. "Study of a Machine Vision Approach to Leak Monitoring of a Marine System" Journal of Marine Science and Engineering 11, no. 7: 1275. https://doi.org/10.3390/jmse11071275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Study of a Machine Vision Approach to Leak Monitoring of a Marine System

Abstract

1. Introduction

2. Related Work and the Framework

2.1. Related Work

2.2. Framework

3. Methodology

3.1. Leak Database

3.2. Base YOLOv5 Model

3.3. Improved YOLOv5 Model

3.3.1. Improved Detection Performance

3.3.2. Improved Feature Extraction

3.4. Evaluation Metrics

4. Case Study

4.1. Running Environment

4.2. Experimental Results

4.3. Comparisons

5. Discussion and Conclusions

5.1. Discussion

5.2. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI