1. Introduction
Laver aquaculture is an important part of the marine fishery economy and occupies an absolutely dominant position in aquaculture. However, with the development of the economy, the rapid growth of laver aquaculture zones has also brought about marine environmental problems, such as green tides [
1]. On the other hand, the large-scale reproduction of enteromorpha will cover the aquaculture boxes and suspended nets, thereby hiding the mariculture zone, which may affect marine traffic and port transportation. Therefore, monitoring the growth status and coverage of laver and other marine products in a timely manner is highly important [
2].
Routine identification and management methods of the laver aquaculture zone mainly include the use of statistics on the sea area application and confirmation of the registration records of the farmers. This method can ensure accuracy of sea area usage statistics but involves a large workload and a long cycle, and thus, it should not be used as the mainstream method for the identification and monitoring of aquaculture areas [
3]. Satellite remote sensing technology has become an important means of surface monitoring because it is not restricted by time and space and has a wide coverage area [
4]. Various methods have been formulated based on remote sensing technology, including the visual interpretation method, traditional classification method based on spectral statistics, morphological classification method, and object-oriented method. The visual interpretation method based on expert experience [
5,
6] can reflect the aquaculture area and location more realistically. However, although this method can meet the requirements of a classified survey, it involves a large amount of work, takes a long time, and has high requirements for interpreters and poor universality. Some researchers in Reference [
7] completed the automatic extraction of offshore aquaculture zones based on spectral information but with low precision and data redundancy. The authors in Reference [
8] used edge detection and multi-scale feature fusion to extract the shape and texture information of the aquaculture zone and then extract the aquaculture zone. This method can realize the automatic extraction of different types of aquaculture zones with high precision but has higher requirements on data sources. Pixel-based experimental analysis is difficult to use in solving the phenomenon of salt and pepper noise due to large internal error between high spatial resolution. The authors in Reference [
9] overcame the salt and pepper noise by using the object-oriented extraction method. However, classification accuracy for this method depends on the segmentation scale, the accurate acquisition of the sample, and the construction of the feature space. These elements need to be reconstructed as the image is updated, making it difficult to repeat.
Recently, deep learning has achieved remarkable achievements in object recognition, image classification, and other fields of machine vision. Deep learning models rely on multiple neural network layers to learn representations of data with multiple levels of abstraction. These methods fit the model through forward calculation and the backpropagation algorithm to achieve the optimal state of the model, and finally realize the purpose of data recognition, classification, and prediction [
10,
11]. Networks such as RNN (recurrent neural network) and its variants, LSTM (long short-term memory), have memory function, which can keep the output of the neuron from the previous moment to the next moment [
12,
13], so they are more suitable for processing time series data, but they are slightly inferior to image classification problems. The authors in Reference [
14] used the original fully connected neural network to identify different classes of thumbnails. The method is easy to implement, but it has a poor processing effect for complex experiments, such as more parameters and high dimensional data. Targeting the disadvantages of fully connected neural networks, the authors in References [
15,
16] proposed a convolutional neural network, which has the advantages of weight sharing, so it can greatly reduce the number of training parameters and improve training performance. Then, some researchers proposed AlexNet to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. This method further improves the classification accuracy of convolutional neural networks [
17]. However, the above networks do not consider the semantic information of the images, and cannot accurately classify the internal categories of the images. The aquaculture net contains a large amount of seawater information, which is actually a kind of noise, which will seriously affect the task of extracting and identifying the aquaculture net. The emergence of semantic segmentation methods has fully solved this problem, such as FCN (fully convolutional neural networks) [
18,
19,
20,
21], U-Net [
22], Segnet [
23], Deeplab series [
24,
25], and other semantic segmentation algorithms, which learn in end-to-end form, provide pixel semantic information to complete the pixel-level classification of images, and take into account the spectral, spatial, and contextual information, and have high classification accuracy. Among them, the FCN network is the forerunner of semantic segmentation, whose main characteristics are encoding-decoding structure and skip connection. Most semantic segmentation networks are improved based on the FCN structure.
The above methods belong to the category of supervised learning. They rely on a large number of training samples and can only extract the aquaculture zone and cannot accurately count the number of aquaculture carriers in the aquaculture zone. In addition, the repeated down-sampling of the image by FCN will result in the loss of a large amount of edge information and make the pixels at the edge of the object smoother. Moreover, convolution only extracts local receptive fields. Although repeated extraction can eventually cover the entire image, the correlation between a pixel and all other pixels in the whole image cannot be extracted, even at the last convolutional layer. Conditional random filed (CRF) is a learning model that generates pixel classification results by calculating the position and color relationships of each pixel and all other pixels in the image, thereby providing a new idea for accurately extracting culture carriers in the aquaculture zone. Based on the characteristics of dense and regular distribution of seawater and nets in the laver aquaculture zone, an inaccurate supervised classification method based on FCN and CRF is proposed. The Lianyungang laver aquaculture area in Jiangsu Province is taken as the research area to design a classification model that can extract the aquaculture zone and count the area and quantity of the laver aquaculture net simultaneously.
4. Discussion
In this paper, we designed an inaccurate supervised classification model for the extraction of laver aquaculture nets, which is mainly based on FCN and CRF. In the experiment, three different FCN structures of FCN-8s, FCN-16s, and FCN-32s were compared. Through the accuracy evaluation of the extraction results of the three methods with different parameters and the comparison of the effect display, it can be found that the FCN-8s with three skip architectures have the best results and had fully converged when the number of iterations reached 25,000. The FCN-32s that do not use skip architecture to combine the low-level features of the encoder and the FCN-16s that use only one skip architecture do not work well. They mistake the fishing boat for laver, lose the boundary information of the aquaculture zones, and generate more noise inside the classified aquaculture zones. Therefore, it is proven that the FCN-8s structure is the best, and it is more suitable for the extraction of aquaculture zones in areas with complex features than the other two structures. Then, this article used MLC, SVM, and NN to extract aquaculture zones. By comparing FCN with the above methods, we can see that the output of FCN has clearer boundaries and less noise, especially for abandoned aquaculture zones, FCN has better identification and there is almost no misclassification. After using FCN to extract aquaculture areas, we set up CRFs with different parameters for experiments and comparisons. It can be found that CRFs with higher epoch have higher recognition accuracy and are mainly affected by two parameters, and . After CRF processing, laver strips can be accurately extracted without damaging the output of FCN. We can count the number and area of nets based on the results of the CRF.
The previous studies extracted the aquaculture zones in the offshore area based on supervised classification, and did not divide the culture carriers inside the aquaculture zones. This only achieved a rough extraction of the aquaculture zones. If the internal carriers are segmented, more detailed labels are needed to train the network model. This method will inevitably increase the workload and preprocessing time of the experiment. The inaccurate supervised classification model we proposed can not only extract the laver aquaculture zones but also accurately obtain the area and quantity of culture carriers, which is beneficial to the management of marine resources. However, there are still many limitations in our research. For example, it is easy to mistakenly identify floating objects near the aquaculture zones as laver culture carriers, which will affect our final area statistics. On the other hand, the model in this paper contains more than 40% seawater information in the labeling process of the aquaculture zone, which has a high fault tolerance rate. This is mainly because the features in our study area are relatively single. Features include only aquaculture zones and seawater, which is quite advantageous for our model. We still have a lot of work to do to improve the universality of the model.
5. Conclusions
This paper designed an inaccurate supervised classification model in allusion to the characteristics of intensive and regular distribution of laver aquaculture zone and the problem that supervised learning requires a large number of samples. The proposed model can extract aquaculture zone and count the area and quantity of the laver aquaculture net simultaneously. The study area focused on the Lianyungang laver aquaculture zone of Jiangsu Province. The conclusions are as follows.
(1) The coefficient of the classification results obtained by FCN-8s can reach 0.984 and was 0.99, which proves that the FCN network model can complete the classification of laver with high accuracy.
(2) Using CRF post-treatment, the individual laver aquaculture net can be divided accurately, and the overall effect was positive, which proves that the proposed model can extract the area and quantity of the laver cultivation net well and has higher reference value.
(3) The inaccurate supervised classification model can effectively identify the laver aquaculture zones and has a high fault tolerance rate, which can meet the requirements of the inaccurate supervised classification of the coarse label and fine classification. It saves considerable labeling time without affecting the final classification accuracy. The data provide a foundation for future laver farming estimation and offshore resource planning and technical support for marine ecological regulation and maritime traffic management.
Based on the analysis of the experimental conclusions, although the model proposed in this paper has achieved certain results, there is still much work to be done in the future. For example, the recognition accuracy of the target object needs to be improved by adding post-processing operations to avoid misclassification of floating objects at sea. On the other hand, building a more complete network model by adding relevant experimental data of complex regions is an essential research area. Through these works, the fine classification of complex scenes not only limited to single features can be further improved.