Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism

Li, Yirong; Yu, Xiang; Zhang, Jiahua; Zhang, Shichao; Wang, Xiaopeng; Kong, Delong; Yao, Lulu; Lu, He

doi:10.3390/rs16111860

Open AccessArticle

Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism

by

Yirong Li

¹,

Xiang Yu

¹

,

Jiahua Zhang

^1,2,*

,

Shichao Zhang

¹

,

Xiaopeng Wang

¹,

Delong Kong

¹

,

Lulu Yao

¹ and

He Lu

¹

Remote Sensing Information and Digital Earth Center, College of Computer Science and Technology, Qingdao University, Qingdao 266071, China

²

Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(11), 1860; https://doi.org/10.3390/rs16111860

Submission received: 27 March 2024 / Revised: 6 May 2024 / Accepted: 10 May 2024 / Published: 23 May 2024

(This article belongs to the Special Issue Remote Sensing for the Study of the Changes in Wetlands)

Download

Browse Figures

Versions Notes

Abstract

The Yellow River Delta wetlands in China belong to the coastal wetland ecosystem, which is one of the youngest and most characteristic wetlands in the world. The Yellow River Delta wetlands are constantly changed by inland sediment and the influence of waves and storm surges, so the accurate classification of the coastal wetlands in the Yellow River Delta is of great significance for the rational utilization, development and protection of wetland resources. In this study, the Yellow River Delta sentinel-2 multispectral data were processed by super-resolution synthesis, and the feature bands were optimized. The optimal feature-band combination scheme was screened using the OIF algorithm. A deep learning model attention mechanism ResNet based on feature optimization with attention mechanism integration into the ResNet network is proposed. Compared with the classical machine learning model, the AM_ResNet model can effectively improve the classification accuracy of the wetlands in the Yellow River Delta. The overall accuracy was 94.61% with a Kappa of 0.93, and they were improved by about 6.99% and 0.1, respectively, compared with the best-performing Random Forest Classification in machine learning. The results show that the method can effectively improve the classification accuracy of the wetlands in the Yellow River Delta.

Keywords:

wetland classification; Yellow River Delta; band selection optimization; attention mechanism; ResNet; deep learning

1. Introduction

Wetlands are transitional zones between terrestrial and aquatic systems where the water table is usually at or near the surface or where the land is covered by shallow water [1]. Healthy natural wetland ecosystems play an important role in China’s sustainable development. The wetland area, though small in size compared to the total area, plays a significant role in the ecosystem [2]. It is essential for maintaining the integrity of the biological chain, protecting biodiversity, controlling climate change, conserving water, and cleaning up the environment [3]. The wetlands of the Yellow River Delta are an important ecosystem in China with extensive biodiversity and play an important role in ecological sustainable development [4]. The uniqueness of this wetland area lies in its diversity of wetland types and ecosystem functions, including a variety of interrelated components that make it unique in terms of material supply capacity and function [5]. However, mounting pressures threaten the delicate balance of this ecosystem. The escalation of industrial, agricultural, and domestic water usage in the Yellow River Basin, coupled with the frequent occurrence of extreme arid climates, has led to a reduction in the inbound water flow of the Yellow River, resulting in the degradation of certain wetlands [6]. Simultaneously, inland sediment continues to accumulate in the estuary due to hydraulic transportation, fostering the formation of new wetlands. Conversely, coastal erosion caused by waves and storm surges exacerbates wetland loss, with the interaction of these factors contributing to the continuous transformation of wetland landscapes. This poses a great challenge for the conservation of wetland ecosystems [7]. The morphological evolution of the Yellow River Delta is significantly influenced by climatic variations and anthropogenic alterations within both the river basin and the deltaic region [8]. Therefore, understanding and monitoring the classification and changes of the wetlands in the Yellow River Delta is crucial for protecting this valuable resource, maintaining ecological balance, and responding to major challenges such as climate change.

Remote sensing is a comprehensive earth observation technology [9]. Remote sensing science can provide accurate, comprehensive, and repeatable active observation and monitoring of large land areas [10]. Remote sensing technology can provide large-scale and high-resolution geographic data, allowing us to quickly gather dynamic information on wetland areas [11]. Remote sensing technology enables the systematic observation and analysis of wetland evolution, facilitating the monitoring of changes in water bodies, wetland vegetation, and land use dynamics. This information holds significant scientific value and serves as a robust foundation for informed decision-making regarding wetland protection and resource management [12].

In remote sensing image classification, utilizing spectral data in conjunction with texture features can extract more information from the data and provide richer information [13]. Xue and Liu [14] improved the classification accuracy of wetland marshes using spectral and textural features and principal component analysis. Wu et al. [15] used data fusion to effectively improve the quality of the data. Exponential bands and texture features need special consideration in improving land use classification performance [16]. Hui et al. [17] enhanced the extraction of wetland information using feature optimization methods. Meanwhile, most of the data with stable hydrological features and fewer clouds were selected for classification. Zheng et al. [18] used Sentinel-2 data on 28 February 2017 for the performance evaluation of land use and land cover classification. Wang et al. [19] used Landsat 8 imagery on 1 October 2012 for coastal wetland cover classification, and the best classification results were obtained. Cui et al. [20] used the Landsat 8 image of July 20, which has relatively stable hydrological features, for the classification of the Yellow River Delta.

Traditional wetland extraction methods mainly use manual visual interpretation techniques [21]. Traditional classification methods are often hindered by factors like image resolution, noise, and occlusion, making accurate recognition and classification challenging. Additionally, the wetlands in the Yellow River Delta exhibit both homogeneous and heterogeneous characteristics, further complicating the classification process. [22]. In recent years, with the emergence of deep learning (DL) techniques, significant breakthroughs have been made in RS image classification based on deep learning, providing new opportunities for the research and development of RS image classification [23].

Deep learning technology is becoming increasingly important, and it has been proven to be a breakthrough and extremely powerful tool in many fields [24]. DL provides significant advantages in wetland classification in the Yellow River Delta through automation, feature learning, large-scale data processing, spatio-temporal data processing, and higher classification accuracy, which is expected to improve wetland monitoring and protection. Mahdianpari et al. [25] used a deep convolutional neural network (CNN) classification framework for wetland classification and obtained better classification accuracy than random forests. Jamali and Mahdianpari [26] utilized the Swin Transformer for coastal wetland classification, resulting in significantly improved accuracy. Jamali et al. [27] used CNN combined with the 3D generative adversarial network for wetland classification with limited training data and obtained optimal results. Also, to speed up model training, many algorithms use migration learning for pre-training [28]. Migration learning is a method of migrating labeled data or knowledge structures from related domains to accomplish or improve the learning effect of the target domain or task. Due to the emergence of more and more machine learning application scenarios, the application of migration learning in deep learning is becoming more extensive [29].

Nevertheless, the predominant wetland classification methodologies employed in the Yellow River Delta rely on machine learning techniques, which often incur high computational resource demands and exhibit sluggish performance improvements in achieving accuracy. Additionally, when employing deep learning methodologies to handle large datasets, the classification accuracy of Convolutional Neural Networks (CNNs) tends to decline as network depth increases. In response to this challenge, He et al. [30] introduced a pioneering residual learning framework called ResNet for image classification. This innovative approach aims to streamline the training process of networks with greater depth compared to conventional methodologies. Gulsan Alp et al. obtained the best training accuracy using ResNet for land cover and land use classification [31]. Raoof Naushad et al. obtained the best results using ResNet combined with transfer learning for land cover classification [32].

Meanwhile, in order to make the network focus on local features, the introduction of attention mechanisms in deep learning networks can make the limited attention focus on the key information to save resources and obtain the most effective information quickly. By introducing the attention mechanism, the neural network can automatically learn and selectively focus on the important information in the input to improve the performance and generalization ability of the model [33]. Attentional mechanisms have also been applied to remote sensing. Duan et al. introduced a channel attention mechanism to improve land classification accuracy [34]. Zhang et al. introduced a channel attention mechanism in the network to improve the classification accuracy of high-resolution remote sensing images [35].

In this study, we employed a feature preference scheme to select dataset images, followed by processing the data through an enhanced ResNet network integrated with an attention mechanism. This approach aimed to achieve a comprehensive classification of wetlands within the Yellow River Delta. Subsequently, we conducted a comparative analysis with existing methodologies to evaluate the effectiveness of our proposed method. Finally, we discussed and interpreted the results, drawing meaningful conclusions.

2. Study Area and Data

2.1. Study Area

The Yellow River Delta is the area within China where the Yellow River enters the sea, and is one of the largest deltas in China. Located at the confluence of fresh and salt water [2], it is bordered by the Bohai Sea to the east, the Jiaodong Peninsula to the south, and the Liaodong Peninsula to the north across the Bohai Sea. The climate is mild and belongs to the temperate monsoon climate. It is relatively humid in the summer and dry in the winter, with obvious seasonality. The Yellow River Delta Wetland is the most widely preserved, perfect, and youngest wetland ecosystem in the warm temperate zone of China [36]. With its unique geographic location and ecosystem, the Yellow River Delta wetland is one of the few estuarine wetland ecosystems in the world. In this study, the selected study area is 118°34′~119°20′E and 37°35′~38°11′N. The study area is shown in Figure 1.

2.2. Data Description

2.2.1. Sentinel-2 Data

Sentinel-2 is a wide and fine spatial resolution satellite imaging mission developed by the European Space Agency (ESA) in the framework of the Copernicus program of the European Union, primarily for terrestrial monitoring [37]. It carries a multispectral imager (MSI) capable of acquiring images of vegetation, soil and water cover, inland waterways, coastal areas, etc. Sentinel-2 combines high spatial resolution, a short replay period, and a rich spectral spectrum [38]. Sentinel-2 data have high-resolution and multispectral characteristics and can provide optical imaging with three bands in the red-edge range, which is very effective for monitoring information on the health of vegetation. Satellite data are widely used in areas such as land monitoring, disaster support, climate and environmental monitoring, and ocean and polar monitoring [39]. In this study, we used Sentinel-2 Level-2A level multispectral data on 29 October 2019 with low cloudiness, while crops were harvested during this month with widely varying land characteristics and distinctive features [20]. The levels and product descriptions of the Sentinel data are shown in Table 1, and all the spectral bands of Sentinel-2 are shown in Table 2.

In this work, four views of L2A-level data from Sentinel-2A on October 29 are selected, with less than 1% cloud cover, and the data have been atmospherically corrected.

2.2.2. Wetland Classification System

The classification criteria and system of this study were based on the specific situation of wetland distribution in the study area of the Yellow River Delta. First, referring to the Convention on Wetlands of International Importance, especially the Waterfowl Habitat (referred to as the Convention on Wetlands, also known as the Ramsar Convention), and also using the national standard of China’s wetland classification [40] as the main reference criterion, the following criteria for wetland classification were formulated, as shown in Table 3. Since the physical characteristics of the Yellow River Delta are affected by human factors, the wetlands are divided into three major parts: natural wetlands that are not affected by human beings, artificial wetlands that are affected by human factors, and non-wetlands. The natural wetlands include shrub swamps, herbaceous swamps, rivers and mudflats, the artificial wetlands include reservoirs and salt flats, and the non-wetlands include building land, agricultural land, offshore waters, and other land.

The training and validation sample sets were partly derived from the China Wetland Marsh Classification and Distribution dataset Geo-Ecology Network (www.gisrs.cn/, accessed on 29 October 2019); the data of the Field Scientific Observatory of Wetland Ecosystems in the Yellow River Estuary and the China Wetland Distribution [41] dataset were referenced. On the basis of the above datasets, visual interpretation using Google Earth was used for further validation and analysis, and the classification results were calibrated and corrected by combining with a large number of field survey sample points. The test set is randomly sampled, and the dataset has a total of 12,866,085 pixels. The proportions of the training set and test set are 20% and 80%, respectively. Figure 2 shows the true value images of the samples.

3. Methodology

The research flow of this study is shown in Figure 3, where the data preprocessing is performed first and the dataset is subjected to further feature optimization after the data preprocessing. Initially, the processed dataset comprises three main components: 12 spectral waveform data, 10 texture waveform data, and 8 exponential waveform data. The Random Forest Algorithm is employed to assess the significance of 18 waveforms, excluding the spectral waveforms, resulting in an importance ranking from which bands with high importance are retained. Subsequently, band combination is performed to compute scores for various band combinations using the Orthogonalized Information Fusion (OIF) approach. The three bands with the highest scores are selected for extraction via Principal Component Analysis (PCA) to derive texture features. The final dataset consists of 4 exponential bands, 12 spectral bands, and 10 texture features, which are integrated into waveforms, yielding a total of 26 bands. These data are then fed into an Attentional Mechanism coupled with ResNet for experimentation, and the results are compared with those obtained using Random Forest (RF) and ResNet series networks. The experimental methodology is elucidated in the Section 3, while the experimental findings are discussed in the Section 4.

3.1. Data Preprocessing

We have performed data processing on Sentinel 2 L2A class data images with less than 1% cloud cover. Firstly, the data are synthesized with super-resolution to obtain data with a resolution much higher than that of the original image. Further, we extract the exponential band of the super-resolution synthesized image. It is also subjected to principal component analysis (PCA) to extract important information from the image. Then, we extract texture features from the data obtained by PCA. Finally, the extracted exponential band and texture feature information are band-fused to facilitate subsequent processing. The flow of the data processing process is shown in Figure 4.

3.1.1. Super-Resolution Synthesis

Owing to the different resolution of the Sentinel-2 bands, resampling operations are performed first before processing the data, but this is not necessarily the best method of operation. The 60-meter spatial resolution of the Sentinel-2 imaging data can be changed to a finer spatial resolution, enabling more information to be extracted from the observational data [42]. If we directly resample the 60 m data to 10 m and change the original 1 raster to 36 raster data, there must be quite a lot of bias. Moreover, other optical images provide panchromatic images that can be fused with multispectral bands to improve resolution. However, Sentinel-2 imagery does not have panchromatic bands, so this idea could not be realized. Xinya Wang et al. [43] use a combination of super-resolution and deep learning for feature classification, urban change detection, and ecosystem monitoring. Sentinel-2 can use Sen2Res to achieve super-resolution synthesis in the 20 m and 60 m bands, synthesized into the 10 m band, with synthesis results that are substantially better than resampling results. In this resolution enhancement method, starting from the highest resolution bands, band-dependent information is separated from information that is common to all bands. This model is then applied to unmix low-resolution bands, preserving their reflectance, while propagating band-independent information to preserve the sub-pixel details. This approach has been integrated into the processing plug-in for Sentinel data [44]. Figure 5 is a comparison of the B1 band of the original 60 m of spectral data.

3.1.2. Feature Band Extraction

The feature information extracted from remote sensing images generally contains spectral bands, spectral indices, texture features, geometric features, etc. [13]. Vegetation and water body indices were first extracted, and the vegetation indices (VIs) described the remotely sensed vegetation characteristics [45]. The water body index is expressed as a significant increase in the contrast between the water body and the building, which is conducive to the accurate extraction of information about the water body [46]. Remote Sensing Ecological Index (RSEI) is a technique based on remote sensing technology to construct and enhance spectral characteristics through the combination of different bands of visible and infrared bands of satellite multispectral imagery to reflect the characteristics of a certain feature. Generally speaking, remote sensing indices are used to help differentiate between different features, such as feature identification and classification, and different features also have corresponding indices with distinctive characteristics, which can be used as a basis for differentiation. Due to different construction principles, the feature extraction effect of different remote sensing indices for the same kind of features may be different. Therefore, choosing the appropriate remote sensing index according to the change in application scenarios is conducive to improving the accuracy of feature extraction [47]. In this study, indices related to vegetation, soil, and water bodies are extracted. Additionally, more prominent feature indices help with further processing and analysis. Using PCA (Principal Component Analysis), more effective comprehensive indices can be obtained. Finally, texture features are extracted. Texture is a combination of complex visual entities or sub-patterns, characterized by brightness, color, steepness, size, etc. Hence, texture can be considered a combination of similar sub-images [48]. Textures are generally categorized into two types: structural texture (a deterministic process) and statistical texture (a random process). Li et al. [40] described texture characteristics using features like homogeneity, density, roughness, regularity, linearity, directionality, frequency, phase, etc.

In this study, based on the characteristics of the Yellow River Delta study area, feature bands such as soil index, vegetation/water index, spectral features, K-T variation, and texture features were mainly extracted, and finally, 12 spectral feature bands, 8 index bands, and 10 texture feature bands, totaling 30 bands, were obtained as the initial dataset. Table 4 shows the classes, names and expressions of the extracted bands.

3.2. Feature Extraction and Combination Scheme

3.2.1. Band Importance Analysis

In this study, the multispectral data of remote sensing images, vegetation index, vegetation red edge location index, water body index, soil brightness index, and texture features were selected to construct the dataset according to the characteristics of the study area. The next experiment was conducted using the random forest algorithm, which has a wide range of applications in the field of remote sensing image processing. RF has a small sample size, complex interactions, and correlation [51]. In addition, feature selection can be conducted based on its feature importance metrics [52]. This study uses 70% of the sample data as the training set and 30% of the data as the test set. After experiments, it was determined that the error of the decision tree tends to stabilize at greater than 120, so the number of generated decision trees was chosen to be 120. Putting all the bands in the dataset into the Random Forest Algorithm (RF) for band importance ranking, after removing the spectral bands. The resulting ranking is shown in Figure 6.

We can conclude that the textural features are all at the forefront of importance, so all of them are retained, Otherwise, MNDWI, NDVI, BI2, and S2REP are retained in the exponential band.

3.2.2. Combination Scheme

The spectral characteristics carried and reflected by different bands are different, so the reflection of ground information by different remote sensing bands is also different. The principle of remote sensing band selection is to judge how to go about it based on the amount of information content of the band combination, the strength of the correlation between the bands, and the spectral response characteristics of the features to be identified in the study area. The best bands and their combinations need to meet the requirements of high information content, low correlation, large spectral differences of features, and good separability. When we use remote sensing technology for correlation analysis, we often have to select the best combination according to some methods to achieve the purpose of image interpretation. OIF can be used to select the optimal band combination [30]. Zhao et al. used OIF to improve crop category recognition accuracy [53]. Acharya et al. used OIF to improve land cover classification accuracy [54]. OIF is a method for determining a three-band combination that maximizes variability in a given multispectral scene [55]. Currently, the most commonly used method for optimal banding combinations is the optimal banding method, OIF.

OIF = \sum_{i = 1}^{3} \{S_{i} / \sum_{j = 1}^{3} | R_{i j} |\}

(1)

The Optimum Index Factor (OIF) is calculated using the formula, where

S_{i}

is the standard deviation of the

i^{t h}

band, and

R_{i j}

is the correlation coefficient between bands

i

and

j

. For n-band images, we first calculate their standard deviation and correlation coefficient matrices. Then, the OIF indices corresponding to all possible band combinations are derived separately, and the strengths and weaknesses of various band combinations are judged based on the size of this index. The larger the OIF, the more information the corresponding combination of images contains. The OIF index was eventually sorted from largest to smallest, with the wave portfolio with the largest OIF index being the best.

We performed an OIF band combination of texture features, spectral data, and index data. Through calculation, we learned that the super-resolution spectral data SRB11, MNDWI, and NDVI were the preferred combinations. We extracted the texture information of these three bands for band fusion and finally obtained 26 bands of data, including the NDVI vegetation index, MNDWI water index, REPI vegetation red edge position index, BI soil brightness index, 10 texture feature bands, and 12 spectral bands. These 26 bands were selected as the final dataset to be input into the model for classification in the next step.

3.3. Attention Mechanism Combined with ResNet Network

In this study, a deep learning network that integrates the attention mechanism with both ResNet-34 and ResNet-50 architectures was employed, and images were inputted into this model for experimental analysis. We utilized a dataset comprising 26 selected bands for further feature processing. Initially, Principal Component Analysis (PCA) is used to obtain three channels of data input, then the input images are directly downsampled by 7 × 7 convolution, and then a BN (Batch Normalization) layer is used to solve the network degradation problem, speeding up the network and improving the recognition accuracy. Subsequently, the feature map underwent maximum pooling and was passed through a Convolutional Block Attention Module (CBAM) before entering the ResNet block. Finally, the image was outputted through average pooling and a fully connected layer. The structural depiction of the model is illustrated in Figure 7.

3.3.1. ResNet Network

In the process of performing deep neural network training, the accuracy of the model saturates or even decreases as the deep neural network training gets deeper and deeper [56], mainly due to the disappearance of the gradient, which makes it difficult for the model to converge. The ResNet network allows the network not to directly fit the objective function H(x), but to fit its residual F(x) = H(x) − x. The structure of residual learning is shown in Figure 8.

The residual unit can be expressed as follows:

x_{l + 1} = x_{l} + F (x_{l}, W_{l})

(2)

3.3.2. Attention Mechanism

Attention Mechanism is an approach that mimics the human visual and cognitive systems, which allows the neural network to focus on the relevant parts of the input data as it is processed [57]. The neural network can automatically learn and selectively focus on the important information in the input by introducing an attention mechanism, improving the performance and generalization of the model [58].

This study utilized both channel attention mechanisms and spatial attention mechanisms. First, describe the global average pooling and maximum pooling of two 1 × 1 × C channels in a space. Then, they are fed into a two-layer neural network, and the two features are added together. Finally, the weight coefficient Mc is obtained through the sigmoid activation function [59]. Multiplying the weight coefficient with the original feature F yields the scaled new feature. The channel attention structure is shown in Figure 9.

Mc (F) = σ (MLP (AvgPool (F)) + MLP (MaxPool (F))) = σ (W_{1} (W_{0} (F_{a v g}^{C})) + (W_{1} (W_{0} (F_{m a x}^{C})))

Introducing a spatial attention module after the channel attention module to focus on which features are meaningful [60]. After average pooling and maximum pooling in a channel dimension, a weight coefficient Ms is obtained through a 7 × 7 convolutional layer. The structure of spatial attention is shown in Figure 10

Finally, multiplying the weight coefficient with feature F yields the scaled new feature. Using an intermediate feature map in the network, this module sequentially infers the attention weights along two separate dimensions, channel and space, and then multiplies the attention map by the original feature map, culminating in the adaptive tuning of the feature map. Channel attention is combined with spatial attention to form CBAM, and the structure of CBAM is shown in Figure 11.

Adding CBAM to each residual cell, the structure of the residual cell after the addition of the attention mechanism to ResNet-34 and ResNet-50 is shown in Figure 12. The left figure corresponds to the deep network, while the right figure corresponds to the shallow network. For short-circuit connections, when the input and output dimensions are the same, the inputs can be added directly to the outputs.

3.3.3. Transfer Learning

In this study, the network to be used is first pre-trained on the ImageNet dataset, and then the pre-trained model is applied to this dataset. The amount of data in this study are on the large side, and it is too costly to train directly on the target domain, so we use migration learning to train the experimental data quickly. Transfer learning is a method of applying knowledge or patterns learned in a domain or task to a different but related domain or problem. That is, the ability to systematically identify and apply knowledge and skills learned in a previous domain or task to a new domain or job [61]. Transfer learning aims to provide a framework for solving new but similar problems faster and more efficiently using previously acquired knowledge [29]. With the emergence of more and more machine learning application scenarios and the existing better-performing supervised learning that requires a large amount of labeled data, and labeling data being a costly task, migration learning is receiving more and more attention.

Transfer learning typically focuses on scenarios with a source domain

D_{s}

and a target domain

D_{t}

. The source domain is represented as

D_{s} = {x_{i}, y_{i}}_{i}^{N_{s}}

, where

x_{i}

and

y_{i}

, respectively, denote data samples and their corresponding category labels. The target domain is represented as

D_{t} = {x_{i}, y_{i}}_{i}^{N_{t}}

. Given the source domain

D_{s}

with a learning task

T_{s}

, and the target domain

D_{s}

with a learning task

T_{s}

, the objective of transfer learning is to leverage the knowledge from the source domain

D_{s}

and learning task

T_{s}

to enhance the learning of the predictive function

f_{t} (\cdot)

in the target domain (Pan Sinno Jialin). This is under the condition that either

D_{s}

≠

D_{t}

or

T_{s}

≠

T_{t}

.

Transfer learning applies knowledge, patterns, and understanding gained in one source domain and task to different, but unrelated, target domains and tasks by means of parametric migration to achieve rapid learning. The process of transfer learning in this experiment is shown in Figure 13.

4. Classification Results and Accuracy Evaluation

4.1. Classification Effectiveness

Based on the super-resolution image of Sentinel-2, the feature-preferred bands were extracted, and the study on the extraction of wetland category information in the Yellow River Delta was completed using RF, ResNet-34, ResNet-50, and AMResNet-34 (AR34) and AMResNet-50 (AR50), respectively, and the classification results are shown in Figure 14.

The visual interpretation and comparison of the original images show that the classification effect of the AM_ResNet-34 and AM_ResNet-50 models is much higher than that of RF in shrub swamps, reservoirs, farmland, and building land, and slightly higher than that of RF in shallow sea waters, mudflats, shrub swamps, and salt fields, but the classification effect of the river is still to be improved, which may be due to the fact that the sand content near the estuary is much higher than that of other months in October, so the offshore waters near the estuary are easily misclassified as a river.

4.2. Accuracy Evaluation

To compare and analyze the classification results, the accuracy metrics used in this study are User Accuracy (UA), Producer Accuracy (PA) Overall Accuracy (OA) and Kappa coefficient, which are more often used in evaluating the classification accuracy of remotely sensed imagery [29]. The Kappa coefficient is used to evaluate the overall performance of the classification method, and each evaluation index is shown in Table 5. The accuracy of the classification results of the optimal model is shown in Table 5.

This study compares SVM, Random Forest, ResNet-34, ResNet-50, AM_ResNet-34, and AM_ResNet-50 with Kappa coefficients of 0.81, 0.83, 0.91, 0.90, 0.92, and 0.93, respectively, and OA of 85.03, 87.62%, 92.59%, 91.77%, 93.76%, and 95.76%, as shown in Table 6. The deep learning models ResNet-34 and ResNet-50 have higher PA and UA than the machine learning RF, but the accuracy of herbaceous swamp, shrubby swamp, and building land is not much improved, and ResNet-50 has a larger accuracy improvement compared to ResNet-34 herbaceous swamp and shrubby swamp; the AM_ResNet-34 and AM_ResNet-50 models have all categories improved accuracy, and OA and Kappa coefficients had greater increases. Due to the large amount of data, the network with ResNet-50 as the base model is more advantageous, resulting in some improvement in both producer accuracy and user accuracy for all categories.

Ultimately, this paper concludes that the use of attention mechanism combined with the ResNet network can effectively improve classification accuracy based on the feature preference data of Sentinel-2.

5. Discussion

5.1. Advantages of This Study

In this study, based on Sentinel-2 images, a deep learning method with feature preference was used to classify the wetland types in the Yellow River Delta. The method has the following advantages: (1) Compared with other studies, they use more Yellow River estuary data, and the observation scale of this study is larger. The preprocessing methods of some studies mostly used spectral bands, and we used texture features to fuse the super-resolution multispectral data to obtain a higher-quality dataset. Most of the studies used machine-learning classification models, and in this study, we used deep-learning models to improve the classification accuracy and ensure the stability of the models. (2) The feature-preferred band greatly improves the separability of the data, and the features of each category are more distinctive. Various studies have also shown that the texture features have a great degree of contribution to the data. In this study, we utilized texture features for the fusion of super-resolution multispectral data to obtain a higher-quality dataset compared to the preprocessing approach of other studies. (3) Deep learning has a very good suitability for the large-scale data of remote sensing, especially the use of residual networks, which make the network with higher depths not produce gradient drops, which makes the steps such as feature extraction of large-scale data simple. (4) The portability of this study is high, and the feature preference can be used many times in remote sensing image data processing. At the same time, the deep learning network can be applied to most of the remote sensing research, and is of general significance in the classification of wetlands, and can be applied to the classification of other geographical scenarios as well. (5) SVM and RF are commonly used classifiers in wetland classification, while the ResNet deep learning network in this study is more effective than machine learning. Compared to SVM, the accuracy and KAPPA were improved by 9.58% and 0.12, respectively, compared to RF by 6.99% and 0.1. The current classification research on remote sensing images mostly uses random forests, but deep learning is better for large-scale data.

5.2. Limitations and Future Improvements

This study still has shortcomings, so the follow-up is mainly from the following three aspects of the study to supplement: (1) October data features obvious features, but the water body features are not obvious, the follow-up will join the multi-temporal data to improve the accuracy of water body features. (2) Although the accuracy of deep learning is greatly improved, the sample labeling is too cumbersome, and machine learning can be used to pre-label the data before deep learning for classification. (3) More bands can be added when performing feature optimization to further improve the quality of the dataset. (4) Other deep learning models that can be used for experiments, such as CNN combined with Transformer algorithms [62] or TransUNet, which combines U-net with Transformer [63], could be explored.

6. Conclusions

Based on the super-resolution image of Sentinel-2, we used the data preprocessing method of feature-preferred bands to obtain index features and texture features and then fused them into bands. Furthermore, the ResNet with attention method was applied to classify the wetland types in the Yellow River Delta. The conclusions that can be obtained from this study are as follows: the data preprocessing method of feature preference can greatly improve the data separability. In addition, the effectiveness of the deep residual network is higher than that of the random forest algorithm. Meanwhile, after the residual network introduces the attention mechanism, the model works best, the overall accuracy of classification reaches 94.61%, and the Kappa coefficient reaches 0.93, which is 6.99% and 0.1 higher than the random forest.

Author Contributions

Y.L., data curation, investigation, software, code, writing—original draft; X.Y., investigation, software, code, analysis, writing—original draft; J.Z., conceptualization, funding acquisition, supervision; writing—review; S.Z., software, visualization, writing—review; X.W., visualization, analysis, writing—review; D.K., code, visualization, writing—review; L.Y., visualization, writing—review; H.L., writing—review. All authors have read and agreed to the published version of the manuscript.

Funding

This work was jointly supported by the Finance Science and Technology Project of Hainan Province (No. ZDYF2021SHFZ063), the Shandong Key Research and Development Project (No. 2018GNC110025, No. ZR2020QE281), the Science and Technology Support Plan for Youth Innovation of Colleges and Universities of Shandong Province of China (No. 2023KJ232), and “Taishan Scholar” Project of Shandong Province (No. TSXZ201712).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated or analyzed during this study are included in this published article.

Conflicts of Interest

The authors declare no conflicts of interests.

References

Mitsch, W.J.; Gosselink, J.G. The Value of Wetlands: Importance of Scale and Landscape Setting. Ecol. Econ. 2000, 35, 25–33. [Google Scholar] [CrossRef]
Zhang, X.; He, S.; Yang, Y. Evaluation of Wetland Ecosystem Services Value of the Yellow River Delta. Environ. Monit. Assess. 2021, 193, 353. [Google Scholar] [CrossRef] [PubMed]
Liu, R.; Liang, S.; Zhao, H.; Qi, G.; Li, L.; Jiang, Y.; Niu, Z. Progress of Chinese Coastal Wetland Based on Remote Sensing. Remote Sens. Technol. Appl. 2017, 32, 998–1011. [Google Scholar]
Wei, Z.; Jian, Z.; Sun, Y.; Pan, F.; Han, H.; Liu, Q.; Mei, Y. Ecological Sustainability and High-Quality Development of the Yellow River Delta in China Based on the Improved Ecological Footprint Model. Sci. Rep. 2023, 13, 3821. [Google Scholar] [CrossRef] [PubMed]
Yan, J.; Zhu, J.; Zhao, S.; Su, F. Coastal Wetland Degradation and Ecosystem Service Value Change in the Yellow River Delta, China. Glob. Ecol. Conserv. 2023, 44, e02501. [Google Scholar] [CrossRef]
Yu, B.; Zang, Y.; Wu, C.; Zhao, Z. Spatiotemporal Dynamics of Wetlands and Their Future Multi-Scenario Simulation in the Yellow River Delta, China. J. Environ. Manag. 2024, 353, 120193. [Google Scholar] [CrossRef] [PubMed]
Liu, L.; Wang, H.; Yue, Q. China’s Coastal Wetlands: Ecological Challenges, Restoration, and Management Suggestions. Reg. Stud. Mar. Sci. 2020, 37, 101337. [Google Scholar] [CrossRef]
Fu, Y.; Chen, S.; Ji, H.; Fan, Y.; Li, P. The Modern Yellow River Delta in Transition: Causes and Implications. Mar. Geol. 2021, 436, 106476. [Google Scholar] [CrossRef]
Jia, Y.-Y.; Tang, L.; Li, C.; Yuan, X.; Qian, Y. Current Status and Development of Remote Sensing Technology Standardization in China. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012; IEEE: Munich, Germany, 2012; pp. 2775–2777. [Google Scholar]
Huadong, G.; Changlin, W. Building up National Earth Observing System in China. Int. J. Appl. Earth Obs. Geoinf. 2005, 6, 167–176. [Google Scholar] [CrossRef]
Bing, Z. Current Status and Future Prospects of Remote Sensing. Bull. Chin. Acad. Sci. 2017, 32, 12. [Google Scholar] [CrossRef]
Aslam, R.W.; Shu, H.; Javid, K.; Pervaiz, S.; Mustafa, F.; Raza, D.; Ahmed, B.; Quddoos, A.; Al-Ahmadi, S.; Hatamleh, W.A. Wetland Identification through Remote Sensing: Insights into Wetness, Greenness, Turbidity, Temperature, and Changing Landscapes. Big Data Res. 2023, 35, 100416. [Google Scholar] [CrossRef]
Zhu, Q.; Zhong, Y.; Zhang, L. Multi-Feature Probability Topic Scene Classifier for High Spatial Resolution Remote Sensing Imagery. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014; IEEE: Quebec City, QC, Canada, 2014; pp. 2854–2857. [Google Scholar]
Xingyu, X.; Hongyu, L. Study on the Classification Approaches of Yancheng Coastal Wetlands based on ALOS Image. Remote Sens. Technol. Appl. 2013, 27, 248–255. [Google Scholar]
Wu, Z.; Zhang, J.; Deng, F.; Zhang, S.; Zhang, D.; Xun, L.; Javed, T.; Liu, G.; Liu, D.; Ji, M. Fusion of GF and MODIS Data for Regional-Scale Grassland Community Classification with EVI2 Time-Series and Phenological Features. Remote Sens. 2021, 13, 835. [Google Scholar] [CrossRef]
Bai, Y.; Sun, G.; Li, Y.; Ma, P.; Li, G.; Zhang, Y. Comprehensively Analyzing Optical and Polarimetric SAR Features for Land-Use/Land-Cover Classification and Urban Vegetation Extraction in Highly-Dense Urban Area. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 102496. [Google Scholar] [CrossRef]
Sheng, H.; Wei, J.; Hu, Y.; Xu, M.; Cui, J.; Zheng, H. Wetland Information Extraction Based on Multifeature Optimization of Multitemporal Sentinel-2 Images. Mar. Sci. 2023, 47, 105–112. [Google Scholar]
Zheng, H.; Du, P.; Chen, J.; Xia, J.; Li, E.; Xu, Z.; Li, X.; Yokoya, N. Performance Evaluation of Downscaling Sentinel-2 Imagery for Land Use and Land Cover Classification by Spectral-Spatial Features. Remote Sens. 2017, 9, 1274. [Google Scholar] [CrossRef]
Wang, X.; Gao, X.; Zhang, Y.; Fei, X.; Chen, Z.; Wang, J.; Zhang, Y.; Lu, X.; Zhao, H. Land-Cover Classification of Coastal Wetlands Using the RF Algorithm for Worldview-2 and Landsat 8 Images. Remote Sens. 2019, 11, 1927. [Google Scholar] [CrossRef]
Cui, L.; Zhang, J.; Wu, Z.; Xun, L.; Wang, X.; Zhang, S.; Bai, Y.; Zhang, S.; Yang, S.; Liu, Q. Superpixel Segmentation Integrated Feature Subset Selection for Wetland Classification over Yellow River Delta. Environ. Sci. Pollut. Res. 2023, 30, 50796–50814. [Google Scholar] [CrossRef] [PubMed]
Lin, X.; Cheng, Y.; Chen, G.; Chen, W.; Chen, R.; Gao, D.; Zhang, Y.; Wu, Y. Semantic Segmentation of China’s Coastal Wetlands Based on Sentinel-2 and Segformer. Remote Sens. 2023, 15, 3714. [Google Scholar] [CrossRef]
Lin, Z.; Wang, J.; Li, W.; Jiang, X.; Zhu, W.; Ma, Y.; Wang, A. OBH-RSI: Object-Based Hierarchical Classifica-Tion Using Remote Sensing Indices for Coastal Wetland. J. Beijing Inst. Technol. 2021, 30, 159–171. [Google Scholar] [CrossRef]
Li, Y.; Zhang, H.; Xue, X.; Jiang, Y.; Shen, Q. Deep Learning for Remote Sensing Image Classification: A Survey. WIREs Data Min. Knowl. Discov. 2018, 8, e1264. [Google Scholar] [CrossRef]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
Mahdianpari, M.; Rezaee, M.; Zhang, Y.; Salehi, B. Wetland Classification Using Deep Convolutional Neural Network. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; IEEE: Valencia, Spain, 2018; pp. 9249–9252. [Google Scholar]
Jamali, A.; Mahdianpari, M. Swin Transformer and Deep Convolutional Neural Networks for Coastal Wetland Classification Using Sentinel-1, Sentinel-2, and LiDAR Data. Remote Sens. 2022, 14, 359. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M.; Mohammadimanesh, F.; Brisco, B.; Salehi, B. 3-D Hybrid CNN Combined with 3-D Generative Adversarial Network for Wetland Classification with Limited Training Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 8095–8108. [Google Scholar] [CrossRef]
Li, X.-C.; Zhan, D.-C.; Yang, J.-Q.; Shi, Y.; Hang, C.; Lu, Y. Towards Understanding Transfer Learning Algorithms Using Meta Transfer Features. In Advances in Knowledge Discovery and Data Mining; Springer: Cham, Switzerland, 2020; pp. 855–866. [Google Scholar]
Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer Learning Using Computational Intelligence: A Survey. Knowl.-Based Syst. 2015, 80, 14–23. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Las Vegas, NV, USA, 2016; pp. 770–778. [Google Scholar]
Alp, G.; Sertel, E. Deep Learning Based Patch-Wise Land Cover Land Use Classification: A New Small Benchmark Sentinel-2 Image Dataset. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Kuala Lumpur, Malaysia, 2022; pp. 3179–3182. [Google Scholar]
Naushad, R.; Kaur, T.; Ghaderpour, E. Deep Transfer Learning for Land Use and Land Cover Classification: A Comparative Study. Sensors 2021, 21, 8083. [Google Scholar] [CrossRef] [PubMed]
Brauwers, G.; Frasincar, F. A General Survey on Attention Mechanisms in Deep Learning. IEEE Trans. Knowl. Data Eng. 2023, 35, 3279–3298. [Google Scholar] [CrossRef]
Duan, S.; Zhao, J.; Huang, X.; Zhao, S. Semantic Segmentation of Remote Sensing Data Based on Channel Attention and Feature Information Entropy. Sensors 2024, 24, 1324. [Google Scholar] [CrossRef]
Zhang, H.; Liu, S. Double-Branch Multi-Scale Contextual Network: A Model for Multi-Scale Street Tree Segmentation in High-Resolution Remote Sensing Images. Sensors 2024, 24, 1110. [Google Scholar] [CrossRef]
Jiang, W.; Li, J.; Wang, W.; Xie, Z.; Mai, S. Assessment of Wetland Ecosystem Health Based on RS and GIS in Liaohe River Delta. In Proceedings of the 2005 IEEE International Geoscience and Remote Sensing Symposium, 2005—IGARSS ’05, Seoul, Republic of Korea, 29 July 2005; IEEE: Seoul, Republic of Korea, 2005; Volume 4, pp. 2384–2386. [Google Scholar]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Xu, L.; Su, T.; Lei, B.; Wang, R.; Liu, X.; Meng, C.; Qi, J. The method of algal bloom extraction in Lake Chaohu waters based on FAI-L method. J. Lake Sci. 2023, 35, 1222–1233. [Google Scholar] [CrossRef]
Tian, Y.; Chen, Z.; Hui, F.; Cheng, X.; Ouyang, L. ESA Sentinel-2A/B satellite: Characteristics and applications. J. Beijing Norm. Univ. Sci. 2019, 55, 57. [Google Scholar] [CrossRef]
Li, Y.F.; Liu, H.Y. Advances in wetland classification and wetland landscape classification. Wetl. Sci. 2014, 12, 102–108. [Google Scholar] [CrossRef]
Mao, D.; Wang, Z.; Du, B.; Li, L.; Tian, Y.; Jia, M.; Zeng, Y.; Song, K.; Jiang, M.; Wang, Y. National Wetland Mapping in China: A New Product Resulting from Object-Based and Hierarchical Classification of Landsat 8 OLI Images. ISPRS J. Photogramm. Remote Sens. 2020, 164, 11–25. [Google Scholar] [CrossRef]
Panagiotopoulou, A.; Charou, E.; Stefouli, M.; Platis, K.; Madamopoulos, N.; Bratsolis, E. Sentinel-2 “Low Resolution Band” Optimization Using Super-Resolution Techniques: Lysimachia Lake Pilot Area of Analysis. In Proceedings of the 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA), Patras, Greece, 15–17 July 2019; IEEE: Patras, Greece, 2019; pp. 1–2. [Google Scholar]
Wang, X.; Hu, Q.; Cheng, Y.; Ma, J. Hyperspectral Image Super-Resolution Meets Deep Learning: A Survey and Perspective. IEEECAA J. Autom. Sin. 2023, 10, 1668–1691. [Google Scholar] [CrossRef]
Brodu, N. Super-Resolving Multiresolution Images with Band-Independant Geometry of Multispectral Pixels. In IEEE Transactions on Geoscience and Remote Sensing; IEEE: Washington, DC, USA, 2017. [Google Scholar]
Zeng, Y.; Hao, D.; Huete, A.; Dechant, B.; Berry, J.; Chen, J.M.; Joiner, J.; Frankenberg, C.; Bond-Lamberty, B. Optical Vegetation Indices for Monitoring Terrestrial Ecosystems Globally | Nature Reviews Earth & Environment. Available online: https://www.nature.com/articles/s43017-022-00298-5 (accessed on 5 January 2024).
Dan, L.; Baosheng, W. Review of Water Body Information Extraction Based on Satellite Remote Sensing. J. Tsinghua Univ. Technol. 2020, 60, 147–161. [Google Scholar] [CrossRef]
Li, H.; Huang, J.; Liang, Y.; Wang, H.; Zhang, Y. Evaluating the quality of ecological environment in Wuhan based on remote sensing ecological index. J. Yunnan Univ. Nat. Sci. Ed. 2020, 42, 81–90. [Google Scholar]
Rosenfeld, A.; Kak, A.C. Digital Picture Processing, 2nd ed.; Morgan Kaufmann Publishers: San Mateo, CA, USA, 2014; Available online: https://www.oreilly.com/library/view/digital-picture-processing/9780323139915/ (accessed on 5 January 2024).
Liu, H.Q.; Huete, A. A Feedback Based Modification of the NDVI to Minimize Canopy Background and Atmospheric Noise. In IEEE Transactions on Geoscience and Remote Sensing; IEEE: Piscataway, NJ, USA, 1995; Available online: https://ieeexplore.ieee.org/document/8746027 (accessed on 5 January 2024).
Jordan, C.F. Derivation of Leaf-Area Index from Quality of Light on the Forest Floor. Ecology 1969, 50, 663–666. Available online: https://esajournals.onlinelibrary.wiley.com/doi/abs/10.2307/1936256 (accessed on 5 January 2024). [CrossRef]
Epifanio, I. Intervention in Prediction Measure: A New Approach to Assessing Variable Importance for Random Forests. BMC Bioinform. 2017, 18, 230. [Google Scholar] [CrossRef]
Rui, C.; Xue, W. Wavelength Selection Method of Near-Infrared Spectrum Based on Random Forest Feature Importance and Interval Partial Least Square Method. Spectrosc. Spectr. Anal. 2023, 43, 1043–1050. [Google Scholar]
Zhao, L.; Li, Q.; Chang, Q.; Shang, J.; Du, X.; Liu, J.; Dong, T. In-Season Crop Type Identification Using Optimal Feature Knowledge Graph. ISPRS J. Photogramm. Remote Sens. 2022, 194, 250–266. [Google Scholar] [CrossRef]
Acharya, T.; Yang, I.; Lee, D. Land-Cover Classification of Imagery from Landsat Operational Land Imager Based on Optimum Index Factor. Sens. Mater. 2018, 30, 1753–1764. [Google Scholar] [CrossRef]
Kienast-Brown, S.; Boettinger, J.L. Applying the Optimum Index Factor to Multiple Data Types in Soil Survey. In Digital Soil Mapping; Boettinger, J.L., Howell, D.W., Moore, A.C., Hartemink, A.E., Kienast-Brown, S., Eds.; Springer: Dordrecht, The Netherlands, 2010; pp. 385–398. ISBN 978-90-481-8862-8. [Google Scholar]
Glafkides, J.-P.; Sher, G.I.; Akdag, H. Phylogenetic Replay Learning in Deep Neural Networks. Jordanian J. Comput. Inf. Technol. 2022, 8, 112–126. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
Gao, G. Survey on Attention Mechanisms in Deep Learning Recommendation Models. Comput. Eng. Appl. 2022, 58, 9. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 3–19. ISBN 978-3-030-01233-5. [Google Scholar]
Yang, Z.; Zhang, T.; Yang, J. Research on Classification Algorithms for Attention Mechanism. In Proceedings of the 2020 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), Xuzhou, China, 16–19 October 2020; IEEE: Xuzhou, China, 2020; pp. 194–197. [Google Scholar]
Weiss, K.; Khoshgoftaar, T. Evaluation of Transfer Learning Algorithms Using Different Base Learners. In Proceedings of the 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), Boston, MA, USA, 6–8 November 2017; IEEE: Boston, MA, USA, 2017; pp. 187–196. [Google Scholar]
Guo, J.; Han, K.; Wu, H.; Tang, Y.; Chen, X.; Wang, Y.; Xu, C. CMT: Convolutional Neural Networks Meet Vision Transformers. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 12165–12175. [Google Scholar]
Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar] [CrossRef]

Figure 1. Study Area (Note: The image is a true-color composite from Sentinel-2 (Red: Band 4, Green: Band 3, Blue: Band 2).

Figure 2. Training sample set.

Figure 3. Feature selection flowchart.

Figure 4. Data processing flowchart.

Figure 5. Comparison between the original B1 band and the super-resolution composite band.

Figure 6. Band importance ranking.

Figure 7. Data processing flowchart in the AM_ResNet Network.

Figure 8. Residual unit.

Figure 9. Channel attention structures.

Figure 10. Spatial attention structures.

Figure 11. CBAM structures.

Figure 12. Structure of residual unit with attention mechanism.

Figure 13. The transfer learning process.

Figure 14. Classification results of wetlands in the Yellow River Delta.

Table 1. Levels and product descriptions of sentinel data.

Product Level	Product Introduction
Level-0	Raw data
Level-1A	Geometric coarse correction products containing metainformation
Level-1B	Radiance product, embedded in a GCP-optimized geometric model but without the corresponding geometric corrections
Level-1C	Atmospheric apparent reflectance products after orthorectification and sub-image-level geometric refinement corrections
Level-2A	Contains primarily atmospherically corrected bottom-of-the-atmosphere reflectance data

Table 2. Spectral Bands of Sentinel-2.

Sensor	Band Number	Band Name	Sentinel-2A		Sentinel-2B		Resolution (Meters)
Sensor	Band Number	Band Name	Central Wavelength (nm)	Bandwidth (nm)	Central Wavelength (nm)	Bandwidth (nm)	Resolution (Meters)
MSI	1	Coastal aerosol	443.9	20	442.3	20	60
	2	Blue	496.6	65	492.1	65	10
	3	Green	560.0	35	559	35	10
	4	Red	664.5	30	665	30	10
	5	Vegetation Red Edge	703.9	15	703.8	15	20
	6	Vegetation Red Edge	740.2	15	739.1	15	20
	7	Vegetation Red Edge	782.5	20	779.7	20	20
	8	NIR	835.1	115	833	115	10
	8b	Narrow NIR	864.8	20	864	20	20
	9	Water vapor	945.0	20	943.2	20	60
	10	SWIR-Cirrus	1373.5	30	1376.9	30	60
	11	SWIR	1613.7	90	1610.4	90	20
	12	SWIR	2202.4	180	2185.7	180	20

Table 3. Wetland Classification System—Legends and Explanations.

First Level Classification	Second Level Classification	Identification and Map Color
Natural wetlands	Shrub swamps	SS Dark green
	Herbaceous swamp	HS Light green
	Rivers	RV Purple
	Mudflat	MF Light yellow
Artificial wetland	reservoir and pond	RE Light blue
Artificial wetland	Salt pan	SP orange
Non-wetland	Construction land	CL Red
	Farmland	FA Brown
	Shallow water	SW Light blue
	Other land use	OL White

Table 4. Categories, names, and expressions of the extracted bands.

Feature Category	Feature Name	Feature Expression
Spectral characteristics	Band	Blue (B2), green (B3), red (B4), near-infrared (B8), red end (B5), near-infrared NIR (B6, B7, and B8A), shortwave infrared SWIR (B11 and B12), coastal atmospheric aerosols (B1), and cirrus bands (B10)
Vegetation/Water Index	NDVI [49]	$\frac{R_{n i r} - R_{r e d}}{R_{n i r} + R_{r e d}}$
	MNDWI	$\frac{R_{g r e e n} - R_{m i r}}{R_{g r e e n} + R_{m i r}}$
	NDWI	$\frac{R_{g r e e n} - R_{n i r}}{R_{g r e e n} + R_{n i r}}$
	REPI	$REP = 705 + 35 \times (0.5 \times (B_{4} + B_{7}) - B_{5}) / B_{6} - B_{5}$
	DVI [50]	$R_{n i r} - R_{r e d}$
	RVI	$\frac{R_{n i r}}{R_{r e d}}$
soil index	BI	$\frac{R_{r e d} - R_{n i r}}{R_{r e d} + R_{n i r}}$
soil index	SAVI	$\frac{R_{n i r} - R_{r e d}}{R_{n i r} + R_{r e d} + 0.6}$
Tasseled Cap	Brightness	$0.3029 R_{b l u e} + 0.2786 R_{green} + 0.4733 R_{red} + 0.5599 R_{n i r} + 0.5080 R_{s w i r l} + 0.1872 R_{s w i r 2}$
	Greenness	$- 0.2941 R_{b l u e} - 0.2430 R_{green} + 0.5424 R_{red} + 0.7276 R_{n i r} - 0.7170 R_{s w i r l} - 0.1680 R_{s w i r 2}$
	Wetness	$0.1511 R_{b l u e} + 0.1973 R_{green} + 0.3283 R_{red} + 0.3407 R_{n i r} - 0.7117 R_{s w i r l} - 0.4559 R_{s w i r 2}$
Texture Features	GLGM_Variance	$Variance = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times {(i - M e a n)}^{2}$
	GLGM_Contrast	$Contrast = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times {(i - j)}^{2}$
	GLGM_Entropy	$Entropy = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times \ln p (i, j)$
	GLGM_Correlation	$Correlation = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} \frac{(i - M e a n) \times (j - M e a n) \times p {(i, j)}^{2}}{Variance}$
	GLGM_Homogeneity	$Homogeneity = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times \frac{1}{1 + {(i - j)}^{2}}$
	GLGM_ASM	$ASM = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p {(i, j)}^{2}$
	GLGM_Mean	$Mean = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times i$
	GLGM_Dissmilarity	$Dissmilarity = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p (i, j) \times \| i - j \|$
	GLGM_Energy	$Energy = \sum_{i = 0}^{{q u a n t}_{k}} \sum_{i = 0}^{{q u a n t}_{k}} p {(i, j)}^{2}$
	GLGM_Max	$Max = {m a x}_{i j} P (i, j)$

Table 5. Accuracy evaluation metrics.

Evaluation Indicators	Calculation Methods	Explanations
User Accuracy (UA)	$UA = \frac{R_{i}}{R_{a}}$	$R_{i}$ represents the number of correct classifications, $R_{a}$ represents the total number of categories.
Producer Accuracy (PA)	$PA = \frac{R_{i}}{R_{i} + R_{m}}$	$R_{i}$ represents the number of correct classifications, $R_{m}$ represents the total number of misclassifications to a certain category.
F1-score	$F 1 = \frac{2 P R}{P + R}$	P stands for precision rate and R stands for with recall rate.
Overall Accuracy (OA)	$OA = \frac{\sum_{i - 1}^{C} T_{i}}{n}$	C represents the total number of categories, $T_{i}$ represents the number of correctly classified samples for each category, and n represents the total number of samples.
KAPPA	$K = \frac{P_{o} - P_{e}}{1 - P_{e}}$	$P_{o}$ represents the overall classification accuracy, $P_{e} = \frac{\sum_{i - 1}^{C} a_{i} \times b_{i}}{n^{2}}$ , $a_{i}$ is the number of real samples for each category, $b_{i}$ is the number of predicted samples.

Table 6. Accuracy evaluation for all categories.

Method	OA(%)	Kappa	F1-Score	Accuracy	Category
Method	OA(%)	Kappa	F1-Score	Accuracy	SW	MF	RV	HS	SS	RE	FA	SP	CL	OL
SVM	85.03	0.81	81.26	PA	84.33	83.21	86.03	78.93	84.54	79.89	87.05	89.13	87.72	85.56
SVM	85.03	0.81	81.26	UA	88.87	87.46	85.81	82.35	85.73	81.02	86.36	87.79	90.02	81.06
RF	87.62	0.83	84.55	PA	86.2	86.4	86.68	69.15	89.27	80.9	90.53	94.44	91.8	88.67
RF	87.62	0.83	84.55	UA	93.82	93.79	92.74	66.97	83.79	85	93.78	96.9	74.74	94.25
ResNet34	92.59	0.91	90.62	PA	94.59	92.29	93.85	82.07	82.16	89.38	94.33	93.18	91.81	93.86
ResNet34	92.59	0.91	90.62	UA	94.92	92.03	94.09	85.92	85.02	94.08	97.18	95.67	65.15	92.95
ResNet50	91.77	0.9	89.43	PA	94.44	94.15	93.71	89.86	89.74	85.88	90.65	95.15	87.65	91.65
ResNet50	91.77	0.9	89.43	UA	96.59	91.45	96.29	94.66	94.27	91.76	93.59	96.94	71.65	95.65
AR34	93.76	0.92	91.02	PA	94.04	95.37	94.86	91.17	92.14	90.48	94.45	97.43	91.56	95.11
AR34	93.76	0.92	91.02	UA	93.52	94.08	96.46	95.24	95.26	93.75	91.02	92.43	84.58	96.46
AR50	94.61	0.93	91.93	PA	95.2	94.48	93.91	92.73	92.38	90.31	96.29	97.68	93.37	94.35
AR50	94.61	0.93	91.93	UA	94.64	93.63	95.59	94.52	94.52	95.49	92.48	91.32	85.25	95.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Yu, X.; Zhang, J.; Zhang, S.; Wang, X.; Kong, D.; Yao, L.; Lu, H. Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism. Remote Sens. 2024, 16, 1860. https://doi.org/10.3390/rs16111860

AMA Style

Li Y, Yu X, Zhang J, Zhang S, Wang X, Kong D, Yao L, Lu H. Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism. Remote Sensing. 2024; 16(11):1860. https://doi.org/10.3390/rs16111860

Chicago/Turabian Style

Li, Yirong, Xiang Yu, Jiahua Zhang, Shichao Zhang, Xiaopeng Wang, Delong Kong, Lulu Yao, and He Lu. 2024. "Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism" Remote Sensing 16, no. 11: 1860. https://doi.org/10.3390/rs16111860

APA Style

Li, Y., Yu, X., Zhang, J., Zhang, S., Wang, X., Kong, D., Yao, L., & Lu, H. (2024). Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism. Remote Sensing, 16(11), 1860. https://doi.org/10.3390/rs16111860

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improved Classification of Coastal Wetlands in Yellow River Delta of China Using ResNet Combined with Feature-Preferred Bands Based on Attention Mechanism

Abstract

1. Introduction

2. Study Area and Data

2.1. Study Area

2.2. Data Description

2.2.1. Sentinel-2 Data

2.2.2. Wetland Classification System

3. Methodology

3.1. Data Preprocessing

3.1.1. Super-Resolution Synthesis

3.1.2. Feature Band Extraction

3.2. Feature Extraction and Combination Scheme

3.2.1. Band Importance Analysis

3.2.2. Combination Scheme

3.3. Attention Mechanism Combined with ResNet Network

3.3.1. ResNet Network

3.3.2. Attention Mechanism

3.3.3. Transfer Learning

4. Classification Results and Accuracy Evaluation

4.1. Classification Effectiveness

4.2. Accuracy Evaluation

5. Discussion

5.1. Advantages of This Study

5.2. Limitations and Future Improvements

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI