Next Article in Journal
Robust Cloud Suppression and Anomaly Detection in Time-Lapse Thermography
Next Article in Special Issue
Spatial Heterogeneity and the Increasing Trend of Vegetation and Their Driving Mechanisms in the Mountainous Area of Haihe River Basin
Previous Article in Journal
Improved Main Lobe Cancellation Method for Suppression Directional Noise in HFSWR Systems
Previous Article in Special Issue
Spatial and Temporal Variation in Vegetation Cover and Its Response to Topography in the Selinco Region of the Qinghai-Tibet Plateau
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mountain Vegetation Classification Method Based on Multi-Channel Semantic Segmentation Model

1
State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(2), 256; https://doi.org/10.3390/rs16020256
Submission received: 11 December 2023 / Revised: 1 January 2024 / Accepted: 5 January 2024 / Published: 9 January 2024
(This article belongs to the Special Issue Remote Sensing of Mountain and Plateau Vegetation)

Abstract

:
With the development of satellite remote sensing technology, a substantial quantity of remote sensing data can be obtained every day, but the ability to extract information from these data remains poor, especially regarding intelligent extraction models for vegetation information in mountainous areas. Because the features of remote sensing images (such as spectral, textural and geometric features) change with changes in illumination, viewing angle, scale and spectrum, it is difficult for a remote sensing intelligent interpretation model with a single data source as input to meet the requirements of engineering or large-scale vegetation information extraction and updating. The effective use multi-source, multi-resolution and multi-type data for remote sensing classification is still a challenge. The objective of this study is to develop a highly intelligent and generalizable classification model of mountain vegetation utilizing multi-source remote sensing data to achieve accurate vegetation extraction. Therefore, a multi-channel semantic segmentation model based on deep learning, FCN-ResNet, is proposed to integrate the features and textures of multi-source, multi-resolution and multi-temporal remote sensing data, thereby enhancing the differentiation of different mountain vegetation types by capturing their characteristics and dynamic changes. In addition, several sets of ablation experiments are designed to investigate the effectiveness of the model. The method is validated on Mt. Taibai (part of the Qinling-Daba Mountains), and the pixel accuracy (PA) of vegetation classification reaches 85.8%. The results show that the proposed multi-channel semantic segmentation model can effectively discriminate different vegetation types and has good intelligence and generalization ability in different mountainous areas with similar vegetation distributions. The multi-channel semantic segmentation model can be used for the rapid updating of vegetation type maps in mountainous areas.

1. Introduction

With the increasing development of sensor-bearing satellites, the amount of remote sensing data available is growing rapidly [1]. Currently, China has more than 500 operational satellites in orbit, and the amount of data collected daily has reached the petabyte level. The types of data are also becoming more diverse, and the acquisition time is growing faster [2]. The use of high-resolution remote sensing imagery for intelligent interpretation has become an important means of studying vegetation cover [3,4], its structural composition [5] and its dynamic changes [6,7].
In recent years, many studies on intelligent vegetation classification methods have been developed and significant progress has been made. These intelligent classification methods mainly include traditional machine learning methods and deep learning methods [8]. Traditional machine learning methods, such as support vector machines (SVM) [9,10,11], decision trees [12,13,14], random forests [15,16,17] and k-nearest neighbors (KNN) [18], typically involve the extraction of hand-crafted features from remote sensing data, such as spectral indices (e.g., NDVI, NDWI), texture features and statistical measures. The extracted features are then used to train a machine learning model that can classify different vegetation types. Many studies have focused on exploring and refining machine learning methods to improve the accuracy and efficiency of vegetation classification [19,20,21,22,23,24]. Traditional machine learning methods are often interpretable and require less computational resources, but they rely heavily on hand-crafted features and may not capture intricate patterns in the data. Moreover, limited by the performance of traditional machine learning, these methods cannot be applied to large-scale production practices because of their poor migration and generalization [25]. Therefore, deep learning methods with a higher degree of automation and intelligence have been more widely developed [26].
Deep learning methods, especially convolutional neural networks (CNNs), have gained considerable attention in remote sensing vegetation classification [27,28,29]. CNNs can automatically learn hierarchical features from raw remote sensing data, eliminating the need for manual feature extraction [30]. These models are trained on large datasets and can effectively capture complex patterns and spatial dependencies in vegetation images [31]. For example, Flood et al. [32] used the U-NET neural network structure to map trees and large shrubs in Queensland, Australia, and achieved accuracy of about 90%; Shi et al. [33] proposed a deep neural network BIT-DNN method, which demonstrated excellent performance in using hyperspectral image data for plant species classification, land cover classification, urban scene identification and crop disease identification tasks, and its superior performance was confirmed by an ablation analysis. However, deep learning algorithms often require a large amounts of labeled training data, which are often difficult to obtain for vegetation classification [34]. Some studies have used methods to reduce the model’s need for labeled data, e.g., Langford et al. [35] used unsupervised methods to generate labels by combining unsupervised remote sensing clustering methods with CNNs. However, they still could not solve the problem of insufficient samples for the vegetation classification task at the root.
Semantic segmentation is a relevant topic in computer vision [36,37], which aims to classify each pixel in an image and segment it into different semantic categories and supports pixel-level classification with the properties of fine-grained classification, maintaining the spatial structure and supporting multi-category classification. Thus, the methods of semantic segmentation are widely applied to vegetation classification. For example, Ayhan [38] improved the classification accuracy of underrepresented vegetation categories (trees, shrubs and grasses) by combining the median frequency weighting strategy with Deeplabv3+ [39], which effectively solved the problem of sample imbalance; Liu [40] also used Deeplabv3+ to quantitatively analyze the effect of different remote sensing data on wetland vegetation classification based on deep learning; Gonzalez-Perez et al. [41] verified its advantages in vegetation classification using Deeplabv3+ and demonstrated the superiority of deep learning networks (U-Net [42] and DeepLabv3+) over traditional machine learning methods (support vector machines and random forests) to accurately classify land cover types in coastal wetlands, especially in diverse coastal landscapes, using high-resolution UAV imagery.
However, most of the above methods use a single data source for feature extraction, and the information provided by a single data source may not be sufficient to accurately describe and differentiate complex vegetation types. It may provide erroneous or redundant information, leading to errors in the feature extraction process. For example, multispectral and hyperspectral images, which are commonly used in vegetation classification, are susceptible to the influence of incident light sources and weather conditions and appear to display the phenomena of “same object, different spectrum” and “same spectrum, different object” [1]. Different data sources have different perceptual mechanisms and resolutions, and the features extracted from a single data source may not be able to adequately capture the subtle differences and characteristics of geomorphological aspects. For example, vegetation types usually have small spectral differences, making it difficult to achieve high classification accuracy based on the raw spectral information from high-resolution remote sensing images alone [43]. Meanwhile, different data sources can provide different types of information, such as spectra, structure, texture, etc. When a single data source is used for remote sensing vegetation classification, the generalization of the application to large areas is relatively poor due to factors such as the scale and viewpoint during remote sensing image acquisition [44,45,46], whereas multi-source data can provide information at multiple scales and viewpoints to cope with large scale variations. In addition, climatic features such as the spectral information of vegetation types often change with the seasons, making them difficult to discriminate effectively using data from a single temporal phase, while analyzing data from different temporal phases can capture the changes and evolutionary processes of vegetation at different growth stages, obtain dynamic information on vegetation changes over time and improve the classification accuracy [47]. The current remote sensing intelligent classification methods are not able to effectively integrate multi-source remote sensing data and cannot fully explore the geographic knowledge contained in many remote sensing data, which makes their application unable to meet the requirements of engineering or large-scale vegetation information extraction and updating.
This study aims to develop a multi-channel semantic segmentation model based on deep learning using multi-source remote sensing data to achieve highly accurate and generalizable intelligent vegetation classification. The model will be trained using knowledge extracted from multiple sources, including multi-temporal and multi-resolution remote sensing images and other remote sensing information, such as digital elevation models. With multi-channel inputs, the model will be able to perform data fusion and enhancement, obtain complementary information from different data sources, improve the effectiveness of the intelligent classification of vegetation remote sensing and increase the generalizability of the model to meet the requirements of engineering or large-scale vegetation information extraction and updating. The experiments of this study will be carried out on Mt. Taibai in Central China, with a rich variety of vegetation types and landscapes.

2. Data and Method

2.1. Study Area

The study area is located on Mt. Taibai (107°41′23″E–107°51′40″E, 33°49′31″N–34°08′11″N), the main peak of the Qinling-Daba Mountains (Figure 1). The climate of Mt. Taibai is influenced by the Mongolian cold air masses in winter and the Pacific subtropical high-pressure belt in summer, resulting in a transitional climate [48]. It has an elevation of 3767.2 m and a relative height difference of over 3000 m. The vegetation landscape also shows a clear vertical zonation pattern [49]: Mt. Taibai, with its significant elevation differences, has unique climatic characteristics. The water and thermal conditions vary in a regular pattern as the terrain rises, resulting in five distinct climatic zones. These zones are characterized by specific temperature and precipitation patterns at different altitudes. The vegetation landscape also shows a clear vertical zonation pattern, known as the vertical belt distribution, corresponding to the climatic zones. Mt. Taibai is a typical mountain of Central China with a rich variety of vegetation types and landscapes, making it an ideal study area for vegetation classification and mapping [50].
In order to evaluate the generalization ability of the model, the initial training of the model in this study was carried out in the training area and the testing area outside Mt. Taibai; then, the training results were validated in the validation area (Figure 1).

2.2. Data Source and Preprocessing

The remote sensing data used in this study consist of three types: ZY3 satellite data and GF2 satellite data with a resolution of 2 m in the winter season; and multi-temporal GF1 satellite data with a resolution of 16 m in the winter and summer seasons. The ZY3, GF2 and GF1 satellites are developed and operated by the China Academy of Space Technology (https://www.cast.cn/, accessed on 4 January 2024), a subsidiary of the China Aerospace Science and Technology Corporation (CASC). As mentioned above, the spectral information of vegetation changes with the seasons. Therefore, the inclusion of multi-temporal data in the model allows us to capture seasonal variations and to improve the discriminative features of the vegetation. The digital elevation model (DEM) data used in this study are derived from the digital surface model (DSM) product, with a resolution of 10 m, which provides altitudinal information for the vegetation distribution. The main data used in this study are summarized in Table 1.
To facilitate data input and standardized processing, we cropped the data, and the 2 m resolution image was cropped to 224 × 224 pixel data. To ensure consistency in the spatial extent of the data, the 16 m resolution remote sensing image was cropped to 1/8 of the size of the 2 m image, which was 28 × 28 pixel data. In addition, a spatial resampling technique was used to align the DEM data with the 16 m resolution data, which were similarly cropped to 28 × 28 pixel data.
The Mt. Taibai vegetation type map with a 1:100,000 scale, obtained by manual visual interpretation, was used to automatically label the samples and generate labeled data. This dataset provides hierarchical information on the vegetation classification in the study area, including the vegetation type group, vegetation type and sub-type and vegetation formation and sub-formation. These hierarchical levels provide detailed insights into the different vegetation types present in the study area.
Based on the 1:100,000 scale vegetation map of Mt. Taibai, the input images were automatically labeled to identify the four dominant vegetation types in the study area: cultivated vegetation, broadleaf forest, coniferous forest and coniferous and broadleaf mixed forest. This labeling process was facilitated by the vegetation map mentioned above, which allowed us to accurately assign labels to the corresponding areas in the imagery.
In total, according to the above datasets, 15,000 samples were selected in the training area and 7400 samples were selected in the testing area (Figure 1), and 2236 samples were selected in the validation area. Each of the above groups of samples did not overlap with the other samples.

2.3. Methods

The deep learning method proposed in this study is based on the FCN-ResNet architecture. The method combines two important network architectures, a Full Convolutional Network (FCN) [51] and Residual Network (ResNet) [52], to achieve efficient and accurate image segmentation tasks. The FCN is a special type of convolutional neural network used for image segmentation tasks. Unlike traditional convolutional neural networks, the fully connected layer in the FCN is converted into a fully convolutional layer, allowing the input to be an image of arbitrary size [51]. This allows FCNs to make dense pixel-level predictions for the entire image, rather than only classifying the entire image. ResNet is a deep residual network designed to solve the problem of gradient vanishing and gradient explosion during deep neural network training. ResNet efficiently trains deeper networks by introducing residual connections that allow information to skip some layers directly in the network [52]. Such residual connections retain more low-level feature information and help the network to learn better feature representations. Compared to other deep network models, ResNet has the advantage of requiring fewer parameters and less computational resources while guaranteeing accuracy, making it more suitable for this experiment.
In the FCN-ResNet architecture, ResNet is used as the base network of the FCN to provide powerful feature extraction and representation capabilities. By using ResNet’s residual connections, FCN-ResNet is able to efficiently train deep networks and extract rich spectral and texture features from remote sensing images. In the image segmentation task, FCN-ResNet performs feature extraction by passing the input image to ResNet, and then restores the feature map to the same size as the original image through an upsampling operation. Finally, the upsampled feature map is further processed using a convolutional layer to obtain the final segmentation result. The advantage is that it is able to combine the pixel-level prediction of the FCN and the deep feature representation of ResNet to achieve better performance in image segmentation tasks. It is able to accurately capture the detailed information in an image and maintain high efficiency in processing remote sensing images. To make the model more suitable for our task, the following adjustments and improvements have been made.
(1)
The model was modified to have a four-channel structure, extracting features from the following sources: 2 m resolution winter remote sensing imagery, 16 m resolution winter imagery, 16 m resolution summer imagery and 16 m resolution DEM imagery. Since the above four-channel data had different feature distributions, a pre-trained ResNet-50 model was used to train an ImageNet to extract features from each channel individually, rather than simply stacking the different data bands.
(2)
The model was built with the ability to input images of multiple sizes, depending on the data source. Since the image size of 16 m resolution was 64 times (8 × 8) the size of the 2 m resolution image, a scale check was performed and the features were aligned using resampling or interpolation methods to ensure feature fusion in subsequent stages after the model extracted features from the input images.
There are four vegetation types in the validation area (based on the 1:100,000 scale vegetation map of Mt. Taibai), including cultivated vegetation, coniferous forest, broadleaved forest and mixed coniferous and broadleaved forest. Due to the poor training results of the mixed coniferous and broadleaved forest, the other three vegetation types were trained for classification in the study area and were classified first, and the remaining ones were classified as mixed coniferous and broadleaved forest. The final model architecture was designed as follows (Figure 2). One channel for each input of the 2 m resolution winter imagery, the 16 m resolution winter imagery, the 16 m resolution summer imagery and the input of DEM data was passed through a different feature extractor network separately, instead of the same network. The extracted spectral and texture features of different sizes from different channels were aligned by interpolation to the same size and concatenated at different stages. In the classifier, the features obtained from different depth feature extractors were concatenated by continuous convolutions and then recovered to the size of 224 × 224 by upsampling. To improve the training speed, the classifier used in this study was based on FCN-8s [51], which not only ensured the accuracy and quality of the generated segmentation results compared to others but also avoided excessive memory overhead during training. After passing through the classifier, the output of the model, a three-channel heat map, was obtained, and the pixel value of each channel represented the probability of the pixel belonging to one class. The final vegetation classification map was produced by performing the argmax operation on the heat map to obtain the vegetation type of the highest probability of pixels.
In addition, a series of ablation experiments were performed to validate the effectiveness of the method, as shown in Table 2. A total of five ablation experiments, along the multi-channel model, were grouped as follows.
Group 1: consisting of Experiments 1 and 2, both single-channel models, comparing the use of high-resolution remote sensing imagery with low-resolution remote sensing imagery.
Group 2: including Experiments 3 and 4, adding DEM data based on Experiments 1 and 2, both two-channel models, comparing the performance of DEM data fused with remote sensing images of different resolutions.
Group 3: including Experiments 5 and 6 (the multi-channel model), a three-channel model and four-channel model, respectively, comparing the performance before and after the fusion of multi-resolution data with DEM.
Then, the results of each experimental model were compared with the result of the multi-channel model again to evaluate the effectiveness of the multi-channel model proposed in this study.
The training process of the model begins with training in the training and testing areas; then, we validate the accuracy of the model in the validation area. During the model training process, we adopted the commonly used cross-entropy loss function for semantic segmentation and employed the Adam optimization method. The initial learning rate was set to 10−5, and the batch size for the multi-channel task was set to 4. The training process consisted of 50 iterations. For the first 20 iterations, the initial learning rate remained constant, and for the remaining 30 iterations, it gradually decreased until it reached 0.
In this study, the neural network was trained by using the above training and testing datasets. The metrics of the model are evaluated using the pixel accuracy (PA) and mean intersection over union (MIoU), which are commonly used metrics for semantic segmentation and are calculated as follows:
P A = i = 0 k p i i i = 0 k j = 0 k p i j
M I o U = 1 k + 1 i = 0 k p i j j = 0 k p i j + j = 0 k p j i p i i
where k in the formula represents the number of categories, p i i represents true positives (TP), p i j represents false positives (FP) and p j i represents false negatives (FN).
After the model training was completed, classification validation was performed on the entire model using the validation dataset. The classification results were visualized and the accuracy was evaluated using a confusion matrix, which counted the pixels for accuracy evaluation.

3. Results

3.1. Training Results of the Models

By pre-training the ResNet-50 backbone network on ImageNet, the models were trained faster. After 50 iterations, the models converged and the loss dropped below 0.1. The training process of the models is represented by lines in Figure 3. As the iterations progressed, the loss of the models continued to decrease, and the training accuracy tended to stabilize.
(1)
Training results of single-channel models: According to the results of Experiments 1 and 2 in Group 1 (Figure 3), the classification result of the 16 m resolution remote sensing images was better than the result of the 2 m resolution images. This suggested that although the high-resolution images contained richer textural and structural features, this information was not fully utilized in the single-channel model.
(2)
Training results of two-channel models: The training results of Experiments 3 and 4 in Group 2 (Figure 3) were very close to each other and higher than the training results of Group 1, indicating that the DEM data could significantly improve the classification results, and the results of the two-channel models were better than the results of the single-channel models.
(3)
Training results of the multi-channel models: The training result of Experiment 5 (three-channel images in the model) in Group 3 (Figure 3) was poor, only slightly higher than that of Experiment 1 (single-channel images with 2 m-resolution) and significantly lower than the results of Experiments 3 and 4 in Group 2 and Experiment 2 (single-channel images with 16 m resolution) in Group 1. This showed that the classification model with multi-channel images did not necessarily significantly improve the classification result. However, the training results of Experiment 6 (multi-channel model: three-channel images and one-channel DEM) were significantly better than the results of other experiments, indicating that the multi-channel model fusing DEM and images could significantly improve the results of mountain vegetation classification. Therefore, the multi-channel model, which integrates data from multiple sources for classification, proves to be effective and greatly enhances the accuracy of mountain vegetation classification.

3.2. Classification Results of Mountain Vegetation

(1)
Classification results of the models
The training models were applied to classification in the validation area, and the classification results of the models are shown in Figure 4. Figure 4a is the 1:100,000 scale vegetation type map generated by manual visual interpretation (used as the ground truth labels in this study) and Figure 4b is the vegetation type map generated by the multi-channel segmentation model (Experiment 6). Overall, the classification results were closer to the ground truth labels, with overall accuracy of 85.8%. The comparison between the two classifications showed that there were several misclassified areas, mainly along the boundary between the coniferous forests and the mixed coniferous and broadleaved forests. Since there is a gradual transition between the coniferous forests and the mixed coniferous and broadleaved forests, it is difficult to make accurate judgments in the transitional zone. Additionally, the image features in these transitional areas are not clear enough to correctly distinguish between the two types. As a result, the classification results of the model have some errors, leading to confusion between the coniferous forests and the mixed coniferous and broadleaved forests. There was also a confusion zone between broadleaved forests and cultivated vegetation in the upper right corner of the validation area (Figure 4a,b). According to the original image, this area was found to be covered by a significant amount of shadow (the elevation was about 1000 m), resulting in the image features being similar to those of cultivated vegetation such as farmland. As a result, the model confused and misclassified this area. Figure 4c–g show the classification results of the other ablation experiment models, which had many misclassifications or errors. It is worth mentioning that there are many horizontal stripes in Figure 4g. This phenomenon is caused by the image cropping and mosaicking methods. Due to the cropping performed in this study without overlap, and the limitations of the method, the predicted values at the edge of every slice tended to have low confidence and were subject to abrupt changes, resulting in stripes along the adjacent borders of slices. Moreover, the poor accuracy of Experiment 5 made the stripes more obvious in the results.
Typical slices of the classification results and the heat maps of the multi-channel model (Experiment 6) in the validation area are shown in Figure 5 and Figure 6. Figure 5a and Figure 6a are the false-color satellite images with a 2 m resolution. As shown in the figures, comparing the classification results (Figure 5c and Figure 6c) and the truth labels (Figure 5b and Figure 6b), it can be observed that the classification results of the multi-channel model are generally consistent with the ground truth labels.
(2)
Classification accuracy of the models
The classification accuracy of the vegetation types was calculated by the statistical analysis of pixels. Details are shown in Table 3. MIoU is a widely used indictor for accuracy assessment compared to PA and is strongly affected by the number and distribution of classification types. Due to the predominance of broadleaved forest and coniferous forest in the study area, the distribution of vegetation types was uneven, resulting in unsatisfactory MIoU results. However, the PA of the method was very high, indicating the good accuracy of the overall classification results. Therefore, the analysis focused on the overall pixel accuracy (PA). Out of all the models evaluated, the multi-channel model demonstrated the highest accuracy, achieving an overall accuracy rate of 85.8%. This suggests that combining DEM data with multi-source spectral remote sensing images yields the best classification results. In the single-channel model, the accuracy of the model using only 2 m or 16 m resolution winter imagery (Experiments 1 and 2) was relatively low at only 58–59%. The accuracy of classification models based on multi-source remote sensing images (Experiment 5) is even lower at 44.5%. This suggests that a single source of image data or multi-source images are not sufficient for vegetation classification. Incorporating DEM information into the images can significantly improve the classification accuracy, which increases to 65.8% and 68.8% (Experiments 3 and 4), respectively. Overall, the model that integrates multi-source data (Experiment 6, multi-source remote sensing images and DEM) performs best in classification, confirming the importance of multi-source data fusion. Multi-channel learning methods have advantages in this type of remote sensing classification task.
Since the 1:100,000 scale vegetation type map was generated by manual visual interpretation (used as the ground truth labels in this study), there are also some errors in the original vegetation type map, which may also lead to a decrease in the accuracy of our predictions. For example, the boundaries of the ground truth label as the cultivated vegetation in Figure 7b are not quite correct according to Figure 7a, while the boundaries and classification results of the multi-channel model in Figure 7c are more consistent with the real situation and more reasonable. However, when the classification accuracy was calculated based on the ground truth labels of the 1:100,000 scale vegetation type map, the obtained classification results were found to be incorrect, which reduced the classification accuracy. In other words, the actual classification accuracy of the multi-channel model was higher than that shown in Table 3. Therefore, through multi-channel abstraction and learning from ResNet50, the multi-channel model can capture and extract key features from the data and gradually optimize the model’s weights and parameters during training. Despite the presence of some errors or noise, the multi-channel deep learning model can still learn valuable information and patterns from the data, resulting in good overall performance and prediction accuracy. As a result, the multi-channel model based on ResNet50 can achieve even better performance in practice.

4. Discussion

(1)
The effect of the labeled data on the classification accuracy
The labeled data used in this study were extracted from the vegetation distribution maps by manual visual interpretation, and they still contained some errors. These errors arose from a number of sources, such as the subjective judgment of the interpreters, the incomplete accuracy of the annotation process and the noise in the data itself, which may cause the model to learn some incorrect features or perform inaccurate classification during the training process, thus affecting the classification accuracy of the model.
As we know, deep learning models usually have a certain level of error tolerance due to their strong nonlinear fitting and generalization capabilities. Even with a certain amount of error in the labeled data, deep learning models still have the ability to learn key features and patterns from them and achieve accurate classification. In addition, deep learning models typically have a large number of parameters and hidden layers that enable them to learn subtle features in the data. This means that deep learning models can compensate for some of the errors in the labeled data through a large number of training samples and an iterative optimization process. By training on large amounts of data, the model can learn a more robust and generalized representation of the features, improving the overall classification performance. Therefore, despite some errors in the labeled data, the multi-channel model proposed in this study can still achieve better results in practical applications, especially if the model has sufficient capacity and training data.
(2)
Effect of data slicing on classification accuracy
The remote sensing data and DEM used in this study were sliced using Python’s GDAL(2.2.3) library. Data slicing may result in the possibility of boundaries between each slice, and often these slices do not overlap each other, which would result in low confidence values at the boundaries because the information at the slice boundaries may be incomplete or limited by the choice of slices. In other words, data slicing can lead to less reliable or accurate classifications by the model at the boundaries.
To solve the problems caused by data slicing, the classification results at the boundaries can be improved by merging or post-processing in the subsequent processing stage. For example, pixel-level post-processing techniques such as boundary smoothing or pixel redistribution may be used to reduce the errors at the boundary. Additionally, adding overlapping regions or using sliding windows can address the slice boundary problem. These approaches can improve the classification accuracy at the boundaries.
(3)
Effect of DEM on the vegetation classification
According to the classification results of the experiments in this study (Figure 2 and Table 3), the DEM has a strong influence on the vegetation classification. The distribution of vegetation in mountainous areas is strongly influenced by the topography and has a clear vertical zonal pattern, which is known as mountain altitudinal belts [53,54,55,56,57]. Therefore, the use of a DEM as one of the data sources in the multi-channel model can significantly improve the classification accuracy. The advantage of the multi-channel model based on ResNet50 is the ability to combine data from different sources for learning and training to obtain more robust features and then improve the classification accuracy.
Although the multi-channel model proposed in this study can fuse multi-source data for deep learning and achieve remarkable results in vegetation classification, it is not able to fully use the geomorphological knowledge contained in the data or other vegetation knowledge graphs. Therefore, the model is still deficient in terms of finer vegetation classification. The use of geoscientific knowledge by geoscientists to improve the “black-box” nature and the interpretability of deep learning models is a major problem. Furthermore, remote sensing images typically contain high-dimensional and complex spatial information such as land cover, terrain and various land features, and their image characteristics often vary with factors such as the time and the angle of acquisition. All of these pose significant challenges for deep learning networks based on feature learning.
(4)
Comparison with other vegetation classification methods in Mt. Taibai
There have been some similar studies in the study area that have also achieved high accuracy in vegetation classification. For example, Zhang et al. [49] used the mountain altitudinal belts of vegetation in Mt. Taibai to construct the topographic constraint factors; then, combining high-resolution remote sensing images and DSM data, they classified and mapped the vegetation of Mt. Taibi using the object-oriented classification of vegetation and finally obtained validation accuracy of 92.9%, which is about 7.1% higher than that of this study (85.8%). However, there was a great deal of manual work for the construction of topographic constraint factors and the post-processing of classification. Wu et al. [50] produced an intelligent technical mapping framework of vegetation types based on the basic units of geo-objects from HSR-RS images, and the final classification accuracy obtained by this method was 87.59%, which was close to the accuracy of this study. However, there were large amounts of auxiliary information, such as three types of image-based features (i.e., spectrum, shape and texture features), topographic and geomorphological features (i.e., elevation, slope, slope direction, aspect, degree of hill shade and geomorphological type) and meteorological factors, soil factors, land cover types, net primary productivity (NPP) and the vegetation index sequence from the multi-source data, which were overlaid with the geo-objects. Many tasks needed to be performed before classification. The multi-channel semantic segmentation model based on deep learning proposed in this study can be used to automatically study and learn the features and knowledge hidden in the input data; thus, the efficiency and generalization of the model are higher than those of others. Additionally, the accuracy of this model could be improved by post-processing during classification.

5. Conclusions

In this study, a multi-channel semantic segmentation model based on deep learning was proposed for mountain vegetation classification, which mainly adopted the FCN-ResNet architecture and introduced a multi-channel framework for feature extraction and the multi-level deep fusion of multi-source remote sensing data. The model fuses the features of multi-resolution and multi-temporal remote sensing images and DEM data, which enhances the ability of the model to extract different remote sensing information, and it is able to automatically learn and capture information at multiple scales and levels, thus making full use of remote sensing data from multiple sources for vegetation classification. Moreover, the multi-channel semantic segmentation model trained in one location (training area) was transferred to another location (validation area) for vegetation classification and achieved overall pixel accuracy (PA) of 85.8% in vegetation classification. The classification results in the validation area verified the effectiveness and the generalizability of the model for vegetation classification, meeting the requirements of engineering or large-scale vegetation information extraction and updating.

Author Contributions

Conceptualization, B.W. and Y.Y.; methodology, B.W. and Y.Y.; software, B.W.; validation, B.W.; formal analysis, B.W.; writing—original draft preparation, B.W.; writing—review and editing, Y.Y.; visualization, B.W.; supervision, Y.Y.; project administration, Y.Y.; funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key R&D Program (grant number: 2022YFB3904204); the National Natural Science Foundation of China (grant number: 41871350); and the Key Project of Innovation LREIS (grant number: KPI008).

Data Availability Statement

The raw data supporting the conclusions of this article are available from the authors upon request.

Acknowledgments

We are grateful to Yang Jiaqi for his advice and suggestions on programming. Our appreciation is also extended to the editors and three anonymous reviewers, whose comments and suggestions helped to greatly improve the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Guo, Q.; Guan, H.; Hu, T.; Jin, S.; Su, Y.; Wang, X.; Wei, D.; Ma, Q.; Sun, Q. Remote sensing-based mapping for the new generation of Vegetation Map of China (1:500,000). Sci. China Life Sci. 2021, 51, 229–241. (In Chinese) [Google Scholar] [CrossRef]
  2. Yang, C.; Wu, G.; Li, Q.; Wang, J.; Qu, L.; Ding, K. Research progress on remote sensing classification of vegetation. Geogr. Geo-Inf. Sci. 2018, 34, 24–32. [Google Scholar]
  3. Liu, H.; Zhang, A.; Zhao, Y.; Zhao, A.; Wang, D. Spatial scale transformation–based estimation model for fresh grass yield: A case study of the Xilingol Grassland, Inner Mongolia, China. Environ. Sci. Pollut. Res. 2023, 30, 1085–1095. [Google Scholar] [CrossRef] [PubMed]
  4. Miao, Y.; Zhang, R.; Guo, J.; Yi, S.; Meng, B.; Liu, J. Vegetation Coverage in the Desert Area of the Junggar Basin of Xinjiang, China, Based on Unmanned Aerial Vehicle Technology and Multisource Data. Remote Sens. 2022, 14, 5146. [Google Scholar] [CrossRef]
  5. Wu, D.; Liu, Q.; Xia, R.; Li, T. Study on the changes in vegetation structural coverage and its response mechanism to hydrology. Open Geosci. 2022, 14, 79–88. [Google Scholar] [CrossRef]
  6. Dong, X.; Hu, C. Remote Sensing Monitoring and Evaluation of Vegetation Changes in Hulun Buir Grassland, Inner Mongolia Autonomous Region, China. Forests 2022, 13, 2186. [Google Scholar] [CrossRef]
  7. Tian, Y.; Wu, Z.; Li, M.; Wang, B.; Zhang, X. Forest Fire Spread Monitoring and Vegetation Dynamics Detection Based on Multi-Source Remote Sensing Images. Remote Sens. 2022, 14, 4431. [Google Scholar] [CrossRef]
  8. Deng, Y. Analysis of the research progress of forest vegetation remote sensing classification. China Sci. Technol. Inf. 2020, 8, 74–75+78. [Google Scholar]
  9. Cui, B.; Wu, J.; Li, X.; Ren, G.; Lu, Y. Combination of deep learning and vegetation index for coastal wetland mapping using GF-2 remote sensing images. Natl. Remote Sens. Bull. 2023, 27, 1376–1386. [Google Scholar] [CrossRef]
  10. Zhang, L.; Luo, W.; Zhang, H.; Yin, X.; Li, B. Classification scheme for mapping wetland herbaceous plant communities using time series Sentinel-1 and Sentinel-2 data. Natl. Remote Sens. Bull. 2023, 27, 1362–1375. [Google Scholar] [CrossRef]
  11. Rapinel, S.; Mony, C.; Lecoq, L.; Clément, B.; Thomas, A.; Hubert-Moy, L. Evaluation of Sentinel-2 time-series for mapping floodplain grassland plant communities. Remote Sens. Environ. 2019, 223, 115–129. [Google Scholar] [CrossRef]
  12. Huang, K.; Meng, X.; Yang, G.; Sun, W. Spatio-temporal probability threshold method of remote sensing for mangroves mapping in China. Natl. Remote Sens. Bull. 2022, 26, 1083–1095. [Google Scholar] [CrossRef]
  13. Qin, H.; Zhou, W.; Yao, Y.; Wang, W. Individual tree segmentation and tree species classification in subtropical broadleaf forests using UAV-based LiDAR, hyperspectral, and ultrahigh-resolution RGB data. Remote Sens. Environ. 2022, 280, 113143. [Google Scholar] [CrossRef]
  14. Marconi, S.; Weinstein, B.G.; Zou, S.; Bohlman, S.A.; Zare, A.; Singh, A.; Stewart, D.; Harmon, I.; Steinkraus, A.; White, E.P. Continental-scale hyperspectral tree species classification in the United States National Ecological Observatory Network. Remote Sens. Environ. 2022, 282, 113264. [Google Scholar] [CrossRef]
  15. Xia, Q.; Li, J.; Dai, S.; Zhang, H.; Xing, X. Mapping high-resolution mangrove forests in China using GF-2 imagery under the tide. Natl. Remote Sens. Bull. 2023, 27, 1320–1333. [Google Scholar] [CrossRef]
  16. Gao, C.; Jiang, X.; Zhen, J.; Wang, J.; Wu, G. Mangrove species classification with combination of WorldView-2 and Zhuhai-1 satellite images. Natl. Remote Sens. Bull. 2022, 26, 1155–1168. [Google Scholar] [CrossRef]
  17. Ruiz, L.F.C.; Guasselli, L.A.; Simioni, J.P.D.; Belloli, T.F.; Barros Fernandes, P.C. Object-based classification of vegetation species in a subtropical wetland using Sentinel-1 and Sentinel-2A images. Sci. Remote Sens. 2021, 3, 100017. [Google Scholar] [CrossRef]
  18. Su, H.; Yao, W.; Wu, Z. Hyperspectral remote sensing imagery classification based on elastic net and low-rank representation. Natl. Remote Sens. Bull. 2022, 26, 2354–2368. [Google Scholar] [CrossRef]
  19. Liu, S.; Dong, X.; Lou, X.; Larisa, D.R.; Elena, N. Classification and Density Inversion of Wetland Vegetation Based on the Feature Variables Optimization of Random Forest Model. J. Tongji Univ. (Nat. Sci.) 2021, 49, 695–704. [Google Scholar]
  20. Xing, X.; Yang, X.; Xu, B.; Jin, Y.; Guo, J.; Chen, A.; Yang, D.; Wang, P.; Zhu, L. Remote sensing estimation of grassland aboveground biomass based on random forest. J. Geo-Inf. Sci. 2021, 23, 1312–1324. [Google Scholar]
  21. Li, D.; Chen, S.; Chen, X. Research on method for extracting vegetation information based on hyperspectral remote sensing data. Trans. CSAE 2010, 26, 181–185+386. [Google Scholar]
  22. Liang, J.; Zheng, Z.; Xia, S.; Zhang, X.; Tang, Y. Crop recognition and evaluation using red edge features of GF-6 satellite. Natl. Remote Sens. Bull. 2020, 24, 1168–1179. [Google Scholar] [CrossRef]
  23. Su, Y.; Qi, Y.; Wang, J.; Xu, F.; Zhang, J. Land cover extraction in Ejina Oasis by hyperspectral remote sensing. Remote Sens. Technol. Appl. 2018, 33, 202–211. [Google Scholar]
  24. Zhang, X.; Yang, Y.; Gai, L.; Li, L.; Wang, Y. Research on Vegetation Classification Method Based on Combined Decision Tree Algorithm and Maximum Likelihood Ratio. Remote Sens. Inf. 2010, 25, 88–92. [Google Scholar]
  25. Li, L.; Qiao, J.; Yao, J.; Li, J.; Li, L. Automatic freezing-tolerant rapeseed material recognition using UAV images and deep learning. Plant Methods 2022, 18, 5. [Google Scholar] [CrossRef] [PubMed]
  26. Minallah, N.; Tariq, M.; Aziz, N.; Khan, W.; Rehman, A.U.; Belhaouari, S.B. On the performance of fusion based planet-scope and Sentinel-2 data for crop classification using inception inspired deep convolutional neural network. PLoS ONE 2020, 15, e0239746. [Google Scholar] [CrossRef]
  27. Guo, Q.; Jin, S.; Li, M.; Yang, Q.; Xu, K.; Ju, Y.; Zhang, J.; Xuan, J.; Liu, J.; Su, Y.; et al. Application of deep learning in ecological resource research: Theories, methods, and challenges. Sci. China Earth Sci. 2020, 50, 1354–1373. [Google Scholar] [CrossRef]
  28. Hu, Y.; Zhang, J.; Ma, Y.; An, J.; Ren, G.; Li, X. Hyperspectral Coastal Wetland Classification Based on a Multiobject Convolutional Neural Network Model and Decision Fusion. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1110–1114. [Google Scholar] [CrossRef]
  29. Kou, W.; Shen, Z.; Liu, D.; Liu, Z.; Li, J.; Chang, W.; Wang, H.; Huang, L.; Jiao, S.; Lei, Y.; et al. Crop classification methods and influencing factors of reusing historical samples based on 2D-CNN. Int. J. Remote Sens. 2023, 44, 3278–3305. [Google Scholar] [CrossRef]
  30. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  31. Li, Q.; Tian, J.; Tian, Q. Deep Learning Application for Crop Classification via Multi-Temporal Remote Sensing Images. Agriculture 2023, 13, 906. [Google Scholar] [CrossRef]
  32. Flood, N.; Watson, F.; Collett, L. Using a U-net convolutional neural network to map woody vegetation extent from high resolution satellite imagery across Queensland, Australia. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101897. [Google Scholar] [CrossRef]
  33. Shi, Y.; Han, L.; Huang, W.; Chang, S.; Dong, Y.; Dancey, D.; Han, L. A Biologically Interpretable Two-Stage Deep Neural Network (BIT-DNN) for Vegetation Recognition from Hyperspectral Imagery. IEEE Trans. Geosci. Remote Sens. 2021, 60, 4401320. [Google Scholar] [CrossRef]
  34. Kartchner, D.; Nakajima An, D.; Ren, W.; Zhang, C.; Mitchell, C.S. Rule-Enhanced Active Learning for Semi-Automated Weak Supervision. Artif. Intell. 2022, 3, 211–228. [Google Scholar] [CrossRef] [PubMed]
  35. Langford, Z.L.; Kumar, J.; Hoffman, F.M.; Breen, A.L.; Iversen, C.M. Arctic Vegetation Mapping Using Unsupervised Training Datasets and Convolutional Neural Networks. Remote Sens. 2019, 11, 69. [Google Scholar] [CrossRef]
  36. Sharma, S.; Ball, J.E.; Tang, B.; Carruth, D.W.; Doude, M.; Islam, M.A. Semantic Segmentation with Transfer Learning for Off-Road Autonomous Driving. Sensors 2019, 19, 2577. [Google Scholar] [CrossRef] [PubMed]
  37. Qiao, K.; Chen, J.; Wang, L.; Zeng, L.; Yan, B. A top-down manner-based DCNN architecture for semantic image segmentation. PLoS ONE 2017, 12, e0174508. [Google Scholar] [CrossRef]
  38. Ayhan, B.; Kwan, C.; Larkin, J.; Kwan, L.; Skarlatos, D.; Vlachos, M. Deep Learning Model for Accurate Vegetation Classification Using RGB Image Only. In Electrical Network, Proceedings of the Conference on Geospatial Informatics X, Online, 27 April–8 May 2020; Doucette, P.J., Ed.; SPIE: Bellingham, WA, USA, 2020. [Google Scholar]
  39. Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
  40. Liu, M.; Fu, B.; Xie, S.; He, H.; Lan, F.; Li, Y.; Lou, P.; Fan, D. Comparison of multi-source satellite images for classifying marsh vegetation using DeepLabV3 Plus deep learning algorithm. Ecol. Indic. 2021, 125, 107562. [Google Scholar] [CrossRef]
  41. Gonzalez-Perez, A.; Abd-Elrahman, A.; Wilkinson, B.; Johnson, D.J.; Carthy, R.R. Deep and Machine Learning Image Classification of Coastal Wetlands Using Unpiloted Aircraft System Multispectral Images and Lidar Datasets. Remote Sens. 2022, 14, 3937. [Google Scholar] [CrossRef]
  42. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Part III 18; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
  43. Xu, M.; Xu, H.; Kong, P.; Wu, Y. Remote Sensing Vegetation Classification Method Based on Vegetation Index and Convolution Neural Network. Laser Optoelectron. Prog. 2022, 59, 273–285. [Google Scholar]
  44. Bazi, Y.; Bashmal, L.; Rahhal, M.M.A.; Dayil, R.A.; Ajlan, N.A. Vision Transformers for Remote Sensing Image Classification. Remote Sens. 2021, 13, 516. [Google Scholar] [CrossRef]
  45. Zhou, Z.; Li, S.; Wu, W.; Guo, W.; Li, X.; Xia, G.; Zhao, Z. NaSC-TG2: Natural Scene Classification with Tiangong-2 Remotely Sensed Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3228–3242. [Google Scholar] [CrossRef]
  46. Zhou, D.; Wang, G.; He, G.; Long, T.; Yin, R.; Zhang, Z.; Chen, S.; Luo, B. Robust Building Extraction for High Spatial Resolution Remote Sensing Images with Self-Attention Network. Sensors 2020, 20, 7241. [Google Scholar] [CrossRef] [PubMed]
  47. Immitzer, M.; Neuwirth, M.; Böck, S.; Brenner, H.; Vuolo, F.; Atzberger, C. Optimal Input Features for Tree Species Classification in Central Europe Based on Multi-Temporal Sentinel-2 Data. Remote Sens. 2019, 11, 2599. [Google Scholar] [CrossRef]
  48. Yao, Y.; Suonan, D.; Zhang, J. Compilation of 1∶50000 vegetation type map with remote sensing images based on mountain altitudinal belts of Taibai Mountain in the north-south transitional zone of China. Acta Geogr. Sin. 2020, 75, 620–630. [Google Scholar] [CrossRef]
  49. Zhang, J.; Yao, Y.; Suonan, D.; Gao, L.; Wang, J.; Zhang, X. Mapping of mountain vegetation in Taibai Mountain based on mountain altitudinal belts with remote sensing. J. Geo-Inf. Sci. 2019, 21, 1284–1294. [Google Scholar] [CrossRef]
  50. Wu, T.; Luo, J.; Gao, L.; Sun, Y.; Dong, W.; Zhou, Y.; Liu, W.; Hu, X.; Xi, J.; Wang, C.; et al. Geo-Object-Based Vegetation Mapping via Machine Learning Methods with an Intelligent Sample Collection Scheme: A Case Study of Taibai Mountain, China. Remote Sens. 2021, 13, 249. [Google Scholar] [CrossRef]
  51. Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. arXiv 2014, arXiv:1411.4038. [Google Scholar] [CrossRef]
  52. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar] [CrossRef]
  53. Zhang, J.; Yao, Y.; Suo, N. Automatic classification of fine-scale mountain vegetation based on mountain altitudinal belt. PLoS ONE 2020, 15, e0238165. [Google Scholar] [CrossRef]
  54. Zhang, B. Natural phenomena in mountains: Vertical zones. For. Hum. 2015, 2, 2–4. [Google Scholar]
  55. Zhao, F.; Zhang, B.; Zhu, L.; Yao, Y.; Cui, Y.; Liu, J. Spectra structures of altitudinal belts and their significance for determining the boundary between warm temperate and subtropical zones in the Qinling-Daba Mountains. Acta Geogr. Sin. 2019, 74, 889–901. [Google Scholar]
  56. Zhang, B.; Yao, Y.; Xiao, F.; Zhou, W.; Zhu, L.; Zhang, J.; Zhao, F.; Bai, H.; Wang, J.; Yu, F.; et al. The finding and significance of the super altitudinal belt of montane deciduous broad-leaved forests in central Qinling Mountains. Acta Geogr. Sin. 2022, 77, 2236–2248. [Google Scholar]
  57. Li, J.; Yao, Y.; Liu, J.; Zhang, B. Variation Analysis of the Typical Altitudinal Belt Width in the Qinling-Daba Mountains. Nat. Prot. Areas 2023, 3, 12–25. [Google Scholar]
Figure 1. The study area.
Figure 1. The study area.
Remotesensing 16 00256 g001
Figure 2. FCN-ResNet architecture.
Figure 2. FCN-ResNet architecture.
Remotesensing 16 00256 g002
Figure 3. Training curves of the experiments. The curves in the top half of the image are the pixel accuracy during model training, and below them are the corresponding losses.
Figure 3. Training curves of the experiments. The curves in the top half of the image are the pixel accuracy during model training, and below them are the corresponding losses.
Remotesensing 16 00256 g003
Figure 4. Classification results of the models in the validation area. (a) The 1:100,000 scale vegetation type map used as the ground truth labels; (b) the results of the multi-channel model (Experiment 6); (c) the results of the model with only 2 m winter (Experiment 1); (d) the results of the model with only 16 m winter (Experiment 2); (e) the results of the model with DEM and 2 m winter (Experiment 3); (f) the results of the model with DEM and 16 m winter (Experiment 4); and (g) the results of the model with all 16 m and 2 m (Experiment 5).
Figure 4. Classification results of the models in the validation area. (a) The 1:100,000 scale vegetation type map used as the ground truth labels; (b) the results of the multi-channel model (Experiment 6); (c) the results of the model with only 2 m winter (Experiment 1); (d) the results of the model with only 16 m winter (Experiment 2); (e) the results of the model with DEM and 2 m winter (Experiment 3); (f) the results of the model with DEM and 16 m winter (Experiment 4); and (g) the results of the model with all 16 m and 2 m (Experiment 5).
Remotesensing 16 00256 g004
Figure 5. Classification results and their heat maps (cultivated vegetation and broadleaved forest). (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest, blue represents coniferous forest and gray represents the mixed coniferous and broadleaved forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaved forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Figure 5. Classification results and their heat maps (cultivated vegetation and broadleaved forest). (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest, blue represents coniferous forest and gray represents the mixed coniferous and broadleaved forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaved forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Remotesensing 16 00256 g005
Figure 6. Classification results and their heat maps (cultivated vegetation and coniferous forest). (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest and blue represents coniferous forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaf forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Figure 6. Classification results and their heat maps (cultivated vegetation and coniferous forest). (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest and blue represents coniferous forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaf forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Remotesensing 16 00256 g006
Figure 7. Comparison of classification results with ground truth. (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest and blue represents coniferous forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaf forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Figure 7. Comparison of classification results with ground truth. (a) The remote sensing image with 2 m resolution; (b) the ground truth label of the vegetation map; (c) the classification results of the multi-channel model, where the yellow color represents cultivated vegetation, green represents broadleaved forest and blue represents coniferous forest in this experiment; (df) are the heat maps of cultivated vegetation, broadleaf forest and coniferous forest, respectively, where the red color represents high probability and blue represents low probability.
Remotesensing 16 00256 g007
Table 1. Remote sensing data used in this study.
Table 1. Remote sensing data used in this study.
Data TypeSensorTimeBands
2 m resolution remote sensing imageZY3 and GF2winter4
16 m resolution remote sensing imageGF1winter and summer4
digital elevation model (DEM) dataZY3--1
Table 2. Settings of the ablation experiments.
Table 2. Settings of the ablation experiments.
Conditions  2 m Resolution
Winter Imagery
16 m Resolution Winter Image16 m Resolution Summer ImageDEM Image
Experiment
1. Only 2 m winter
2. Only 16 m winter
3. DEM and 2 m winter
4. DEM and 16 m winter
5. 16 m and 2 m
6. Multi-channel model
Table 3. Classification accuracy of vegetation type mapping in validation area.
Table 3. Classification accuracy of vegetation type mapping in validation area.
ExperimentsCorrect PixelsTotal PixelsEvaluation Indicator
Cultivated VegetationBroadleaved ForestsConiferous
Forests
Correct TotalTotal PixelsPA (%)MIoU (%)
1. Only 2 m winter 144,6582,546,8337,654,57110,346,06217,567,65458.930.5
2. Only 16 m winter559,2703,971,4995,851,71210,382,48117,567,65459.136.1
3. DEM and 2 m winter388,8563,439,3517,723,45411,551,66117,567,65465.839.5
4. DEM and 16 m winter459,9495,153,8896,475,33112,089,16917,567,65468.843.9
5. All 16 m and 2 m12297,099,200724,2307,824,65917,567,65444.516.7
6. Multi-channel model537,7607,468,9357,063,65715,070,35217,567,65485.865.7
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, B.; Yao, Y. Mountain Vegetation Classification Method Based on Multi-Channel Semantic Segmentation Model. Remote Sens. 2024, 16, 256. https://doi.org/10.3390/rs16020256

AMA Style

Wang B, Yao Y. Mountain Vegetation Classification Method Based on Multi-Channel Semantic Segmentation Model. Remote Sensing. 2024; 16(2):256. https://doi.org/10.3390/rs16020256

Chicago/Turabian Style

Wang, Baoguo, and Yonghui Yao. 2024. "Mountain Vegetation Classification Method Based on Multi-Channel Semantic Segmentation Model" Remote Sensing 16, no. 2: 256. https://doi.org/10.3390/rs16020256

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop