*4.4. Impact of Different Training Scales*

In this section, we discuss experiments conducted to survey the influence of different training scales on mapping performance. We trained the Attn+ISeg+ResNet-152 structure on progressively smaller subsets of training data and evaluated the test data from the GAN-GL-D dataset, as shown in Figure 6. Owing to slight variations in the OA, the accuracy statistics of the other four indicators with the changes of sample scale were plotted. Generally, the extraction accuracy of glacial lakes can be continuously improved with increased amounts of training data, and is particularly sensitive to the sample scale within a range of 60% of the training set. This means that a sufficient number of training samples is conducive to reliable mapping. However, when the ratio of the training set exceeds 60%, the associated accuracy increases slowly and almost reaches the saturation point.

#### *4.5. Comparison with Other State-of-the-Art Mapping Methods*

#### 4.5.1. Experimental Materials

For a comprehensive evaluation of the robustness of the proposed model (Attn + ISeg + ResNet-152), two state-of-the-art mapping methods, the widely used global–local iterative segmentation algorithm [14] and the classical random forest classification [1], were employed for mapping performance comparison in the mapping of glacial lakes over the Eastern Himalayas. The Eastern Himalayas was chosen as our test site because this region has a high density of glacial lakes [1] and a high probability to outburst hazards [41]. Ten Landsat-8 OLI images from the year 2017 covering the entire Eastern Himalayas were used for the experiments.

**Figure 6.** Accuracy for the proposed Attn+ISeg+ResNet-152 structure using different ratios of the training sets as input.

The global–local iterative segmentation algorithm has been successfully used before for glacial lake mapping in mountainous areas. Implementation of the algorithm mainly consists of two steps. Firstly, potential glacial lake pixels are delineated using a globallevel thresholding segmentation of NDWI coupled with NIR and SWIR bands to filter out backgrounds and noise pixels, with a spectral reflectance similar to that of glacial lakes. Secondly, a buffer zone is established for each potential lake, and then a local threshold of NDWI is used to determine the final lake extent within this buffer zone. Here, the local threshold is calculated based on the rule that the NDWI of glacial lakes and backgrounds conforms to a bimodal distribution. In our experiments, the global thresholds of NDWI (≥ 0.10), NIR (< 0.15), and SWIR (< 0.05) were set according to those of the literature [4,14,17,42]. The local-level threshold in each buffer zone is computed as follows:

$$Tlreshold = \frac{\mu\_{back\,\text{ground}} \cdot \sigma\_{water} + \mu\_{water} \cdot \sigma\_{back\,\text{ground}}}{\sigma\_{water} + \sigma\_{back\,\text{ground}}} \tag{10}$$

where *μwater* and *μbackground* are the mean NDWIs of the water and background region, respectively. *σwater* and *σbackground* are the variances of the NDWI of the water and background region, respectively.

The random forest is a classical ensemble learning method that employs many individual decision trees to vote for the best decision. The method has better robustness and generalization ability than methods that use an individual decision tree due to the random sampling of input data and the random subset of features. Random forest has been widely applied in the field of lake mapping [1,43]. In this study, we grew 100 trees and randomly selected 1000 pixels from the NDWI, NIR, and SWIR for glacial lakes and non-glacial lakes to train the classifier. Note that to alleviate the effects from terrain conditions, additional experiments were undertaken by introducing auxiliary ASTER DEM data (with a spatial resolution of 30 m) for the two methods. Topographic shadows were masked using slopes larger than 15◦ [4,33].

#### 4.5.2. Results and Analysis

Mapping glacial lakes at a large scale is a challenging task due to the influence of various and complex climatic, geological, and terrain conditions. Figure 7 presents the spatial

distribution of glacial lakes in the Eastern Himalayas. The results of GAN-GL (without DEM) and the other two methods (with DEM) are shown in the three enlarged images. In Region A, some small glacial lakes are formed around the glaciers, and the proposed GAN-GL model can extract almost all the lakes without misclassified objects. However, the lake areas obtained by the global–local iterative segmentation algorithm and random forest are affected by a high degree of noise from melting glaciers and parts of shadows, as shown in the blue ellipse. The images in Regions B and C are largely contaminated by mountain shadows, clouds, and cloud shadows, but interference from these factors was effectively eliminated by GAN-GL, meaning lakes could be easily detected, and their details preserved. However, lake areas detected by the other two methods mistakenly contained vast non-glacial lake regions, most of the glacial lakes were not precisely delineated (indicated as the blue ellipses in Region B—a lake was divided into many small parts), and the complex structure of the lake boundary was lost. Such structure comprising, for example, undulating topography, as shown in the blue ellipses in Region C. All these performances can be attributed to the fact that our GAN-GL model automatically computes numerous mid- and high-level features through convolutional operations, and employs an effective training strategy under the two constraints of content loss and adversarial loss to distinguish between different objects. Regarding the pixel-based approach, the global–local iterative segmentation algorithm is not able to effectively deal with noise pixels that have spectral values similar to those of lakes and regional heterogeneity. Random forest may have several similar decision trees that mask true results and easily overfit strong noise; this eventually leads to incomplete and noise-polluted extraction results. Table 5 shows the accuracy assessment of mapping results over the whole Eastern Himalayas. Except for Recall, other indicators obtained using the GAN-GL model are extremely high (P = 93.19%; OA = 99.85%; F1 = 73.31%; IoU = 58.46%). This means that most glacial lake pixels can be accurately extracted with only a few commission errors. Although a high Recall indicates that some lakes confused with the background are also not detected, the GAN-GL balances the effects of high accuracy and less noise and gives a good performance from other indicators. The global–local iterative segmentation algorithm achieved the highest Recall (88.47%) but the lowest Precision (44.81%) since large quantities of background pixels were also mapped. Random forest outperformed the global–local iterative segmentation algorithm for all of the indicators. However, the performance of these two methods was significantly improved with the assistance of DEM, meaning many small glacial lakes were not identified in mountainous regions.


**Table 5.** Accuracy assessment of the three mapping methods in the Eastern Himalayas.

**Figure 7.** Distribution of glacial lakes (marked in red contours) overlaid on Landsat-8 imagery of the Eastern Himalayas, and the compared results of the three methods. Note that the results of G-L Seg and random forest were computed using Landsat-8 imagery and DEM. Region A shows some small glacial lakes around the melting glaciers. Region B shows glacial lakes and extensive mountain shadows. Region C shows image interference from clouds and cloud shadows.

#### **5. Discussion**

#### *5.1. Exploration of the Improvement of the Effects of our GAN-GL Model*

To obtain the accurate large-scale glacial lake mapping results in HMA, we designed this GAN-based model. As a deep learning model, there are still some possible limitations and tips to improve the generalization performance. (1) Sufficient and various data: In our study, we collected the glacial lake patches from part of HMA in a single year, and some special glacial lakes may not be sampled in our dataset. A sufficient dataset that contains lakes that vary in size, color, type, and shape can give more lake features to model to further improve the lake mapping results. (2) Adaptive input image setting: We used a Landsat series as the data source, including MSS/TM/ETM+/OLI imagery. These images give a long time series recording of glacial lakes, which is advantageous to mine the lake information. Our model only considered the inputting Landsat OLI imagery, and therefore, an adaptive input image setting would enhance the scalability for applications in other Landsat data. (3) Hierarchical structure for detecting lakes under scale variation: Scale variation in lake areas hampers the model efficiency when mapping glacial lakes in largescale regions. The multi-level feature concatenation is an instrumental design for small object detection, but it has a huge computation cost. A hierarchical structure that detects both small lakes and large lakes has great potential for large-scale glacial lake mapping.

#### *5.2. Performance for Different Lake Sizes*

Small lakes account for a large part of the composition of glacial lakes in HMA. Statistically, in the mapping results in HMA, there are 15,456 glacial lakes (72.73%) less than 0.1 km<sup>2</sup> in 2016 [9]. These lakes are highly variable and sensitive to climate change, but are hard to identify since they are easily confused with the background.

To explore the extraction effects of our model (Attn + ISeg + ResNet-152) for different lake sizes, we counted the numbers provided with the accuracy assessment results of the glacial lakes of various sizes detected with our GAN-GL dataset and GAN-GL-D dataset, and the results can be found in Table 6.


**Table 6.** Statistic results for different size lakes using proposed model.

\* Note: The accuracies of lakes less than 0.01 km<sup>2</sup> were not computed since the Hi-MAG only considered lakes greater than nine pixels (>0.0081 km2).

The smallest lake detected by the GAN-GL is only one pixel (area = 0.0009 km2), far smaller than the lakes in the Hi-MAG (nine pixels). This also indicates why the proportion of small lakes (<0.1 km2) is greater than that in Hi-MAG. Considering that some isolated lake pixels may be produced when splitting the lake area in the edge of cropped image patches, we kept these small lakes without conducting accuracy assessments. From Table 6, our glacial lake mapping results are almost consistent with ground truth when the lake area is greater than 0.01 km2.
