**1. Introduction**

During the last several decades, glacial lakes have increased dramatically in area and number in High Mountain Asia (HMA) due to the ongoing impact of global warming and glacier melting [1]. This has considerably increased the risk of flood outburst hazards and, therefore, monitoring and evaluating the dynamics of glacial lakes is of great significance for the understanding of ecosystem stability and preventing outburst hazards in downstream areas. Fast and accurate mapping of glacial lakes is a prerequisite for the comprehensive investigation of these lakes.

As a unique water resource, glacial lakes have several remarkable characteristics. (1) Small size: small glacial lakes (<0.1 km2) make up the majority of the glacial lakes in HMA. For example, more than 72.7% of the glacial lakes were small in size in 2016 [2,3].

**Citation:** Zhao, H.; Zhang, M.; Chen, F. GAN-GL: Generative Adversarial Networks for Glacial Lake Mapping. *Remote Sens.* **2021**, *13*, 4728. https:// doi.org/10.3390/rs13224728

Academic Editors: Alban Kuriqi and Luis Garrote

Received: 9 October 2021 Accepted: 19 November 2021 Published: 22 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Although these small lakes pose a limited threat to downstream regions, they are still a key factor in exhibiting the dynamic of climate change and giving larger uncertainties in glacial lake mapping [4]. (2) Various physical properties: affected by environmental components such as soil, geology, vegetation, and glaciers, glacial lakes show varying degrees of turbidity and coloring in remote sensing imagery. Moreover, some objects, such as mountain shadows and clouds [1], have a spectrum similar to that of glacial lakes. Thus, the spectral characteristics of glacial lakes vary in complexity with diverse environmental conditions. (3) Wide distribution: glacial lakes of different types, sizes, and shapes are widely distributed around glaciers in the alpine regions of Central and South Asia [5], including the Altai Mountains [6], Himalayas [7,8], Tianshan Mountains [9], and Kunlun Mountains [10], as well as the Karakoram-Pamir Plateau [11,12]. All of these unique characteristics provide great challenges for the automatic and accurate mapping of glacial lakes over a very large-scale glaciated area.

Although much progress has been made in mapping glacial lakes, the mapping methods involved require significant post-processing work and the use of other ancillary data, such as the digital elevation model (DEM) and feature maps. One fundamental problem in glacial lake mapping is that all the features used to highlight glacial lake information are manually designed. This means that while certain spectral or handcrafted features are used, other useful high-level and complex features are ignored. For instance, water indexes [13] are the most commonly used spectral features for the detection of glacial lakes, and they are designed as band ratios that involve green/blue (G/B) bands and near-infrared/short wave infrared (NIR/SWIR) bands. However, many phenomena (such as mountain shadows, melting glaciers, and clouds) generate spectral responses similar to those of glacial lakes, resulting in low mapping accuracy and inevitable manual correction. To alleviate the effects of these factors, most semi-automatic methods use auxiliary data to minimize the amount of less post-processing required. Song et al. [4] presented a hierarchical image segmentation method to explore the distribution and evolution of glacial lakes in the Southeastern Tibetan Plateau. The method combined the normalized difference water index (NDWI) derived from Landsat TM/ETM+/OLI imagery with DEM-based terrain analysis results to extract glacial lake areas. Li et al. [14] proposed a global–local iterative segmentation algorithm to delineate glacial lake extent using Landsat TM/ETM+ and DEM data. Shen et al. [15] applied an object-oriented classification method to extract glacial lake information using a water extraction decision ruleset. This method, however, requires many experiments to determine which features should be considered and how to set parameter values, such as the segmentation scale, shape index, and NDWI. Bhardwaj et al. [16] designed a lake detection algorithm (LDA), which comprised inputs from the moisture index, vegetation index, and NDWI to detect lake pixels and filter out noise pixels based on the DEM and thermal information. Gao et al. [17] established a lake hydrological network to identify the attributes of each lake in the Third Pole using Landsat images, topographic maps, and DEM data. Wangchuk et al. [1] employed a random forest classifier to map glacial lakes using multi-source optical and radar data, including Sentinel-1 synthetic aperture radar, Sentinel-2 multispectral instrument, and DEM. Zhao et al. [18] integrated the advantages of the threshold segmentation method and the active contour model to improve the efficient extraction of glacial lakes and the removal of mountain shadows with the help of DEM. Li et al. [19] created a two-stage segmentation workflow for mapping glacial lakes. First, the object-oriented method was used to segment the target image into the lake, potential lake, and unknown region. Then the potential lake zone was refined using the watershed algorithm. All of these methods depend on auxiliary data to some extent, and checking and editing the mapping results requires great effort. This significantly limits the use of the mapping methods for the fast and accurate extraction of large-scale glacial lake distribution information. Developing a more automatic and less data-dependent method for mapping glacial lakes suitable for large, glaciated regions, is clearly essential to explore the relationship between the changes from climate and glacial lakes, and give forewarning of the glacial lakes that have high outburst risk.

With the explosive growth in remote sensing imaging data, many effective data processing methods have been proposed. Among these, deep learning models have attracted considerable attention and shown great potential in the extraction of high-level information of objects in terms of classification [20], segmentation [21], and generation [22]. To date, there has been scant research that uses deep learning models for glacial lake mapping. Qayyum et al. [23] attempted to map glacial lakes using four-band PlanetScope imagery of the Hindu Kush, Karakoram, and Himalaya (HKKH) region using U-Net architecture. Wu et al. [24] employed a U-Net-based model to extract the contours of glacial lakes in Southeastern Tibet, with the input from Landsat-8 OLI and Sentinal-1A SAR images. Although the pooling operations in the U-Net model can reduce the number of model parameters without changing the image features, they omit some details of the lake boundaries. This is not conducive to the extraction of complex-shaped and small glacial lakes. Considering that the Landsat series of satellites provides the most extensive and longest records for glacial lake mapping, this paper proposes a new solution for glacial lake extraction. We used a deep learning model and Landsat images to facilitate the development of a glacial lake inventory and disaster management in HMA.

As an artistic designation in the deep learning model, the generative adversarial network (GAN) has achieved much in image generation [22], classification [25], object detection [26], image super-resolution [27], and image deblurring [28]. GAN is rarely used as a domain transfer task for image segmentation. Compared to other segmentation models, GAN defines a generator and discriminator to learn the distribution of real data and generates segmentation masks without distribution assumptions [29]. Using GAN, Xue et al. [30] proposed a SegAN model, which uses a fully convolutional neural construction in the generator to segment the mask of a brain tumor in an MRI image at the pixel level. Their model had better precision and sensitivity than other state-of-the-art models when testing it against the BRATS 2013 and 2015 datasets. Son et al. [31] used a GAN-based model to precisely map a vessel in a retinal image and obtained good results on the DRIVE and STARE datasets. To improve mapping accuracy and avoid human-interactive processing, in this paper, we propose a novel end-to-end GAN-based architecture for glacial lake mapping (GAN-GL), in which the only input data are remote sensing images. The water attention module and image segmentation module are cascaded in the generator of GAN-GL to focus on lake information. A ResNet backbone is used in the discriminator. To the best of our knowledge, this is the first time that water attention has been used in a deep learning method for glacial lake mapping. Moreover, we built a large-scale glacial lake dataset for the training and evaluation of the performance of GAN-GL. This dataset contains about 4600 Landsat image patches, each cropped around the glacial lake and with 256 × 256 × 7 pixels. We further divided the dataset into three subsets according to the collection methods, including random cropping, uniform cropping, and density cropping. This model greatly improves the segmentation of glacial lakes over a large-scale area with low data dependence. The robustness and relative accuracy of the proposed method was also tested under different environmental conditions using a global–local iterative segmentation algorithm and random forest classification as a benchmark.

The rest of this paper is organized as follows. Section 2 introduces the collection and statistical analysis of the dataset. In Section 3, we describe the methodology and the architecture of the proposed GAN-GL model. The evaluation metrics and experimental results are given in Section 4. The factors that may influence the mapping performance are discussed in Section 5. Finally, we conclude this work in Section 6.

#### **2. Dataset**

While many achievements and publications have been conducted on the glacial lake inventory [5,10], the inventory data cannot be directly used as training samples for deep learning models due to inconsistent data properties between inventory data and glacial lakes in images. In addition, format transformation and region cropping are needed to comply with the input form of the GAN network. In this section, we describe the details of the collection and production of a complete glacial lake dataset. Such a dataset can be used to drive deep learning models for automatic glacial lake mapping as well as to evaluate the performance of the deep learning model.

#### *2.1. Collection of Dataset*

Owing to its moderate spatial resolution (30 m) and continuous record, Landsat imagery has become one of the most extensively used data resources to retrieve glacial lake information. In this study, Landsat-8 OLI imagery was employed as basic data to create the GAN-GL dataset, as shown in Table 1. To minimize the interference from seasonal snow/ice cover and clouds in glacial lake detection, the acquisition times of the images were all between July and early November. During this period, the boundaries of glacial lakes are very clear and stable because of the balanced state of glacier mass gains and losses [32,33]. The High Mountain Asia Glacial Lake Inventory (Hi-MAG) database [10], which mapped the annual glacial lake coverage from 2008 to 2017 at a 30 m resolution using Landsat series satellite imagery, was used to assist in the creation of ground truth labels for each element (glacial lake or non-glacial lake).

**Table 1.** Details of Landsat-8 OLI images used in this study.


Note: E.: East; W.: West; C.: Central.

#### *2.2. Production of the GAN-GL Dataset*

Glacial lakes are generally gathered around glaciers, and their areas are extremely small compared to backgrounds, for example, there are considerable spatial extents of non-glacial lakes in a Landsat scene. Therefore, 103 tiles, comprising 1024 × 1024 pixels and containing glacial lakes, were firstly cropped from original Landsat-8 OLI images and used as the basis for the subsequent production of the GAN-GL dataset. The spatial distribution of these tiles is shown in Figure 1.

**Figure 1.** Spatial location of High Mountain Asia and the distribution of 103 image tiles (red rectangles), which cover the main mountain ranges.

Glacial lakes are unevenly distributed and vary greatly in size. Many glacial lakes in HMA are too small (<0.1 km2) to be identified, but they account for a large proportion of the total lake area (in Nyainqêntanglha, the area of small glacial lakes accounts for 69.47% of the total area [10]). These small lakes are quite sensitive indicators to exhibit the trends of global climate changes and are easily overlooked in lake evolution in HMA. Moreover, the density distribution of glacial lakes has high spatial heterogeneity in the glaciated regions. The density of glacial lakes is relatively high in the ranges of Southwestern Pamir as well as in the Himalayas; few glacial lakes exist in parts of Western Pamir. All this indicates that the scale and density of glacial lakes vary significantly in the HMA region, and should, therefore, be fully considered in the production of a glacial lake dataset. In this study, three forms of image cropping—uniform cropping, density cropping, and random cropping—were used to build a complete glacial lake dataset, as shown in Figure 2. Notably, the density map-based cropping method was proposed for the first time to fully utilize the spatial and contextual information from glacial lakes and to improve the detection performance of the model.

The following are the detailed steps in the production of the three glacial lake subsets:

GAN-GL-U: Uniform cropping was used for each image tile from the original GAN-GL dataset into 16 patches, each with a 256 × 256 pixel size. This subset consists of 683 patches and each lake appears only once. Some patches without any lakes were discarded.

GAN-GL-D: We cropped 256 × 256 pixels of the patches covering the glacial lakes in each image tile, and then counted the number of glacial lakes and their pixels in each patch. Only patches with more than five lakes and a total area greater than 1% of a patch area were reserved. Finally, 1540 density-cropped patches were acquired.

GAN-GL-R: To create this subset, 50 image patches, each with a size of 256 × 256 pixels, were randomly cropped from each image tile, and only image patches containing glacial lakes were retained. In this way, this subset has a total of 2382 patches, and some glacial lakes may appear more than once.

**Figure 2.** Schematic diagram showing the three methods of creating the glacial lake subsets from the image tiles. (**a**) Uniform cropping: Image tiles were cropped evenly, and image patches without glacial lakes were discarded. (**b**) Density cropping: Image tiles were cropped according to glacial lake density. (**c**) Random cropping: Image tiles were cropped randomly and image patches without glacial lakes were discarded.

Table 2 lists the statistical results associated with these three subsets. GAN-GL-R and GAN-GL-U have similar values for the average number, the average area of glacial lakes in each patch, and the size of glacial lakes. GAN-GL-D has the highest density of glacial lakes.

**Table 2.** Properties of three glacial lake subsets.

