**Remote Sensing for Climate Change**

Editor

**Xander Wang**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Xander Wang University of Prince Edward Island Canada

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Remote Sensing* (ISSN 2072-4292) (available at: https://www.mdpi.com/journal/remotesensing/ special issues/RS for Climate Change).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-6628-3 (Hbk) ISBN 978-3-0365-6629-0 (PDF)**

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


across China under Climate Change Using ERA5-Land Reanalysis Dataset Reprinted from: *Remote Sens.* **2022**, *14*, 2400, doi:10.3390/rs14102400 ................. **151**


## **About the Editor**

#### **Xander Wang**

Dr. Xander Wang is an Associate Professor in the School of Climate Change and Adaptation at the University of Prince Edward Island (UPEI). He is also the Director of Climate Smart Lab in the Canadian Centre for Climate Change and Adaptation. Dr. Wang's research is mainly focused on regional climate modeling, climate downscaling, hydrological modeling and flooding risk analysis, energy systems modeling under climate change, climate change impact assessment and adaptation studies, GIS and remote sensing applications, and big data analysis and visualization.

### *Editorial* **Remote Sensing Applications to Climate Change**

**Xander Wang 1,2**


#### **1. Introduction**

Climate change research remains a challenging task, as it requires vast quantities of long-term data to investigate the past, present, and future scenarios of Earth's climate system and other biophysical systems at global to local scales. Both traditional groundbased observation methods and remote-sensing technologies are available options for gathering data for climate change research. Observations from weather stations have been widely used to study climate change over long periods of time. However, due to the scarcity of point-based weather observations, our understanding of the Earth's changing climate is very limited. This impedes the advancement of our knowledge of the Earth's climate system and our ability to develop well-suited climate models to simulate future climate change, which further results in considerable uncertainties associated with future climate projections. Thus, the determination of a method for quantifying and minimizing these uncertainties is quickly becoming one of the most challenging issues yet to be addressed by climate change impact assessment and adaptation studies. Remote sensing offers a new method for observing the Earth's climate system with continuous and high-resolution spatial coverage through satellite-based, aircraft-based, or drone-based sensor technologies. This can significantly improve our understanding of climate change and its potential impacts at global, regional, and local scales. The data collected with remote-sensing technologies can also be used to validate our climate models, improve our knowledge of the physical and dynamic processes of the climate system, and help us project future climate change and its impacts with minimized uncertainties.

This Special Issue intends to capture the latest research advances regarding remotesensing technologies and their applications in climate change research. Sixteen original research articles authored by one hundred and five researchers were published in this Special Issue, presenting recent advances in remote-sensing technologies (two articles) and the application of remote-sensing technologies to climate change modelling (five articles), monitoring (six articles), and impact assessment (three articles). While the articles span multidisciplinary perspectives and methodologies, they are clustered into four themes.

#### **2. The Advances in Remote-Sensing Technologies**

The continuous development of remote-sensing technologies, including advanced satellites and new procedures for large scale data processing, is essential for increasing the accuracy and reliability of climate change research based on remote-sensing data. Meftah et al. [1] present the technological development of Ultraviolet and Infrared Sensors at high Quantum efficiency onboard a small Satellite (UVSQSAT). They also present the findings of the first in-orbit observations of the mapping of solar radiation reflected by the Earth and of the outgoing longwave radiation at the top of the atmosphere throughout February 2021. Their study shows the feasibility of using a miniaturized satellite to measure the Earth's energy imbalance (EEI) over a short period.

**Citation:** Wang, X. Remote Sensing Applications to Climate Change. *Remote Sens.* **2023**, *15*, 747. https:// doi.org/10.3390/rs15030747

Received: 16 January 2023 Accepted: 27 January 2023 Published: 28 January 2023

**Copyright:** © 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

While satellite technologies are improving, the performance of remote-sensing data analysis techniques remains of great concern. Basheer et al. [2] evaluate the land-use landcover (LULC) classification performance of two commonly used platforms (i.e., ArcGIS Pro and Google Earth Engine) with different satellite datasets (i.e., Landsat, Sentinel, and Planet) through a case study concerning the city of Charlottetown, Canada, over the period of 2017 to 2021. The study provides the scientific basis for the selection of remote-sensing classifiers and satellite imagery with which to develop accurate LULC maps.

#### **3. Climate Change Modelling**

The application of remote sensing to climate change modelling constitutes a crucial domain in climate research. This Special Issue includes five articles presenting regionallevel climate change-modelling studies using different applications of remote-sensing data. Yan et al. [3] explore potential changes in future extreme precipitation events in China based on Coupled Model Intercomparison Project Phase 6 (CMIP6) (under SSP2-4.5 scenario), using a machine learning approach to integrate and fit multiple models. The study estimates the distribution and trends in the precipitation amount (PRCPTOT), very heavy precipitation days (R20mm), extreme precipitation intensity (SDII95), extreme precipitation amount (R95pTOT), maximum consecutive 5-day precipitation (Rx5day), and precipitation intensity (SDII) for the early 21st century (2023–2050), mid-21st century (2051–2075), and late 21st century (2076–2100). Gnitou et al. [4] adopt a two-way approach to CORDEX-CORE RegCM4-7 seasonal precipitation simulations' Added Value (AV) analysis over Africa with the aim of quantifying the potential improvements introduced by a downscaling approach at high- and low-resolution using satellite-based observational products. Zhou et al. [5] use Regional Climatic Model (RegCM) simulations to downscale the boundary conditions of Geophysical Fluid Dynamics Laboratory Earth System Model Version 2M (GFDL-ESM2M) over the Prairie Provinces, and extract the daily mean, maximum, and minimum temperatures from the historical and future climate simulations. The study investigates temperature variations in two future periods (i.e., 2036 to 2065 and 2065 to 2095) compared to the baseline period (i.e., 1985 to 2004). Lu et al. [6] explore the long-term spatial and seasonal variations of three dominant variables with respect to the water cycle (i.e., precipitation, evapotranspiration, and runoff) in China using the Regional Climate Model system (RegCM) developed by the International Centre for Theoretical Physics. The research compares the simulation results with remote-sensing data and gridded observations to validate the model's performance. Ahmad et al. [7] present a two-dimensional (2D) hydrodynamic model combined with remote sensing (RS) and a geographic information system (GIS)-based approach to generate additional flood characteristic maps (e.g., flood velocity, duration, arrival time, and recession time) for the transboundary river Deg Nullah in Pakistan. The study simulates the extreme flood event that occurred in the study area in 2014 to evaluate the performance of the approach.

#### **4. Monitoring Climate Change**

Remote-sensing technologies are now being extensively applied to climate change monitoring at the global, regional, and local scales at an unprecedented rate, especially where ground observation data are scarce. Accordingly, six articles published in this Special Issue demonstrate the remote-sensing-based, long-term monitoring of climate parameters (e.g., precipitation, temperature, etc.) and glaciers. Lu et al. [8] explore possible causes for the interdecadal shift in the interannual variability in summertime precipitation (IVSP) over South China (SC) following the mid-2000s. The study uses climate datasets such as the Precipitation Reconstruction dataset (from NOAA), monthly atmospheric circulation (from NCEP/NCAR), a daily outgoing longwave radiation (OLR) dataset (from NOAA), and the NOAA OI SST V2 High-Resolution Dataset to analyze summertime precipitation variability over South China. Ran et al. [9] present a deep-learning algorithm for fog detection at dawn and dusk under terrain restrictions and an enhanced channel domain attention mechanism (DDF-Net) using advanced Himawari-8 imager data (H8/AHI). Satellite- and

ground-based observational climate data concerning northern China, including the Inner Mongolia Plateau and the Loess Plateau, during the winter months (November to December) of 2015 to 2017 were used in this study. Fan et al. [10] employ downscaled climate data using machine learning algorithms and develop a Batch Gradient Descent Linear Regression model to calculate the contributions of temperature and precipitation to runoff in data-scarce high mountains. A case study of six mountainous basins originating from the Tianshan Mountains in Northwest China are used to demonstrate the application of this novel approach. Li et al. [11] demonstrate the use of ERA5-Land, a reanalysis dataset with high spatial and temporal resolution, to quantify the trends and variations in Frost-Free Periods (FFP) and Frost Days (FD) across China from 1950 to 2020. Zhou et al. [12] investigate glacier velocity changes and glacier flow patterns in the Himalayas over the period from 1999 to 2018 using 220 scenes of Landsat-7 panchromatic images taken between 1999 and 2000 and Sentinel-2 panchromatic images taken between 2017 and 2018. Wang et al. [13] analyze the spatial and temporal variations in the glaciers in the Ebi Lake basin during the period from 1964 to 2019 based on the first and second Chinese Glacier Inventories (CGI) and remote-sensing data. The study also investigates the response of glaciers to the warming climate by analyzing digital elevation modeling and meteorology.

#### **5. Climate Change Impact Assessment**

The assessment of climate change's impacts on biophysical systems is a highly complex area of research wherein remote-sensing technologies are widely used to evaluate the changes in these systems over time. In this Special Issue, three articles present case studies demonstrating integrated impact assessments with the aid of remote-sensing data. Wang et al. [14] investigate the changes in vegetation across the Three-Rivers Headwaters Region (TRHR) (the Yangtze, the Yellow, and the Lancang (Mekong) rivers) by mapping the normalized difference vegetation index (NDVI) over the growing season from 1982 to 2015. Moreover, the study examines how the vegetation cover has changed with the changing temperature and precipitation at different altitudes of the TRHR region throughout the past few decades. Guild et al. [15] estimate forest loss within Kutai National Park (KNP) in Indonesia precipitated by illegal logging and wildfires over various time periods since 1997 using an extensive catalogue and the processing power of the Google Earth Engine. Comparing time-series estimates of precipitation, the ENSO index, burned area, and forest loss, the paper demonstrates that the risk of fire within KNP mainly depends on drought severity, and that the rates of non-fire- (gradual) and fire-related (extreme) forest loss threaten the remaining forests of this National Park. Chen et al. [16] investigate the relative importance of plant nutrient traits and soil nutrient availability with respect to mediating the climatic sensitivity of desert scrub biomes in the Tibetan Plateau. This study analyzes the vegetation sensitivity index (VSI) of desert scrub for the period from 2000 to 2015 and field measurements of the nutrients in the soil and plant leaves from research conducted in 2016 for seven types of desert scrub communities in the Qaidam Basin (NE Tibetan Plateau). Multiple linear and structural equation models were used to reveal how leaf and soil nutrient regimes affect desert scrubs' sensitivity to climate variability.

This collection of articles addresses some of the knowledge gaps in the field of 'remote sensing applied to climate change.' I hope this will encourage further investigation in this area and thus improve the performance of remote-sensing technologies and data analysis techniques as well as widen applications in research related to climate change modelling, monitoring, and impact assessment.

**Funding:** This research was funded by the Natural Science and Engineering Research Council of Canada, the New Frontiers in Research Fund, and the Government of Prince Edward Island.

**Acknowledgments:** I would like to thank all the authors for their contributions and all the reviewers for their valuable comments and feedback.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

#### *Article*

## **Long-Term Tibetan Alpine Vegetation Responses to Elevation-Dependent Changes in Temperature and Precipitation in an Altered Regional Climate: A Case Study for the Three Rivers Headwaters Region, China**

**Keyi Wang 1,2,3, Yang Zhou 2,3,\*, Jingcheng Han 2,3, Chen Chen 3,4 and Tiejian Li 3,4,5**


**Abstract:** Recent studies offer more evidence that the rate of warming is amplified with elevation, indicating thereby that high-elevation ecosystems tend to be exposed to more accelerated changes in temperature than ecosystems at lower elevations. The phenomenon of elevation-dependent warming (EDW), as one of the regional climate-change impacts, has been observed across the Tibetan Plateau. Studies have often found large-scale greening trends, but the drivers of vegetation dynamics are still not fully understood in this region, such that the local implications of vegetation change have been infrequently discussed. This study was designed to quantify and characterize the seasonal changes in vegetation across the Three Rivers Headwaters Region (TRHR), where the land cradles the headwaters of the Yangtze, the Yellow, and the Lancang (Mekong). By mapping the normalized difference vegetation index (NDVI) over the growing season from 1982 to 2015, we were able to evaluate seasonal changes in vegetation cover over time. The results show a slightly increased tendency in green vegetation cover, which could possibly be attributed to sustained warming in this region over the past three decades, whereas a decline in the green-up rate with elevation was found, indicating an inconsistent trend of vegetation greening with EDW. The cause of the green-up rate decline at high elevations could be linked to the reduced soil water availability induced by the fast increase in warming rates associated with EDW. The findings of this study have important implications for devising adaptation strategies for alpine ecosystems in a changing climate.

**Keywords:** elevation-dependent warming; climate change; vegetation greening; NDVI; Three Rivers Headwaters Region

#### **1. Introduction**

Climate change has become a recognized cause of major alterations to natural ecosystems, with many climate-driven changes already being observed in biotic community structure and plant species composition [1,2]. Growing evidence suggests that the rate of warming is amplified with elevation, such that high-elevation ecosystems tend to be exposed to more accelerated changes in temperature than ecosystems at lower elevations [3–5]. The phenomenon of elevation-dependent warming (EDW), as one of the regional climatechange impacts, has been observed and recognized as a contributing cause of shifts in species distribution, hydrological regimes, and biochemical cycles in high-elevation regions [6]. Alpine ecosystems are known to be particularly susceptible to climate change because of their unique thermodynamic properties [7]. In light of the vulnerability of

**Citation:** Wang, K.; Zhou, Y.; Han, J.; Chen, C.; Li, T. Long-Term Tibetan Alpine Vegetation Responses to Elevation-Dependent Changes in Temperature and Precipitation in an Altered Regional Climate: A Case Study for the Three Rivers Headwaters Region, China. *Remote Sens.* **2023**, *15*, 496. https://doi.org/ 10.3390/rs15020496

Academic Editor: Inge Jonckheere

Received: 30 November 2022 Revised: 10 January 2023 Accepted: 10 January 2023 Published: 13 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

ecosystem services in alpine regions, e.g., biodiversity, water cycling, and carbon sequestration, shifts in ecosystem composition, structure, and function are likely to have far-reaching social and economic implications for local and regional areas [8]. Vegetation is a vital part of healthy, functioning ecosystems, and serves as a foundation for terrestrial food webs and habitat for animals [9,10]. It helps to cycle energy and nutrients throughout an ecosystem as well as improving water quality and reducing soil erosion [11–13]. Therefore, monitoring vegetation change across remote regions and quantifying ecosystem feedbacks to climate change has become a topic of increasing interest among academics in recent years [14,15]. Rapid technological advancement has accelerated the growth of remote sensing and enabled a more comprehensive assessment of vegetation change across a variety of spatial and temporal scales [16–20]. The normalized difference vegetation index (NDVI) is a standardized metric that describes the difference between visible and near-infrared reflectance of vegetation cover, allowing for more detailed information that links vegetation greening patterns to drivers [21]. Moreover, because of its simple estimation, high availability, and noise elimination, the NDVI has been widely used for regional-to-global-scale vegetation monitoring and assessment [22–26].

It has been successfully used by many researchers to measure trends in greening at northern latitudes and high elevations [27–30]. In general, the interactions between different climatic variables, such as precipitation, temperature, and solar-radiation-driven evapotranspiration, could affect vegetation greenness [31]. It is reported that high warming rates are recognized as the essential driver of vegetation growth [32]. However, different results may be obtained for different terrestrial characteristics and regional climate environments. For instance, Angert et al. found drought-induced reduction in photosynthesis throughout the growing season at both middle and high latitudes, indicating that the relationship between temperature changes and vegetation growth may change with elevation [33]. Trujillo et al. demonstrated that vegetation greening trends varied with elevation and maximum greenness could be observed in mid-latitude mountain regions [34]. Piao et al. suggested that vegetation growth was generally positively correlated with high warming rates but the correlation could change over time following alternations in other environmental factors, particularly in areas above a latitude of 30 degrees north [35]. Kumari et al. indicated that vegetation growth was more related to evapotranspiration than precipitation in the UKR Basin in the Himalayas [25]. These findings indicate that the driving mechanism of the correlation between elevation-dependent warming (EDW) and vegetation greening is not yet fully understood and that there are large uncertainties regarding climate change impacts at higher elevations. Given the knowledge gaps in the current understanding of EDW and high-elevation vegetation change, expanded efforts to shed further light on this research topic could advance the understanding of the exact driving mechanism.

The Three Rivers Headwaters Region (TRHR) lies in the hinterland of the Tibetan Plateau, and its 363,000 km<sup>2</sup> of land cradles the headwaters of the Yangtze, the Yellow, and the Lancang (Mekong). This region is also known as China's water tower, providing runoff for these three rivers as well as for several hundred million people downstream. Specifically, The TRHR provides 25% of the annual runoff of the Yangtze, 49% of the Yellow, and 15% of the Lancang [36]. The alpine ecosystem of the TRHR is characterized by high biodiversity, including endemic species not found anywhere else, and a highplateau climate, meaning large diurnal and monthly temperature variations across the region. The phenomenon of EDW, as one of the regional climate-change impacts, has been commonly observed here, possibly owing to the warming rate in this region, which is twice as high as the global average [37]. The increased rate of warming could have a profound effect on the biosphere processes in the TRHR, with many and diverse impacts on local biological resources, e.g., common vegetation and habitat types. In addition, the distinctive landform and geologic characteristics have also made the TRHR highly vulnerable to climate change; for example, soils in this region are thin and coarse-textured, making them more susceptible to soil erosion [38]. The TRHR is of significant value to

nature conservation and ecological services within regional and national contexts, and has become a hotspot of regional climate concerns due to the escalating number of droughts, vegetation degradation, and extreme-climate-related conditions. Additionally, further research on this region can also improve the understanding of ecosystem dynamics and vegetation change at high altitudes [8,39–41]. However, some studies within this region also find positive ecological feedbacks associated with the local anthropogenic and climatic changes. For example, grassland productivity has been found to be significantly improved, possibly as a result of successful efforts to implement alpine conservation and ecological restoration programs as well as consistent increases in temperature [42,43] Given the spatial heterogeneity of vegetation change in high-elevation regions, researchers often focus their attention on ecological and environmental impacts with regard to specific factors, e.g., climate change, grazing activities, water-use patterns, and environmental conservation programs [43–45]. However, another factor in vegetation change, also driven by EDW, has not been studied adequately in the TRHR. As aforementioned, many studies have found that EDW plays a potentially important role in altering vegetation growth patterns at higher elevation. Therefore, there is a need to ensure adequate analysis and interpretation of EDW in this hotspot region. In addition, the TRHR is composed of a variety of vegetation cover types, including grasslands, barren lands, forests, croplands, wetlands, etc. In light of the diverse vegetation cover types, the vegetation patterns and dynamics within the TRHR may also be altered fundamentally by warmer temperatures, but few researchers have discussed the potential impacts of vegetation composition on vegetation change in this region. It is therefore imperative that researchers make new efforts to advance the understanding of the internal vegetation variability across vegetation types and the external drivers associated with spatial heterogeneity.

This study was designed to quantify and characterize the seasonal changes in vegetation using more than 30 years of remote sensing data across the TRHR so as to help further the understanding of the role of EDW in changing alpine vegetation growth patterns in this region. Specifically, our research steps included: (i) investigating the interannual variability in the growing-season NDVI (NDVIgs) from May to September across the TRHR between 1982 and 2015; (ii) examining the correlation between NDVIgs and elevation as well as driving forces behind changes in NDVIgs using multiple statistical approaches; (iii) assessing the effects of temperature and precipitation changes on vegetation over time so as to identify the climate variables that most affect vegetation productivity; and (iv) exploring the linkages between vegetation composition and greening trends to help further the understanding of NDVIgs change across vegetation types. Results from this study will be beneficial for advancing the understanding of EDW in high-elevation regions, e.g., the TRHR, and of particular value to the local authorities in evaluating ecological and environmental conservation programs.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The TRHR (31◦39 –37◦10 N, 89◦24 –102◦27 E) is located in the northeastern part of the Qinghai–Tibet Plateau, with a total area of about 363,000 km2. The elevation in the TRHR ranges from about 2000 m in the northeastern corners to above 6800 m in the ridges of the western mountains, and the average elevation of the region is nearly 4500 m (Figure 1). This region has a typical continental climate with warm, humid summers and cold, arid winters [46]. Additionally, there is significant temporal variability in precipitation and temperature across the TRHR. For instance, about 60% of the region's annual rainfall occurs in the summer months, and the monthly average temperature is above 0 ◦C only during the growing season, from May to September [47]. Over the past decades, the TRHR has experienced a striking change in local climate and been confronted with a series of climate-related issues, such as grassland degradation, soil erosion, glacier retreat, and lake and wetland decline [48]. Therefore, central and local governments have long undertaken ecological interventions to reverse the degradation of local ecosystems and to help them regain their ecological functionality in a changing climate. The central government has also set aside protected land for the Three-Rivers Nature Reserve in this region to preserve the natural heritage of the Tibetan Plateau.

**Figure 1.** Location of study area in the Qinghai–Tibet Plateau, China. The TRHR cradles the headwaters of the Yangtze, the Yellow, and the Lancang, and is known as China's water tower. The elevation in the area ranges from 1961 m to 6876 m. The average elevation is 4484 m, but most of the area is above 4500 m.

#### *2.2. Remote Sensing Data*

#### 2.2.1. Land Cover Data

The land cover data were derived from the Finer Resolution Observation and Monitoring of Global Land Cover dataset (FROM-GLC 2017v1, http://data.ess.tsinghua.edu.cn/ (accessed on 1 December 2021)). The FROM-GLC dataset is the first 30 m resolution global land cover product generated using Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) data, and is publicly accessible from Tsinghua University. As shown in Figure 2, the categories of grassland and barren land are the most prevalent in the TRHR and the most common types of land cover, accounting for about 94% coverage of the region (Figure 2b). The other categories account for the remaining 6%.

#### 2.2.2. NDVI Data

We used the Global Inventory Monitoring and Modeling Studies (GIMMS) NDVI3g dataset in this study. This dataset was generated from several advanced very high resolution radiometer (AVHRR) sensors onboard the U.S. National Oceanic and Atmospheric Administration (NOAA) series satellites at 0.083◦ over the years 1982 through 2015. The GIMMS-NDVI3g dataset has been calibrated for sensor degradation, inter-sensor differences, cloud cover, solar zenith angle, etc., and has also been widely proved to be suitable for assessing temporal changes in vegetation [49,50]. The monthly NDVI data were calculated using the semi-monthly values and synthesized using the maximum-value composite method. This method can help greatly reduce the impact of clouds and aerosols on the vegetation index retrieval algorithm [51]. The mean growing-season NDVI (NDVIgs) was used to map the vegetation greenness changes across the TRHR (Figure 2a).

**Figure 2.** (**a**) The 34-year average growing-season NDVI greenness map for the TRHR (the pixels with NDVIgs values smaller than 0.01 were deemed to be non-vegetation areas and displayed as blank); (**b**) the land cover map for the TRHR, represented by nine land cover categories.

#### 2.2.3. Climatic Data

The monthly precipitation data were obtained from the Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) dataset at 0.05◦ resolution for 1982 to 2015. The monthly temperature data were derived from the European Centre for Medium-Range Weather Forecasts (ECMWF) dataset at 0.125◦ resolution for 1982 to 2015. In order to allow for a finer resolution of data analysis, the climatic datasets were interpolated to a higher resolution, 0.0833◦, to ensure the compliance of the climate data's resolution with that of the NDVI data.

#### 2.2.4. Evapotranspiration Estimation from GLEAM

The monthly evapotranspiration (ET) and potential ET (PET) data were obtained from the Global Land Evaporation Amsterdam Model (GLEAM) dataset at 0.25◦ resolution for 1982 to 2015, which can reasonably capture the evaporation variability in all vegetation types and climate conditions [52,53]. GLEAM-ET is estimated by using reanalysis net radiation and air temperature, satellite-, reanalysis-, and gauge-based precipitation, satellitebased vegetation optical depth, and snow water equivalents. The *ET* modeling algorithm is defined by Martens et al. [52] as follows:

$$ET = E\_p S + E\_i \tag{1}$$

where *Ep* is the potential evapotranspiration, estimated using the Priestley and Taylor equation driven by observations of surface net radiation and near-surface air temperature; *S* is the evaporative stress factor, calculated from the observations of microwave vegetation optical depth (VOD) and estimates of root-zone soil moisture; *Ei* is the interception loss, calculated using a Gash analytical model.

The general information of data used in this study has been summarized in Table 1.

**Table 1.** Summary of data sources used for this study.


#### 2.2.5. Methods

Multiple analysis methods were used in this study to reflect the characteristics of NDVI trends at different elevations. Firstly, the non-parametric Mann–Kendall (MK) test was used to evaluate long-term changes in vegetation cover and to identify greening patterns across the TRHR. Previous studies indicated that serial correlation in hydrological or climatic series may result in larger standard errors and lead to misleading trend estimates [54–56] Thus, the trend-free pre-whitening (TFPW) procedure was integrated with the MK test to improve the reliability of trend analysis [55]. After application of the TFPW-MK test, the trend slope was estimated as follows:

$$b = \operatorname{Medim}\left(\frac{\mathbf{x}\_j - \mathbf{x}\_i}{j - i}\right), \forall i < j \tag{2}$$

where *b* is the trend slope; *xi* denotes the *i*th datum.

Next, by using both average-data and pixel-based methods, the trend changes in the NDVI across the region were linked to drivers such as elevation. This allows for a better understanding of the role of elevation in driving vegetation changes across elevational gradients.

The relationships between trends in the NDVI and climatic variables were investigated using the following linear least-squares regression model:

$$y\_i = a\mathbf{x}\_i + b + \varepsilon\_i \; i = 1, \dots, n \tag{3}$$

where *yi* denotes the dependent variable; *x* denotes the independent variable; *n* is the sample size; *a* is the slope; *b* is the intercept; and *ε<sup>i</sup>* is the random error.

The correlations between the trend changes in the NDVI and elevation as well as relevant climatic variables could be evaluated with Spearman rank correlation analysis:

$$r = \frac{\sum\_{i} (\mathbf{x}\_i - \overline{\mathbf{x}})(y\_i - \overline{y})}{\sqrt{\sum\_{i} (\mathbf{x}\_i - \overline{\mathbf{x}})^2 \sum\_{i} (y\_i - \overline{y})^2}}, i = 1, \dots, n \tag{4}$$

where *x* and *y* are the variables; *n* is the sample size. Student's *t*-test was applied to evaluate the statistical significance in a trend analysis.

With regard to the roles of climatic factors in vegetation growth, their effects on elevation-dependent changes in NDVI trends can be understood through partial correlation analysis. Partial correlation analysis is an effective way to investigate the net effect of one independent variable when multiple independent variables act on a dependent variable simultaneously [57,58]. The partial correlation coefficient is defined as follows:

$$R\_{xy|z} = \frac{r\_{xy} - r\_{xz}r\_{yz}}{\sqrt{(1 - r\_{xz}^{-2})(1 - r\_{yz}^{-2})}}\tag{5}$$

where *x* and *y* denote the independent variables; *z* denotes the dependent variable; *r* denotes the correlation coefficient between different variables; and *R* represents the degree of correlation. The null hypothesis assumed that there was no linear correlation between the two independent variables, and the significance level of the hypothesis test was set at *α* = 0.05 in this study.

#### **3. Results**

#### *3.1. Spatiotemporal Variations in NDVIgs and Climatic Variables*

According to Figure 2a, the 34-year average growing-season NDVI greenness map demonstrated clear spatial delineations of vegetation changes in greenness. The southeast part of the TRHR was mostly covered in grassland with NDVIgs values spanning from 0.5 to 0.7, whereas barren land was prevalent in the northwest part with NDVIgs values of less than 0.2. To further investigate the spatiotemporal variations in vegetation and climatic variables across the TRHR, we applied trend analysis to the NDVI and related climatic variables in this study. Comparisons of trend patterns among long-term vegetation and climate records across the study region showed similar yet statistically divergent results with respect to the temporal changes in NDVIgs and climatic variables. First, there was a slight upward trend in NDVIgs at an average rate of 0.003 per decade during the period 1982–2015 (Figure 3a), although this overall trend (*p* > 0.05) was not sufficiently statistically significant to indicate that greening trends across the region became more prominent over time. However, expansive changes in NDVIgs over space showed that over 70% of the region experienced greening between 1982 and 2015, and the greening pattern was spatially fragmented (Figure 3b). Negative NDVIgs trends were found mainly in the southeastern corners of the region, whereas positive NDVIgs trends were more prevalent in the western and northeastern parts. Second, there was a clear uptrend in the growing-season temperature (Tgs) over time, as evident from the statistically significant *p* value (*p* < 0.01). The average increase rate was about 0.39 ◦C per decade during the period 1982–2015 (Figure 3c). Although the overall warming trend was evident across the TRHR, the northeastern corners of the region, near Qinghai Lake, had flat changes or declining trends in temperature; interestingly, the eastern half of the region also experienced the highest warming rate, whereas the most widespread warming was seen in the mostly higher-elevation western half (Figure 3d). The phenomenon of EDW appears to exist within the TRHR as well. Third, the growing-season precipitation (Pgs) also showed a statistically significant uptrend at an average rate of 31 mm per decade during the period 1982–2015 (Figure 3e). The westernmost parts of the TRHR had experienced greater increases in precipitation than others, while a few mountainous areas in the southern and southeastern corners had seen minimal and insignificant increases in precipitation (Figure 3f). We conducted correlation analysis to assess the association between NDVIgs and the two climatic variables (i.e., Tgs and Pgs). We found that there was a weak positive correlation between NDVIgs and Tgs (r = 0.43, *p* < 0.05), while no statistically significant correlation was found between NDVIgs and Pgs (r = 0.18, *p* > 0.05). These findings indicate that the vegetation change could be mainly attributed to the growing-season warming across the TRHR.

**Figure 3.** The average trends in (**a**) NDVI, (**c**) temperature, and (**e**) precipitation during the growing season from 1982 to 2015; the decadal changes in (**b**) NDVI, (**d**) temperature, and (**f**) precipitation across the TRHR. The pixels with significant changes at the 95% confidence level are marked by crosses.

#### *3.2. Elevation-Dependent Responses of Climatic Variables to NDVIgs*

We used multi-year averages of the indicators to account for effects of the elevational gradient on spatial disparity in temperature and precipitation, and to investigate linkages between elevation-dependent changes in temperature and precipitation and trends in NDVIgs. Spatially, the TRHR showed a gradual increase in the average temperature from 0.24 ◦C in the westernmost parts to 11.8 ◦C in the northeastern corners, indicating a strong correlation between temperature and elevation; temperature gradually increased as elevation decreased (Figure 4a). Mountainous areas in the southern and southeastern parts received much more annual rainfall than the flat-lying areas in the west and northeast, indicating that variations in precipitation are closely tied to landscapes rather than elevations (Figure 4b). The NDVIgs values show an inverted V-shape trend across elevational gradients, peaking at around 3900 m (Figure 4c). By comparing both NDVIgs and Tgs trends at different elevations, we noticed a negative correlation between NDVIgs and Tgs in lowelevation areas, which gradually turned positive with increases in elevation. In contrast, there was a stronger correlation between NDVIgs and Pgs over elevational gradients as a similar inverted V-shape trend was observed for the two indicators (Figure 4d).

**Figure 4.** The growing-season (**a**) temperature and (**b**) precipitation patterns across the TRHR from 1982 to 2015; comparisons of elevational trends in (**c**) temperature and NDVIgs, and (**d**) precipitation and NDVIgs.

#### *3.3. Elevation-Dependent Changes in Climatic and NDVIgs Trends*

In order to better understand the rate trends in temperature, precipitation, and vegetation over elevation within the study region, we used both average-data and pixel-based methods to conduct a trend analysis. The average NDVIgs values showed a statistically significant declining trend with elevation, indicating that the green-up rate was slower at high elevations (Figure 5a). However, this type of trend analysis failed to factor in differences in the landscapes throughout the TRHR, so the statistical significance could be more or less over-stated. After re-analyzing the NDVIgs value across each pixel grid within the region, we found that the declining trend was confirmed but less statistically significant (Figure 5d). The phenomenon of EDW was highlighted in the results from both analysis methods, suggesting that the rate of warming was amplified at high elevations (Figure 5b,e). In fact, this regional manifestation of global warming had also been observed in other high-elevation regions across the Tibetan Plateau. In addition, the high-elevation areas within the TRHR experienced much greater yearly precipitation increments than the low-elevation areas between 1982 and 2015, and the widespread changes in precipitation exhibited a statistically significant dependency on elevation (Figure 5c,f). Overall, elevation-dependent changes in temperature and precipitation appeared to be happening faster at high elevations, whereas changes in the green-up rate were relatively insignificant.

#### *3.4. Correlations between Changing Climatic Conditions and NDVIgs Trends*

We also used both average-data and pixel-based methods to conduct a correlation analysis to improve our understanding of greening trends and drivers in an altered local climate. Both methods showed a statistically significant correlation between Tgs changes and NDVIgs trends (Figure 6a,c), and this correlation was overall negative (i.e., decreasing NDVIgs with warming). The average-data method found a weak positive correlation between Pgs changes and NDVIgs trends (i.e., increasing NDVIgs with higher precipitation increments), but this finding shows limited statistical significance (Figure 6b). After reanalyzing the multi-year average Pgs value across each pixel grid within the region, this weak positive correlation was confirmed statistically (Figure 6d). Despite the identified statistical significance of both climatic variables for the NDVIgs trends, temperature appeared to be a more explicit driver of vegetation change as warming resulted in more variability in NDVIgs across the study region.

**Figure 5.** The elevational trends in (**a**) NDVIgs, (**b**) temperature, and (**c**) precipitation using the average-data method; the elevational trends in (**d**) NDVIgs, (**e**) temperature, and (**f**) precipitation using the pixel-based method.

**Figure 6.** Correlation analysis between trends in temperature and precipitation and trend in NDVIgs. (**a**,**b**) Correlations from the average-data method, and (**c**,**d**) correlations from the pixel-based method.

#### *3.5. Spearman's Rank Correlation and Partial Correlation Analysis*

Spearman's rank correlation and partial correlation coefficients were calculated to allow for more detailed statistical information that accounts for the elevation dependency of temperature, precipitation, and NDVIgs changes as well as the relationships between temperature and precipitation changes and NDVIgs trends (Figure 7). Both temperature and precipitation changes were positively correlated with elevation, and these strong correlations were of sufficient statistical significance. According to the Spearman correlation coefficient, NDVIgs trends were significantly and negatively correlated with elevation; however, the partial correlation coefficient suggested that this negative correlation was much less statistically significant if the confounding variables, i.e., temperature and precipitation, were factored out. This implied that elevation was not a direct driver of NDVIgs trends. In addition, NDVIgs trends were also negatively correlated with temperature changes, indicating warming is an identified driver of vegetation change. In contrast, precipitation changes exhibited no statistically significant correlation with NDVIgs trends.

**Figure 7.** Summary of Spearman correlation and partial correlation analysis. Bar opacity represents statistical significance at the 95% confidence level.

#### *3.6. Land-Cover-Based NDVIgs Quantitative Analysis*

Land cover composition is an important explanatory factor when calculating NDVIgs values and evaluating their associated trends. In order to refine our understanding of the relationships between land coverage and NDVIgs values and trends, we used the pixelbased quantitative method to evaluate different land cover types, including grassland, barren land, cropland, developed land, forest and shrubland. The majority of the land within the TRHR was covered in grassland and barren land, accounting for 94% of the total land coverage. Thus, the two land cover types had more chances to have 100% coverage in a pixel grid. As the coverage of grassland increased, the NDVIgs values increased significantly (Figure 8a). In contrast, more coverage in barren land resulted in decreased NDVIgs values (Figure 8b). Furthermore, forest and shrubland, in addition to grassland, were the land cover types most positively related to the NDVIgs values (Figure 8e,f). The results from the pixel-based quantitative analysis also allowed for detection of the spatial patterns of land cover distributions. For example, cropland was mainly distributed in densely vegetated areas (Figure 8c), while developed land was evenly distributed between densely and sparsely vegetated areas (Figure 8d).

**Figure 8.** Correlations between NDVIgs and per-pixel land cover percentage (**a**–**f**); correlations between NDVIgs trend and per-pixel land cover percentage (**g**–**i**).

We also evaluated the decadal average NDVIgs changes for land cover types to improve the understanding of the trends of land cover changes within the TRHR between 1982 and 2015. There was a slight declining trend in the positive changes in NDVIgs with an increasing coverage rate of grassland within each pixel grid, indicating that the greening trends in grassland were likely to be slower than those in other land cover types between 1982 and 2015 (Figure 8g). The reason for this decline could be linked to the continued rise in livestock population and the adverse effects of grazing activities. In contrast, more areas covered in barren land appeared to experience slight positive changes in NDVIgs, indicating a growing trend of barren land greening within the TRHR (Figure 8h). This implied that the significant changes in temperature and precipitation across the TRHR could probably have facilitated the process of vegetation rehabilitation in barren areas, while the warmer temperature and increased precipitation during the growing season could also have greatly slowed down the progress of grassland degradation. Both cropland and developed land had lower coverage rates within each pixel grid; however, more pronounced changes in greening were found in areas partially covered by these two land cover types (Figure 8i,j). This implies that the local government's long-term efforts to deliver targeted interventions for ecological restoration, e.g., the Grain for Green program, have brought positive results in areas where human activities are concentrated. Forest showed no specific trend that was of statistical significance (Figure 8k), while shrubland seemed to experience a negative decline in NDVIgs changes (Figure 8l).

#### **4. Discussion**

It is now accepted that continued climate warming could further boost vegetation growth at high elevations as warmer temperatures and increased rainfalls might potentially create more favorable growing environments in local areas. Our results show that since 1982, most parts of the TRHR have experienced a sustained warming trend, and are getting wetter during the growing season (Figure 3). This is consistent with findings from other studies in this region [37,59,60]. Moreover, the interannual variability in NDVIgs was found to be positively correlated with Tgs at a high significance level based on the correlation analysis results, suggesting a driving effect of temperature on vegetation greening.

Both our study and previous studies show that temperature appears to be a dominant driver of vegetation change within the TRHR; therefore, there do seem to be reasons to suppose that the change in the vegetation green-up rate would be consistent with the trend in warming. However, Figure 5a,d show that there was a slight downward trend in the vegetation green-up rate as elevation increased, which appears to be contradictory to the amplified effect of EDW at high elevations. This result indicates that, despite being a dominant driver of interannual vegetation greening, temperature may play a different role in the vegetation dynamics altering the green-up rate.

Water availability is critical for vegetation growth and is usually deemed to be closely related to precipitation. Our results show that high-elevation areas tend to experience greater increases in precipitation than other areas, and this tendency is statistically significant (Figure 5c,f). However, the decreased NDVIgs trend at high elevations indicates that precipitation plays a limited role in regulating the green-up rate. In fact, Figure 6b,d also presents a relatively weak correlation between Pgs and NDVIgs trends. Notwithstanding a few studies reporting a strong dependency of water availability on precipitation [30,35,48], the TRHR has seen no sign of an evident growth in water availability.

As we know, water availability on land is not only related to precipitation but may also be directly affected by increased evapotranspiration (ET). Previous studies suggest that the ratio of evapotranspiration to potential evapotranspiration (PET), i.e., ET/PET, is a useful indicator of soil moisture conditions, which can also be used to assess the water availability conditions for vegetation growth during the growing season [61]. To discern the roles of temperature and precipitation changes in altering the vegetation green-up rate, we further investigated the pattern of the growing-season ET/PET ratio (ET/PETgs) with elevation. We observed that most areas had increased decadal ETgs, particularly in the western TRHR, which is covered mostly by barren land (Figure 9a). The increased decadal ratios of ET/PETgs were mainly found in the western and northeastern parts, while the decreased ET/PETgs ratios were observed in the central and southeastern areas (Figure 9b). The ETgs values showed a V-shape trend across elevational gradients, suggesting that ET tended to increase sharply at high elevations (Figure 9c). The ET/PETgs ratios showed a statistically significant declining trend with elevation (*p* < 0.01), which was broadly consistent with the NDVIgs trend (Figure 9d). Moreover, we found that the vegetation green-up rate was positively correlated to the ET/PETgs ratio at a higher significance level (r = 0.52, *p* < 0.01) and the correlation between ET and the green-up rate was less significant (*p* > 0.05). Therefore, the increased ET within the TRHR could mainly be attributed to the overall increase in temperature.

**Figure 9.** The growing-season (**a**) ETgs and (**b**) ET/PETgs patterns across the TRHR from 1982 to 2015; comparisons of elevational trends in (**c**) ETgs and NDVIgs, and (**d**) ET/PETgs and NDVIgs.

Overall, the reductions in water availability brought about by climate warming could greatly slow the vegetation green-up rate at high elevations, although EDW appears to create a warmer environment for vegetation growth. In addition, the negative correlation between Tgs and NDVIgs shown in Figure 6 also highlights the adverse effect of warming on vegetation greening within high-elevation areas. Thus, the decline in the vegetation green-up rate with elevation could be attributed to the sustained reductions in water availability for vegetation growth at high elevations.

#### **5. Conclusions**

In this study, we quantified and characterized the seasonal changes in vegetation across the Three Rivers Headwaters Region (TRHR) to advance the understanding of linkages between elevation-dependent changes in temperature and precipitation and vegetation greening trends over this region. By mapping the growing-season normalized difference vegetation index (NDVIgs) from 1982 to 2015, we were able to evaluate vegetation changes across different years and detect greening trends over time. The results provide evidence that the rate and magnitude of elevation-dependent changes in temperature and precipitation may vary across elevation and vegetation types in alpine regions. The major findings of this study can be summarized as follows:


(iv) The implementation of alpine conservation and ecological restoration programs within the TRHR appear to be effective in driving barren land greening. However, the presence of anthropogenic activities could also adversely affect local alpine ecosystems, and in some cases was also deemed the primary cause of environmental degradation, e.g., via loss of productivity in grasslands with overgrazing.

**Author Contributions:** Conceptualization, K.W., Y.Z., J.H., C.C. and T.L.; methodology, K.W.; formal analysis, K.W., Y.Z. and J.H.; data curation, K.W. and C.C.; writing—original draft preparation, K.W. and Y.Z.; writing—review and editing, Y.Z. and T.L.; visualization, Y.Z.; funding acquisition, K.W. and Y.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Open Research Fund Program of the State Key Laboratory of Hydroscience and Engineering (sklhse-2022-A-02; sklhse-2021-A-02), the Science and Technology Project of Shenzhen Institute of Information Technology (SZIIT2022KJ011), and the Key Research Project of Qinghai Province (2021-SF-A7-1).

**Data Availability Statement:** The NDVI data are available at https://data.tpdc.ac.cn/en/data/97 75f2b4-7370-4e5e-a537-3482c9a83d88/ (accessed on 1 December 2020). The monthly precipitation data are available at https://data.chc.ucsb.edu/products/CHIRPS-2.0/ (accessed on 31 December 2020). The monthly temperature data are available at https://www.ecmwf.int/en/forecasts/dataset/ ecmwf-reanalysis-v5 (accessed on 31 December 2020). The monthly ET and PET data are available at https://www.gleam.eu/ (accessed on 1 December 2020).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Tracking Deforestation, Drought, and Fire Occurrence in Kutai National Park, Indonesia**

**Ryan Guild 1,2, Xiuquan Wang 1,2,\* and Anne E. Russon <sup>3</sup>**


**Abstract:** The dry lowland and mangrove forests of Kutai National Park (KNP) in Indonesia provide invaluable ecosystem services to local human populations (>200,000 in number), serve as immense carbon sinks to recapture anthropogenic emissions, and safeguard habitats for thousands of wildlife species including the critically endangered Northeast Bornean orangutan (*Pongo pygmaeus morio*). With recent reports of ongoing illegal logging and large-scale wildfires within this National Park, we sought to leverage the extensive catalogue and processing power of Google Earth Engine to track the rates and influences of forest loss within KNP over various time periods since 1997. We present estimates of forest loss from the Hansen Global Forest Change v1.9 dataset (2000–2021) which detected a loss of 15% (272 km2) of forest cover within KNP since 2000, half of which (137 km2) coincided with the El Niño-induced wildfires of 2015–2016. Using the MCD64A1 C6.1 MODIS dataset, we found significant spatial overlap between burned area and forest loss detections during the 2015–2016 period but identified considerable omissions in the burned area dataset over smallholder farms within KNP. We discuss the implications of deforestation in areas of primary orangutan habitat and how patterns of forest loss have influenced drought and fire dynamics within KNP. Finally, we compare time-series estimates of precipitation, the ENSO index, burned area, and forest loss to demonstrate that fire risk within KNP depends largely—but not exclusively—on drought severity, and that rates of non-fire (gradual) and fire-related (extreme) forest loss threaten the remaining forests of this National Park.

**Keywords:** deforestation; wildfire; drought; satellite; Kutai National Park; Google Earth Engine

#### **1. Introduction**

Kutai National Park (KNP) represents one of the last relatively intact areas of protected lowland and mangrove forest of Indonesian Borneo (Kalimantan) [1]. This protected area is home to over 300 species of birds, a significant population of the critically endangered Northeast Bornean orangutan (*Pongo pygmaeus morio*), and some of the last tracts of ecologically significant and commercially valuable Bornean ironwood trees in the region [1,2]. In addition to catastrophic wildfires reported during the extreme El Niño droughts of 1983–1984, 1997–1998 and 2015–2016 [3,4], KNP has continuously experienced illegal logging from nearby human populations for timber, expanding agricultural fields, and land speculation which the National Park authority struggles to address [5,6].

The use of high resolution (<30 m) multiband satellite imagery in such fields for ecosystem monitoring and biodiversity conservation has kept pace with gradual improvements in its access and availability [7]; today, it represents the primary resource by which researchers track changes in forest cover over time. Yet only two studies have attempted to track the extent of forest loss within KNP using satellite imagery. [5] first used visual inspection of Landsat imagery to detect areas of loss between 1990 and 2007 over KNP,

**Citation:** Guild, R.; Wang, X.; Russon, A.E. Tracking Deforestation, Drought, and Fire Occurrence in Kutai National Park, Indonesia. *Remote Sens.* **2022**, *14*, 5630. https://doi.org/ 10.3390/rs14225630

Academic Editor: Sandra Eckert

Received: 13 October 2022 Accepted: 5 November 2022 Published: 8 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

while [8] used supervised classification of Landsat imagery to detect forest loss in the eastern (populated) region of the park between 2006 and 2009. The latter study estimated an annual deforestation rate of 2.15% in the park's eastern region and attributed much of it to conversion to settlements and agricultural land. At this rate, complete loss of forest in the eastern half of KNP would occur by the mid-century; however, plans to relocate the capital city of Indonesia from Java to East Kalimantan by 2024 are likely to intensify pressures on KNP's forests and accelerate this time frame. This would represent a substantial loss to both the *morio* orangutan habitat within Indonesia and to the ecosystem services upon which local residents—more than 200,000 people within and immediately surrounding KNP—rely for fresh water, forest resources, and tourism (see [9]). Thus, an updated analysis of the rates, extents, and causes of forest loss within KNP using modern sensing and processing technologies is imperative.

The objective of this study is to track forest loss, drought, and fires over the past two decades in KNP using historical collections of thermal and spectral satellite imagery mostly sourced from Google Earth Engine (GEE). Specifically, we utilize the Hansen Global Forest Change (v1.9) dataset to track annual forest loss (2000–2021); the MODIS MCD64A1 C6.1 product to track fire-affected areas (2002–2021); and the CHIRPS (Climate Hazard InfraRed Precipitation with Stations) Pentad dataset to track drought patterns within KNP (1997–2021). We attempt to validate each of these using the appropriate data and methods. The data presented here provide information to researchers and governments to demarcate vulnerable areas of protected forests and to devise strategies to avert future forest loss in this National Park.

#### **2. Data and Methods**

#### *2.1. Study Area*

KNP covers 1927 km2 along the eastern coast of Borneo in the Indonesian province of East Kalimantan (Figure 1) [9]. The ecology of the region comprises inland areas of primary and secondary lowland forests (<250 m above sea level) dominated by Dipterocarpaceae and Lauraceae family trees and coastal areas featuring mangrove forests and brackish estuaries [9]. KNP is bounded by two cities at its southeastern (Bontang) and northeastern (Sangatta) corners (Figure 1) that have each seen considerable population expansion since the park's formal declaration in 1995, which has prompted the excision of several illegally settled areas from KNP's formal boundaries [10]. The Sangatta River forms part of KNP's eastern and northern boundaries, while the remaining northern, western, and southern perimeters abut a contiguous series of industrial concessions, including an open-pit coal mine, plus wood fibre, timber, and oil palm plantations (Figure 1). Several small villages of a few thousand people remain within the bounds of KNP in the eastern and coastal areas [9] where many active and abandoned smallholder farms dominate the landscape (pers. obs.). Oil palm monocultures represent a large portion of planted agriculture within KNP primarily because of the high returns on land and labour afforded by the crop [11]. Encouragement by the Indonesian government to cultivate oil palms as a poverty-reduction strategy and growing contracts offered by palm oil companies in the region [12,13] have further incentivized farmers to illegally establish and enlarge oil palm plantations throughout eastern KNP.

In an intact old-growth state prior to 1970, the forests of KNP rarely experienced wildfires due to low combinations of fire hazards (i.e., dryness and fuel loads) and fire risks (i.e., prevalence of ignition sources) [14]. While ignition sources including lightning, swidden agriculture, and burning coal seam have existed in this region for millenia, neither naturally nor human-induced fires are believed to have driven large-scale deforestation here prior to 1970 [14,15]. However, fire hazards in this region have since increased from more frequent extreme droughts and unprecedented amounts of logging-related fuel loads, as have fire risks from the proliferation of croplands that use fire for various purposes (see [6]). Over the past 40 years, three super El Niño droughts have facilitated the spread of wildfires throughout KNP's forests. In 1982/83, around half of the park's forests burned [16] and

high mortality rates were reported for canopy trees, figs, and lianas [17], while in 1997/98, a much larger extent of the park burned and most of its forests were reset to a secondary successional stage [4,18]. The mixed-dipterocarp forests of this region can recover their structure when left undisturbed for a period of 10–20 years [19,20]. Most recently, during the super El Niño drought of 2015/16, island-wide studies revealed a significant number of fire detections over KNP [3,21], but the extent of—and damage caused by—burning during this event is yet unknown.

**Figure 1.** Land use designations and population density within and around Kutai National Park of East Kalimantan, Indonesia. Density displayed as estimated number of people per 250 m × 250 m grid cell [22] Southeastern and northeastern population centers represent the respective cities of Bontang and Sangatta.

#### *2.2. Datasets*

#### 2.2.1. Hansen Global Forest Change (HGFC)

The HGFC v1.9 dataset is a global gridded (30 m) forest loss product available on GEE that uses a time-series change detection analysis of multispectral Landsat TM (L4 and L5), ETM+ (L7), and OLI/TIRS (L8) scenes to track annual changes in global forest cover between 2000 and 2021 [23,24]. The first release of the dataset (2000–2012) reported a producer's accuracy of 83.1% (overall accuracy 99.5%) over tropical domains [23]. The most recent version (v1.9) reports an improved detection accuracy since 2013 due to the inclusion of L8 data; thus, we only discretize post-2013 years annually in our analyses. Forest loss in HGFC is defined as a stand-replacement disturbance [of tree canopy cover] via removals or mortality and it uses a detection algorithm described by [24,25]. This dataset does not account for potential regrowth after a pixel has been assigned as 'lost', which prevents discernment of outcomes of canopy loss (i.e., regrowth, further degradation, conversion to other land uses) in our study area. Although high-resolution satellite imagery indicates that conversion to settlements and croplands predominate these outcomes over the frontline [loss] areas of eastern KNP, additional data and analysis is required to quantify the capacity for—and extents of—regrowth following canopy loss in our study area.

The 'treecover2000 and 'lossYear' bands of HGFC v1.9 contain classified pixels of (1) global tree canopy cover (vegetation >5 m in height) in the year 2000 (baseline) and (2) annual canopy loss between this baseline and 2021, respectively. We clipped these bands to the boundary of KNP, reduced them to area estimates using GEE functions, and exported them for colour rendering in QGIS. The full script (see Data Availability Statement) used to generate the raster images of annual forest loss in KNP was co-opted from a Google-based Earth Engine tutorial (https://developers.google.com/earth-engine/tutorials/tutorial\_ forest\_02, accessed on 1 September 2022).

We employ two additional datasets to validate HGFC-detected areas of forest canopy loss over two sample regions within KNP between 2017–2020. First, we utilize the novel GEE-based Forest Canopy Disturbance Monitoring (FCDM) Tool using the developer-recommended parameter values to detect annual canopy loss within the sample regions. The FCDM Tool detects new canopy openings from L8 scenes across user-defined time periods and reports an overall accuracy of 77.8% within tropical evergreen rainforests [26]. Second, we visually inspected high-resolution (5 m) cloud-free composite images (June–November) via Planet/NICFI on GEE to manually delineate canopy loss over the sample regions for each year of the validation period. For this "reference" dataset (as referred to in this study), we first traced the extent of non-forest areas in the 2016 composite over both sample regions, then overlayed composites from each subsequent year to trace additional areas of canopy loss. Since each dataset uses a different temporal approach to detect canopy loss between years (see Figure A1), we focus our comparison on the total—rather than annual-extent(s) of detected forest loss between datasets.

#### 2.2.2. MCD64A1 Burned Area

The MCD64A1 C6.1 burned area (BA) product (500 m) contains estimates of global monthly BA based on a time-series detection analysis of daily MODIS C6.1 observations between 07/2000 to 02/2022 [27]. Its most recent version (C6.1) reports an improved detection accuracy of BA (63% producer's; 97% overall), yet it still performs poorly over croplands due to spectral confusion between burnt and harvested land and a limited ability to detect small-burning fires [27–29]. We first generated annual mosaics of BA detections within KNP between 2002–2021 in GEE (see Data Availability and Source Code), and then colour-rendered and reduced each mosaic to area estimates (using the GRASS r.series algorithm) in QGIS. Using active fire (AF) observations from NASA's VIIRS (Variable Infrared Imaging Radiometer Suite) product (375 m) we graphed annual aggregates of daily fire counts within KNP (2012–2021) and compared the location of AF and BA detections during the wildfire years of 2015–2016. While the two datasets detect different signals of fire activity—thermal (AF) vs. post-fire burn scars (BA)—the AF (VIIRS) product can detect relatively smaller fires and is less susceptible to harvest-related confusion, making it valuable to compare against BA detections over cropland regions [30,31].

We attempted to validate the MCD64A1 dataset for the 2015 wildfires within KNP following the burn severity mapping approach developed by UN-SPIDER (United Nations Space-based Information for Disaster Management and Emergency Response). We first selected two sample areas of detected BA and determined the estimated date(s) of burning for each according to MCD64A1 and VIIRS datasets. We also selected one 'unburned' sample area (without BA detections) over SW KNP where numerous AF were detected in 2015. In GEE, we (1) generated cloud-free pre- and post-fire composite images over each sample region using L8 scenes, (2) calculated the normalized burn ratio (NBR) of each composite, and (3) produced a differenced normalized burn ratio (dNBR) image to classify burn severity for each of the three sample regions (see Data Availability Statement). We classified dNBR images using the burn severity thresholds and colour coding developed by the USGS (United States Geological Survey) and UN-SPIDER, respectively, and compared the classified images against the MCD64A1-detected burned areas over each sample region.

We chose to generate cloud-masked composite scenes of pre- and post-fire conditions to address the challenge of year-round high cloud cover that obscures most satellite scenes over KNP. This approach is not recommended given the potential for signal interference over the composited time-period (e.g., through changes in vegetation greenness, agricultural harvests, etc.). However, our approach detected signals of browned (i.e., burnt) areas that are uncommon in scenes or composites from non-wildfire years over this region, and thus this offers value to our analysis of BA in KNP.

#### 2.2.3. Climate Hazard InfraRed Precipitation and Station (CHIRPS)

The CHIRPS Pentad reanalysis gridded rainfall dataset combines infrared measurements of Cold Cloud Duration taken every 5 days (from the Tropical Rainfall Measuring Mission Multisatellite Precipitation Analysis) with in situ weather station data sourced from the World Meteorological Organization's Global Telecommunication System [32]. This global dataset has a spatial resolution of 5.5 km and is available from January 1981 to present.

To generate a time series of monthly rainfall estimates, we first used a GEE mapping function to calculate total monthly precipitation (mm) per pixel across KNP for each year between 2002–2021 (date range of MCD64A1 data), then averaged the pixel estimates across KNP for each month using a reducer function (see Data Availability Statement). Monthly estimates were exported from GEE as csv files and transformed into rolling three-monthly total precipitation (e.g., DJF, JFM, FMA, etc.) to reduce monthly variation and to better reflect medium-term drought conditions (cf. [33]). To evaluate spatial variations in rainfall patterns, we also mapped the 25-year (1997–2021) average of total fire-season (Aug–Nov) rainfall for each pixel across KNP using the GRASS r.series algorithm in QGIS.

CHIRPS-based rainfall estimates were compared with daily observations from the only available rainfall gauge within KNP, located at the Bendili research station (est. July 2010; 0.56◦N, 117.4◦E; founded and operated by the Orangutan Kutai Project) in the NE corner of the park (Figure A2). It is important to note that this comparison serves to visualize conformity between datasets at a single location within KNP and that a standard validation analysis could not be performed without access to additional rainfall observations.

#### **3. Results and Discussion**

#### *3.1. Forest Loss within KNP*

Figure 2 depicts areas of baseline tree canopy cover (>30% coverage in 2000) and canopy loss throughout KNP between 2001–2021, as detected by the HGFC v1.9 dataset. Baseline canopy cover is shown in green and canopy loss is presented as an aggregate between 2001–2012 but annually thereafter (2013–2021) to reflect the greater detection accuracy of the post-2013 period (Figure 2). Much of the forest loss detected since 2000 is located along the trans-provincial highway that runs north-south through the eastern region of KNP (est. 1991) where illegal settlements and agricultural fields have proliferated over time. Outside of this highway region, notable patches of recent (dark blue) forest loss were also detected along the park's southern border (Figure 2A). The southwest patch borders an industrial tree plantation (oil palm and acacia) [34,35] while the easternmost patch borders an industrial open-pit coal mine [36].

To assess the validity of the HGFC estimates over KNP, we compared them with estimates from two other datasets for the years 2017–2020 over two sample areas of NE KNP (Figure 3). The three datasets conformed well in terms of the spatial pattern and area of canopy loss they detected over the two validation areas (Figure 3). Only two estimates by the reference dataset (in 2018 and 2020 over validation area B) were markedly different from those of the detection datasets (Figure 3)—perhaps an artifact of the different approaches that each use in the temporal detection of canopy loss (Figure A1). Overall, our validation supports the detection performance of the HGFC v1.9 dataset over this region, yet a standard validation analysis is warranted to quantify its accuracy over KNP.

The significant spike of canopy loss detected in 2016 (Figure 2C) coincides with the super El Niño drought of 2015–2016 that sparked major wildfires throughout the region according to Russon AE (pers. obs.) and [3,21]. Although this spike was detected in 2016, more than twice as many active fires and seven-times as much burned area were detected throughout KNP in 2015 than in 2016 (Figure 4). Since the HGFC dataset relies on clear-sky

observations for accurate detection, it is plausible that wildfire smoke and high cloud cover obscured most Landsat scenes during the 2015 fire season (August–November) so that most of the fire-related forest loss from 2015 could not be detected until the following year. This is one known caveat of the HGFC dataset, making it more useful in detecting multi-year trends in forest canopy loss than deriving definitive area estimates per year. Overall, the HGFC dataset detected 272 km2 of total forest canopy loss between 2001–2021 (15% of the 2000 baseline canopy cover), half of which (137 km2) was detected between 2015–2016. On average, the annual rate of loss within KNP during non-wildfire years between 2013–2021 was 5.6 km<sup>2</sup> (±1.3) per year, or a loss of 0.3% of baseline forest cover each year.

**Figure 2.** Classified areas of forest loss within Kutai National Park (**A**) and the Greater Mentoko Area [GMA] (**B**) between 2001–2021 and corresponding area estimates (**C**) as detected by the Hansen Global Forest Change (v1.9) dataset. Only years post-2013 are discretized annually due to higher detection accuracy of the 2013–2021 period. Tree cover with >70% canopy closure in year 2000 are depicted in green. Images of an illegal logging site (**D**) and the resultant canopy opening (**E**) in the GMA from 2018 by RG.

The only area of KNP for which accounts of pre-disturbance forest conditions are available is the Greater Mentoko Area (GMA) of NE KNP (Figure 2B). As the site of the first orangutan research in Indonesia in 1970 [37], its forests were described as "near-pristine", free of major logging activities, and only affected by natural disturbances such as windthrows, droughts, and recurrent river flooding [38,39]. Since 2000, nearly half of its original forest cover has been lost (Figure 2B) to both fire- and non-fire disturbances, the latter of which include illegal logging, settlements, and agricultural expansion to support the local

population (Sangatta) that has grown 500-fold over the past five decades [6]. Attributing precise contributions of fire- and non-fire disturbances to annual forest loss within KNP, however, requires further analysis beyond the scope of this study.

The forest loss detected in the GMA has replaced primary habitat for the resident orangutans in this area (est. 1000 individuals) [9] and placed their home ranges in closer proximity to humans and their livelihoods—exacerbating the record-setting rates of humanorangutan conflict reported within KNP [40,41]. Further loss in this region along a westward trajectory will also have negative implications for the local human population by weakening crucial forest-related ecosystem services (e.g., flood management, water quality, climate regulation, availability of non-timber forest products, etc.) and by compromising park tourism that relies in large part on the forested orangutan habitats around the Prevab research station (Figure 2B).

**Figure 3.** Validation of the Hansen Global Forest Change (HGFC; v1.9) product for the years 2017– 2020 over two sample regions (**A**,**B**) in northeastern KNP using (1) high-resolution Planet/NICFI satellite imagery (Reference dataset) and (2) the Forest Canopy Disturbance Monitoring Tool (FCDM Tool) developed by [26].

#### *3.2. Hotspot Mapping within KNP*

Between 2002–2021, a significant amount of BA was detected throughout KNP in 2004, 2015, and 2016 by the MCD64A1 dataset (Figure 4). The BA from 2015 cover settlement areas and agricultural fields in NE KNP and coincides with recently disturbed forests in SW KNP that had lost canopy cover only a year prior (Figures 2A and 4A). Date-ofburning estimates suggest that fires (2015) in the latter region spread from the neighbouring industrial plantation into the recently disturbed area, resulting in a footprint of burning that dwarfs the patches of earlier (2014) forest loss (Figures 2A and 4A). The effect of recent canopy openings facilitating the spread and development of wildfires that cause further deforestation has been well documented in past studies in this region [42,43].

AF observations (VIIRS) revealed a significant footprint of fires in eastern KNP between 2015–2016 that coincide with detected forest loss over this period (Figure 2) but not with BA (Figure 4D). This is perhaps because relative to BA datasets, AF products are known to detect more small-scale fires [28], such as those originating from the individual smallholder farms that predominate eastern KNP. Lower detection accuracy over croplands and other areas of predominantly small fires is a known limitation of the MCD64A1 dataset, owing largely to its coarse resolution (500 m) [27,28,44].

**Figure 4.** MCD64A1 (left): location (**A**) and annual estimates (**B**) of detected burned area (BA) within KNP between 2002–2021. VIIRS (right): active fire detections within and outside the perimeters of detected BA in KNP between 2015–2016 (**C**) and cumulative active fire detections across days of the year (DOY) between 2012–2021 within KNP (**D**).

To further assess the validity of MCD64A1-detected BA over KNP, we followed a validation approach defined by UN-SPIDER that maps burn severity using the difference of normalized burn ratios (dNBR) between pre- and post-fire satellite images (Figure 5). We chose to validate BA from the 2015 wildfires using L8 cloud-free composite images over three sample areas of KNP, covering both burned (regions A and C) and non-burned (region B) land according to MCD64A1 (Figure 5; see caveats in Section 2.2.2). The MCD64A1 dataset performed well over the SW region of KNP where croplands are absent, but poorly over the other sampled areas where croplands predominate; it missed all of the moderate- to severeburn scars over SE KNP (Figure 5) despite a large number of active fire detections over this region in 2015 (Figure 4). Despite its poor performance over croplands, the MCD64A1 dataset still provides a good indicator of years of high BA within KNP (Figure 4), but alternative BA datasets and/or mapping methods are required to estimate the contributions of fire to forest loss in this region.

#### *3.3. Climatic Influences of Fire and Forest Loss within KNP*

According to estimates of total "fire season" (Aug–Nov) rainfall over the past 25 years, the deforested eastern region of the park receives the least amount of rainfall, as low as 5.5 mm/day (500 mm/3 mo.) on average around the bounds of Sangatta city (Figure 6A). Previous research over Borneo and other tropical regions has found patterns of lower rainfall and higher ambient temperatures over deforested areas (compared to unlogged), with the most pronounced effects observed during the dry season and El Niño-related droughts [45,46]. Deforested areas are also more accessible to humans, experience higher temperature extremes, and contain relatively higher fuel loads from logging debris and

woody successional plants, all of which contribute to an elevated fire hazard/risk in such areas [43,47]. These implications are supported by studies that have connected rampant deforestation patterns throughout eastern Borneo with stronger El Niño droughts [45,47] and more severe and widespread wildfires [42].

During the 2015 fire season, rainfall estimates (<2.5 mm/day) and single-point observations (<0.75 mm/day) reveal extreme drought conditions over much of the burned area (Figure 6B), well below the reported 4–5 mm/day fire risk threshold for this region [3,45]. These severe drought conditions—driven by a 20-year peak of El Niño strength—combined with a prevalence of potential ignition sources (see [6]) and widespread degradation to facilitate large-scale wildfires within KNP (Figure 7). It is likely that such wildfires contributed both directly (i.e., drought- and fire-related mortality) and indirectly (i.e., facilitating clearing/conversion) to the spike in detected forest loss over this period (Figure 7), but further analysis is necessary to confirm this interpretation.

Rainfall estimates also suggest that most—but not all—of the six El Niño events over the past 20 years have induced drought conditions below the fire risk threshold for this region (Figure 7). Spikes in BA were only detected during one weak (2004–2005) and one strong (2015–2016) EN event, but none were detected during several moderatestrength EN events that induced comparable rainfall deficits (Figure 7). Assuming that the MCD64A1 dataset is a sufficient indicator of large-scale fire occurrences within KNP, we posit that other factors, such as temperature and the incidental presence of ignition sources, have regulated the intensity and spread of fire throughout KNP during severe droughts. However, ongoing deforestation in KNP and predictions of more frequent and extreme super El Niño events [48–51] will worsen the fire hazard of this reserve during future droughts.

**Figure 5.** Validation of MCD64A1 burned area in 2015 over northwestern (**A**), southeastern (**B**), and southwestern (**C**) sample regions of Kutai National Park using differenced normalized burn ratio (dNBR) of pre- and post-fire Landsat 8 composite imagery. In the bottom right chart, the spatial extent of the MCD64A1-detected burned areas is compared against the top three classes of dNBR burn severity over each sample region.

**Figure 6.** Per-pixel averages (1997–2021) of total rainfall (**A**) and daily average rainfall vs. burned area in 2015 (**B**) during the standard fire season (August–November) within Kutai National Park. Rainfall estimates from CHIRPS Pentad; burned area estimates from MCD64A1.

**Figure 7.** Estimates of monthly burned area (MCD64A1) and three-monthly total precipitation (CHIRPS) within Kutai National Park, as well as the Oceanic Niño Index (ONI) between 2002–2021, with phases of El Niño in orange and La Niña in green. Green bubbles (top) represent annual estimates of forest loss within KNP (HGFC v1.9) between 2013–2021 in square kilometers (aligned to Jan 1 of the detection year). The reported fire risk threshold for this region of 4–5 mm/day (360–450 mm/3 month period) is denoted by the grey band.

#### **4. Conclusions**

In this study, we leveraged the extensive collection of historical satellite data and the processing power of Google Earth Engine to investigate trends in and influences on forest loss within Indonesia's Kutai National Park over various time frames since 1997. We found a gradual yet considerable westward progression of forest loss across the eastern (settled/farmed) region of KNP since 2000 and patches of fire-related forest loss along the park's southern border associated with the El Niño-related drought and fires of 2015–2016. In total, the HGFC v1.9 dataset detected 272 km2 of forest loss (or 15% of baseline forest

cover) since 2000, half of which (137 km2) occurred between 2015–2016 with significant overlap of detected burn scars. Our validation analyses support the detection performance of the HGFC v1.9 over NE KNP but they caution the use of the MCD64A1 burned area dataset over eastern KNP where considerable croplands compromise its accuracy. Our analysis of rainfall estimates also suggests that most El Niño events since 2000 have induced prolonged droughts over KNP below the fire risk threshold (4–5 mm/day) for this region. However, we found that drought severity alone cannot explain fire intensity within KNP, indicating the influences of other modulating factors (e.g., anomalous temperatures, incidental presence of fire sources, recent logging) in this phenomenon. Unlike all other areas of KNP, its eastern settled/farmed region receives average rainfall amounts that approach the fire risk threshold during the standard fire season (August–November), which is potentially (in part) attributable to widespread deforestation as demonstrated elsewhere in Borneo (Chapman et al. 2020; McAlpine et al. 2018). Lastly, we found that the most intense wildfire event of the past two decades—in 2015—coincided with rainfall estimates of <2.5 mm/day over the most fire-affected areas of the park.

Limitations of our analyses include the use of composite imaging in our validation assessments to address the obstacle of high-cloud cover over KNP. With this approach, we were still able to detect clear signals of canopy openings and burn scars in our validation analyses, thus adding confidence to our discussion of forest loss and its relative contributions over our study area. However, precise contributions of fire- and non-fire disturbances to annual forest loss within KNP could not be estimated with our approach, which along with standard validation assessments of the datasets used, should be the subject of future studies within KNP. Future studies of fire dynamics over KNP should also examine mechanisms of fire spread during wildfire years to identify targets of fire management efforts in this region.

The rates and influences of forest loss reported here contribute to undermining tourismrelated revenue, degrading essential ecosystem services, heightening fire risk and severity in the region, and threatening the considerable biodiversity that resides within the park, including a stronghold population of the critically endangered Northeast Bornean orangutan. With the relocation of the Indonesian government capital to East Kalimantan scheduled for 2024, unprecedented new human migration into East Kalimantan and subsequent land and building requirements are soon likely to exert added pressures on KNP's forests. Wildfires and illegal logging within KNP must therefore be the priority subject of future conservation, management, and research programs to avoid further loss of forest cover in this protected area.

**Author Contributions:** Conceptualization, R.G.; formal analysis, R.G.; funding acquisition, X.W.; methodology, R.G.; resources, A.E.R.; supervision, X.W.; writing—original draft, R.G.; writing—review and editing, X.W. and A.E.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Natural Science and Engineering Research Council of Canada and the New Frontiers in Research Fund.

**Data Availability Statement:** The source codes for the Google Earth Engine scripts used in this paper are available at: https://github.com/ryan-guild/Kutai-National-Park-Deforestation-Fire-Rainfall-Mapping-Study.git. The Hansen Global Forest Change v1.9 dataset is available at: https://developers. google.com/earth-engine/datasets/catalog/UMD\_hansen\_global\_forest\_change\_2021\_v1\_9 (accessed on 1 September 2022). The MCD64A1.061 MODIS dataset is available at: https://developers. google.com/earth-engine/datasets/catalog/MODIS\_061\_MCD64A1 (accessed on 1 September 2022). The CHIRPS Pentad dataset is available at: https://developers.google.com/earth-engine/datasets/ catalog/UCSB-CHG\_CHIRPS\_PENTAD?hl=en (accessed on 1 September 2022).

**Conflicts of Interest:** The authors declare no conflict of interest.


**Figure A1.** Differences in detection methodology between the three forest loss datasets used in this study, including the reference dataset (Planet/NICFI imagery), FCDM Tool [26], and the Hansen Global Forest Change product [23].

**Figure A2.** Comparison of rainfall estimates (CHIRPS) and rain gauge observations (RG) at the Bendili research station in Kutai National Park for annual, September, and October time periods between 2010–2018. Bendili RG observations are courtesy of the Orangutan Kutai Project.

#### **References**


## *Article* **Weakened Impacts of the East Asia-Pacific Teleconnection on the Interannual Variability of Summertime Precipitation over South China since the Mid-2000s**

**Wei Lu 1, Yimin Zhu 1,2, Zhong Zhong 1,2, Yijia Hu <sup>1</sup> and Yao Ha 1,2,\***


**Abstract:** The current study concentrates on the interdecadal shift in the interannual variability of summertime precipitation (IVSP) over South China (SC). Possible causes for the interdecadal shift are explored. The IVSP on a decadal time scale presents a significant weakening after the mid-2000s. The results show that the variances of the interannual precipitation variability over the SC region between 1993 and 2004 (hereafter S1) and 2005 and 2020 (hereafter S2) are 1.40 mm d−<sup>1</sup> and 0.58 mm d<sup>−</sup>1, respectively. The variance of the IVSP has decreased by 58.6% since the mid-2000s. The current study reveals that the reduction in the IVSP over SC after the mid-2000s is prominently attributed to the weakened impact of the East Asia-Pacific (EAP) teleconnection. Before the mid-2000s, the interannual variation of the east-west movement of the western Pacific subtropical high was more significant. The warming over the tropical central-eastern Pacific (CEP) and cooling over the western Pacific (WP) suppress the Walker cell in the tropical Pacific and induce anomalous Hadley cell with its descending branch over the WP in the wet years. The anomalies of SST and atmospheric circulation show opposite phases in the dry years. This SSTA pattern enhances the northward propagation of the EAP teleconnection through a Rossby-wave-type response, which triggers an ascending/descending branch with active/suppressed convection over the northwestern Pacific in the wet/dry years. Therefore, the cooling WP and El Niño in its developing phase provide an ideal condition for more precipitation over SC. However, the above ocean–atmosphere interactions changed after the mid-2000s. The significant SST changes in the tropical CEP and the WP weaken the EAP teleconnection and atmospheric circulation anomalies over SC, leading to a significant interdecadal reduction in the IVSP over SC after the mid-2000s.

**Keywords:** summertime precipitation; interannual variability; interdecadal shift; South China

#### **1. Introduction**

East Asian summer monsoon (EASM) exhibits significant interannual and interdecadal variabilities, which have a great influence on summertime precipitation in East Asia [1]. Huang et al. gave a comprehensive and systematic summary of the spatiotemporal features, processes, and causes of the EASM variability and pointed out that the EASM has significant variabilities on interannual and interdecadal time scales [2]. Moreover, the EASM experienced an interdecadal weakening around the late 1970s, while summertime precipitation increased in the Yangtze-Huai River Basin (YHRB) and decreased in North China correspondingly. Summertime precipitation in South China (SC) increased after 1992 and decreased in YHRB after 1999 [3]. Correspondingly, the correlation between interannual variations of the EASM and ENSO became unstable [4]. South China (SC) is located at the east of the Qinghai-Tibet Plateau and south of YHRB. Due to the interactions between tropical and middle- to high-latitude weather systems, the weather and climate in SC are

**Citation:** Lu, W.; Zhu, Y.; Zhong, Z.; Hu, Y.; Ha, Y. Weakened Impacts of the East Asia-Pacific Teleconnection on the Interannual Variability of Summertime Precipitation over South China since the Mid-2000s. *Remote Sens.* **2022**, *14*, 5098. https://doi.org/10.3390/rs14205098

Academic Editor: Xander Wang

Received: 19 August 2022 Accepted: 7 September 2022 Published: 12 October 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

complicated and diverse. SC is the area with the amplest precipitation and the longest rainy season in China. Severe drought and flood disasters associated with abnormal precipitation cause huge losses of lives and properties and greatly influence local economies [5–8].

Influenced by the EASM, summertime precipitation over eastern China shows multiscale variability, including seasonal and intraseasonal variability, as well as interannual and interdecadal variability [2,9–15]. Previous studies have shown that there were three interdecadal variations of summertime precipitation that occurred in the mid-1970s and the early and late 1990s [16]. The interdecadal decrease of summertime precipitation in SC in the 1970s [17,18] is attributed to the weakening of the EASM [19] and the warming of the tropical Indian Ocean (IO) and the tropical central-eastern Pacific (CEP) [20,21].

Summertime precipitation in SC exhibited an interdecadal intensification after the early 1990s [22] and a weakening after the late 1990s [23,24]. Most studies pointed out that uninterrupted warming in the tropical IO and CEP in summertime can cause anomalous Walker Cell and Hadley Cell circulations [25,26]. The interdecadal intensification of precipitation over SC in the early 1990s is attributed to joint effects of the Walker Cell and Hadley Cell anomalies and the reduction in spring snowfall in northern Eurasia [25,27]. Due to the strengthening of the western Pacific subtropical high (WPSH) caused by the tropical Pacific SSTA [23] and the significant reduction in tropical cyclone activity in the western Pacific [28], summertime precipitation in SC decreased in the late 1990s.

Previous studies have shown that there is a link between the East Asian climate and the Northwest Pacific climate, which can be established through atmospheric teleconnection [29]. On the basis of previous studies, Nitta found that there is a seesaw pattern in the atmospheric circulations between regions near Japan and the Philippine Sea and proposed the Pacific-Japan (PJ) teleconnection mode for the first time [30]. Based on the analysis of observations, Huang and Li also found that there is a tele-connected wave train similar to the "tripole" structure of PJ in the atmosphere extending from the Philippines to the Sea of Okhotsk, which is named East Asia-Pacific (EAP) teleconnection type [31]. Many studies focused on the formation mechanism of this teleconnection type. Based on numerical experiments using a barotropic model, Kurihara and Tsuyuki found that convective activities near the Philippines can trigger two-dimensional northward propagating Rossby waves with a horizontal structure similar to the observed PJ/EAP teleconnection [32]. Wakabayashi and Kawamura defined the PJ index based on the decomposition of the orthogonal function through the distribution position of the central node of the graph [33]. Later, the PJ index was widely applied to find the connection between the Northwest Pacific climate anomalies and the East Asian climate anomalies. It is also used in climate forecast, especially the prediction of summertime precipitation anomalies in eastern China. Similar to summertime precipitation in South China, the position and intensity of the EAP teleconnection also underwent significant interdecadal variations. After the late 1970s, the position of the EAP teleconnection pattern changed significantly with an obvious shift to the west and south [34]. Yin et al. pointed out that rainfall anomalies over YHRB associated with the EAP teleconnection were weaker and less evident after the 1980s [35]. Xu et al. revealed that the PJ/EAP (hereafter EAP) teleconnection shifted to the east after the late 1990s, and its intensity weakened significantly [36].

The above studies are helpful for understanding the interdecadal change in the mean state of summertime precipitation over SC. However, less attention has been paid to the study of the interdecadal change of the IVSP in SC [37]. Many studies revealed that the factors which influence the interannual variation of the East Asian summertime climate have changed a lot on the interdecadal time scale in the past decades [38–42]. Therefore, the IVSP could have enormous contributions to the occurrence of extreme precipitation events and should be further investigated. Moreover, interannual changes in various meteorological elements could improve the precision of seasonal prediction by statistical climate prediction models [43]. Therefore, it is necessary to study the interdecadal shift of the IVSP over SC and explore the possible mechanisms behind.

#### **2. Materials and Methods**

The precipitation data used in this study are the Precipitation Reconstruction dataset on 2.5 × 2.5◦ grids provided by NOAA [44–46]. The monthly atmospheric circulation data are derived from the NCEP/NCAR Reanalysis on a 2.5 × 2.5◦ resolution [47]. The daily outgoing longwave radiation (OLR) dataset produced based on NOAA polar-orbiting satellite remote sensing data with a horizontal resolution of 2.5 × 2.5◦ grids is used in this study [48]. In addition, the NOAA OI SST V2 High Resolution Dataset on a 0.25 × 0.25◦ resolution is put to use [49].

The time span of this study is the summertime season (JAS, the second rainy season in SC, which usually has a strong relationship with tropical systems) from 1979 to 2020. A 10-year running *F*-test is performed on the time series of summertime precipitation to assess the interdecadal change. The Student's *t* test is used to examine the confidence level of the composites. The wave activity flux (WAF) is applied to describe the energy propagation of wave trains [50]. The conversion of local kinetic energy (CK) is calculated to depict the conversion of kinetic energy from mean flow to turbulence [51].

#### **3. Results**

#### *3.1. Weakened Interannual Variability of Summer Precipitation over SC since the Mid-2000s*

To describe the interdecadal shift of the IVSP in the mid-2000s, Figure 1 shows the regional mean time series, interannual difference, and 10-year running *F*-test values of summertime precipitation over SC (20–35◦N, 100–125◦E) from 1979 to 2020. The time series and interannual difference both reveal a significant weakening of the interannual variability after the mid-2000s. To describe the interdecadal shift of the IVSP over SC in the mid-2000s and determine the interdecadal turning point, the 10-year running *F*-test and 11-year running standard deviation of summertime precipitation over SC are analyzed (Figure 1b). The results suggest that the IVSP over SC experienced a significant weakening around2004/05. Hence, the year 2004 is taken as the shift point and the stage (1993– 2004/2005–2020) with strong/weak IVSP is defined as S1/S2. It is found that the variance in S1 is 1.40 mm d−<sup>1</sup> and that in S2 is 0.58 mm d−<sup>1</sup> , and the mean values of the interannual difference in summertime precipitation are 1.82 mm d−<sup>1</sup> in S1 and 0.81 mm d−<sup>1</sup> in S2. Furthermore, the mean value of the 11-year running standard deviation is 1.33 mm d−<sup>1</sup> in S1 and the value reduces to 0.64 mm d−<sup>1</sup> in S2. All the above results support the conclusion that the interdecadal shift occurred around2004/05.

The time series of summertime precipitation with the linear trends in S1 and S2 being removed is shown in Figure 2. It is found that the variance in S1 is 1.32 mm d−<sup>1</sup> and the value reduces to 0.37 mm d−<sup>1</sup> in S2. The variance of the detrended time series in S2 decreases by 72.0% compared to that in S1, which is larger than the value of 58.6% for the original time series that includes the linear trend. Based on the detrended time series, the years with detrended summertime main precipitation larger (smaller) than 0 are determined to be wet (dry) years. The wet and dry years during S1 and S2 are listed in Table 1. Then the differences between the wet and dry years are composited to explore the reasons for the interdecadal shift of the IVSP over SC. Linear trends are eliminated for all environmental variables before the composite analysis is conducted.

The composite differences in summertime precipitation between wet and dry years during S1 and S2 are shown in Figure 3b,d, respectively. It is found that summertime precipitation over SC shows positive anomalies for the two periods but presents different spatial distributions and intensities. In S1, the positive anomalies center is located at (26◦N, 113◦E), and the maximum intensity of anomaly reaches 3.2–3.4 mm d−<sup>1</sup> (Figure 3b). In S2, the two centers of positive anomalies are located at (29◦N, 113◦E) and (23◦N, 115◦E), which are to the north and south of the positive center in S1, respectively. The maximum value at the two positive centers in S2 is around 1.2–1.8 mm d−<sup>1</sup> , weaker than that in S1 (Figure 3d). Consistent with the composite differences of summertime precipitation, the center of the maximum standard deviation is located at (26◦N, 113◦E) in S1 but it splits into two centers, which shift to the north and south respectively in S2. The values of standard

deviation at the two positive anomaly centers also decrease in S2 compared to that in S1, which agrees with the interdecadal weakening of the IVSP over SC (Figure 3a,c).

**Figure 1.** (**a**) Time series (black solid line) and interannual difference (blue bars) of summertime precipitation over SC. (**b**) *F*-test values (red solid line) and 11-year running standard deviations (blue solid lines) of summertime precipitation over SC (black dashed lines in (**a**) are averaged values of interannual variability in S1 and S2, respectively; blue dashed lines in (**b**) are averaged values of running standard deviation in S1 and S2, respectively; the red dashed line in (**b**) denotes the significant value exceeding the 95% confidence level; unit: mm d<sup>−</sup>1).

**Figure 2.** Time series of detrended summertime precipitation over SC during S1 and S2.

**Table 1.** Wet and dry years picked out in S1 and S2 based on the detrended summertime main precipitation larger and smaller than 0, respectively.


**Figure 3.** Composite differences in summertime precipitation between wet years and dry years (unit: mm d<sup>−</sup>1; (**b**,**d**) for S1 and S2, respectively; dots indicate the regions exceeding 95% confidence level.) and the standard deviation of summertime precipitation (unit: mm d<sup>−</sup>1; (**a**,**c**) for S1 and S2).

#### *3.2. Atmospheric Circulation Anomalies Responsible for the Interdecadal Shift of Summertime Precipitation over SC*

To find out the reasons why there are large differences in the IVSP between S1 and S2, atmospheric circulation characteristics are analyzed in this section. Figure 4 shows composite differences of 200 and 500 hPa geopotential height as well as 200 hPa zonal wind between wet and dry years. The center of positive geopotential height anomalies occurs at 45◦N in both S1 and S2, but the center in S2 is located at the east of that in S1. The negative geopotential height anomalies are smaller in S2 compared to that in S1 (Figure 4a,b,d,e). Composites of the western Pacific subtropical high (WPSH) feature line (5880 gpm contour) in wet and dry years are displayed in Figure 4b,e, respectively. In S1, west extension of the WPSH is significant in dry years; in contrast, it retreats to the east in wet years (Figure 4b). In S2, the location and intensity of the WPSH show little difference between wet and dry years (Figure 4e). This suggests that the interannual variability of the WPSH significantly weakened in S2 compared to that in S1, which is similar to the variation in summertime precipitation. Consistent with geopotential height anomalies, the positive and negative anomalies centers of zonal wind at 200 hPa are located more eastward in S1 and more westward in S2. The intensity of positive and negative anomalies centers of zonal wind is weaker in S2 than in S1 (Figure 4c,f).

The influence of atmospheric circulation anomalies on the IVSP is analyzed from the perspective of large-scale circulation above. To explore the impact of water vapor transport and local convection on the IVSP, composite differences in moisture flux and its divergence in the middle and lower levels, OLR, and omega in the middle and lower levels, 850 hPa relative vorticity and horizontal wind between wet years and dry years are analyzed. Results are displayed in Figure 5. In S1, negative OLR and omega anomalies are centered over SC (Figure 5a); in S2, however, the centers of the negative OLR and omega anomalies are located more eastward and become weak (Figure 5d). This result indicates that the intensity of convection is weaker in S2 than in S1. From the perspective of moisture transport, strong easterly anomalies pulled lots of moisture from the Northwestern Pacific (NWP) to SC in S1, resulting in significant moisture convergence over SC (Figure 5b). In S2, however, no significant convergence and divergence of moisture flux anomalies can be found over SC (Figure 5e). Positive relative vorticity anomalies are centered over SC in S1 with strong easterly anomalies (Figure 5c). The meridionally distributed centers of anomalies of positive and negative relative vorticity resemble the EAP teleconnection. Compared to that in S1, the EAP teleconnection moves eastward and becomes weak in S2 (Figure 5f). The changes in the location and intensity of the EAP teleconnection could be a key factor that affects the interdecadal shift of the IVSP over SC.

**Figure 4.** Composite differences of (**a**,**d**) 200 hPa geopotential height (unit: gpm), (**b**,**e**) 500 hPa geopotential height (unit: gpm), and (**c**,**f**) 200 hPa zonal wind (unit: m s<sup>−</sup>1) between wet years and dry years. (**a**–**c**) for S1 and (**d**–**f**) for S2; dots indicate the regions exceeding 95% confidence level; red and black lines in (**b**,**e**) indicate composites of the WPSH feature line for wet and dry years, respectively).

**Figure 5.** Composite differences of (**a**,**d**) vertically integrated omega from 850 to 500 hPa (contours with the interval of 2, unit : kg (m s) −3 ) and OLR (shading, unit : W m<sup>−</sup>2); (**b**,**e**) vertically integrated moisture flux from 1000 to 500 hPa (vectors, unit : kg m−<sup>1</sup> s<sup>−</sup>1) and its divergence (shading, unit : 10−<sup>6</sup> kg m−<sup>2</sup> s<sup>−</sup>1); and (**c**,**f**) 850 hPa wind (vectors, unit: m s<sup>−</sup>1) and relative vorticity (shading, unit : 10−<sup>6</sup> s<sup>−</sup>1) between wet years and dry years. ((**a**–**c**) for S1; (**d**–**f**) for S2; significant values exceeding the 95% confidence level are displayed; the dashed/solid lines suggest values larger/smaller than zero in (**a**,**d**)).

It is found that the composite differences of summertime precipitation between wet and dry years and standard deviations in the corresponding years over the SC show highly consistent spatial distribution characteristics (Figure 3). To inspect the consistency of the composite differences in moisture transport and local convection between dry and wet years and their standard deviations in the corresponding years, Figure 6 shows the standard deviations of moisture flux divergence, vertically integrated omega from 850 to 500 hPa and OLR in S1 and S2, and the respective differences. Consistent with the results shown in Figure 5, moisture flux divergence, omega, and OLR have larger standard deviations over SC in S1 (Figure 6a–c) than in S2 (Figure 6d–f). Moreover, the differences in moisture flux divergence, omega, and OLR between S1 and S2 (Figure 6g–i) display a similar spatial distribution pattern to that of the standard deviation of summertime precipitation over SC in S1.

**Figure 6.** Standard deviations of vertically integrated moisture flux divergence from 1000 to 500 hPa ((**a**,**d**,**g**); unit : 10−<sup>6</sup> kg m−<sup>2</sup> s<sup>−</sup>1), vertically integrated omega from 850 to 500 hPa ((**b**,**e**,**h**); unit : kg (m s) −3 ), OLR ((**c**,**f**,**i**); unit : W m<sup>−</sup>2) in S1 (**a**–**c**); and S2 (**d**–**f**) and the respective difference between S2 and S1 (**g**–**i**).

#### *3.3. Weakened Impacts of the EAP Teleconnection on Summertime Precipitation over SC since the Mid-2000s*

As referred to in Section 3.2, the EAP teleconnection is of vital importance for the interdecadal shift of the IVSP over SC. This section explores the possible mechanisms of the interdecadal shift of the EAP teleconnection impact on summertime precipitation over SC around the mid-2000s. The composite differences of OLR and 850 hPa relative vorticity between wet and dry years are shown in Figure 5, which can reflect some features of the EAP teleconnection. In S1, negative differences of OLR (Figure 5a) and positive differences of vorticity (Figure 5c) are centered over SC, while the positive OLR difference center and negative vorticity difference center are distributed to the north and south, respectively. In S2, the centers of negative OLR (Figure 5b) and positive vorticity (Figure 5d) anomalies both move eastward and their intensity weakens. These results suggest the EAP teleconnection may have experienced an interdecadal weakening process after the mid-2000s.

Since the EAP teleconnection is characterized by meridionally distributed centers of relative vorticity anomalies with alternating signs [36], the 200 and 850 hPa relative vorticity anomalies connected with the EAP teleconnection in wet years during S1 and S2 are shown in Figure 7. In S1, the centers of positive and negative relative vorticity anomalies are meridionally distributed along 105–120◦E with the maximum positive center in lower levels located over SC (Figure 7b). In S2, the centers of positive and negative relative vorticity anomalies move eastward and are meridionally distributed along 115–130◦E with the intensity weakened in lower levels (Figure 7d). Following the approach proposed by Huang to define the EAP index using 500 hPa geopotential height anomalies [52], the EAP index is defined in the present study using 850 hPa relative vorticity anomalies. The EAP index can be expressed as:

$$I\_{Pl} = -0.25\text{Var}^{'}(\text{R}\_1) + 0.5\text{Var}^{'}(\text{R}\_2) - 0.25\text{Var}^{'}(\text{R}\_3) \tag{1}$$

where *Vor* (*R*1), *Vor* (*R*2), and *Vor* (*R*3) represent averaged 850 hPa relative vorticity anomalies over (30–36◦N, 105–120◦E), (16–29◦N, 105–120◦E), and (8–15◦N, 105–120◦E), respectively. The above three areas are denoted by the red boxes in Figure 7b,d. The EAP index is 1.187 in S1 and −0.002 in S2, respectively. The large difference in the EAP index between S1 and S2 suggests that the intensity of the EAP teleconnection experienced a significant weakening since the mid-2000s. Furthermore, the weakening of the EAP teleconnection is partly attributed to the weakening of its intensity and partly attributed to its eastward shift.

**Figure 7.** Composite anomalies of relative vorticity (shading, unit : 10−<sup>6</sup> s<sup>−</sup>1) and associated WAF (vectors, unit : m2 s<sup>−</sup>2) at (**a**,**c**) 200 and (**b**,**d**) 850 hPa in wet years of S1 (**a**,**b**) and S2 (**c**,**d**) (significant values exceeding the 95% confidence level are displayed).

In order to illustrate the propagation of Rossby waves, the wave activity flux (WAF) is also composited. In S1, the WAF connected with the EAP teleconnection emanates from the tropical WP, propagates to the north, and reaches and converges SC and the adjacent sea to its east in the lower levels (Figure 7b). In the higher levels, EAP of the WAF predominantly propagates along the westerly jet stream at the middle- to high- latitudes (Figure 7a). In S2, the propagation paths of WAF in lower (Figure 7d) and higher levels (Figure 7c) both are

similar to that in S1, but the intensity of WAF is weaker than that in S1. A previous study indicated that the energy of the EAP teleconnection in the wet years of S1 predominantly hails from a Rossby-wave-type response [53].

Traditional studies regard the EAP teleconnection as a free Rossby wave train propagating from the subtropics to the mid latitudes in the lower troposphere [29,31]. To explain in a clearer way the interdecadal variation of the EAP teleconnection from the perspective of energy, the composite differences in CK vertically integrated from 1000 to 100 hPa between wet years and dry years are displayed in Figure 8. Note that positive CK indicates that fluctuations are easier to develop by extracting kinetic energy from basic flow. In S1, positive CK is centered at 150–160◦E, 35–40◦N and 105–115◦E, 15–20◦N (Figure 8a). In S2, the positive CK shift to the east and are centered at 170–180◦E, 35–45◦N and 125–135◦E, 25–35◦N (Figure 8b). Traditional studies regard the EAP teleconnection as a free Rossby wave train propagating from the subtropics to the mid latitudes in the lower troposphere [29,31]. Therefore, it is obvious that the CK shift to the east in S2 contributes significantly to the eastward shift of the EAP teleconnection.

**Figure 8.** Composite differences of the conversion of local kinetic energy (CK) vertically integrated from 1000 to 100 hPa between wet and dry years in (**a**) S1 and (**b**) S2 (dots indicate the regions exceeding 95% confidence level; black curves in (**a**,**b**) represent the EAP teleconnection).

It has been found that the eastward shift of EAP teleconnection is mainly forced by the shift of related SSTA [54,55]. Figure 9 shows the composite differences of SSTA (unit: K) between wet and dry years from the prior winter (DJF) to the subsequent summertime (JAS). In S1, the EAP teleconnection is connected with the transitional phase from La Niña to El Niño and from "North-warm-South-cold" to "North-cold-South-warm" in the Indian Ocean (IO), as well as a significant cooling phase over the western Pacific (WP) (Figure 9a–c). In S2, the EAP teleconnection is connected with the El Niño Modoki [56] decaying and a cooling phase in the IO, as well as a warming phase over WP (Figure 9d–f).

**Figure 9.** Composite differences of SSTA (unit: K) between wet and dry years (**a**,**d**)/(**b**,**e**)/(**c**,**f**) for pre-winter (DJF)/spring (MAM)/summer (JAS); (**a**–**c**)/(**d**–**f**) for S1/S2; dots indicate the regions exceeding 95% confidence level).

The patterns of composite SSTA in S1 and S2 are significantly different, which is at least partly responsible for the interdecadal shift of the EAP teleconnection. Looking at the situation in the summertime during S2, it is clear that the cooling in the tropical central-eastern Pacific (CEP) and the warming in the WP (Figure 9f) are favorable for the development of convective activity over WP [57,58]. Strong convective activities in the WP strengthen the WPSH by affecting the Hadley cell due to the weakening of EAP teleconnection. Since the enhancement of the IVSP in S1 attracts more attention, the present study focuses on the mechanism for the strengthening of summertime precipitation in S1.

It is found that the CEP warming can induce anomalous convection activities in the NWP by a Rossby-wave-type response and simultaneously destabilize the atmosphere in lower levels [59–61]. To further explore the influence of SSTA pattern on atmospheric circulation anomalies in S1, we analyze the composite differences (shown in Figure 10) in the Walker cell (Figure 10a) and Hadley cell (Figure 10b) between wet and dry years. In the summertime of S1, the updraft in the tropical CEP and downdraft in the tropical WP are significantly enhanced (Figure 10a) due to the warming in the CEP and the cooling in the WP (Figure 9c). This kind of Walker Cell induces an anomalous Hadley cell over the northern WP (Figure 10b). Descending branches appear in the WP around 10S–10N and North China around 35N. An ascending branch occurs near 25◦N, which is located at the center of the largest IVSP area (Figure 3a,b). The updraft there suppresses the development of the WPSH (Figure 4b) and promotes more summertime precipitation in the wet years of S1. The Walker Cell and Hadley Cell accompanied by SSTA have opposite impacts on summertime precipitation in SC in the dry years of S1. These two distinctly different processes are responsible for the more significant IVSP in S1.

In a nutshell, the enhanced EAP teleconnection is a Rossby-wave-type response to the warming in the CEP and the cooling in the WP, which result in anomalous Walker Cell and Hadley Cell in the wet years of S1. The northward propagation of Rossby wave energy promotes convective activities and suppresses the WPSH in the NWP, providing an advantageous condition for the increase of summertime precipitation over SC.

**Figure 10.** Composite differences of (**a**) Walker cell (averaged over 10◦S~10◦N) and (**b**) Hadley cell (averaged over 100~120◦E) between wet and dry years in S1 (shading regions suggest composite differences of vertical velocity (unit: 10−<sup>3</sup> m s<sup>−</sup>1) between wet and dry years. Significant values exceeding the 95% confidence level are displayed).

#### **4. Conclusions**

Many studies revealed that the factors that influence the interannual variation of the East Asian summertime climate have changed a lot on interdecadal time scale in the past decades [38–42]. This study reveals that the IVSP over SC experienced a significant weakening after the mid-2000s. Taking 2004 as the shift point, the periods 1993–2004 and 2005–2020 are defined as S1 and S2, respectively. By composite differences of various atmospheric elements between wet and dry years in S1 and S2, the reasons why the interdecadal shift of the IVSP over SC occured are explored.

The composite differences of summertime precipitation between wet and dry years in S1 and S2 are compared. It is found that summertime precipitation over SC shows positive anomalies for both periods but presents distinctly different spatial distributions and intensities. In S1, the center of positive anomalies is located at (26◦N, 113◦E) and the intensity of the anomalies reaches 3.2–3.4 mm d−<sup>1</sup> . In S2, two centers of positive anomalies are distributed to the north and south of the positive center in S1. The two centers are respectively located at (29◦N, 113◦E) and (23◦N, 115◦E). The intensity of summertime precipitation anomalies in S2 is 1.2–1.8 mm d−<sup>1</sup> at the two positive centers, weaker than that in S1. Consistent with the composite differences of summertime precipitation, the center of the standard deviation is located at (26◦N, 113◦E) in S1 but it splits into two centers that shift to the north and south respectively in S2. The magnitude of standard deviation at the positive anomaly centers also decreases in S2 compared to that in S1, which agrees well with the interdecadal weakening of the IVSP over SC. Consistent with summertime precipitation, local moisture flux divergence, omega, and OLR all have larger standard deviations over SC in S1 than in S2.

To find out the reasons for the enormous differences of the IVSP between S1 and S2, atmospheric circulation characteristics are analyzed. From the perspective of large-scale circulation, the center of positive geopotential height anomalies is located at 45N for both S1 and S2, yet the center in S2 is located further westward compared to that in S1. The local negative geopotential height anomalies are weakening from S1 to S2. Moreover, the WPSH elongates to the west in dry years and moves back to the east in wet years during S1. The location and intensity of the WPSH have little difference between wet and dry years during S2. This suggests that the WPSH interannual variability significantly weakens in S2 compared to that in S1, which is similar to the variation in summertime precipitation. Consistent with geopotential height anomalies, the centers of positive and negative anomalies of zonal wind at 200 hPa are located relatively eastward in S1 and westward in S2. The intensity at centers of positive and negative anomalies of zonal wind is weaker in S2 than in S1. From the perspective of moisture transport and local convection, negative OLR and omega anomalies are centered over SC in S1, while the centers of negative anomalies are located eastward and become weak in S2. This result indicates that the intensity of convection weakens from S1 to S2. Moreover, there are strong easterly anomalies that pull huge amounts of moisture from the Northwestern Pacific (NWP) in S1, resulting in significant moisture convergence over SC. However, no significant convergence and/or divergence of moisture flux anomalies can be found over SC in S2. Positive relative vorticity anomalies are centered over SC in S1 accompanied by strong easterly anomalies. The meridional centers of positive and negative relative vorticity anomalies resemble the EAP teleconnection, which moves eastward and becomes weak from S1 to S2.

Since the EAP teleconnection is of vital importance for the interdecadal shift of the IVSP over SC, the reasons for the interdecadal shift of the EAP teleconnection impact on summertime precipitation over SC around the mid-2000s are investigated. The weakening of the EAP teleconnection is partly attributed to the weakening of its intensity and partly attributed to its eastward shift. It was found that the eastward shift of EAP teleconnection is mainly forced by the shift of related SSTA [54,56]. In S1, the EAP teleconnection is connected with the transitional phase from La Niña to El Niño, from the "North-warm-South-cold" pattern to "North-cold-South-warm" pattern in the IO and a significant cooling phase in the WP. In S2, the EAP teleconnection is connected with the Central Pacific El Niño decaying and a cooling phase in the IO, as well as a warming phase over the western Pacific. The patterns of composite SSTA are significantly different between S1 and S2, which is at least partly responsible for the interdecadal shift of the EAP teleconnection. Looking at the situation in the summertime of S2, the cooling in the tropical CEP and the warming in the western Pacific are favorable for the development of convective activity over WP [57,58]. Strong convective activities in the WP strengthen the WPSH by affecting the Hadley cell due to the weakening of EAP teleconnection. In the summertime of S1, the enhanced EAP teleconnection is a Rossby-wave-type response to the warming in the CEP and the cooling in the WP, which results in anomalous Walker Cell and Hadley Cell in the wet years of S1. The northward propagation of Rossby wave energy promotes convective activities and suppresses the WPSH in the NWP, providing an advantageous condition for the increase of summertime precipitation over SC.

The present study concentrates on the interdecadal shift of the IVSP over SC and the possible mechanisms behind it. However, the present study mainly performs some dynamic and statistical analysis on observational datasets. In the following work, it would be better to do numerical experiments to verify the reliability of the results. This study reveals that the interdecadal shift of the EAP teleconnection is significant around the mid-2000s, which is caused by the interdecadal shift of the SSTA patterns in the CEP and the western Pacific. However, there may exist multiple factors responsible for the interdecadal shift of the EAP teleconnection. It is found that the interdecadal change of the PDO phase in the late-2000s could result in the interdecadal shift of the EAP teleconnection and variations

of summertime precipitation in eastern China [61]. Further studies are necessary to explore the complex feedback loops that affect summertime precipitation in China.

**Author Contributions:** Conceptualization, Y.H. (Yao Ha) and Y.Z.; methodology, Y.H. (Yao Ha); validation, Z.Z. and Y.Z.; formal analysis, Y.H. (Yijia Hu); investigation, W.L.; data curation, Y.H. (Yijia Hu); writing—original draft preparation, W.L.; writing—review and editing, Y.H. (Yao Ha); supervision, Z.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This work is sponsored jointly by the National Natural Science Foundation of China (41975090); the Natural Science Foundation of Hunan Province, China (2022JJ20043); the Scientific Research Program of National University of Defense Technology (18/19-QNCXJ); and the Jiangsu Collaborative Innovation Center for Climate Change in Nanjing University.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Comparison of Land Use Land Cover Classifiers Using Different Satellite Imagery and Machine Learning Techniques**

**Sana Basheer 1,2, Xiuquan Wang 1,2,\*, Aitazaz A. Farooque 1,2, Rana Ali Nawaz 1,2, Kai Liu 3, Toyin Adekanmbi 1,2 and Suqi Liu 1,4**


**Abstract:** Accurate land use land cover (LULC) classification is vital for the sustainable management of natural resources and to learn how the landscape is changing due to climate. For accurate and efficient LULC classification, high-quality datasets and robust classification methods are required. With the increasing availability of satellite data, geospatial analysis tools, and classification methods, it is essential to systematically assess the performance of different combinations of satellite data and classification methods to help select the best approach for LULC classification. Therefore, this study aims to evaluate the LULC classification performance of two commonly used platforms (i.e., ArcGIS Pro and Google Earth Engine) with different satellite datasets (i.e., Landsat, Sentinel, and Planet) through a case study for the city of Charlottetown in Canada. Specifically, three classifiers in ArcGIS Pro, including support vector machine (SVM), maximum likelihood (ML), and random forest/random tree (RF/RT), are utilized to develop LULC maps over the period of 2017–2021. Whereas four classifiers in Google Earth Engine, including SVM, RF/RT, minimum distance (MD), and classification and regression tree (CART), are used to develop LULC maps for the same period. To identify the most efficient and accurate classifier, the overall accuracy and kappa coefficient for each classifier is calculated throughout the study period for all combinations of satellite data, classification platforms, and methods. Change detection is then conducted using the best classifier to quantify the LULC changes over the study period. Results show that the SVM classifier in both ArcGIS Pro and Google Earth Engine presents the best performance compared to other classifiers. In particular, the SVM in ArcGIS Pro shows an overall accuracy of 89% with Landsat, 91% with Sentinel, and 94% with Planet. Similarly, in Google Earth Engine, the SVM shows an accuracy of 87% with Landsat 8 and 92% with Sentinel 2. Furthermore, change detection results show that 13.80% and 14.10% of forest areas have been turned into bare land and urban class, respectively, and 3.90% of the land has been converted into the urban area from 2017 to 2021, suggesting the intensive urbanization. The results of this study will provide the scientific basis for selecting the remote sensing classifier and satellite imagery to develop accurate LULC maps.

**Keywords:** remote sensing; LULC classification; ArcGIS Pro; Google Earth Engine; machine learning; change detection

#### **1. Introduction**

Accurate information on land use land cover (LULC) can facilitate various research activities related to floods, droughts, migration, and climate change at several scales. The constant and precise investigation of LULC is a vital aspect of any region's sustainability

**Citation:** Basheer, S.; Wang, X.; Farooque, A.A.; Nawaz, R.A.; Liu, K.; Adekanmbi, T.; Liu, S. Comparison of Land Use Land Cover Classifiers Using Different Satellite Imagery and Machine Learning Techniques. *Remote Sens.* **2022**, *14*, 4978. https:// doi.org/10.3390/rs14194978

Academic Editor: Magaly Koch

Received: 13 September 2022 Accepted: 3 October 2022 Published: 6 October 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and development. LULC changes (i.e., deforestation and urbanization, etc.) are one the prominent drivers of climate change around the globe [1]. Furthermore, climate change has significant impacts on water balance [2,3], geomorphology [4], water quality and groundwater management [5–8], resources management and their impacts on humans and their surroundings [9], and land monitoring [10–12], all of which require detailed LULC maps as an essential input [13,14]. LULC maps can assist in identifying which types of land are suitable for agriculture practices and watershed management or useful for urban planning. The most frequent method for this purpose is mapping the LULC and analyzing change over time [15–18].

The severity of extreme events is mainly modulated by the LULC changes. The extreme events of heat waves in 2003 and 2010 in Western Europe and Russia inspired researchers to investigate extreme events by looking at the LULC of the area [19–21]. Therefore, it is crucial to identify the best methodology to analyze the change in LULC to address the growing demands of the population to overcome the issues related to agriculture management, natural resource management, and energy generation. Generating more accurate LULC maps for dense regions requires vast quantities of data. As a result, enormous storage capacities, significant computing power, and the ability to employ various strategies are required [22]. Classification of LULC is the foundation of research on LULC change detection. Traditional visual interpretation and mathematical statistics no longer match the criteria for the accuracy of LULC classification [23]. Training samples, classifiers, and supplementary datasets are the most influential components in supervised LULC classification [24]. However, modern state-of-the-art tools (i.e., ArcGIS Pro and Google Earth Engine) enable us to develop more accurate LULC maps using different LULC classification algorithms.

Analysis of different LULC categories, using machine learning approaches, can be performed in a geographic information system platform (i.e., ArcGIS Pro) or Google Earth Engine using remote sensing datasets. The geographic information system can be used for a better understanding of spatial and temporal analysis related to LULC classification [25]. These analyses can help us to understand the trend of change in LULC. Google Earth Engine is a platform that combines remote sensing data (i.e., satellite imagery from different sources) with high-performance computer service, making satellite imagery processing quicker and simple [26–29]. Google Earth Engine contains satellite imagery from various sources, including Landsat 8, Sentinel 2, MODIS, and many other sources of datasets that are freely accessible. JavaScript is used in Google Earth Engine to develop client libraries, while Python handles code modification [30–34]. Google Earth Engine uses the method of MapReduce architecture for parallel processing, which is used for dividing huge volumes of data into smaller sets and proceeding them across several tools. Then, the data were compiled for output datasets after processing them as several individual components.

Recently, state-of-the-art machine learning algorithms (i.e., random forest/random tree (RF/RT), classification and regression tree (CART), support vector machine (SVM), maximum likelihood (ML), and minimum distance (MD)) have captured the researchers' attention [35,36]. Nowadays, most available research focuses on these LULC classifiers comparisons using different machine learning models. Machine learning has the potential to deal with big historical and present datasets using different algorithms for LULC analysis. In the last few decades, many researchers focused on LULC classification using different remote sensing data with machine learning techniques for different types of study areas to analyze the LULC classification at different platforms [24,30,31]. However, only a few studies focus on using the different platforms for LULC classification and mapping, utilizing different LULC classifiers or approaches to compare LULC maps derived from various multispectral satellite images. On the other hand, the stress for accurate LULC data from satellite imagery around the globe increases. Therefore, it is more critical to compare different classification algorithms and their effectiveness in ArcGIS Pro and Google Earth Engine platforms. The objective of this study is to check the performance evaluation of different algorithms or classifiers for LULC categorization utilizing Landsat

8 surface reflectance Tier 1, Sentinel 2 Level-1C, and Planet imagery data with 30 m, 10 m, and 3–5 m resolutions, respectively, with two different platforms (ArcGIS and Google Earth Engine). Moreover, the accuracy of classifiers, i.e., RF/RT, CART, SVM, ML, and MD [37–39] for both platforms, was determined. Furthermore, the changes in LULC for the city of Charlottetown from 2017 to 2021 were also analyzed utilizing the accurate classifier on the selected platform. The results of this study will provide insights for future studies by identifying the most suitable classifier and platform for the remotely sensed images.

#### **2. Data and Methods**

#### *2.1. Study Area*

This study focuses on Charlottetown, the capital of Prince Edward Island (PEI), commonly known as the Island, a province of Canada located in the Maritimes (Figure 1). PEI has been facing the adverse effects of climate change in the past few decades [40]. Charlottetown is Prince Edward Island's capital. In the past few years, the population of Charlottetown has increased significantly. As a result, the city is facing deforestation and urbanization. The study area has different land cover types (e.g., forest vegetation, water bodies, and urban areas), and its changing due to human interference. Therefore, it is important to analyze the LULC changes with maximum accuracy using the best machine learning approach. Charlottetown has a total size of 44.29 km2 [41]. It is located on the province's southern coast and has a population of 80,347. Charlottetown is situated on its namesake harbor. By the convergence of three rivers, this harbor was created on the island's south coast. The entrance to the harbor is the Northumberland Strait. The downtown area of Charlottetown consists of the city's five hundred ancient lots and the waterfront area facing the Hillsborough River. Initially, the downtown area was bordered by four villages (Parkdale, Spring Park, Brighton, and Sherwood). The districts to all sides of downtown except the south area have been developed in recent decades with many commercial developments and an increase in residential areas, while the city's outside areas are still primarily farmland.

**Figure 1.** The location of the study area (city of Charlottetown).

#### *2.2. Satellite Data*

This project will employ three types of datasets to perform LULC categorization. These datasets include low (Landsat 8), median (Sentinel 2), and high (Planet) resolution satellite images. In this research, all visible and infrared bands (the thermal infrared band was excluded) were used for the LULC analysis [42].


#### *2.3. Methodological Framework for LULC Classification*

Currently, geospatial Big Data are garnering an abundance of attention and are emphasized globally. ArcGIS Pro and Google Earth Engine are the most popular platforms for large data processing and LULC classification. Several approaches have been established and employed to measure changes in different LULC changes. However, there is currently no research that uses both platforms and includes different input datasets to examine the accurate LULC changes through time in Charlottetown, PEI. In this study, a framework was developed to compare and analyze the performance and viability of different classification methods in ArcGIS Pro and Google Earth Engine, using different datasets. Table 1 shows the major LULC classes scheme for the study area. We use five major classes, e.g., water bodies, urban, forest, bare land, and vegetation, that represent the overall land cover of the city of Charlottetown. Figure 2 shows the methodological frameworks for LULC classification using different classifiers (see Section 2.4) in ArcGIS Pro and Google Earth Engine using Landsat 8, Sentinel 2, and Planet imagery from 2017 to 2021. Image treatment is entirely independent in ArcGIS Pro than in Google Earth Engine. Some data processing factors can affect the outputs, including the algorithms for image selection, cloud masking, and filling the missing pixels. Overall accuracy and kappa coefficient were also calculated to evaluate the performance of each classifier with different input datasets. Maps and graphs were used to display the entire analysis.



**Figure 2.** Methodology of LULC classification using (**a**) ArcGIS Pro and (**b**) Google Earth Engine.

#### *2.4. Classification Methods*

To perform a pixel-based supervised classification, a set of training samples specific to each year was obtained. Every training sample pixel was assigned to a LULC class on the basis of supplementary data, such as Google Street View and orthophoto images from each year of the study [46]. We use SVM, ML, and RT classier in ArcGIS Pro and CART, SVM, RF, and MD classifiers in the Google Earth Engine platform. In this study, the following classifiers were used for LULC classification:


• Maximum likelihood classifier (ML). Maximum likelihood classifier is a supervised classification method that describes every band by a normal distribution. This supervised classification method is based on the Byes theorem [53].

#### *2.5. Data Processing*

In Google Earth Engine, LULC classification maps were generated using Landsat 8 and Sentinel 2 imagery. We use SVM, RF, MD, and CART classifiers in Google Earth Engine. Contaminated pixels caused by cloud cover were eliminated from all images using the cloud mask method provided on the Google Earth Engine [11]. Temporal aggregation approaches such as mean, first, and median was employed to fill the gaps in foggy images.

Using SVM, RT, and ML classifiers, LULC maps were created in ArcGIS Pro. As the major input for classification, images with the least or zero cloud cover were employed. Atmospherically and geometrically corrected images from the three sources (Landsat 8, Sentinel 2, and Planet) with zero cloud cover were utilized as the primary input. The source of data was USGS and Planet Scope sites. Images from the Planet Scope site were mosaiced to cover the whole study area in ArcGIS Pro. Then, all images from all three data sources were clipped to the city of Charlottetown boundary using ArcGIS Pro for LULC classification. Image treatment is entirely independent in ArcGIS Pro than in Google Earth Engine.

A total of 354 training samples were used for LULC classification, from which 284 were used as training samples, and 145 were used as testing samples. Google Earth Pro was used to visually evaluate high-resolution orthophoto images from 2017 to 2021. Training and validation samples were generated with Google Earth Pro. They were then loaded into ArcGIS Pro and the Google Earth Engine platform as shape files to train classifiers. As a rule [54], every classification class should have a minimum of 50 training samples. Training and validation data used for LULC classification was the same in both platforms (ArcGIS Pro and Google Earth Engine). LULC was divided into five broad classes: forest, urban, bare land, water bodies, and vegetation.

For the CART classifier: the optimal cross-validation factor was estimated to be ten based on the research of Kohavi [44] and utilized as an input. For the RT classifier: more trees within the range of 50–100 demonstrated greater classification accuracy and performance [55]. In the present investigation, 100 trees produced positive outcomes. For the SVM classifier: cost, kernel type, and gamma are crucial parameters. The linear kernel is suitable for large datasets [56]. The gamma value is unnecessary for linear kernels. The cost parameter establishes the degree of the punishment for misclassified data, if the C value is greater, it means fewer data misclassifications. For SVM classification, the C-SVC method with a value of 10 as a cost parameter and a linear kernel is also utilized. After LULC classification using post-change detection methodologies, the result of each categorization for the study period was estimated, and a change detection analysis from 2017 to 2021 was also conducted.

#### *2.6. Accuracy Assessment*

Methods, procedures, time, and space affected the classification's precision [57–59]. Several research [42,60] found slight to significant variations in the classification accuracy of LULC using different LULC classifiers. The efficacy of various classifiers was evaluated based on their accuracy. The most frequent indicator for measuring the accuracy and also efficiency of all classifiers is overall accuracy, which indicates the percentage of correctly classified testing data. The accuracy evaluation in this investigation reveals a modest discrepancy between the outputs of the classifiers utilized in this instance. The precision of a LULC classification varies not just by classifier but also by location and time. This might be caused by atmospheric, surface, and light fluctuations [61]. Using stratified random sampling, a dataset was generated [62]. Stratified sampling is to divide the dataset or strata according to the characteristics of its attribute. After randomly selecting samples from the types or strata through stratification, these samples in each stratum have a certain

commonality. The main purpose of the stratified sampling method is to obtain the sample data with high efficiency. It is generally believed that stratified sampling makes it easy to extract representative samples [63]. These sampling datasets were separated into validation and training sets. Seventy percent of the entire datasets were used for training, while thirty percent were used as validation datasets [64]. Some auxiliary datasets such as Google Street view were used as a reference to collect the sampling datasets. ArcGIS Pro and Google Earth Engine offer different methods to check the accuracy of many LULC classifiers, and some of these accuracy evaluations were utilized to determine the accuracy of each categorized map from 2017 to 2021.

$$\text{Overall Accuracy} = \frac{T\_{C.P}}{T\_{S.P}} \times 100\tag{1}$$

Whereas *TC.P* is the total number of correctly classified pixels and *TS.P* is the total number of sample pixels.

$$kappa\ coefficient = \frac{O.A - \text{C.A}}{1 - \text{C.A}} \tag{2}$$

Whereas *O.A* is the overall accuracy and *C.A* is chance agreement. Equations (1) and (2) show the calculation method of overall accuracy and kappa coefficient. The validation data gathered for each year was compared to each map to generate an error matrix. In addition, the kappa coefficient was determined to evaluate the accuracy of each used classifier for every year throughout the study period in ArcGIS pro and Google Earth Engine platforms for their respective classifiers [65]. We tested all classified results from ArcGIS pro on the ArcGIS Pro platform and tested all Google Earth Engine classified results on the Google Earth Engine platform by the same methods (overall accuracy, kappa coefficient) to check the accuracy. A higher value of the kappa coefficient shows the perfect arrangement of the raster data [66], e.g., 0.8 to 1.0 value shows higher efficiency of LULC classification.

#### *2.7. Change Detection*

Numerous techniques exist to identify the LULC change from different imagery datasets [67], but this process is not always convenient or straightforward. Comparing other datasets from different satellites obtained on many dates is relatively straightforward yet effective for change detection in the LULC of any area. Some techniques based on this methodology [68,69] calculate descriptive statistics for the vegetation index difference between two different time points. A threshold of the spectral difference is utilized to distinguish between pixels with and without change. In this study, after the classification and accuracy assessment of all three datasets, we choose the dataset and the classifier with the higher value of overall accuracy and also the kappa coefficient value to analyze the change in LULC using ArcGIS Pro. After that, area change in every class throughout the study period was calculated.

#### **3. Results and Discussion**

#### *3.1. LULC Classification Maps*

#### 3.1.1. LULC Classification of Landsat 8 Imagery in ArcGIS Pro

From Figure 3, we can observe that the LULC classification of Landsat 8 resulted in most of the forest area being misclassified as an urban class or bare land in the years 2020 and 2021 for RT. Vegetation class was misclassified as bare land as water bodies in 2018 for RT. For the SVM classifier, forest and vegetation were misclassified as urban in 2020 to some extent. Vegetation was also misclassified as bare land or forest in 2018 and 2019. For 2020, SVM classified the image well except for bare land and some urban areas. For all Landsat 8 images from all five years, the SVM classifier performed well in comparison to the other two classifiers in terms of overall accuracy. For 2017 and 2019, the ML classifier misclassified vegetation as a forest. This may be because of identical reflectance, the same as forests [65].

**Figure 3.** LULC classification maps of Landsat 8 images using SVM, RT, and ML classifiers for the years 2017 to 2021 in ArcGIS Pro.

#### 3.1.2. LULC Classification of Sentinel 2 Imagery in ArcGIS Pro

Figure 4 shows the LULC classification maps using SVM, RT, and ML classifiers for the years 2017 to 2021. From Figure 4, we can see that classification of Sentinel 2 using an SVM classifier resulted in more accuracy as compared to the other two classifiers [11]. For SVM, forest and vegetation were misclassified as urban in 2017 and 2018. For 2021, SVM classified the input imagery data well except for forest and urban classes. For 2020, the RT classifier misclassified the urban and bare land as forest and vegetation in the years 2020 and 2021. Forest was misclassified as urban and, to a lesser extent, as water bodies in 2021 for ML. Vegetation was misclassified as bare land to some extent or forest in 2018 and 2019 using the ML classifier. For 2017 and 2020, the ML classifier misclassified vegetation as bare land and forest, respectively.

**Figure 4.** LULC classification maps of Sentinel 2 images using SVM, RT, and ML classifiers for the years 2017 to 2021 in ArcGIS Pro.

#### 3.1.3. LULC Classification of Planet Imagery in ArcGIS Pro

Results show that the classification output of Planet imagery resulted in most of the LULC classes being well classified for the study area with the SVM classifier (Figure 5). For SVM, vegetation was slightly misclassified as bare land and urban in 2020 and 2021. For 2017, the RT classifier misclassified the bare land as vegetation. Forest was misclassified as vegetation and, to some extent, as urban in 2020 and 2021, respectively. For all years, Sentinel 2 images throughout the study period, the ML classifier misclassified the forest and urban areas as bare land in some areas.

**Figure 5.** LULC classification maps of Planet images using SVM, RT, and ML classifiers for the years 2017 to 2021 in ArcGIS Pro.

#### 3.1.4. LULC Classification of Landsat 8 Imagery in Google Earth Engine

Figure 6 shows the LULC classification maps using Landsat 8 imagery. In most of the forest area, this class was misclassified as vegetation for the study area with the SVM and CART classifiers in 2020 and 2021. For the SVM classifier, bare land was misclassified as vegetation and urban for 2017 and 2021. For 2017, the RF classifier misclassified the bare land as vegetation. In 2021, vegetation was classified as bare land in some areas. Urban was misclassified as forest in 2017, 2018, and 2019 in the MD classifier. Throughout the study period, the CART classifier misclassified the forest as urban areas. Overall, the SVM classifier performs better than the other three classifiers using Landsat 8 imagery in the Google Earth Engine.

#### 3.1.5. LULC Classification of Sentinel 2 Imagery in Google Earth Engine

Figure 7 shows the classification output of Sentinel 2 imagery, which shows that SVM classifiers perform well as compared to CART, MD, and RF classifiers. For SVM, the forest was misclassified as vegetation and bare land for 2019 and 2020, respectively. For 2017, the RF classifier misclassified the bare land and forest area as urban. In 2020, vegetation was classified as water bodies in some areas. Urban was misclassified as a forest for all five years in the MD classifier [51]. Throughout the study period, the CART classifier misclassified most classes as urban areas [70]. Overall SVM classifier performs better than the other three classifiers as urban and vegetation areas are well classified in the Google Earth Engine using Sentinel 2 imagery.

**Figure 6.** LULC classification maps of Landsat 8 images using SVM, RF, MD, and CART classifiers for the years 2017 to 2021 in Google Earth Engine.

**Figure 7.** LULC classification maps of Sentinel 2 images using SVM, RF, MD, and CART classifiers for the years 2017 to 2021 in Google Earth Engine.

#### *3.2. Accuracy Assessment*

In this research, to check the accuracy of each classifier, we used a validation dataset that was different from the training datasets. By stratified random sampling, we distributed these points over the study area [64] to ensure that all LULC classes accurately and uniformly were represented. After LULC classification using all classifiers, overall accuracy and kappa coefficient were calculated to check the accuracy of these LULC classified maps in the ArcGIS Pro and Google Earth Engine. Tables 2 and 3 show the overall accuracy of different classifiers in ArcGIS Pro and Google Earth Engine platform using Landsat 8, Sentinel 2, and Planet imagery. From Table 2, we can observe the overall accuracy of the SVM classifier with Planet imagery was higher than the other classifiers. The SVM classifier performs higher accuracy with an overall accuracy of 96% in 2017 and 2019 using Planet imagery in ArcGIS Pro. For Sentinel 2 dataset, the higher value of overall accuracy was for the year 2018, with a value of 87%. The lowest value of overall accuracy (74%) for Sentinel

2 imagery was for the year 2019 using the RT classifier. Landsat 8 shows the lowest overall accuracy for all tree classifiers using the ArcGIS Pro platform throughout the study period compared to the other two datasets. Using Landsat 8 imagery, the SVM classifier shows a higher overall accuracy throughout the study period, with the highest value of 85% in 2021.


**Table 2.** Kappa coefficient and overall accuracy of Landsat 8, Sentinel 2, and Planet imagery for SVM, RT, and ML classifiers using ArcGIS Pro.

**Table 3.** Kappa coefficient and overall accuracy of Landsat 8, Sentinel 2 imagery for SVM, RF, MD, and CART classifiers.


Using the Google Earth Engine, the SVM classifier performed the highest overall accuracy of 92% and 90% with Sentinel 2 and Landsat 8 imagery, respectively, in the year 2019 compared with the other three classifiers [71]. In 2021, the MD classifier showed the lowest overall accuracy value, 87%, with Sentinel 2 imagery. Throughout the study, the SVM classifier performs well compared to the other classifiers with Sentinel 2 data. Using Landsat 8 data, the SVM classifier shows the highest overall accuracy of 90% in 2019 and 2021. The lowest overall accuracy of 77% was observed for the MD classifier in 2018. Throughout the study period, the MD classifier showed the lowest overall accuracy with both datasets in the Google Earth Engine platform.

Figure 8a shows the kappa coefficient values for all tested classifiers using three different imagery (Landsat 8, Sentinel 2, and Planet) at the ArcGIS Pro platform. Values of the kappa coefficient from Figure 8a show that Planet imagery shows the highest accuracy with the SVM classifier compared to the other two datasets throughout the study period. Figure 8b shows the values of the kappa coefficient for all tested classifiers using Landsat 8 and Sentinel 2 imagery at the Google Earth Engine platform. The Sentinel 2 dataset shows a higher value of the kappa coefficient as compared to the Landsat 8 dataset from 2017 to 2021 with the SVM classifier. Results show that SVM performs well as compared to CRAT and other classifiers on both platforms. Similar results are reported in the literature [47,72]. At the same time, the lowest value for the kappa coefficient was observed for Landsat 8 data using the RF classifier.

**Figure 8.** Kappa coefficient (**a**) using SVM, ML, and RT classifiers in ArcGIS Pro, (**b**) using SVM, MD, RF, and CART classifiers in Google Earth Engine.

The average overall accuracy for the SVM, ML, and RT classifiers was 94%, 91%, and 89%, respectively, using Planet imagery. While using Sentinel 2 and Landsat imagery, the overall accuracy for SVM, ML, and RT classifiers was 85%, 82%, 82.2% and 83%, 81%, and 77%, respectively, at the ArcGIS Pro platform. In ArcGIS Pro, the average value of the kappa coefficient from 2017 to 2021 was 0.92 for the SVM classifier using Planet imagery (Table 4), which was higher than the other classifiers. The lowest value for the average kappa coefficient value was observed for the RT classifier using Landsat 8 data, which was 0.70, respectively. In Google Earth Engine, the average value of overall accuracy for the SVM classifier was 0.83 with Sentinel 2 (Table 5), which was higher than the other classifiers for the Sentinel 2 dataset. Using Landsat 8 data, the highest value of average overall accuracy and kappa coefficient with a value of 83% and 0.79 was for the SVM classifier. MD and RF classifiers show almost close values of average overall accuracy and kappa coefficient in the Google Earth Engine platform.


**Table 4.** Average overall accuracy and average kappa coefficient of Landsat 8, Sentinel 2, and Planet imagery for SVM, RT, and ML classifiers using ArcGIS Pro.

**Table 5.** Average overall accuracy and average kappa coefficient of Landsat 8, Sentinel 2 imagery for SVM, RF, MD, and CART classifiers using Google Earth Engine.


#### *3.3. Change Detection*

There are a variety of approaches for detecting LULC changes using different satellite image data [68], but this process is not always simple. Comparing remotely sensed data obtained on multiple dates is a simple yet efficient method for change detection [73]. Figure 9 shows the significant changes in the LULC of Charlottetown from 2017 to 2021 in all LULC classes in the city of Charlottetown. A significant change was observed in forest and vegetation classes, which were converted into other LULC classes during the study period. Bare land areas were also converted to urban areas due to rapid urbanization in the study area.

**Figure 9.** Change in LULC over study area from 2017 to 2019.

The LULC changes can also be determined by using a transition matrix. The transition matrix can explain the change in every LULC class over a specific period [74]. This matrix describes the changed and unchanged amount of area of each LULC class [75]. Change in every class can be observed by a transition matrix in Table 6 from 2017 to 2021 in percentage and km2. The bare land class changed to the urban class with an increase of 38.81%, showing an urban area expansion from bare land, forest, and vegetation areas as per the literature review. At the same time, another significant change was observed in the vegetation class, which had a decrease of 32.24% in the bare land class and 1.63% in the urban class. Overall, 3.9% of the total study area was converted to an urban area from 2017 to 2021, which shows rapid urbanization in the study area. Forest areas changed into vegetation by 21.3%. Forest area was also converted into bare land by 13.80% due to growth in the lumber industry and into urban by 14.10%, whereas the urban class and water bodies class shows minor changes in their respective area of interest.

**Table 6.** Transition matrix for area change in different LULC classes from 2017 to 2021 in km2.


#### **4. Conclusions**

The climate around the globe has been modified by human activities through the LULC changes. The LULC changes (i.e., deforestation, urbanization, etc.) played an important role in climate change since the pre-industrial era [76]. There are many studies conducted to measure the LULC changes. However, there is no study, to the best of the authors' knowledge, which compares the performance of different approaches available on other platforms. In this study, the performance of different classifiers was observed on two different platforms (ArcGIS Pro and Google Earth Engine) using different datasets, e.g., Landsat 8, Sentinel 2, and Planet imagery. This study aimed to identify the bestperforming classifier with different input data. To check the accuracy of different classifiers, the accuracy assessment was performed for every classifier using the error matrix and kappa coefficient for all three satellite imageries from 2017 to 2021. The SVM showed higher accuracy as compared to other classifiers within both platforms. The SVM classifier performs the highest overall accuracy using the Planet imagery on ArcGIS Pro and using Sentinel 2 imagery in the Google Earth Engine. Due to the high resolution of Planet data, it performs well with an average overall accuracy of 94% compared to Sentinel 2 with 85% and Landsat 8 with 83% in the ArcGIS Pro platform. On the Google Earth Engine platform, the SVM classifier performs well with Sentinel 2 imagery with an average overall accuracy of 92% compared to Landsat 8 imagery, which shows 87% accuracy. For this type of study area, which includes different LULC types, e.g., the forest, vegetation, urban, and water bodies, the SVM classifier shows promising results [11].

Change detection analysis shows deforestation and rapid urbanization as 18.80% of forest area was converted to bare land, and 38.81% of bare land was converted to urban area throughout the study period. In addition, some classes may be categorized using expert knowledge and supplementary data [46]. The most accurate and suitable classifier can also affect by the region of interest, number, and quality of training samples. Similarly, identical reflectance of satellite imagery was another limitation for LULC classification. Some pixels in the LULC maps represent different classes after the LULC classification from the training due to the fact that some of the pixels correspond to different spectral responses. These LULC classifications may be modified based on the study area, data, etc. Furthermore, many studies suggested that the different classifiers show different results in different climatic and geographic situations. Therefore, there is a dire need to further explore the LULC classification accuracy by adding more detailed LULC classes and more training samples with high-resolution input datasets.

**Author Contributions:** Conceptualization, X.W.; Data curation, S.B.; Formal analysis, R.A.N.; Investigation, A.A.F. and K.L.; Project administration, X.W.; Supervision, X.W.; Validation, S.B. and T.A.; Visualization, A.A.F., R.A.N. and S.L.; Writing—original draft, S.B.; Writing—review and editing, R.A.N., K.L. and T.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Natural Science and Engineering Research Council of Canada, the New Frontiers in Research Fund, Atlantic Canada Opportunities Agency, and Agriculture and Agri-Food Canada.

**Data Availability Statement:** Publicly available data sets were used in this study. These can be found at: https://earthexplorer.usgs.gov/ (Landsat8 and Sentinel-2 datasets) and https://www.planet. com/explorer/ (Planet datasets).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Climate Sensitivity of the Arid Scrublands on the Tibetan Plateau Mediated by Plant Nutrient Traits and Soil Nutrient Availability**

**Ben Chen 1,2,†, Hui Chen 1,†, Meng Li 3, Sebastian Fiedler 4, Mihai Ciprian Mărgărint 5, Arkadiusz Nowak 6,7, Karsten Wesche 8,9,10, Britta Tietjen 11,12 and Jianshuang Wu 2,5,\***


#### **Highlights:**

#### **What are the main findings?**


#### **What is the implication of the main finding?**


**Abstract:** Climate models predict the further intensification of global warming in the future. Drylands, as one of the most fragile ecosystems, are vulnerable to changes in temperature, precipitation, and drought extremes. However, it is still unclear how plant traits interact with soil properties to regulate drylands' responses to seasonal and interannual climate change. The vegetation sensitivity index (VSI) of desert scrubs in the Qaidam Basin (NE Tibetan Plateau) was assessed by summarizing the relative contributions of temperature (SGST), precipitation (SGSP), and drought (temperature vegetation dryness index, STVDI) to the dynamics of the normalized difference vegetation index (NDVI) during plant growing months yearly from 2000 to 2015. Nutrient contents, including carbon, nitrogen, phosphorus, and potassium in topsoils and leaves of plants, were measured for seven types of desert scrub communities at 22 sites in the summer of 2016. Multiple linear and structural equation models were used to reveal how leaf and soil nutrient regimes affect desert scrubs' sensitivity to climate variability. The results showed that total soil nitrogen (STN) and leaf carbon content (LC), respectively, explained 25.9% and 17.0% of the VSI variance across different scrub communities. Structural equation modeling (SEM) revealed that STN and total soil potassium (STK) mediated desert scrub's

**Citation:** Chen, B.; Chen, H.; Li, M.; Fiedler, S.; M ˘arg ˘arint, M.C.; Nowak, A.; Wesche, K.; Tietjen, B.; Wu, J. Climate Sensitivity of the Arid Scrublands on the Tibetan Plateau Mediated by Plant Nutrient Traits and Soil Nutrient Availability. *Remote Sens.* **2022**, *14*, 4601. https:// doi.org/10.3390/rs14184601

Academic Editor: Xander Wang

Received: 25 July 2022 Accepted: 6 September 2022 Published: 15 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

VSI indirectly via SGST (with standardized path strength of −0.35 and +0.32, respectively) while LC indirectly via SGST and SGSP (with standardized path strength of −0.31 and −0.19, respectively). Neither soil nor leave nutrient contents alone could explain the VSI variance across different sites, except for the indirect influences of STN and STK via STVDI (−0.18 and 0.16, respectively). Overall, this study disentangled the relative importance of plant nutrient traits and soil nutrient availability in mediating the climatic sensitivity of desert scrubs in the Tibetan Plateau. Integrating soil nutrient availability with plant functional traits together is recommended to better understand the mechanisms behind dryland dynamics under global climate change.

**Keywords:** climate change; dryland ecosystem; leaf nutrient traits; Qaidam Basin; soil nutrient availability; vegetation sensitivity index

#### **1. Introduction**

Drylands cover approximately 41% of the Earth's land surface [1] and support more than 38% of the global population [2], although they are sparsely vegetated and with limited productivity. Recently, drylands have been degraded due to ongoing climate change and intensifying human disturbance [3,4]. For example, increased droughts mainly caused by warming are predicted to accelerate the degradation of the Mediterranean drylands under human overexploitation [5]. More than 75% of drylands in developing countries are also likely to expand due to further global warming [6]. Consequently, poverty alleviation will become more challenging in arid undeveloped countries [7]. Therefore, a better understanding of how drylands respond to changes in climate and other potential regulators is urgently needed.

Climate change drives dryland dynamics across different spatial scales, including warming, shifting precipitation regimes, and intensifying droughts. Warming can limit plant growth and survival via increasing evapotranspiration and intensifying water deficit in soils [8,9]. Experimental studies have reported that increased precipitation can improve nutrient availability and facilitate plant growth, development, and reproduction in drylands via enhancing moisture content in topsoils and vice versa [10,11]. Meanwhile, remotesensing-related technologies help monitor large-scale vegetation dynamics [12]. However, current studies on ecosystem responses to climate change are mainly based on average climate states but usually ignore the effects of climate variability and extremes [13].

Recently, Seddon et al. [14] have proposed a framework to assess the vegetation sensitivity index (VSI) to short-term variability in temperature, precipitation, and radiation (cloud mask) from a global perspective. Such an algorithm has been increasingly applied to reveal the relative importance of different climatic variables in driving the dynamics in vegetation coverage, productivity, and phenology within and across various ecosystems [15–18]. For example, Li et al. [15] assessed the variability of VSI with altitude gradient in alpine grasslands on the Tibetan Plateau and found that vegetation in higher altitude regions is more sensitive to climate variability than that in lower altitudes. In another study, Yuan et al. [18] found that vegetation in Central Asia was more sensitive to climate variability in spring than other three seasons. These studies have mapped the differential spatiotemporal patterns of alpine and arid vegetation in response to climate variability and greatly enhanced our understanding of the vegetation–climate relationship. However, they failed to provide mechanistic explanations for these patterns from the perspective of plant and soil nutrient properties. Many in situ manipulative experiments have reported that the vegetation response to climate change can be largely influenced by plant functional traits and soil nutrient availability. For example, in a climate simulation study of alpine grasslands, Henn et al. [19] found that alpine plants can adapt to climate change by shaping leaf traits. In a recent meta-analysis, soil nitrogen availability was found to be a determinant of terrestrial ecosystem productivity under changing climates [20]. However, the findings from local in situ experiments are always hard to be directly applied

to large-scale ecosystem management. Therefore, it is necessary to combine remote sensing technologies and field observations to gain insights into the mechanisms of how vegetation adapts to changing climate via plant and soil nutritional properties.

Plants can respond specifically and actively to environmental changes rather than only passively withstand external stresses [21]. According to the community functional ecology theories [22–24], plant seed germination, seedling growth, and individual development and survival, along with environmental gradients, are mainly regulated by functional traits at species and community levels [25–30]. Plant individuals can modify their water requirements by reducing leaf size, stomatal conductance, and photosynthetic rates in response to droughts [3,31,32]. At the community level, plants can alter the C:N:P stoichiometry among organs above- and belowground to offset physical inhibitions caused by warming or cooling [33,34], declined moisture [35–37], and changed nutrient availability [38]. For example, Jiang et al. [39] found that warming could substantially enhance N and P mineralization and consequently improve nutrient provision to plants in the Arctic tundra. Recently, Delgado-Baquerizo et al. [40] and Jiao et al. [41] also found that increased aridity, mainly due to warming, could reduce the concentrations of C and N but increase that of P in drylands. Fiedler et al. [42] also found that climate change could indirectly affect the trade-offs and synergies among different ecosystem functions by affecting soil nutrients and plant traits. These studies indicated that ecosystem responses to climate change are complicated and depend on abiotic and biotic factors [43]. Network analyses that include plant functional traits and physical environments together may improve our prediction of ecosystem functional changes across space and over time [44–46]. However, our understanding of the interactions of plant functional traits with soil nutrients of high-elevated drylands in response to climate variability is still limited.

In this study, we aimed to improve our understanding of how plant nutrient traits interact with low soil nutrient availability to regulate the sensitivity of desert scrubs in the Qaidam Basin, northeastern Qinghai-Tibetan Plateau, in response to climate variability. Specifically, we have (i) explored the patterns and trends of climate change and desert scrub coverage in the last decade, (ii) examined the differences in climatic sensitivity and its components for different desert scrub communities, and finally, (iii) investigated the networking paths of interactions between plant nutrient traits and soil nutrient availability to the components of desert scrubs' sensitivity to climate variability across the entire Qaidam Basin.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The Qaidam Basin, with elevations ranging from 2640 m to 6000 m, is surrounded by the Kunlun, Qilian, and Altun Mountains and located in the northeastern Qinghai-Tibetan Plateau (90◦16 E~99◦16 E, 35◦00 N~39◦20 N, Figure 1). The area is characterized by an alpine arid continental climate, where the mean annual temperature (MAT) is generally below 5 ◦C and mean annual precipitation (MAP) is less than 200 mm [47]. The air relative humidity is between 30% and 40% throughout the year, while the average yearly sunshine duration can reach 3000 h in the Qaidam Basin. The annual average evaporation in the basin is between 1900 mm and 2600 mm [48]. Soil nutrients and moisture availability are low in desert scrublands in this area [49]. The vegetation has co-evolved with harsh physical conditions and is dominated by scrubs and semi-scrubs, such as *Kalidium foliatum*, *Salsola abrotanoides*, and *Ephedra sinica* [50]. Desert plants sprout in May and defoliate in early October; therefore, this period was usually defined as the plant growing season in recent studies [51].

**Figure 1.** Vegetation (**A**), elevation and major meteorological stations (**B**), climate (**C**), and site locations (black dots) in the Qaidam Basin, Qinghai-Tibetan Plateau. In Panel C, GST and GSP refer to the average temperature and sum precipitation during the plant growing season, respectively, from May to October during 2000–2015. TVDI means the temperature vegetation dryness index.

#### *2.2. Remote Sensing Data Collection and Processing*

It is the key to choosing a suitable vegetation index for accurately examining and assessing vegetation dynamics under climate change. Recently, Zhang et al. [50] have evaluated the correlation between different vegetation indices and vegetation cover observed with UAV in the Qaidam Basin. They found that the Normalized Difference Vegetation Index (NDVI) performed much better than other indices such as ratio vegetation index (RVI), difference vegetation index (DVI), modified vegetation index (MVI), modified soil-adjusted vegetation index (MSAVI), and normalized difference greenness index (NDGI)) and normalized difference greenness index (NDGI) in assessing vegetation cover. Therefore, in this study, the NDVI of the Moderate Resolution Imaging Spectroradiometer (MODIS) MOD13Q1 Products was used as a proxy for vegetation productivity of desert scrubs. We downloaded the NDVI data with a temporal resolution of 16 days and spatial one of 250 m for the Qaidam Basin for the period from 2000 to 2015 (https://ladsweb.modaps.eosdis.nasa.gov/, accessed on 1 October 2021). As NDVI is sensitive to soil background information, therefore, we used the Savitzky–Golay filter [52] to smooth the time-series NVDI and eliminate the contaminations caused by clouds, snow, and ice cover, following Li et al. [15]. The pixels whose peak NDVI value was less than 0.05 were classified as extremely sparse vegetation or barren lands. Those pixels were manually excluded from further analyses following the recent study of Yuan et al. [18] in Central Asia. Finally, the peak monthly NDVI generated by maximum value composites [53] was used to

describe the sensitivity and its components of desert scrub vegetation in the Qaidam Basin, Qinghai-Tibetan Plateau, to the variability of different climate variables.

#### *2.3. Weather Data Collection and Processing*

The monthly data of temperature and precipitation of the main meteorological stations within the Qaidam Basin (Figure 1B) yearly from May to September were downloaded and mapped for the period of 2000 to 2015. Seasonal and inter-annual fluctuations in temperature and precipitation, as well as their extreme values, were clearly shown in the supplementary Figure S1. It makes sense that the weather data from 2000–2015 are sufficient to reflect the climate fluctuations in the region. Therefore, we downloaded monthly temperature and precipitation of 200 meteorological stations located within and around the Qinghai-Tibetan Plateau from the China Meteorological Data Service Centre (http://data.cma.cn/en, accessed on 5 October 2021) for the plant growing seasons (yearly from May to October) between 2000 and 2015. First, the average temperature and sum precipitation during the plant growing seasons (termed as GST and GSP, respectively) were calculated for each station. In addition to monthly weather records, we interpolated GSP and GST into rasters with ANUSPLIN (The software is developed by Australian National University in Australia and the version is 4.3) [54] at a spatial resolution of 250 m to match the remote sensing data (see details below). A digital elevation model (DEM) with a 30 m resolution was used as the covariate, following Li et al. [55], who also confirmed that the interpolated climate rasters matched well the field observations across the northern Tibetan Plateau.

In this study, we used the temperature vegetation dryness index (TVDI) to explore the vegetation-drought relationship as recommended by Sandholt et al. [56] because it combines vegetation greenness (MODIS MOD13A2) and land surface temperature (MOD11A2). TVDI ranges from 0 to 1, with higher values indicating drier soils. We downloaded monthly TVDI data from the National Earth System Science Data Center (http://www.geodata.cn, accessed on 12 October 2021) and resampled it to match the spatial-temporal resolutions of climate and NDVI data. Finally, we used the Mann–Kendall tests [57,58] to examine the trends of GST, GSP, NDVI, and TVDI at the pixel scale between 2000 and 2015 (see more details in Figures S2–S5). The MK test outputs Z-values and slope, with positive and negative slope values representing increasing and decreasing trends. |Z| > 1.96 reaches a significant level (α = 0.05).

#### *2.4. Climatic Sensitivity Index of Desert-Scrubs*

In this study, we used NDVI as an agent of desert-scrub coverage to explore its sensitivity to changes in GST, GSP, and TVDI. First, we transformed the time-series NDVI and climate variables into z-score anomalies using each variable's mean and standard deviation (Equation (1)).

$$Z = \frac{X - \overline{X}}{\sigma} \tag{1}$$

where Z is the standard score, *X* is the raw data, −*X* is the mean value of data, and *σ* is the standard deviation of data. Then, we used a principal component regression (PCR) to determine the relative importance of each climate variable in driving the monthly NDVI dynamic at the pixel scale. Principal components with significant relationships to climatic variables (*p* < 0.1) were retained for the subsequent VSI calculations.

The climate weights (*Wi*) were obtained by multiplying the loading scores of each variable with the corresponding PCR coefficients. Here, the weight of each climate variable was also rescaled to be between 0 and 1. Then we detrended each time-series variable and extracted its weight variance. The residuals of the mean-variance relationship between NDVI and climate variables were fitted using a linear-quadratic model, following Seddon et al. [14]. The residuals were normalized between 0 and 100. We take the log10 transformed ratios between NDVI variability and climate variables as the sensitivity metrics

(*Si*). The NDVI logged for one month was used as the fourth variable to explore the time-lag effect of vegetation response to climate variability, following Seddon et al. [14].

$$\text{VSI} = \sum\_{i=1}^{3} \mathcal{W}\_i \times \mathcal{S}\_i \tag{2}$$

where VSI is the sum of the product of the weights of each climate variable (*Wi*) and its sensitivity metrics (*Si*). We rescaled the VSI between 0 and 100 (unitless), and the larger the value, the more sensitive to climate variability. We constructed a 500 m buffer centered on each sample point. Thus, it makes sense that the specific VSI values of the pixels in the buffer area can serve as repeats for multiple comparisons among different scrub communities. Moreover, we calculated the VSI average of the pixels in the buffer area for each site and used it in multiple linear regressions and structural equation modeling across sites within the study area.

#### *2.5. Plant Nutrient Traits and Soil Nutrient Availability*

According to the 1:1,000,000 China Vegetation Atlas, which is available at the Resource and Environmental Science Data Center, Chinese Academy of Sciences (http://www.resdc. cn/data.aspx?DATAID=122, accessed on 1 May 2016), our field observations in 2016 at 22 sites covered seven main types of desert scrubs within the Qaidam Basin (Table 1). Plant nutrient traits and soil nutrient properties were measured to examine their potential regulating effects on the desert scrubs' sensitivity to climate variability.

We laid five 5 m × 5 m sample squares evenly along the diagonal in a relatively homogeneous area of 250 m × 250 m at each site, where the plant community was intact and undisturbed by human activities while the proportion of desert scrub plants was higher than 90% in the total vegetation cover. Healthy leaves were collected from at least five mature individuals for each scrub/semi-scrub species in each quadrat. Plant leaves were stored in a separate bag for each species during the field campaign and then dried at 70 ◦C for 48 h in the lab to a constant weight. Before chemical analyses, all dried leaves were grounded into fine powder through a 0.15 mm sieve and kept in brown glass bottles. Leaf total P (LTP, g kg−1) and K (LTK, g kg−1), respectively, were determined by Mo-Sb colorimetric method and flame spectrophotometer [59].

We randomly sampled three profiles of the 0–50 cm depth at each site, about a half kilogram for each sample. Soil samples were air-dried and sieved through a 100-mesh sieve to remove root fractions and gravel. Then, soils were grounded into fine powders and stored in brown glasses before chemical analyses [60]. Total soil and leaf nitrogen (STN and LTN, g kg−1) were determined by Kjeldahl method. Total soil and leaf phosphorus (STP and LTP, g kg−1) were measured with the Mo-Sb anti-spectrophotometry method, while total soil and leaf Kalium (STK and LTK, g kg<sup>−</sup>1) with the flame photometry method. Soil organic carbon (SOC, g kg−1) and leaf carbon (LC, g kg−1) were determined by the potassium dichromate oxidation–ferrous sulfate titrimetric method.

*Remote Sens.* **2022**, *14*, 4601

**Table 1.** Locations, the most dominant species and their relative coverage (DSC), climate conditions, soil properties, and plant nutrient traits at each site sampled in 2016 across the Qaidam Basin. GST and GSP refer to the average temperature and precipitation during the plant growing season. Soil properties included the contents of organic carbon (SOC), total nitrogen (STN), total phosphorus (STP), and total potassium (STK) of topsoils at the 0–50 cm depth. Plant nutrient traits were the community weighted means (CWMs) of leaf carbon (LC), total nitrogen (LTN), total phosphorus (LTP), and total potassium (LTK) in the leaves of all plants at each sampling site.


#### *2.6. Plant Functional Trait Diversity*

Our plant nutrient traits referred to the contents of N, P, K, and C in leaves (LTN, LTP, LTK, and LC) for all species sampled (Table 1). We calculated community-weighted means (CWMs) to describe the regulating effects of plant nutrient trait diversity on desert scrubs' sensitivity to climate variability (see Equations (2) and (3)). CWMs are essential for understanding community reorganization in response to environmental filtering [61] and are widely recommended and used in plant functional ecology research [62].

$$CWM\_j = \sum\_{i=1}^{n} P\_{ij} T\_{ij} \tag{3}$$

where *Pij* is the relative cover of species *i* in sample site *j*; *Tij* is the mean of trait values of species *i* in sample site *j*; *CWMj* is the community-weighted mean of traits of each species in sample site *j*.

#### *2.7. Statistical Analyses*

First, with the Mann–Kendall test [57,58], we examined the trends of GST, GSP, TVDI, and NDVI between 2000 and 2015 at the pixel scale. Moreover, we extracted the yearly value and trends of GST, GSP, TVDI, and NDVI based on each site's geographical coordinate records at the pixels where we performed field surveys. Thus, we compared the medians of climate and vegetation variables and their trends among the seven types of desert scrubs (see more details in Figures S2–S5).

Second, we calculated the VSI of 2000–2015 based on principal component regressions and used a red–green–blue (RGB) composition map to show the relative contribution of GST, GSP, and TVDI to the climatic sensitivity of desert scrubs at the pixel scale. We also extracted the VSI and its components to each site and examined the difference in them at the community level with the Wilcoxon test [63].

Next, we divided the data into three groups: VSI as the response variable and leaf traits (LTN, LTP, LTK, LC) and soil attributes (STN, STP, STK, SOC) as potential explanatory variables. We examined the normality of the data containing response and predictor variables and normalized these data in R using a scale function (see Equation (1)). Moreover, we examined the correlation between the responders and predictors using the corrplot package (version 0.92) [64] in R. The multivariate linear model was applied to investigate the main effects of leaf and soil nutrient attributes on the variation of VSI. In this step, we also determined the multicollinearity of each factor based on the variance inflation factor (VIF) [65]. Thus, the relative importance of each explanatory can be disentangled as the fraction of the variance VSI explained in the best-fitted model.

Finally, structural equation modeling (SEM) [66] was used to explore how vegetation traits and soil nutrients drive changes in VSI using the lavaan package (version 0.6.1) [67]. Stepwise backward selections were used to remove the least significant term until the bestfitted model was picked out. All statistical analyses were performed in R (The software is developed by R Core Team in Austria and the version is 4.1.1) [68], and maps were generated with ArcGIS (The software is developed by the Environment System Research Institute in America and the version is 10.2) [69].

#### **3. Results**

#### *3.1. Vegetation Sensitivity Index and Its Contributors*

The overall VSI of desert scrubs is low (15.0) and distributed unevenly across the Qaidam Basin. The VSI was greater than 30 at only 1.18% pixels scattered near the foothills and riverbanks (Figures 1A and 2A, Table 2). The relative contribution of each climatic variable also varies heterogeneously across space. Specifically, in the western and northeastern areas, vegetation is more sensitive to changes in GSP. However, the desert scrubs in the southern and northern areas were mainly affected by temperature dynamics. In the eastern part, desert scrubs were jointly controlled by GSP and TVDI (Figure 2B). Moreover, the desert scrubs in the central region were driven by GST and TVDI together.

**Figure 2.** Vegetation Sensitivity Index (VSI, (**A**)) and its contributions (**B**) from the temperature and precipitation during the plant growing season (GST and GSP, respectively, in red and blue) and temperature vegetation dryness index (TVDI, in green).

**Table 2.** Vegetation Sensitivity Index (VSI) among different desert scrub communities across the Qaidam Basin.


Communities dominated by *Haloxylon ammodendron* and *Tamarix chinensis* were more sensitive to climate variability than others (Figure 3A). The climatic weights vary among desert-scrub types. Communities dominated by *Tamarix chinensis* were more sensitive to GST than others (Figure 3B), while communities dominated by *Haloxylon ammodendron* were more vulnerable to GSP (Figure 3C). In response to changes in TVDI, communities dominated by *Tamarix chinensis* were slightly more vulnerable than those dominated by *Haloxylon ammodendron*. However, both communities were more sensitive than other desert scrub communities across the Qaidam Basin (Figure 3D).

**Figure 3.** Comparisons of VSI (**A**) and its climatic weights among desert scrub communities. Panels (**B**–**D**) displayed the weights of mean temperature and sum precipitation during the plant growing season (noted as SGST and SGSP, respectively) as well as the temperature vegetation dryness index (STVDI). Different lowercase letters indicate significant differences among species.

#### *3.2. Effects of Soil and Plant Properties on VSI*

We found that VSI strongly correlated with the weights of GST (SGST, r = 0.81) and droughts (STVDI, r = 0.65, Figure S6). SGST was negatively correlated with STN (r = −0.53) and LTN (r = −0.51). The weight of GSP (SGSP) was negatively correlated with LC (r = −0.49). Moreover, SOC had strong correlations with SGST (r = −0.53), STK (r = 0.74), LTK (r = 0.59), and STN (r = −0.48), respectively. Across different dominant species of the 22 sites surveyed in this study, we only found that SGST declined linearly with increasing STN and LTN (Figures 4 and 5) while SGSP declined linearly with increasing STP and LC (Figures 4 and 5).

**Figure 4.** Trends of vegetation sensitivity index and its contributions with soil nutrients.

**Figure 5.** Trends of vegetation sensitivity index and its contributions with leaf nutrients.

However, we also found differential responses, linear, U-shaped, and unimodal, for different top-dominant species along the soil and foliar nutrient gradients (Figures S7 and S8). When examining the relationships of sensitivity components with soil nutrients, we only found that the SGST of *Kalidium foliatum* and STVDI of *Sympegma regelii* had unimodal relationships with SOC and STP, respectively, at a marginal significance level (*p* < 0.1, Figure S7). When examining the relationships of sensitivity components with foliar nutrimental traits, we found a significant unimodal relationship between STVDI and LC for *Sympegma Regeli* (*p* < 0.05, Figure S8), a marginally positive linear relationship between SGSP and LTK for *Kalidium foliatum* (*p* < 0.1, Figure S8), a significant unimodal relationship between SGSP and LTN for *Kalidium foliatum*, and a marginally positive linear relationship between SGST and LTP for *Krascheninnikovia ceratoides* (*p* < 0.1, Figure S8).

STN and LC explained 42.9% of the variance in VSI across different desert scrubs (*p* < 0.05, Table 3). SGST was significantly affected by STN and LC, which accounted for 49.0% of the variance in SGST (*p* < 0.01, Table 3). In addition to STN and LC, LTN and STK affected SGST marginally, respectively, to explain 9.11% and 7.62% of the variance in SGST (*p* = 0.053 and 0.073, respectively, Table 3). SGSP was affected by LC and STP marginally, with 14.0% and 18.9% of its variance explained by LC and STP (*p* < 0.1, Table 3). Neither soil nutrients nor plant leaf traits influenced STVDI in this study (*p* > 0.05, Table 3).

**Table 3.** Main effects of soil nutrient availability and plant nutrient traits on vegetation sensitivity index (VSI) of desert scrubs across the Qaidam Basin. Abbreviations were the same as in Table 1. d.f., the degree of freedom; F, variance ratio; *p*, significance level; η2, Eta squared, the percentage of sum squares explained. The difference significant at the 0.01 and 0.05 levels were labeled with \*\* and \*, respectively.


#### *3.3. Causal Network of Plant Traits and Soil Nutrients to VSI under Climate Variability*

Structural equation modeling revealed the causal networking paths among plant and soil nutrient traits and climate weights to the sensitivity of desert scrubs across the Qaidam Basin (Figure 6). STN, LC, and LTN had direct adverse effects on SGST. In contrast, STK had a significant positive effect on SGST. LC had direct adverse effects on SGSP, accounting for about 24% of its total variance. STN had direct adverse effects on STVDI, while STK had a significant positive effect on STVDI. However, STN and STK accounted for only 16% of the variance in STVDI.

**Figure 6.** Structural equation modeling reveals the direct and indirect effects of leaf traits and soil properties on VSI and its components of desert scrubs across the Qaidam Basin. Notes are the same as

in Table 1. The orange and dark-green lines represent the positive and negative influences, respectively. The difference significant at the 0.001 and 0.01 levels were indicated with \*\*\* and \*\*, respectively.

#### **4. Discussion**

*4.1. Climate Sensitivity among Different Desert Scrubs*

In this study, we mapped the climate sensitivity of desert scrubs in the Qaidam Basin using principal component regressions (Figure 2). We found that the climatic sensitivity of desert scrubs in this study was not high as expected. However, this finding is consistent with Seddon et al. [14] and Yuan et al. [18], and they found that arid deserts across the globe and Central Asia, respectively, were not as sensitive to climate dynamics. This phenomenon may have resulted from the strong memory effect of desert plants (Figure S9). The vegetation memory effect generally refers to the fact that previous environmental conditions can strongly influence the current ecosystem [70]. Ogle et al. [71] quantified ecological memory in vegetation and ecosystem processes in arid and semi-arid regions and found when the vegetation's memory effects were considered, 18–28% more of the variance in a given responsible variable could be explained. Seddon et al. [14] also found such a strong memory effect in global drylands' sensitivity to climate change. Moreover, the low climatic sensitivity of desert scrubs in the Qaidam Basin was also likely due to the relatively slow warming during the last decade (Figure S2), a rate of only 0.06 ◦C per decade. This finding was also confirmed by Easterling and Wehner [72] that at the beginning of the 21st century, there would be a 10- to 20-year warming hiatus. Overall, the stable climatic conditions may be another explanation for the low climate sensitivity of desert scrubs in the Qaidam Basin.

Although the desert scrubs were not sensitive as expected to climate change, we also found high species-dependency for the seven community types involved in this study. Specifically, we found that scrub communities with tall plant individuals, such as *Tamarix chinensis* communities and *Haloxylon ammodendron* communities, were more sensitive to climate change (Figure 3). Similarly, Zhu et al. [73] have found that northern China's trees and tall shrubs are more vulnerable to climate change. This is because taller plants have predictably wider water-conducting conduits, so they are more vulnerable to conductionblocking embolisms [74]. Therefore, the response of different plant communities to climate change is influenced by the functional traits evolved to adapt to long-term environmental changes. However, vegetation also reacts and adjusts positively to environmental stresses [21]. Environmental-vegetation interactions have shaped scrub communities with different tolerance, and the response of these scrub communities to climate variability varies widely in the Qaidam Basin. The environmental filtering hypothesis predicts that, excluding biotic interactions, only the species that have already evolved with specific functional or phenotypic traits for survival, such as tolerance or avoidance of dryland plants to extreme water deficits [21,75]. This might be the reason for the differences in the relative weight of temperature (SGST), precipitation (SGSP), and drought (STVDI) found among the top-dominant species (Figure 3).

#### *4.2. Desert-Scrubs' VSI Correlated with Soil and Leaf Nutrients*

Warming is one of the important drivers for community structural and functional changes in drylands. Here, we found that STN, STK, LTN, and LC were closely related to the SGST of vegetation (Table 3) and together explained more than 65% of the variance in SGST (Table 3). This might be because warming can enhance soil microorganism activities, promote SOM decomposition and N mineralization, and thereby increase the supply and availability of soil nutrients [76–78]. In addition to STN and STK, warming can also enhance the activity and supply of K ions in soil solution [79].

In addition to soil nutrients, we also found that SGST generally declined with increasing LTN and LC (Figure 5), indicating a high photosynthetic rate with high LTN and LC can mitigate the negative effects of warming. LTN and LC are usually used to descript the photosynthetic activity under manipulated stresses in experimental studies [80,81]. Our findings were partly consistent with Li et al. [15] that high-elevated grasslands have greater SGST than those in low-elevations on the Tibetan Plateau. Dryland plants can increase leaf photosynthetic enzymes by enhancing LTN to promote photosynthetic rate and water use efficiency [82–84]. Moreover, our findings confirmed that the relationships of sensitivity climatic components with soil and foliar nutrients varied among different top-dominant species (Figures S7 and S8), likely due to species niche overlapping and differentiation.

Water availability is the main limiting factor for desert vegetation, and we found that STP and LC were the main factors affecting vegetation SGSP, explaining 18.8% and 13.96% of the variation in SGSP, respectively. P is an important element that affects vegetation growth, development, and metabolism; however, it can be readily combined with Ca2+ in arid soils and then is hard to be uptaken by plants. Pulsed precipitation in drylands can influence P transport, transformation, and availability through biochemical processes that control organophosphorus mineralization [85,86]. This might be the reason for the declining SGSP with increasing STP (Figure 4) in our study.

LC is closely related to the photosynthesis of vegetation. Under drought conditions, vegetation reduces its water loss by decreasing leaf area and lowering light saturation point, which results in impaired ATP synthesis and reduced carboxylase activity, resulting in impaired leaf photosynthesis and thus reduced LC fixation [87]. Pulsed precipitation can briefly mitigate vegetation drought conditions and provide good moisture conditions for vegetation photosynthesis, thus increasing LC accumulation. This is why we also found that SGSP declined with increasing LC content across different species (Figure S8). In addition, we also found that the LC of the seven arid scrub species was concentrated in the 300–400 g kg−<sup>1</sup> interval, which may imply the coexistence of species has evolved with overlapped niches to successfully survive in the Qaidam Basin, Qinghai-Tibetan Plateau.

It is worthy to note that neither soil nor plant nutritional properties alone can well explain the variance in STVDI, which describes the availability of soil moisture, in other words, the combining effects of temperature and precipitation. In this study, we found that STVDI could be associated with STN and STK at the same time (Figure 6). Both STN and STK could explain 16% of the variance in STVDI across different species and sites. Soil water deficit caused by decoupling temperature and precipitation can limit soil microbial activity and slow down soil N mineralization [88]. Moreover, a drying environment with high temperatures and limited precipitation can also exacerbate the volatilization of gaseous N and lead to high evaporative demand [89], which in turn reduces the availability of STN [90]. Dry conditions can enhance soil inorganic N content [91]; however, they are hardly taken up by plants due to diffusion limitations [92]. The uptake of potassium by vegetation is dependent on soil moisture [93]; soil water deficit reduces soil potassium release capacity and availability [94] and inhibits soil potassium mobility, limiting potassium uptake by vegetation.

#### *4.3. The Networks of Direct and Indirect Environmental Influences on VSI*

The response of vegetation to climate change is characterized by the integrated effect of multiple factors and processes, forming complex and specific climate-species relationships [95,96]. In this study, we found that soil and foliar nutrient properties affect the scrub communities' climate sensitivity by regulating different climate sensitivity components. Even more, such regulating effects are dependent on the spatial scale, across sites, or for a given species. Finally, we found that the effect of soil nutrient effectiveness on VSI was greater compared to leaf nutrient effectiveness. The path strength of STK and STN on SGST were greater than that of LC and LTN (Figure 6), and the explanatory power of STK and STN on the variance in SGST (37.52%) was greater than that of LC and LTN (28.17%, Table 3). Soil nutrients provide indispensable nutrients for the growth and development of vegetation, and the decoupling of soil nutrient cycles due to climate warming may negatively affect dryland ecosystem services [40]. We should pay attention to the role of

soil nutrients in drylands to maintain ecosystem stability. Therefore, in the future, the dependent and combining effects of biotic and abiotic factors on vegetation's sensitivity and vulnerability should be well examined and disentangled.

#### *4.4. Limitations and Future Work*

In this study, we integrated remote sensing data and field measurements together to investigate the mechanisms behind desert scrub's sensitivity to climate change in the Qaidam Basin, Qinghai-Plateau. However, there are still several uncertainties in this study. First, there might be a scaling-mismatch problem between remote sensing data and field measurements. Indeed, NDVI can be used to reflect the information on all vegetation in the whole region. Limited by budgets, only dominant desert scrub plants were sampled and measured for functional traits. Herbaceous plants were not fully considered in field observations. This may affect the accuracy of CWMs of plant functional traits and lead to some bias in the interpretation of community climate sensitivity. In further studies, both scrubs and herbaceous plants should be well considered because they have differential strategies to adapt to changes in physical circumstances.

In addition, this study only considered several soil nutrients and failed to take into account the effects of microorganisms and trace elements in soils. A recent review suggests that soil microbes dominate soil life activities by mediating nutrient cycling, decomposing organic matter, inhibiting plant diseases, influencing soil structure, and maintaining vegetation productivity, are major drivers of soil ecosystem change, and have game-changing potential in restoring soil function [97]. Moreover, soil trace elements are directly involved in the metabolic activities of organisms and influence the growth and development of vegetation, and have received increasing attention globally [98]. Therefore, future studies should also integrate the effects of multiple factors to obtain a more comprehensive understanding of the vegetation–climate relationship.

#### **5. Conclusions**

This study assessed and disentangled the sensitivity and its components of desert scrubs to climate variability with the time series of NDVI and weather data at the large scale crossing the Qaidam Basin. The regulatory mechanisms of plant and soil functional/nutrimental traits behind the desert scrub's climate sensitivity were investigated and compared among multiple communities dominated by different plant species. The results confirmed that the sensitivity of vegetation to temperature change was mainly regulated by the contents of leaf carbon and nitrogen as well as soil nitrogen and potassium. In contrast, the sensitivity of vegetation to precipitation change was regulated by the contents of leaf carbon and soil total nitrogen. Neither soil nutrients nor plant foliar traits alone can well explain scrubs' sensitivity to droughts. Plant foliar traits and soil nutrient properties indirectly regulate the different components of vegetation's climate sensitivity. Due to harsh physical circumstances in the Qaidam Basin and limited available funds, a scaling-mismatch problem might still exist between remote-sensing data and field measurements in this study. Anyway, this study combined soil and plant functional traits together to provide a new perspective to investigate the mechanisms behind desert vegetation dynamics under climate change when assessed with large-scale remote-sensing data. These findings also highlighted the necessity to make management and conservation strategies specifically according to the different factors that regulate the vegetation–climate relationship among different plant communities.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14184601/s1.

**Author Contributions:** J.W. and H.C. conceived ideas and designed field protocols. B.C., J.W. and M.L. collected data. B.C. and H.C. performed chemical analyses. J.W. and B.C. analyzed data, prepared figures, and led the writing. M.L., S.F., M.C.M., A.N., K.W., B.T. and H.C. interpreted the results. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was jointly supported by the Second Tibetan Plateau Scientific Expedition and Research (STEP, 2019QZKK1002), the Innovation Talent Exchange of Foreign Expert Program under the Belt and Road Initiative (DL2021056001L), the Key Project of the Hebei Normal University (L2021Z05) and the National Natural Sciences Foundation of China (41877448 and 40971118).

**Acknowledgments:** J.W. had been funded with a two-year scholarship from January 2017 to April 2019 by the Alexander von Humboldt Foundation in Germany. J.W. has been supported by the Young Talent Scientist Program of the Chinese Academy of Agricultural Sciences since December 2019. S.F. was funded by the Deutsche Forschungsgemeinschaft (DFG, project number 192626868–SFB 990/2-3). We appreciated the valuable comments and suggestions given by Ben Niu and Zhipeng Wang, and Yun Jäschke on the first draft of this manuscript.

**Conflicts of Interest:** The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work reported in this paper.

#### **References**


## *Article* **Satellite Fog Detection at Dawn and Dusk Based on the Deep Learning Algorithm under Terrain-Restriction**

**Yinze Ran 1, Huiyun Ma 1, Zengwei Liu 1, Xiaojing Wu 2, Yanan Li <sup>1</sup> and Huihui Feng 1,\***


**\*** Correspondence: hhfeng@csu.edu.cn

**Abstract:** Fog generally forms at dawn and dusk, which exerts serious impacts on public traffic and human health. Terrain strongly affects fog formation, which provides a useful clue for fog detection from satellite observation. With the aid of the advanced Himawari-8 imager data (H8/AHI), this study develops a deep learning algorithm for fog detection at dawn and dusk under terrainrestriction and enhanced channel domain attention mechanism (DDF-Net). The DDF-Net is based on the traditional U-Net model, with the digital elevation model (DEM) data acting as the auxiliary information to separate fog from the low stratus. Furthermore, the squeeze-and-excitation networks (SE-Net) is integrated to optimize the information extraction for eliminating the influence of solar zenith angles (SZA) on the spectral characteristics over a large region. Results show acceptable accuracy of the DDF-Net. The overall probability of detection (POD) is 84.0% at dawn and 83.7% at dusk. In addition, the terrain-restriction strategy improves the results at the edges of foggy regions and reduces the false alarm rate (FAR) for low stratus. The accuracy is expected to be improved when training at a season or month scale, rather than at a longer temporal scale. Results of our study help to improve the accuracy of fog detection, which could further support the relevant traffic planning or healthy travel.

**Keywords:** H8/AHI; fog detection at dawn and dusk; DEM; U-Net; SE-Net

#### **1. Introduction**

Fog refers to the suspension of microscopic water droplets in the atmosphere [1]. Besides during the night, solar radiation is also weak during the dawn and dusk, resulting in a relatively high probability of fog formation because of the low surface air temperature and high vapor saturation [2]. The dense fog seriously reduces horizontal visibility, which adversely affects public traffic and human health (particularly in rush hours at dawn and dusk) [3]. In addition, anthropogenically generated chemicals dissolve in the foggy water, which would strongly worsen air pollution [4]. Therefore, fog detection at dawn and dusk is crucial to effectively support traffic planning and to provide information and bulletins for reducing risks to human health [5].

Traditional terrestrial fog detection mainly relies on meteorological observation stations. However, it is difficult to be applied at a large-scale due to the limited spatial and temporal resolutions. With the high temporal resolution and spatial continuity observation, satellite remote sensing shows great potential for fog detection on a large scale [6]. Physically, the satellite fog detection algorithm follows the fact that the emissivity of opaque water clouds is lower in the mid-infrared (MIR) band than that in the thermal infrared (TIR) band [7]. Based on this theory, Eyre et al. [8] used the 3.7 μm and 11 μm brightness temperature (BT) of the Advanced Very High-Resolution Radiometer (AVHRR) to identify the nighttime fog and low stratus. Considering the great difference of the dual-channel brightness temperature difference (BTD) between fog, surface and clouds, the BTD is widely used in nighttime fog detection [9–14]. However, the method is only suitable for

**Citation:** Ran, Y.; Ma, H.; Liu, Z.; Wu, X.; Li, Y.; Feng, H. Satellite Fog Detection at Dawn and Dusk Based on the Deep Learning Algorithm under Terrain-Restriction. *Remote Sens.* **2022**, *14*, 4328. https://doi.org/ 10.3390/rs14174328

Academic Editor: Xander Wang

Received: 30 June 2022 Accepted: 29 August 2022 Published: 1 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

fog detection at nighttime, for which it is difficult to isolate the fog and low stratus (FLS) simultaneously [15]. The MIR band is affected by the solar radiation reflection of the target in the daytime, which causes the dual-channel differential separation detection threshold to vary with the solar zenith angle (SZA) [2]. At dawn and dusk, the MIR reflection is relatively weak due to the high SZA. In regions near to the terminator, the BTD between the fog and surface is similar or consistent, resulting in large uncertainty in fog detection. Gurka et al. [16] analyzed the fog dissipation process and its correlation with the thickness of the fog layer based on the visible light (VIS) band of the Synchronous Meteorological Satellite (SMS-1). Rao et al. [17] discussed the feasibility of identifying FLS in the VIS band of 0.64 μm. At present, the daytime fog detection algorithms are mainly based on the threshold of the spectral and textural differences between cloud, fog, and surface in the 0.67 μm, 3.7 μm, and 11 μm bands [18,19]. However, the MIR and VIS bands are affected by the high SZA at dawn and dusk and the spectral characteristics difference is weak between fog and surface, leaving low confidence in fog detection at dawn and dusk. Using the difference of BTD and reflectivity at 0.67 μm, Yoo et al. [5] proposed a FLS detection algorithm at dawn based on the Communication, Ocean and Meteorological Satellite (COMS) and Fengyun-2D (FY-2D) satellite. However, the method is difficult to monitor fog areas in real-time due to the low temporal resolution of the polar-orbiting satellites. Yang et al. [20] used the geostationary Fengyun-4A (FY-4A) and Himawari-8 (H8) with a higher temporal resolution to retrieve the probability index of Japan/surrounding FLS at dawn, which provides a methodological reference for fog detection at dawn and dusk.

The algorithms mentioned above mainly rely on the spectral difference between fog and other ground objects. Surface environments (particularly the terrain) also strongly affect fog formation and detection [21], which are lacking in most previous algorithms. In general, fog forms near the ground, while clouds form at high altitudes [22]. Furthermore, the edge of fog is restricted by terrain [13], which presents unique spatial variation, vertical structure, and optical properties. By adding terrain information to the fog detection algorithm, it is expected to further improve the accuracy of fog detection. For example, Shang et al. [22] combined DEM with traditional cloud/haze detection indicators to improve the accuracy of cloud, haze, and clear recognition. However, the terrain features of fog classification are the same as the edge elevation of objects, while the spectral features of remote sensing images are mainly at pixel scale. Therefore, it is difficult to fuse the two features in traditional fog detection methods.

The deep learning approach realizes the mapping of input data to detection target results (that is, the detection of fog) through a series of data transformations (layers), replacing the complex multi-stage process with a simple, end-to-end learning model. By using a series of nonlinear functions, the deep learning model describes and classifies the characteristics of the targets [23], which provides a potentially efficient way for fog detection at dawn or dusk. Based on the U-Net model, Jacob et al. [24] proposed an RS-Net cloud detection algorithm that effectively improved the efficiency of cloud detection. Wu et al. [25] proposed a geographic information-driven network (GeoInfo-Net) for cloud and snow detection. In addition to using remote sensing data, the network would encode the auxiliary data sets of the elevation, latitude, and longitude information for network training, which helps to improve the accuracy significantly. However, fog detection is sensitive to the remote sensing bands under different terrain conditions and SZA, resulting in various weight values of each band during deep learning. Meanwhile, fog is a small probability event. The model training strategy needs to be adjusted to speed up the convergence process of model training and avoid overfitting. The problems above leave great uncertainty for the application of deep learning in satellite fog detection. These problems can be optimized by improving the application of the deep learning model. For example, the sensitivity of different SZA and terrain conditions to fog detection can be effectively alleviated by introducing the Squeeze-and-Excitation Networks (SE-Net), which could automatically obtain the weights of the contributions of the parameters [26]. Uneven distribution of training data samples can be solved through the batch normalization of input data. Additionally, the training process could be optimized by designing the independent adaptive learning rates for different parameters and by using efficient optimization algorithms [27]. However, the relevant research is still rare, thus further investigation is warranted.

This study aims to develop a deep learning-based algorithm for satellite fog detection under the terrain-restriction. The DEM data is added into the information input layer to distinguish fog and low cloud. Furthermore, the Squeeze-and-Excitation Networks (SE-Net) is integrated to optimize the model's information extraction to eliminate the influence of different SZA over a large region. Finally, a deep learning fog detection algorithm under terrain constraints and enhanced channel domain attention mechanism (DDF-Net) is developed for fog detection at dawn and dusk. This study is organized as follows: Section 2 describes the study area, data, details regarding pre-processing methodologies, and the realization of the algorithm. Section 3 includes the results of quantitative and qualitative validation. Finally, the performance and limitation of the DDF-Net are concluded and discussed in Section 4.

#### **2. Materials and Methods**

#### *2.1. Study Area and Data Sources*

Figure 1 shows an overview of the study area, which is located in northern China (104◦30 E~136◦00 E and 33◦00 N~47◦30 N). The western part of the study area includes the Inner Mongolia Plateau and Loess Plateau, with an average altitude of more than 1500 m. The eastern of the study area is located in the third step of the flat terrain of the Chinese topographic demarcation line, where the average altitude drops to 500–1000 m. The study area is located in the eastern of the Henhe-Tengchong Line and has a developed economy and large population. The climate is characterized by a rich source of water vapor from the Western Pacific, which helps to form fog for more than 30 days per year [28]. The frequent fog at dawn and dusk exerts a serious impact on local traffic, especially aviation and highways, and also the economic development and people's physical and mental health [29].

**Figure 1.** The study area.

(1) Satellite data

We adopt the Advanced Himawari-8 Imager (AHI) data with a temporal resolution of 10 min for the investigation. The data set has 16 channels ranging from 0.47 μm to 13.3 μm, with the details could be seen in ref [30]. By referencing the previous studies [22,31–33], we

select the data sets of bands 3, 5, 6, 7, 11, 13, and 14 for model training and validation. AHI data in foggy days from November to December during 2015~2017 are selected to build 3122 fog mask images (size: 256 × 256) and then divided into 70% for training and 30% for validating.

#### (2) Ground observation data

The ground observation data was used to execute and validate the algorithm, these data were available from the China Meteorological Administration ground observation stations. The data include observations 8 times per day: 02:00 local time (LT), 05:00 LT, 08:00 LT, 11:00 LT, 14:00 LT, 17:00 LT, 20:00 LT, and 23:00 LT. Among them, the data at 8:00 and 17:00 are used to evaluate the accuracy of the algorithm. The climate would be marked as foggy when the "visibility" is less than 1 km in most criteria. In this study, we followed the criteria and classified the fog as strong, dense and hazy as visibility ranges between 0~200, 200~500 and 500~1000 m, respectively. Meanwhile, the China Meteorological Administration (CMA) promulgated the national standard of Grade of fog forecast (GB/T 27964-2011GB/T) [34], which also defined haze when the horizontal visibility ranges from 1.0 to 10.0 km. Furthermore, it is less harmful to traffic when the visibility is greater than 3 km. Therefore, we defined the haze as visibility ranges from 500~3000 m.

#### (3) Terrain data

The terrain data used in this study came from the SRTM1-DEM data with a spatial resolution of 30 m released by the United States Geological Survey (USGS). To match the spatial resolution of the H8/AHI data, the DEM data was resampled to 2 km.

#### *2.2. Deep Learning Algorithm*

Deep learning realizes the mapping of input data to target results through a series of data transformations (layers) (Figure 2). The basic principle is to output the image as a target result according to the mapping function *f* of the initial network model [23]. The model compares the prediction with the truth, and then adjusts the parameters and weights through an iterative method to capture the optimal prediction result. The components of the deep learning model generally include the mapping function *f*, convolutional layers (CL) and pooling layers (PL) [23]. Specifically, *f* defines the functional relationship between the input *x* and the target mask classification map *y*; CL uses a specified number of convolution kernels in each layer to extract the feature information of each pixel of *x*. The shallow layer includes the textural and spectral features that are used as the basis classification, and the deeper layer represents the higher-level semantics, which is the feature information obtained after several convolutions (feature extraction) [35]. The PL is implemented to sample the channel feature information downward, which divides the input image into several rectangular areas to get the maximum value of each sub-area. Through this way, it will continuously reduce the spatial size of the data, which could reduce the parameters and improve the computation speed [23]. Furthermore, the PL can ensure that the network continuously extracts and utilizes the information through the multi-layer structure of biological neurons. The activation function (Relu) is located between the CL and the PL, which is used to achieve nonlinear transformation. The sigmoid function maps the output of the last CL to values from 0 and 1, and then realizes the classification of the input *x* for each pixel by setting an appropriate threshold.

In the execution of the deep learning model, the target probability is generated according to the last CL and the sigmoid function. It is defined as the target pixel when the probability is greater than a certain threshold. The deviation between the actual fog coverage and the predicted target mask is calculated using a binary cross-entropy loss function, and the parameter gradient of the mapping function *f* is calculated and updated by the Backpropagation algorithm (BP) [36] until the optimal parameters of the model are obtained. Specifically, the actual fog coverage is generated by the combination of visual

interpretation and ground observation, which is used as the label data for training or validating the algorithm.

**Figure 2.** The Schematic diagram of the deep learning model. Relu is used to achieve non-linear transformation between the CL and PL layers.

#### *2.3. The SE-Net Module*

The output of the CL does not consider the dependencies of each channel. SE-Net allows the network to selectively enhance features and suppress useless features. We choose to adopt this approach in our study. The SE-Net module contains 1 global average pooling operation (*GP*), 2 small fully connected layers (*FC*), 1 Sigmoid function and 1 channel weighted operation, with the details could be seen in ref [26]. Specifically, the Global pooling (*GP*) refers to the average of all the pixels of the feature map on each channel. The FC includes two sub-modes of *FC*<sup>1</sup> and *FC*2. *FC*<sup>1</sup> consists of fully-connected layers with C/16 filters, which is calculated by the weighted summation to reduce the number of parameters and improve the calculation efficiency. The activation function (Relu) is located between the *FC*<sup>1</sup> and the *FC*2, which is used to achieve nonlinear transformation. *FC*<sup>2</sup> consists of fully-connected layers with *C* filters, which assure the same numbers of outputs and channels. Finally, the sigmoid normalizes these learned weights to be between 0–1 for dimensionless processing of different features.

Execution steps of the SE-Net module are as follows:

(1) Obtaining the global information from the feature channels. The information of the DEM-AHI fog image at dawn and dusk is extracted through the CL formed by *C* convolution kernels, and image *X* with a characteristic channel number of *C* is formed. *Cp* is calculated by GP in each feature channel of the image *X* (Equation (1)):

$$\mathbf{C}\_{\mathcal{P}} = \frac{1}{\mathcal{W} \times \mathcal{H}} \sum\_{i=1}^{\mathcal{W}} \sum\_{j=1}^{H} x\_{\mathcal{P}}(i, j) \tag{1}$$

where *Cp* is the global information of the feature channel *p* after GP, *W* and *H* are the length and width of the image *X*, respectively, *xp* (*i*,*j*) represents the pixel value of *X* at point (*i*,*j*) in channel *p*.

(2) Calculating the correlation between each feature channel. According to the calculated correlation results, the weight of the feature channel that improves the accuracy is increased, while the suppressed or ineffective feature channel's weight is decreased (Equation (2)):

$$F\_{P\_-F\_1} = ReLU(FC\_1(C\_P))\tag{2}$$

where *F\_P\_F*<sup>1</sup> is the feature of output channel *p* after *FC*<sup>1</sup> and Relu activation function, the *FC*<sup>1</sup> is the first fully connected layer [23,26], the size of *F\_P\_F*<sup>1</sup> is *C*/16 × 1 × 1.

(3) Calculate the weight of each feature channel. The Sigmoid function is used to calculate the weight factor *Wp* of the channel *p* (Equation (3)):

$$\mathcal{W}\_P = \mathcal{S} \mathcal{G} \mathit{mo} \mathit{i} \mathit{d} \left( \mathcal{F} \mathcal{C}\_2 \left( F\_{P\_- F\_1} \right) \right) \tag{3}$$

where the *FC*<sup>2</sup> is the second fully connected layer. *Wp* represents the weight factor of channel *p* after function calculation of *FC*<sup>2</sup> and Sigmoid function.

(4) Attention learning of channels. The weighted attention feature *x <sup>P</sup>* is acquired through the weight factor *Wp* and the pixel value *xp* of *X* at channel *p* (Equation (4)):

$$\mathbf{x}\_P' = \mathcal{W}\_p \cdot \mathbf{x}\_p \tag{4}$$

The *x <sup>P</sup>* represents the attention feature of channel *p* learned by SE-Net.

#### *2.4. Deep Learning-Based Algorithm of Fog Detection under Terrain Restriction*

The spatio-spectral characteristics of fog, medium/high clouds and surfaces provide the physical basis for the algorithm of fog detection, while the weak solar radiation severely hinders the separation of fog from the surface and clouds at dawn and dusk. The U-net model had been proven to be a robust and effective method due to its excellent performance and transferability in image segmentation [24,37], which is used as the basic model in our algorithm. Furthermore, terrain strongly affects the formation and movement of fog [21], which should be taken into consideration in the algorithm of fog detection. To overcome these challenges above, this study develops a deep learning-based algorithm for satellite fog detection at dawn and dusk under terrain constraints. The improvement of our method refers to the adaptive algorithm under different SZA and terrain conditions by introducing channel attention mechanisms and adding DEM data to remote sensing images. Specifically, our method extracts fog from the AHI-DEM datasets by integrating the SE-Net module in the CL of the U-net model. Figure 3 depicts the flowchart of the algorithm. The inputs include the ground observation data, satellite data of H8/AHI (bands 3, 5, 6, 7, 11, 13, and 14) and the terrain data (DEM), while the outputs refer to spatial coverage of fog detection under different times.

The main structure of the DDF-Net is shown in Figure 4. The input side includes 5 CL, 4 PL and 3 SE-Net modules. The CL can extract the feature information of fog, cloud and surface at dawn or dusk through designing different numbers of 3 × 3 convolution kernel. The BN operation is executed to normalize the input data and the Relu function acts by increasing the model's ability to fit nonlinear relationships. The PL selects the maximum value of the 4 pixels to reduce the dimension of the feature information and complete the down-sampling of the feature map, which can reduce the parameters and accelerate the training process of the model. The output side includes 4 CL and 4 up-sampling layers. The up-sampling layer restores the dimension of the feature map by duplicating each pixel in the feature map as 4 pixels, which generates less local information in the feature map. Alternatively, the skip connection fuses the feature map to establish a contextual feature information connection and compensate local information, which helps to obtain the semantic space context information of the model [37]. The DDF-Net model composed of these multi-layer structures can approximately fit all complex functions. This is very beneficial to our research, because the state of fog is quite different under different terrains and SZA.

**Figure 3.** Flow chart of the methodology.

**Figure 4.** The DDF-Net's architecture. The top/bottom numbers of the rectangle represent the channels of the feature map, and the right numbers are the size of the feature map.

#### 2.4.1. Building the H8/AHI-DEM-Fm Fog Mask Dataset

Based on the H8/AHI data, this study firstly builds the training data set (Figure 5). The DEM data was added to the AHI data to refine the model's detection accuracy at the edge of the foggy area. Fog pixels are labeled by the spectral, textural and movement characteristics of fog, surface and cloud.

**Figure 5.** The process of making the H8/AHI-DEM-Fm dataset.

(1) Building the AHI-DEM dataset at dawn and dusk.

The remote sensing images including dawn and dusk are screened according to the SZA. Meanwhile, band clipping and sea-land mask clipping are performed. The DEM is combined with AHI data as a channel to construct the AHI-DEM dataset containing the time of dawn and dusk.

(2) Building the fog mask dataset Fm.

The fog pixels marked on the AHI-DEM data are used to build the fog mask data Fm. The pixels are marked as daytime, nighttime and twilight fog through the textural and spectral difference (TSD), bright temperature difference (BTD) and visual interpretation (VI). TSD refers to the differences in the texture spectrum of daytime fog, clouds, and surfaces. The BTD of fog is lower than surfaces and medium/clouds at nighttime. The criterion of VI is that the fog has the characteristics of spatial and temporal consistency within the adjacent 30 min. According to the labeling of the image at the previous moment, it is determined whether there are fog pixels at the current moment. VI is mainly used to identify fog that is similar or consistent with surface spectral information at dawn and dusk. Meanwhile, the ground observation data is used to strictly calibrate the Fm.

(3) Building the H8/AHI-DEM-Fm fog mask dataset.

AHI-DEM and Fm are simultaneously cropped into sub-images of 256 × 256 sizes with 30% overlap to improve the training efficiency of the model and the ability to capture edge information. Finally, H8/AHI-DEM-Fm is produced through the above process.

#### 2.4.2. Integration of the SE-Net Module

SZA strongly affects the fog spectral characteristics, even with the same remote sensing band. For example, the VIS band has a strong effect on the low SZA regions (due to fog being more reflective than the surface and less reflective than clouds), while it is ineffective in the nighttime regions. Therefore, the influence of each band on fog detection under different SZA should be taken into account, which is very important for large-scale dawn and dusk fog detection. Meanwhile, the terrain strongly affects the fog formation and development, which would act as one of the key parameters of fog detection [21]. The SE-Net module can automatically learn the weight of each band in the fog detection task at different conditions of SZA terrain in a data-driven manner, which shows the great potential for fog detection under different SZA.

Assuming that *X* is the feature map of the *c* channel output by the Relu function of a certain layer in the encoder, then *X* can be expressed as:

$$X = \{\mathbf{x1}, \mathbf{x2}, \dots, \mathbf{xc}\} \tag{5}$$

SE-Net automatically learns the weights *W* of feature maps on each channel by training:

$$\mathcal{W} = \{w1, w2, \dots, wc\} \tag{6}$$

Finally, the feature map *X* output by SE-net can be expressed as:

$$X' = X \cdot W = \{ \mathbf{x1} \times w \mathbf{1}, \mathbf{x2} \times w \mathbf{2}, \dots, \mathbf{xc} \times wc \} \tag{7}$$

The SE-Net module is embedded after the first three Relu (rather than after each Relu in the encoder) to further improve the extraction of fog information under different terrain and SZA conditions, and also avoids the introduction of too many parameters to slow down the model training.

#### 2.4.3. Adjust the Training Strategy of the Model

To optimize the network model, the Adam gradient descent algorithm (Adam) [27,38,39] and BN [40] are introduced to adjust the training strategy. The Adam is introduced to speed up the training process of the model by automatically obtaining the direction of optimized parameters [27,38,39]. The BN is embedded after each CL to normalize the input data to overcome the sudden change in the distribution of a certain batch of training data (Figure 4). In addition, it can also improve the training speed and shortens the training time [38]. The configuration parameters (such as software, graphics card, programming framework, etc.) are shown in Table 1.

**Table 1.** Training platform and parameter configuration information.


(1) By setting the initial parameters of the model (Table 1), the encoder extracts the feature map of the input image. Batch size represents the number of images inputted to the network during model training. Epoch represents the maximum number of rounds of model iteration. *αinitial* indicates the initial learning rate. Decaying Epoch indicates that the model adjusts the learning rate dynamically according to formula (8) after the studies number of Epoch minus Decaying Epoch.

$$\mathfrak{a} = \mathfrak{a}\_{Last-round} \times \left(\frac{\text{Decaying } Epcch}{Epcch}\right)^{0.9} \tag{8}$$

(2) The decoder combines high-level semantic and low-level spatial information to complete the initial classification of pixels. The model finally outputs the binary result of 0 (non-fog) and 1 (fog). Then, the error between the predicted result and the actual fog coverage is calculated (Equations (9)–(11)). BP assigns the error to each parameter and the Adam is used to update the parameters by using the learning rate provided by the model.

(3) Processes 1–2 are iterated until the indicators are no longer significantly changing and fluctuate around a certain value.

(4) The new remote sensing data are input to the trained DDF-Net model to realize fog detection over the whole study area. The performance of the algorithm could be finally evaluated with the aid of ground observation.

#### *2.5. Evaluation Metrics*

This study adopts several metrics to evaluate the accuracy of the training model and the performance accuracy of our algorithm, which are described as follows:

#### (1) Model Evaluation Metrics

We use the metric of semantic segmentation [33,37], namely Accuracy (*Acc*), F1-Score (*F1*) and Intersection-over-Union (*IoU*) to evaluate the accuracy of the training model. Each indicator is defined as follows:

$$Acc = \frac{N\_{correct}}{N\_{total}} \tag{9}$$

$$F1 = \frac{2p \cdot r}{p + r} \tag{10}$$

$$IoI = \frac{N\_{correct}}{N\_{ground-truth} \cup \ N\_{false}} \tag{11}$$

where *Acc* is the ratio of the number of correctly classified pixels (*Ncorrect*) to the total number of pixels in the scene (*Ntotal*). p represents the proportion of the *Ncorrect* to the sum of *Ncorrect* and the number of falsely classified pixels (*Nfalse*), *r* represents the proportion of the Ncorrect to the total number of pixels that are foggy in the corresponding label of the images (*Nground-truth*). *IoU* represents the ratio of fog pixels in the image predicted by the model to the sum of total number of pixels that are foggy in the corresponding label of the images (*Nground-truth*) and *Nfalse*. All indicators are scaled from 0 to 1, the larger the *IoU*, the higher the segmentation accuracy.

#### (2) Quantitative evaluation metrics of fog detection accuracy

The probability of detection (*POD*), false alarm ratio (*FAR*) and critical success index (*CSI*) [11] are used for the validation of fog detection, which are described as follows:

$$POD = \frac{N\_H}{N\_H + N\_M} \,\tag{12}$$

$$FAR = \frac{N\_{\text{F}}}{N\_{\text{H}} + N\_{\text{F}}} \,\text{}\tag{13}$$

$$CSI = \frac{N\_H}{N\_H + N\_M + N\_F} \,\tag{14}$$

where *NH*, *NM* and *NF* are the pixel numbers of hitting, missing and false fog detections. All indicators are scaled from 0 to 1, with high *POD* and *CSI* and low *FAR* indicating great performance of the algorithm.

#### **3. Results**

The fog is detected at dawn (7:00–8:30) and dusk (15:30–17:00) on 18 November 2016 over northern China (Figures 6 and 7) through our algorithm, and ground observation sites at 8:00 or 17:00 are shown in Figures 6c and 7d. The detection results show that the fog area increases from 7:00 to 7:30 and gradually decreases from 7:30–8:30. Comparing the position of the terminator line, it can be seen that the fog gradually dissipated after sunrise. With the aid of our algorithm, large areas of fog are detected in the central and western parts of the study area. Furthermore, the detection results of satellite data can capture the generation and disappearance of fog, which is highly consistent with the ground observations. Figure 6a shows that the fog was in the formation stage. The fog area was fragmented because it was interspersed with low cloud, resulting in incomplete fog detection results. Figure 6b shows that a relatively stable fog zone has formed, with fog covering the largest region before sunrise. As the terminator line continues to move westwards, the surface temperature in the eastern portion of the figure increases rapidly due to solar radiation, which causes fog to dissipate gradually (Figure 6c,d). The ground observation data show a high spatial consistence with the results of satellite fog detection (Figure 6c). The rectangular box in Figure 6c is located in the Chinese topographic transition zone according to the terrain of the study area (Figure 1), whose foggy area's boundary is defined well because of the application of terrain data. The missing detection is primarily located in the eastern portion of the figure (circular box in Figure 6c). The main reason for this missing detection might originate from the influence of low clouds. Influenced by the characteristics of the passive sensor images, it is relatively difficult to detect the under-cloud fog for the algorithm.

**Figure 6.** The fog detection results at (**a**) 7:00, (**b**) 7:30, (**c**) 8:00 and (**d**) 8:30 on 18 November 2016. The triangle, star, circle, and cross symbols represent haze, dense fog, strong fog, and non-fog stations, respectively. The square symbol indicates that there is no observation data because of the limit monitoring frequency. The blue area is the fog detected by our algorithm. All the times are at the Beijing Time (8 h later than the Universal Time Coordinated (UTC)) and the green line is the terminator line.

Figure 7 shows the fog detection results at dusk from 15:30 to 17:00. The ground observation data show large areas of haze at 17:00. It also shows a high spatial consistence with the results of satellite fog detection (Figure 7d), which supports the performance of our fog detection approach (Figure 7d). Satellite detection results show that the area of fog takes on a relatively stable trend from 15:30 to 17:00 (Figure 7a–d). The false detection occurred to the left of the terminator line at 16:30 (circular box in Figure 7c), which is characterized by high topographic undulations and low cloud coverage. Overall, our algorithm successfully detected more than 80% of fog in the study area.

We validate the algorithm with the aid of 10 fog cases randomly selected from 18th November in 2016 to 3 January in 2017. The statistical evaluation shows that the overall *POD*, *FAR* and *CSI* of the algorithm are 84.0%, 16.4% and 72.0% at dawn (Table 2), respectively. The highest *FAR* and lowest *CSI* occurred on January 3, 2017. The main reason

is attributed to the low clouds in the Inner Mongolia autonomous region and Northeast Plain, which are difficult to be isolated by infrared remote sensing at nighttime. Table 3 shows the accuracy at dusk, with the mean *POD*, *FAR* and *CSI* values are 83.7%, 15.8% and 72.6%, respectively. Overall, our algorithm demonstrates the broad potential of using deep learning for fog detection at dawn and dusk. We sum all the fog categories to validate the algorithm. It is very useful to evaluate the accuracies under different fog types. However, the observation points are insufficient to be divided into so many sub-datasets. Therefore, a small disturbance of the ground samples would generate great uncertainty on the validation. We will continue the collection of ground observation to validation the performance of our algorithm in our future research.

**Figure 7.** The fog detection results at (**a**) 15:30, (**b**) 16:00, (**c**) 16:30 and (**d**) 17:00 on 18 November 2016. The triangle, star, circle, and cross symbols represent haze, dense fog, strong fog, and non-fog stations, respectively. The square symbol indicates that there is no observation data because of the limit monitoring frequency. The blue area is the fog detected by our algorithm. All the times are at the Beijing Time (8 h later than the Universal Time Coordinated (UTC)) and the green line is the terminator line.

**Table 2.** Fog detection accuracy at 08:00 (BT).



**Table 3.** Fog detection accuracy at 17:00 (BT).

To evaluate the seasonal adaptability of the algorithm, eight dawn fog cases and five dusk fog cases were randomly selected from February to October 2017 (few fogs occur at dusk from April to July). Ground observations in 08:00 (February to March) and 05:00 (May to September) are adopted for the evaluation at dawn and dusk. Results are shown in Tables 4 and 5. Overall, the *POD*, *FAR* and *CSI* are 66.8%, 45.1% and 43.6% at dawn, and 68.0%, 32.3% and 49.8% at dusk, which show the acceptable accuracy over different seasons. On the other hand, the detection accuracy in summer and autumn was significantly influenced by cloudiness. In summer, the algorithm's performance is robust due to the apparent spectral and textural difference of the surface and fog/cloud. However, low clouds (liquid water clouds) in summer are more likely to be misclassified as fog. The FAR also increased due to the frequent cloud covering in autumn.

**Table 4.** Fog detection accuracy at dawn over different seasons.



**Table 5.** Fog detection accuracy at dusk over different seasons.

#### **4. Discussion**

We further compare the performances of our algorithm with those of the previous studies. Specifically, fog detection results from the U-Net and DDF-Net are compared at 08:00 on 18 in November and 12 December in 2016, and at 17:00 on 18 November and 11 December in 2016. The comparisons are shown in Figures 8 and 9. The ground observation indicates that fog area exceeding 85% has been successfully detected in the study area (Figure 8c,d and Figure 9a,b). Specifically, both algorithms detected fog more accurately at dawn (Figure 8c,d), and haze is detected more accurately by both algorithms at dusk (Figure 9a,b). On the other hand, both algorithms have missed detections due to medium and low cloud contamination (circular box in Figure 8a,b and Figure 9c,d). The false detection occurred at 8:00 (rectangular box in Figure 8c,d), which are mainly due to the interference of the low liquid water cloud. Compared with the U-Net, the DDF-Net shows more accurate detection accuracy. For example, the circular box in Figure 8a,b shows that fog edges of the U-Net are slightly coarse with a small proportion of low stratus being misclassified as fog, but the edge detection results of DDF-Net are more refined and the boundary of the foggy area is clearer. Secondly, U-Net detects low cloud as fog mistakenly in the daytime area with high SZA (rectangular box in Figure 8a,c), while the misclassified medium/high clouds are smaller in the DDF-Net (rectangular box in Figure 8b,d).

**Figure 8.** The fog detection results of U-Net and DDF-Net at 08:00 (18 November 2016 (**a**,**b**) and 12 December 2016 (**c**,**d**)). The triangle, star, circle, and cross symbols represent haze, dense fog, strong fog,

and non-fog stations, respectively. The green area is the fog detected by U-Net. The blue area is the fog detected by DDF-Net. All the times are at the Beijing Time (8 h later than the Universal Time Coordinated (UTC)).

**Figure 9.** The fog detection results of U-Net and DDF-Net at 17:00 (18 November 2016 (**a**,**b**) and 11 December 2016 (**c**,**d**)). The triangle, star, circle, and cross symbols represent haze, dense fog, strong fog, and non-fog stations, respectively. The green area is the fog detected by U-Net. The blue area is the fog detected by DDF-Net. All the times are at the Beijing Time (8 h later than the Universal Time Coordinated (UTC)).

#### **5. Conclusions**

In this study, we developed a novel algorithm for fog detection at dawn and dusk, which is based on the U-Net model and incorporates a channel attention mechanism under-restriction. Several conclusions could be drawn after the execution and validation of the algorithm:


Overall, this study provides a new deep learning-based approach for fog detection at dawn and dusk. Our algorithm shows great advantages and potential for overcoming the limitations of minor spectral differences for fog detection at dawn and dusk, which would significantly improve the satellite application in climate forecast. The limitation refers to the low performance of fog formation and dissipation in mountainous areas with highly complex terrain. Two issues would be addressed in our future research. Firstly, other geographical parameters (i.e., latitude and surface temperature, etc.) would also affect the fog formation, which should be addressed in the algorithm to improve the accuracy of satellite fog detection. Secondly, a public fog dataset for each season would be developed and published for supporting the management of traffic and public health.

**Author Contributions:** Conceptualization, Y.R.; methodology, Y.R. and H.F.; software, Y.R.; validation, Y.R., H.M. and Z.L.; formal analysis, Y.R.; investigation, X.W. and Y.L.; resources, X.W.; data curation, X.W.; writing—original draft preparation, Y.R.; writing—review and editing, H.F. and H.M.; visualization, Y.R. and Z.L.; supervision, H.M. and H.F.; project administration, H.M.; funding acquisition, H.M. and H.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China, grant number 42071334 and 42071378.

**Data Availability Statement:** The Himawari-8 data supporting the findings of this study are openly available at https://www.eorc.jaxa.jp/ptree/ (accessed on 17 April 2020). The ground observation data can be archived at the following website https://data.cma.cn/ (accessed on 20 April 2020). The SRTM1-DEM data can be archived at the following website https://www.usgs.gov/ (accessed on 5 May 2020).

**Acknowledgments:** The authors are grateful to JMA for making the Himawari-8 data available. We are also grateful to the National Satellite Meteorological Center for providing access to the Meteorological station visibility data of our research.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Projection of Future Extreme Precipitation in China Based on the CMIP6 from a Machine Learning Perspective**

**Yilin Yan 1, Hao Wang 1,2,\*, Guoping Li 3,4, Jin Xia 5, Fei Ge 4, Qiangyu Zeng 1, Xinyue Ren <sup>1</sup> and Linyin Tan <sup>1</sup>**

	- <sup>3</sup> Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing 210044, China
	- <sup>4</sup> School of Atmospheric Sciences/Plateau Atmosphere and Environment Key Laboratory of Sichuan Province/Joint Laboratory of Climate and Environment Change, Chengdu University of Information Technology, Chengdu 610225, China
	- <sup>5</sup> Shiyan City Meteorological Bureau, Shiyan 442000, China
	- **\*** Correspondence: wh@cuit.edu.cn; Tel.: +86-028-8596-6501

**Abstract:** In recent years, China has suffered from frequent extreme precipitation events, and predicting their future trends has become an essential part of the current research on this issue. Because of the inevitable uncertainties associated with individual models for climate prediction, this study uses a machine learning approach to integrate and fit multiple models. The results show that the use of several evaluation metrics provides better results than the traditional ensemble median method. The correlation coefficients with the actual observations were found to improve from about 0.8 to 0.9, while the correlation coefficients of the precipitation amount (PRCPTOT), very heavy precipitation days (R20mm), and extreme precipitation intensity (SDII95) reached 0.95. Based on this, the precipitation simulations of moderate forced scenario for sharing socio-economic path (SSP2-4.5) from 27 coupled models in the Coupled Model Intercomparison Project Phase 6 (CMIP6) were used to explore potential changes in future extreme precipitation events in China and to calculate the distribution and trends of the PRCPTOT, extreme precipitation amount (R95pTOT), maximum consecutive 5-day precipitation (Rx5day), precipitation intensity (SDII), SDII95, and R20mm for the early 21st century (2023–2050), mid-21st century (2051–2075), and late 21st century (2076–2100), respectively. The results showed that the most significant increase in extreme precipitation indices is expected to occur by the end of the century, with the R95pTOT, Rx5day, and SDII95 increasing by 13.73%, 9.43%, and 9.34%, respectively, from the base period. The remaining three precipitation indexes, the PRCPTOT, SDII, and R20mm, also showed increases of 8.77%, 6.84%, and 4.02%, respectively. Additionally, there were apparent differences in the spatial variation of extreme precipitation. There were significant increasing trends of extreme precipitation indexes in central China and northeast China in the three periods, among which the total annual precipitation showed an increasing trend in central and northern China and a decreasing trend in western and south China. An increasing trend of annual precipitation intensity was found to be mainly concentrated in central China and south China, and the annual precipitation frequency showed a larger increasing trend at the beginning of this century. The annual precipitation frequency showed an increasing trend in the early part of this century. In general, all the indices showed an overall increasing trend in the future period, with the PRCPTOT, Rx5day, and SDII95 showing the most significant overall increasing trends.

**Keywords:** machine learning; extreme precipitation indices; climate change; CMIP6; China

#### **1. Introduction**

In recent years, the impacts of extreme weather events such as continuous heat waves, extreme precipitation events and droughts [1,2] have become more prominent and more

**Citation:** Yan, Y.; Wang, H.; Li, G.; Xia, J.; Ge, F.; Zeng, Q.; Ren, X.; Tan, L. Projection of Future Extreme Precipitation in China Based on the CMIP6 from a Machine Learning Perspective. *Remote Sens.* **2022**, *14*, 4033. https://doi.org/10.3390/ rs14164033

Academic Editor: Xander Wang

Received: 14 July 2022 Accepted: 14 August 2022 Published: 18 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

widespread [3]. Extreme precipitation, in addition to its severe impact and related secondary disasters such as floods, landslides, debris flows, and other natural disasters [4], also brings a great challenge to people's lives as well as to economic development [5,6]. The Sixth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC6), led by the United Nations, notes that continued warming has intensified water cycle processes in the context of widespread global warming, resulting in an increase in atmospheric water content, which, in turn, will lead to enhanced precipitation and the chance of intense precipitation events over most of the globe [7,8]. The scale and frequency of extreme precipitation events will be larger and more frequent in the future as temperatures continue to rise, raising concerns about the changing features of extreme precipitation events [9,10].

China has diverse climatic conditions and complex geography, and is subject to a perennial monsoon climate with frequent extreme precipitation events [11,12]. From the late 1970s to the present, economic losses due to extreme precipitation events in Chinese regions have increased nearly tenfold. Historically, there have also been multiple floods caused by extreme precipitation, such as the 1998 Yangtze River basin-wide mega-flood caused by a multi-day extreme precipitation event [13,14]; in 2012, Beijing was hit by the heaviest rainfall in nearly 61 years, resulting in massive property damage and casualties; the August 2020 Sichuan flood that was a rare event in China's climate history; and the 2012 Sichuan flood that resulted in the loss of life. However, current studies on extreme precipitation have focused more on small regional scales, such as precipitation distribution in urban and watershed areas [15] and trends or changes in extreme precipitation in typical areas [16,17], while there are fewer studies on extreme precipitation at large regional scales.

Since the 1990s, the Coupled Model Intercomparison Project (CMIP) has promoted the simulation of hundreds of climate models from dozens of institutes around the world so as to improve the simulation capability of climate models and enhance a more comprehensive scientific knowledge of weather processes. Most past work on the assessment and prediction of temperature and precipitation in China through CMIP Phase 5 (CMIP5) [18]. Jiang et al. [19] analyzed extreme precipitation using the CMIP5. Simulations show moisture bias in northwestern China as well as in the north, accompanied by a dry tendency in southeastern China. Future projections for the CMIP5 show that the increase in extreme global precipitation may be faster than the increase in total precipitation on rainy days [20]. In addition, China is expected to experience more frequent regional heavy precipitation events while being more destructive [19,21]. With the improvement of the model's simulation performance, the current research focus has gradually shifted to research on the new-generation CMIP6 model [22]. The experimental output of the CMIP6 model contains parameterized settings for more complex dynamic climate processes [23] and more different emission scenarios designed to improve prediction accuracy [24]. Chinese scholars have paid more attention to its performance in China and the new CMIP6 scenarios (i.e., Shared Socio-Economic Pathways, SSPs) regarding how China's climate will change. Recent studies have also begun using CMIP6 data to evaluate and predict the East Asian monsoon climate [25]. They have also shown that the CMIP6 model is compatible with the CMIP5 model. Models simulating precipitation and climate extremes have generally improved [26,27], and there are more models and larger simulation ensembles.

Current predictions of China's climate mainly utilize multi-model ensemble approaches, with some studies exploring the best individual models and using the multimodel ensemble medians approaches [14], and, although these approaches provide some improvements to prediction accuracy, they are still affected by the limitations of individual models. The fields of artificial intelligence (AI) and machine learning (ML) are currently evolving, allowing meteorological researchers to integrate machine learning knowledge into related fields. The biggest advantage of machine learning (ML) methods is that more useful information can be extracted from multi-model data to achieve a closer match with real observations. Numerous studies have shown that machine learning can be better used in weather forecasting, long-term future climate prediction, and supplementing missing climate elements, and significant progress has been made [28]. Since machine learning has

greater advantages in solving nonlinear and high-dimensional problems, it allows for the extraction of dynamics and physical processes present in the climate, among other reliable information [28,29].

Based on this, we used ML to integrate the simulation results of multiple models in the next-generation Coupled Model Intercomparison Project Phase 6 (CMIP6) while establishing the nonlinear relationship between them with the objective of real observational data. To better represent the integration effect, we compared ML with the ensemble median model to explore whether ML can effectively improve the ability to simulate observed data and further establish reliable future climate prediction results based on it. The remainder of this study is organized as follows: Section 2 introduces the research area, research data, evaluation indices, and research methods. Section 3 presents the main findings of current and future climate prediction methodology assessments. Finally, Section 4 summarizes and discusses the spatial distribution and temporal variation trend of extreme precipitation in China in the future.

#### **2. Datasets and Methodology**

#### *2.1. Study Area*

China is located in eastern Asia, with significant differences in winter and summer precipitation, most of which is concentrated in summer [30,31]. China's geographical environment is changeable with regional differences in topography. Due to the role of the East Asian monsoon, the eastern summer rainfall is sufficient. Northwest China is mainly influenced by a westerly, dry climate. Due to the vast size of China, its complex topography and strong monsoon characteristics [32], and the existence of many different climate types, precipitation varies among regions, decreasing from southeast to northwest [33].

In order to further study the future trends of precipitation and to quantify the regional differences, in this study, we divided the Chinese regions into eight climatic zones [34], specifically, the western arid (semiarid) zone (A), the Qinghai–Tibet Plateau (B), the eastern arid zone (C), southwest China (D), northeast China (E), north China (F), central China (G), and south China (H). The specific divisions are shown in Figure 1.

**Figure 1.** Research area and climate zoning. Among them (A) western arid (semiarid) zone, (B) Qinghai–Tibet Plateau, (C) eastern arid zone, (D) southwest China, (E) northeast China, (F) north China, (G) central China, (H) south China.

#### *2.2. Data*

The daily precipitation observations CN05.1 used in the study are from the grid point dataset provided by the China Meteorological Administration. This dataset was obtained by Wu et al. [35] using 2416 national meteorological stations from 1961 to 2014 in the National Meteorological Information Center, and the daily observation data of precipitation with a spatial resolution of 0.25◦ × 0.25◦ were obtained by interpolation and superposition of a thin-disk spline function (ANUSPLIN) and the angular distance weighting method (ADW), respectively. Additionally, these data have been used extensively in numerous climate change studies in China [36,37]. The dataset was used as the "true value" for training and verification to verify and evaluate the established model.

We used daily precipitation simulations from 27 CMIP6 global climate models. The numerical simulation experiments available for study in CMIP6 consist of two main parts: (1) historical climate simulation experiments (Historica); and (2) 23 Model Comparison Sub-Plans (MIPs) for future prognosis. Both provide a variety of meteorological variables including precipitation, where the former is based on a large number of observations, including historical climate simulation experiments driven by external forcing of data from ground-based observations and remote sensing observations, and also serves as a reference benchmark experiment, mainly to assess the model's ability to simulate climate change. Secondly, among the many MIPs, we selected the Scenario Model Comparison Program (ScenarioMIP), the most recently included in CMIP6, whose experimental results were used as the initial data for the future prediction part of this study. The study uses historical simulation data from 1961–2014 and future projection data for 2023–2100 from SSP2-4.5 in the Shared Socioeconomic Pathway. SSP2-4.5 was used as the updated RCP4.5 scenario representing the medium socio-economic development path and the medium forcing scenario, which simulates medium land use and aerosol paths [38,39], which can reflect the development scenario under normal development conditions. Before the analysis, all the data were unified to a 1◦ × 1◦ resolution using a bilinear interpolation scheme [19].

#### *2.3. Methods*

#### 2.3.1. Climate Indices

The six precipitation indices used in the study were defined by the Expert Team on Climate Change Detection and Indices (ETCCDI, Table 1) [40]. They were used to quantify the characteristics of future precipitation characteristics [33,41]. The metrics included the following precipitation indices: total precipitation on rainy days (daily precipitation >1 mm) (PRCPTOT), annual extreme precipitation total above the 95th percentile threshold (R95pTOT), and 5-day maximum precipitation (Rx5day); and the following precipitation intensity indices: precipitation intensity (SDII), extreme precipitation intensity above the 95th percentile threshold (SDII95), and daily precipitation >20 mm of heavy rainfall days (R20 mm). These indices are widely used to identify and monitor climate extremes [42]. In this study, we first calculated the indices of all models and observations and then performed machine learning multi-model integration and calculated the median of the multi-model ensemble on this basis.

**Table 1.** Precipitation indices and definitions (recommended by the ETCCDI).



**Table 1.** *Cont.*

#### 2.3.2. Artificial Neural Networks

The back propagation (BP) learning algorithm is a multilayer feedforward network, an artificial neural network algorithm based on error back propagation with forwarding multilayer feedback. Using the gradient descent method, the error squared minima of the actual output value and the target value were searched for so that the threshold value of the network as well as the connection weights could be adjusted without interruption so as to achieve the desired output for each set of inputs in the adjusted network model [43].

The operation of the BP neural network consists of three main parts, namely, the forward propagation process from the input layer pathway hidden layer to the output layer; the reverse correction process from the output layer to the input layer is determined by the error between the predicted output of the network before the actual observed data; and the training process in which the forward and backward processes are alternated. The BP neural network topology includes an input layer, a hidden layer, and an output layer; x1, x2, ... , xm are the input data; y1, y2, ... , yn are the output data; and w is the network weights, as shown in Figure 2.

**Figure 2.** BP neural network topology diagram.

#### 2.3.3. Multi-Model Integrated

The overall flow chart of the multi-model integration processing method is shown in Figure 3. The model data and observed data for the historical period (1961–2014) were divided into two parts: the training set representing 1961–1998 (38 years) and the test set representing 1999–2014 (16 years). Among these, the test set mainly served as an assessment of whether the results of the evaluation after multi-model integration were closer to the observed data. In this study, we integrated six indices separately with the overall goal of integrating the advantages of the individual models' simulation capabilities in different regions to the greatest extent possible so as to obtain the best simulation results for the actual observed data in the Chinese regions. The scheme was eventually applied to the prediction of future periods as a way to achieve the best precipitation forecast. A large number of previous works have used the multi-model ensemble median for future precipitation predictions [26,44]. The CMIP6 multi-model ensemble median method has generally improved the simulation of precipitation and precipitation extremes trends on a

global scale compared to its predecessor (CMIP5) [27], so it was used as a reference for the machine learning fitting results.

**Figure 3.** ML model flow chart.

#### 2.3.4. Evaluation Method

In this study, the Taylor diagram was used to evaluate the simulation results of 27 climate models, the multi-model ensemble median, and the integrated multi-model ML in CMIP6; the spatial distribution of each index of model simulation results before and after processing can be compared more intuitively with the consistency of the observed data. Taylor diagrams can integrate multiple evaluation metrics for presentation at the same time, indicating the accuracy of the model matching, the observed data in terms of standard deviation, central root mean squared error (RMSE), and the correlation coefficient [45].

The standard deviation of the observation data and the model is calculated as follows:

$$
\sigma\_{\text{obs}} = \sqrt{\frac{1}{N} \sum\_{n=1}^{N} \left( X\_{\text{obsn}} - \overline{X\_{\text{obs}}} \right)^2} \tag{1}
$$

$$
\sigma\_{model} = \sqrt{\frac{1}{N} \sum\_{n=1}^{N} \left( \mathbf{X}\_n - \overline{\mathbf{X}} \right)^2} \tag{2}
$$

Among them, *Xobs* and *X* respectively is the average of the observed data and model data.

The central root mean square error, *RMSE*, is defined as:

$$RMSE = \sqrt{\frac{1}{N} \sum\_{n=1}^{N} \left[ \left( X\_n - \overline{X} \right) - \left( X\_{obsn} - \overline{X\_{obs}} \right) \right]^2} \tag{3}$$

The correlation coefficient *r* of the observed data and the model data is defined as:

$$\sigma = \left[ \frac{1}{N} \sum\_{n=1}^{N} \left( X\_n - \overline{X} \right) - \left( X\_{\text{obsn}} - \overline{X\_{\text{obs}}} \right) \right] / \left( \sigma\_{\text{model}} \sigma\_{\text{obs}} \right) \tag{4}$$

The key to the Taylor diagram is that the standard deviation, correlation coefficient, and central root mean square error of the simulated and observed data satisfy the following relationships:

$$RMSE^2 = \sigma\_{obs}^2 + \sigma\_{model}^2 - 2\sigma\_{obs}\sigma\_{model}r \tag{5}$$

Because of the large uncertainty in future precipitation predictions, We also applied the relative root mean square error (*RMSE* ) for a more accurate quantitative assessment of individual models, integrated models, and median multi-model integration to re-validate the correlation with the observed data results [46,47].

The *RMSE* for the assessment of the climate simulation capabilities of the CMIP6 models is defined as follows [48]: first, the RMSE of each model relative to the observed data needs to be calculated as in Equation (6), and, second, using the calculated *RMSE* for each mode in turn, the *RMSE* for each mode is calculated as follows:

$$RMSE' = \frac{\text{RMSE} - \text{RMSE}\_{Madian}}{\text{RMSE}\_{Madium}} \tag{6}$$

where RMSE*Median* is the set median of the RMSE of all the models. In general, a negative (positive) value of *RMSE'* is better (worse) than half (50%) of all model results for the model.

#### **3. Results and Discussion**

#### *3.1. Evaluations*

We obtained the PRCPTOT, R95pTOT, Rx5day, SDII, SDII95, and R20mm for each grid point by calculating the actual observations, 27 pattern data, the multi-model ensemble median data, and the ML output data for the validation of the dataset (1999–2014), respectively. On this basis, the central root mean square error (RMSE) and the ratio of the spatial correlation coefficient (R) to the standard deviation (STD) of the above and the observed values were calculated, and the Taylor diagram for a summary of the statistics was obtained. The results are shown in Figure 4. According to the calculation principles of the three indexes, the higher the spatial correlation coefficient, the closer the standard deviation ratio to 1, and the closer the central root mean square error to 0, the better the data simulation effect (the smaller the distance between the simulated data points and observation points, as expressed in the Taylor diagram). It can be seen from the Taylor diagram of each index that for 27 different models, there are great differences between them. For the four precipitation indices, PRCPTOT, R95pTOT, and Rx5day, most of the patterns had correlation coefficients between 0.6 and 0.85, while the RMSE values was mostly between 0.5 and 1.0. The correlation coefficient of the precipitation frequency index R20mm was between 0.3 and 0.85, while the RMSE values were between 0.75 and 1.5. The two intensity indices, SDII and SDII95, performed better with correlation coefficients between 0.7 and 0.9 and the RMSE values between 0.5 and 1.0. In contrast, the effect of the multi-model ensemble medians on all the indices was slightly better than the 27-model data. The ML performed much better than the 27 patterns and the multi-model ensemble medians, which also reflects the ability of machine learning methods to better integrate the advantages of different models. A further analysis showed that the correlation coefficients of the PRCPTOT, R20rmm, and SDII95 all reached about 0.95, while the R95pTOT Rx5day and SDII were slightly different, with correlation coefficients of 0.86, 0.85, and 0.92. The central root mean square errors were all around 0.5, and the final standard deviation ratio was between 0.75 and 1.0 for all the evaluations. The best performance was obtained for the PRCPTOT, R20mm, and SDII95. It was found that ML has the smallest distance between each index and the observed values, indicating that the simulation effect is closest to the observed values.

**Figure 4.** Taylor diagram of PRCPTOT (**a**), R95pTOT (**b**), Rx5day (**c**), SDII (**d**), SDII95 (**e**), and R20mm (**f**). In the figure, the circle's angle corresponds to the spatial correlation coefficient; the black arc represents the standard deviation ratio; the green arc represents the central root mean square error; and the purple dot is the observation representing the observed data.

Figure 5 visualizes the distribution of the *RMSE* magnitude for each model simulating extreme precipitation indices as compared to the real observations. A larger value indicates a worse simulation performance, while a lower value indicates a better simulation performance.

**Figure 5.** Heat map of the relative RMSE (RMSE ) of the precipitation extremes index simulated by the CMIP6 model versus observed data for the period 1999–2014. The colors range from red to blue. The deeper the blue, the better the simulation ability. The deeper the red, the worse the simulation ability.

The results show that the ability of each model to simulate each index of precipitation varies greatly, among which EC-Earth3, EC-Earth3-Veg-LR, and EC-Earth3-Veg performed better, and the RMSE of the different indices was mainly negative, indicating that the three models can simulate the study area well. CAS-FGOALS-g3, CMCC-ESM2, and CMCC-CM2-SR5 performed poorly, while EC-Earth3, EC-Earth3-Veg-LR, and EC-Earth3-Veg were found to be able to better simulate the interannual variability of extreme precipitation events in the Asian region [49]. Since the models exhibited large uncertainties in different metrics, this also reaffirms the importance of model integration through machine learning when studying climate simulations and predictions. The last two columns of Figure 5 show the ensemble median approach and the machine learning model fit, respectively, and it can be analyzed that, for each precipitation index, the integrated median multi-model approach performs better overall than the results of the 27 models simulated individually, but the ML in the last column was even more comprehensive, far outperforming all the models including the multi-model ensemble median approach according to the six extreme precipitation metrics. The results also showed that for all indices, the RMSE of ML was the minimum, and the results were better than those of any individual model, which largely eliminates the uncertainty of the structural model, and thus the model can be considered to be reasonably well established for predictive simulations of future climate change.

Based on the above evaluation results, that is, the performance of ML fitting being significantly better than that of a single model, we finally decided to evaluate the spatial simulation capability of ML for each of the six precipitation indices and to better understand the regional climate bias of the model. Figure 6 shows the deviation and relative deviation distribution of the ML fitted values relative to the observed precipitation indices, where a relative deviation greater than 100% indicates a multiplicative overestimation of the ML model mean for the simulated values of the indices. From the figure, we can see that the simulations of the six precipitation indices by the future climate model were underestimated in most areas of central and southern China and the Qinghai–Tibet Plateau, and overestimated in some areas of the western and eastern arid regions, which may be

due to the limitations of the climate model itself and its simulation of complex terrain [50]. Some scholars have previously found that the CMIP6 model output for western China had a low agreement for observed precipitation [51], and the use of ML is thus effective in improving the error in this part of the region. Specifically from the spatial point of view, ML was found to have over 50% overestimation for the three extreme precipitation indices in southern Xinjiang and parts of Inner Mongolia; 10%–40% underestimation in parts of Tibet, southwest, central, and southern China; and over 40% underestimation in parts of Qinghai and Xinjiang. ML is relatively good for the simulation of the two extreme precipitation frequency indexes; for SDII especially, the relative deviation was found to not be more than 40%, and for SDII95, a deviation of more than 40% was found to be mainly distributed in parts of Xinjiang. ML overestimated the frequency index R20mm by more than 40% in Xinjiang, Tibet, Qinghai, and parts of Inner Mongolia, and underestimated by more than 40% in Xinjiang and Tibet. In general, ML simulation is relatively accurate for most parts of China, and the biggest deviation was mainly around Xinjiang and the Qinghai– Tibet Plateau, where the resolution of the climate model was relatively low, and it was difficult to completely reproduce the local circulation of complex terrain. In addition, these regions usually lack climate stations, including gridded precipitation data obtained from climate stations, making it hard to accurately simulate precipitation features in complex terrain [52]. Secondly, the simulated precipitation was only slightly underestimated (<10%) in the vast majority of South China compared to the observed data, and comparable results were obtained by Sun et al. [53]; it is also consistent with the underestimation of extreme precipitation in southern China [54]. However, the overall distribution was approximately the same as the previously assessed distribution compared to the multi-model ensemble median used in previous studies, but the simulation effect was greatly improved [55]. The results of this study also differ from some previous studies, and part of the reason for the difference may be that this study is a comparison of climate model ML with gridded precipitation data, rather than the station data typically used in the past. Most regions in China were found to have better performance with the ML model. The uncertainty in northwest China was the smallest. The uncertainty in northeast China was greater than that in northwest China, and the uncertainty in south China was small. The uncertainty in northwest China was low, likely because it is a particularly arid region.

#### *3.2. Projected Changes*

#### 3.2.1. Future Changes in Spatial Distribution and Boxplot

Due to the long time projection range of the model by which to facilitate analysis, we divided the future projection time into three periods starting from 2023: 2023–2050 (early 21st century), 2051–2075 (mid-21st century), and 2076–2100 (late 21st century), while the precipitation observation data from 1990 to 2014 were used as the reference period to facilitate the analysis of the distribution of the six precipitation indices in eight climate zones of China in different future periods compared with the base period and the change trends.

As shown in Figures 7 and 8, the spatial distribution of the relative changes of the three precipitation indices (PRCPTOT, R95pTOT, and Rx5day) and the box line plot of the absolute changes under the SSP2-4.5 scenario are shown. It can be seen that all precipitation, in general, showed an increasing trend in this century, and each precipitation index also increased with time. Relative to the base period, the changes in the three indices are more dramatic at the end of the 21st century, with the change rates of 5.69%, 7.47%, and 7.89%, respectively. Therefore, in the context of future climate change, precipitation in China will not increase significantly. Among these findings, 30.85% to 37.68% of the study area is expected to experience a significant increase (>15%) in R95pTOT by the end of the century, mainly in northern China and the southern part of the western arid zone, while the maximum rate of change will reach 40% in some parts of northern China. The largest absolute changes also occur in north China, with weaker sub-increases in northeast China and southwest China. For example, the absolute change in R95pTOT is expected to increase by an average of about 100 mm in north China by the end of the century, while the increase in the southern part of

the western arid zone is expected to be about 1 mm. However, the relative rates of change are inversely distributed, with larger percentages of change occurring in the southern part of the western arid zone as well as in the eastern arid zone. This is because the total precipitation in arid and semi-arid regions is quite low, and small changes in precipitation can cause large differences in these regions [56]. In contrast, in central China and southern China, even though there is abundant rainfall and large absolute variability in some areas, its relative variability is not significant. In general, the box line plots of the absolute variation and relative variation distribution of PRCPTOT, R95pTOT, and Rx5day are generally similar, while the relative variation rates of PRCPTOT and Rx5day are more significant in south China than those of R95pTOT.

**Figure 6.** Absolute and relative deviations (%) of ML from observations for PRCPTOT (**a**,**d**), R95pTOT (**b**,**e**), Rx5day (**c**,**f**), SDII (**g**,**j**), SDII95 (**h**,**k**), and R20mm (**i**,**l**) for the period 1999–2014.

**Figure 7.** Relative change maps (%) of PRCPTOT (**a1**–**a3**), R95pTOT (**b1**–**b3**), Rx5day (**c1**–**c3**), SDII (**d1**–**d3**), and SDII95 (**e1**–**e3**) and absolute change maps of R20mm (**f1**–**f3**) for the three periods of 2023–2100 compared to the base period of 1990–2014.

**Figure 8.** Boxplots of absolute changes from the base period.

Figure 7(d1,e3) and Figure 8d,e show the two precipitation intensity indices SDII and SDII95. It can be seen that there were some differences in their distributions, but the rate of change in the three phases increased gradually with time, as did the amount of precipitation. By the end of the century, SDII and SDII95 are expected to change by about 3.78% and 7.30%, respectively, compared to the base period, it is not difficult to find a more significant increasing trend in the extreme index SDII95. Precipitation intensity over about 14.9–17.5% of China, with significant variation (>15%) over the forecast period, and the areas with small changes (−5–5%) accounted for 37.06% of the total area, indicating that the extreme precipitation intensity increase is concentrated in the local areas of eastern China with a maximum change rate of about 30%, while the remaining areas are not expected to have large changes. Among these, as shown in Figure 8d,e, the main regions that were found to have an impact on the values were central China and north China, and the largest absolute changes were found to occur in central China, for which the SDII and SDII95 were 3.521 mm/day and 13.31 mm/day, respectively, followed by southwest China and south China, while the relative change rates of the western arid zone and the Qinghai–Tibet Plateau regions with less perennial precipitation and absolute changes were not obvious.

Figure 7(f1–f3) and Figure 8f show the distribution and box plots of the annual number of days of intense rainfall (R20mm). Since torrential rain is not common in western China and the probability of occurrence is very low, small changes will all result in high relative rates of change and will also affect the analysis of other regional characteristics. After weighting, we choose to show the absolute change distribution. The R20mm mainly showed a continuous increase in north China and central China, and most of the regions showed an increase as compared with the base period. In contrast, parts of south China experienced a significant decrease in the middle of the 21st century, and even the number of rainstorm days was below the base period until it increased at the end of the century.

In conclusion, all the indices were found to increase with time during the forecast period, and by the end of the 21st century, PRCPTOT is expected to increase steadily in most parts of the country, with changes in heavy precipitation concentrated in north China as well as northeast China. In particular, the three extreme precipitation indices, R95pTOT, Rx5day, and SDII95, all showed more prominent and significant changes. Additionally, the projected percentage increase in northern China was higher than that in southern China, with the largest increases mainly in north China and northeast China. The increase of PRCPTOT in north China is greater than that in south China, but this did not change the distribution in that precipitation in southern China remains higher than in northern China [57]. Due to the perennially dry climate in the northern and northwestern regions, the largest relative increase is expected in the western arid zone with a large relative change, a phenomenon consistent with CMIP5 projections [58].

#### 3.2.2. Future Trend Distribution

Figure 9 shows the spatial distribution of the annual trend changes of the six indices in the three early, middle and end periods in the future. Among these, all the indices passed the 95% MK reliability test of the trend significance level.

**Figure 9.** *Cont*.

**Figure 9.** Spatial distribution of trend changes in each phase for PRCPTOT (**a1**–**a3**), R95pTOT (**b1**–**b3**), Rx5day (**c1**–**c3**), SDII (**d1**–**d3**), SDII95 (**e1**–**e3**), and R20mm (**f1**–**f3**). The columns from left to right are the three predicted periods before, during, and at the end of the future period, respectively. The gray dots are marked as the grid points where the trend significance level passed the 95% MK confidence test.

As shown in Figure 9(a1–c3) for the trend distribution of the three precipitation indices PRCPTOT, R95pTOT, and Rx5day, it can be seen that most of the regional grid points were significant nationwide except for south China and some parts of central China. For PRCPTOT, the largest trend of most regions in the three stages was in the early 21st century, except for a few western regions and south China. Among these, southwest China, as well as north China, reached the trend of 0.512 mm/year and 0.491 mm/year in the early stage, respectively, and this trend slowed down in the middle and at the end of the period, but was still in an upward trend. In particular, south China showed a decreasing trend of −0.219 mm/year in the early 21st century, and then showed a significant increasing trend in some areas in the middle and late stages, but, according to the overall trend, there was still a slight decline of 0.056 mm/year. Finally, most of the western regions of China showed an upward trend at all stages, with a significant one in the Qinghai–Tibet Plateau

where the overall trend was found to reach 0.354 mm/year. Secondly, the two extreme precipitation indices R95pTOT and Rx5day differed from the former in that the upward trend was most significant in the middle or late stages across China. In the mid-century, R95pTOT in central China reached the maximum trend value of 0.803 mm/year nationwide, while Rx5day in north China also reached the maximum trend of 0.116 mm/year, followed by northeast China and southwest China, which did not have a trend as obvious as that in the abovementioned regions, but still showed a strong upward trend of change. The R95pTOT in south China had a brief upward trend in the middle term, but the overall trend did not fluctuate much.

For the precipitation intensity indexes SDII and SDII95 (Figure 9(d1–e3)), they have different spatial trend distributions. The trend of SDII was more drastic at all stages, but the grid point of SDII95 passing the 95% confidence test was much larger than that of SDII. It is expected that, except for south China, most of the areas on the central map will still show an upward trend in general, especially north China, which has a relatively obvious trend of change where the two indices reached a peak of 0.018 mm/day/year and 0.014 mm/day/year, respectively. In contrast, SDII in north China, the eastern arid zone, and parts of western China showed almost no significant fluctuations. SDII in southwest China showed a slight downward trend in the middle and late 21st century, but SDII95 showed a continuous upward trend. About R20mm generation frequency is shown in Figure 9(f1–f3). Combining the grid points with trend significance using MK95, it can be clearly observed that the maximum trend occurs in southwest China in all three periods, while the maximum decreasing trend is concentrated in parts of central China and south China. It is not difficult to find that the area with an increasing trend in the southwest region increased in the middle and late 21st century, and only a few areas in the region showed a decreasing trend in the later period. The same phenomenon was also observed in north China, where the majority of the areas remained unchanged and decreased at the beginning, and the majority of the areas with increasing trends occurred at the end, which indicates that R20mm in southwest China and north China will increase significantly with time, and the precipitation in this region will also increase.

From this, it can be seen that the area with an increasing trend in the southwest region increased in the middle and late 21st century, and only a few areas in the region showed a decreasing trend in the later period.

In general, the trend distribution of these indices is not consistent, and the value of PRCPTOT is expected to continue to rise in most regions of China in the future, but a decreasing trend will be observed in some parts of south China.

#### 3.2.3. Future Interannual Variation

Figure 10 shows the evolution characteristics of five precipitation indexes over time in the forecast period of 2023–2100 compared with the reference period of 1990–2014. Table 2 shows the statistical data on the change in the time trend of the indicators in Figure 10. Among the grid point statistics, we kept only the grid points whose trend significance levels passed the 95% MK confidence test as statistical objects.

**Table 2.** Proportion of different intensity echoes of two radars.


**Figure 10.** Annual average changes in six national precipitation indices: PRCPTOT (**a**), R95pTOT (**b**), Rx5day (**c**), SDII (**d**), SDII95 (**e**), and R20mm (**f**), all calculated from the base period 1990–2014. The colored dotted lines represent the trend for the three future periods, and the gray lines represent the overall trend.

Compared with the baseline period, the evolution trends of PRCPTOT, R95pTOT, Rx5day, SDII, SDII95, and R20mm in 2023–2100 can all be seen to show an overall increasing trend, and the rate of change of future precipitation for each index was found to mostly increase with time in the middle and end of this century.

Figure 10a–c show the temporal evolution characteristics of three precipitation indices. PRCPTOT (Figure 10a) showed the flattest trend of the three indices, reaching a maximum trend of 0.17%/year at the beginning of this century, and then maintaining a relatively stable trend around 0.06%/year. Throughout the study time period, the average trend of PRCPTOT, compared to the base period, reached 0.11%/year. Similarly, as shown in Figure 10b, the trend of the rate of change of R95pTOT in the future 2023–2100 period showed a clear trend and increased gradually over the period, reaching an average trend of 0.39%/year at the end of this century. In Figure 10c, the curve of Rx5day steadily increases, and its change rate is relatively stable in the future period, with the overall average trend reaching 0.182%/year.

Figure 10d,e show the temporal changes in the rate of change of the two intensity indices. The values of SDII and SDII95 are expected to rise in this century, with SDII showing a more pronounced trend. As shown in Figure 10d, the average trend of SDII reached a maximum of 0.20%/year in the middle of the century and decreased towards the end of the century, but still maintained an increase, with an average trend of 0.11%/year throughout the study period. Finally, the change of SDII95 was the most gentle, showing almost no obvious change trend, with an average trend of 0.09%/year, indicating that China's future extreme precipitation intensity will not change much as a whole, and, from the above, we can see that SDII changes mostly exist in local areas. Figure 10f shows the evolution of the annual change rate of heavy rainfall frequency (R20mm). The evolution trend of the R20mm in the three periods of the 21st century was seen to continue to increase; the frequency of extreme precipitation was found to maintain an increasing trend, and the trend gradually expanded, reaching 0.177%/year in the late 21st century.

#### **4. Discussion**

The total amount and intensity of heavy rainfall is expected to increase across the country in the future, which also implies that continuous flooding will be a major problem in China's development. Future changes in precipitation in China are mainly influenced by temperature in addition to the general humidification of the atmosphere, and many studies have shown that precipitation in China is highly susceptible to climate warming, and rising temperatures lead to changes in precipitation levels in China [59]. On the other hand, potential changes in future large-scale monsoon circulation systems also play a key role in the frequency and intensity of precipitation in China, with some studies suggesting a strong influence of the East Asian summer circulation in eastern China and the westerly circulation in northwestern China and the northern Qinghai–Tibet Plateau. Precipitation in eastern China is more frequent in summer and is easily influenced by the East Asian summer winds, leading to changes in future precipitation and spatial distribution [60,61]. Studies have also shown that the expected future intensification of the East Asian summer winds [29,62] may lead to larger increases in precipitation in north China and smaller increases (or even decreases) in southeastern China, especially in the middle and lower reaches of the Yangtze River Basin, and may also affect the timing of precipitation [63]. In addition, the winter northwest wind will also strengthen in the future [64,65], which is also the cause of the projected increase in precipitation in the northwest territories. The above has focused on two factors, namely, thermal and dynamic factors, to explore some possibilities regarding future increases in precipitation intensity.

We used ML to integrate the CMIP6 multi-model, effectively reducing its uncertainty and improving its prediction ability, and revealing the future temporal and spatial changes of precipitation in China, which will allow us to improve our knowledge of the precipitation evolution in China, and the world, in the future. In addition, in the future, we will explore more advanced machine learning techniques to retrieve more information from multiple perspectives and large amounts of data, thereby improving the reliability of future predictions. Additionally, the significant increases in severe extreme precipitation events may have a strong impact on human society and ecosystems. We will also further explore the contribution and impact of extreme precipitation events on future precipitation. The results of this study bring important guidance to the development and design of future remote sensing equipment, as well as guiding directions for the planning and layout of future multi-source observation networks.

#### **5. Conclusions**

In this study, we used a machine learning approach to integrate the CMIP6 multimodels to better capture the nonlinear and complex relationships among climate models compared with the traditional ensemble median approach and achieved a more accurate

model-based prediction of future precipitation. On this basis, the temporal changes and spatial distribution of indices of eight climate zones in China during three periods of the 21st century, from 2023 to 2100, were analyzed. The study findings are as follows:


**Author Contributions:** Conceptualization, Y.Y., H.W., G.L., and J.X.; methodology, Y.Y., H.W., and F.G.; software, Y.Y., H.W., and X.R.; formal analysis, Y.Y., H.W., and L.T.; writing—original draft preparation, Y.Y.; writing—review and editing, Y.Y., H.W., and Q.Z.; visualization, Y.Y., H.W., and J.X. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the National Natural Science Foundation of China (42175002), the Project of the Sichuan Department of Science and Technology (2022YFS0541), the Key Laboratory of Atmospheric Sounding Program of China Meteorological Administration (2021KLAS02M), the National Key R&D Program of China (2018YFC1506104), Special Funds for the Central Government to Guide Local Technological Development (2020ZYD051), and Application Basic Research of Sichuan Department of Science and Technology (2019YJ0316).

**Data Availability Statement:** Publicly available datasets were analyzed in this study. These data can be found here: The data used in this study are the coupled model intercomparison project phase 6 where the SSP2-4.5 scenarios at https://esgf-data.dkrz.de/projects/cmip6-dkrz/ (accessed on 1 October 2021).

**Acknowledgments:** The authors sincerely thank the Coupled Model Intercomparison Project for providing the data used in this paper, and also thank the reviewers for their constructive comments and editorial suggestions, which have greatly contributed to the quality of the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Temperature Contributes More than Precipitation to Runoff in the High Mountains of Northwest China**

**Mengtian Fan 1,\*, Jianhua Xu 2, Yaning Chen 3, Meihui Fan 4, Wenzheng Yu <sup>1</sup> and Weihong Li <sup>3</sup>**


**Abstract:** In alpine areas in Northwest China, such as the Tianshan Mountains, the lack of climate data (because of scarce meteorological stations) makes it difficult to assess the impact of climate change on runoff. The main contribution of this study was to develop an integrated method to assess the impact of climate change on runoff in data-scarce high mountains. Based on reanalysis products, this study firstly downscaled climate data using machine learning algorithms, then developed a Batch Gradient Descent Linear Regression to calculate the contributions of temperature and precipitation to runoff. Applying this method to six mountainous basins originating from the Tianshan Mountains, we found that climate changes in high mountains are more significant than in lowlands. In high mountains, the runoff changes are mainly affected by temperature, whereas in lowlands, precipitation contributes more than temperature to runoff. The contributions of precipitation and temperature to runoff changes were 20% and 80%, respectively, in the Kumarik River. The insights gained in this study can guide other studies on climate and hydrology in high mountain basins.

**Keywords:** runoff changes; climate downscaling; data-scarce mountainous basins; quantitative assessment

#### **1. Introduction**

Global warming has exacerbated the uncertainty of runoff in mountainous rivers [1,2]. Mountainous rivers provide water for people and support lowland industries and agriculture [3–5]. However, in the alpine areas of Northwest China, such as the Tianshan mountains, limited climate data is available (because of the scarcity of meteorological stations), which makes it difficult to calculate the contribution of climate change to runoff [6,7]. New methods need to be developed in order to address this knowledge gap [8].

In recent years, various methods have been applied to calculate the contribution of climate change to runoff, including correlation analysis [9], sensitivity analysis [10], nonparametric Mann–Kendall tests [11], water–energy balance equations [12,13], two-parameter climate elasticity [14], weight connections [15], multiple linear regression (MLR) [16], Budyko curves [17], intelligent water drop algorithms [18], and hydrological models, such as VIC [19] and SWAT [20,21]. Using weight connections and artificial neural networks, Wang et al. [15] calculated the contributions of precipitation and temperature to runoff, which were 48% and 52% in the Toshkan River and 36% and 64% in the Kumarik River. Climate and streamflow processes have non-linear characteristics [22]. Improved, complete ensemble empirical mode decomposition, with adaptive noise (ICEEMDAN), can effectively decompose signals [23], and can be applied to analyze the impacts of climate change on runoff at different scales [14]. However, existing methods are mostly based on observed data [24], so they cannot be easily applied to data-scarce mountainous basins.

**Citation:** Fan, M.; Xu, J.; Chen, Y.; Fan, M.; Yu, W.; Li, W. Temperature Contributes More than Precipitation to Runoff in the High Mountains of Northwest China. *Remote Sens.* **2022**, *14*, 4015. https://doi.org/10.3390/ rs14164015

Academic Editor: Xander Wang

Received: 19 July 2022 Accepted: 16 August 2022 Published: 18 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Precipitation and temperature are the main variables affecting runoff changes in mountainous basins [25,26]. Various studies have shown that precipitation changes have led to runoff changes in many rivers around the world [27,28]. Increasing temperatures can accelerate the melting of snow and glaciers [29], providing large amounts of runoff to mountainous rivers [30]. For data-scarce areas, reanalysis products spatially distributed climate data [31]. The ERA5 precipitation (ERAP) and ERA-interim temperature (ERAT) show climate change at the regional scale and are strongly correlated with empirical observations in China [32–36]. However, when applied to basins, the reanalysis products need to be downscaled, in order to improve spatial resolution [37,38].

The purpose of this study is to develop a method to calculate the contributions of temperature and precipitation to runoff in data-scarce mountainous rivers. Selecting six mountainous basins originating from the Tianshan Mountains, we first downscaled temperature and precipitation using machine learning algorithms, and we then developed a Batch Gradient Descent Linear Regression (BGDLR) model to calculate the contributions of temperature and precipitation to runoff. This research can provide references for hydrological forecasts in mountainous areas.

#### **2. Materials and Methods**

#### *2.1. Study Area*

To calculate the contributions of temperature and precipitation to runoff changes in mountainous rivers, we selected as study areas the Manas River basin (MRB) (84◦96 E– 86◦31 E; 43◦07 N–43◦98 N) and Urumqi River basin (URB) (86◦80 E–87◦29 E; 43◦02 N– 43◦38 N) originating from the northern slope of the Tianshan Mountains, and Kashgar River basin (KaRB) (83◦00 E–85◦00 E; 43◦21 N–44◦06 N), Toxkan River basin (TRB) (75◦54 E– 78◦62 E; 40◦28 N–41◦50 N), Kumaric River basin (KuRB) (78◦07 E–80◦33 E; 41◦39 N–42◦48 N), and Kaidu River basin (KRB) (82◦57 E–86◦06 E; 42◦06 N–43◦21 N), originating from the southern slope of the Tianshan Mountains (Figure 1). There is only one meteorological station in the KRB and TRB, and no meteorological station in the KuRB, KaRB, MRB, or URB (Figure 1). Therefore, the above basins are typical data-scarce mountainous areas.

The Kashgar River is the main tributary of the Yili River, with a mean elevation of 3100 m [39]. In KaRB, the terrain is open to the west, and the westerly airflow brings abundant water vapor [40]. Among the rivers on the northern slope of the Tianshan Mountains, the Manas River has the most abundant runoff [41]. The Aksu River and KRB provide more than 80% of runoff for the Tarim River in Northwestern China [42]. The Toshkan River and Kumarik River are the main sources of the Aksu River [43]. The mean elevations are 3737 m and 3550 m, respectively, and the basin areas in the KuRB and TRB are 13,557 km and 18,835 km2, respectively. The Kaidu River originates in the Sarming Mountains, with an average elevation of 2990 m and an area of 18,727 km2. Snow and glaciers are widely distributed above the mountainous basins, which are an important supply of runoff [44]. There is a hydrological station in the mountain pass of each basin, the observed data of which can reflect runoff changes.

#### *2.2. Datasets*

The data used in this study included observed data and reanalysis data. We first downscaled temperature and precipitation based on reanalysis products and verified the accuracy of the downscaled results by observed data from meteorological stations. Then, based on the downscaled climate data and observed runoff, we calculated the contributions of temperature and precipitation to runoff change.

**Figure 1.** The study area. (**a**) Tianshan Mountains, (**b**) Toxkan River basin, (**c**) Kashgar River basin, (**d**) Manas River basin, (**e**) Urumqi River basin, (**f**) Kumaric River basin, and (**g**) Kaidu River basin.

The reanalysis products include ERAP, ERAT, and the Digital Elevation Model (DEM). The United States Geological Survey provided the DEM (http://srtm.csi.cgiar.org, accessed on 4 July 2022), with a resolution of 90 m × 90 m. The ERAT and ERAP are the thirdand fifth-generation products of ECMWF, respectively. The spatial resolution of ERAT is 0.125◦ × 0.125◦, and that of ERAP is 0.25◦ × 0.25◦ (https://apps.ecmwf.int/datasets/data/, accessed on 4 July 2022). The period of ERAT is from January 1979 to August 2019, and that of ERAP is from January 1979 to December 2020. To analyze the mechanism of climate change driving runoff, this study used a long-time series data set of snow

depth in China, with a time resolution of one day, and a spatial resolution of 25 km. The data were provided by the National Glacier-Permafrost Desert Science Data Center (http://www.ncdc.ac.cn/portal/, accessed on 10 July 2022), from 1979 to 2016 [45].

To verify the accuracy of downscaled results, we used the observed monthly temperature and precipitation from 17 meteorological stations located near the six basins (http://data.cma.cn/, accessed on 17 June 2022). The data were downloaded from the National Meteorological Information Center, from 1979 to 2020. The monthly runoff was provided by the Hydrological Bureau of Xinjiang Uygur Autonomous Region. The runoff data for URB, MRB, and KaRB are from 1980 to 1987 and 2006 to 2011, and for KRB, TRB, and KuRB are from 1979 to 2015.

#### *2.3. Methods*

To calculate the contribution of climate change to runoff in data-scarce mountainous rivers, we integrated climate downscaling, the Mann–Kendall test, ICEEMDAN, and BGDLR. We first downscaled temperature and precipitation using machine learning algorithms, then developed a BGDLR model to calculate the contributions of temperature and precipitation to runoff change.

#### 2.3.1. Climate Downscaling

Topography and geographical location are the main factors affecting precipitation and temperature distribution in mountainous areas [46,47]. Introducing terrain and geographic location to the downscaling of precipitation and temperature is useful [48]. This research fitted nonlinear models for temperature and precipitation, then trained the models using a gradient descent algorithm [49]. The downscaling models can be expressed as follows:

$$\mathbf{T} = \mathbf{F}\_1 \begin{pmatrix} \mathbf{A}, \mathbf{B}, \mathbf{C}, \mathbf{D}, \mathbf{E} \end{pmatrix} + \Delta \mathbf{T}, \tag{1}$$

$$\mathbf{P} = \mathbf{F}\_2 \left( \mathbf{A}, \mathbf{B}, \mathbf{C}, \mathbf{D}, \mathbf{E} \right) + \Delta \mathbf{P}, \tag{2}$$

where ΔT and ΔP are residuals; A, B, C, D, and E are elevation, latitude, longitude, aspect, and slope, respectively. The steps build on Fan et al. [50,51].

#### 2.3.2. Climate and Hydrological Process Analysis

The Mann–Kendall test [52,53] is an effective tool for analyzing trends in time series [22,54]. We applied this method to explore the changes in climate and runoff. In addition, we used Sen's slope [55] to verify trends in the Mann–Kendall test. For the steps of the Mann–Kendall test and Sen's slope, please see Wang et al. [15].

Runoff and climate have non-linear changes [56,57]. ICEEMDAN can effectively decompose signals [58–60]. In this research, ICEEMDAN was used to extract the multiscale changes in runoff and climate. For the steps of ICEEMDAN, please see Ali and Prasad [61].

#### 2.3.3. Contributions of Climate Change to Runoff

Precipitation and temperature are the main climatic variables affecting runoff changes in mountainous basins. Affected by factors such as geographic environment and altitude, the contribution of climate change to runoff varies in different basins and seasons. Linear regressions can clarify the relationship between dependent and independent variables [62,63]; the regression may be applied to calculate the contributions of independent variables to a dependent variable [64]. In this study, we developed a multi-linear model of runoff with temperature and precipitation and solved the model using a batch gradient descent (BGD) algorithm [49]. Before applying this method, normalization of temperature, precipitation, and runoff was performed with the equation as follows:

$$\mathbf{X\_{is}} = (\mathbf{x\_i} - \mathbf{x\_{min}}) / (\mathbf{x\_{max}} - \mathbf{x\_{min}}),\tag{3}$$

where xi is the sequential data and Xis is the normalized variable.

To avoid collinearity in the model, multicollinearity among the explanatory variables was evaluated, using the tolerance and variance inflation factor [65]. We also used partial correlation analysis to remove the effect between explanatory variables. According to Hair et al. [65], when the tolerance of independent variables is >0.1, the variance inflation coefficient is <10, indicating that there is no collinearity between independent variables. The resulting beta coefficients (partial regression coefficients) for the explanatory variables represent the independent contributions of each explanatory variable [66]. The regression model was as follows:

$$\mathbf{R} = \mathbf{c}\_1 \mathbf{T} + \mathbf{c}\_2 \mathbf{P} + \mathbf{a}\_\prime \tag{4}$$

where R, T, and P, are normalized runoff, temperature, and precipitation, respectively; c1 is the regression coefficient of T; c2 is the regression coefficient of P; and a is the regression constant.

In the model, the cost function for regression is:

$$\mathbf{J}(\boldsymbol{\theta}) = \frac{1}{2\mathbf{m}} \sum\_{i=1}^{\mathbf{m}} \left( \mathbf{h}\_{\boldsymbol{\theta}} \left( \mathbf{x}^{(i)} \right) - \mathbf{y}^{(i)} \right)^{2},\tag{5}$$

where m is the number of samples; x(i) represents the characteristics of the sample; and y(i) is the target.

Then, we used the BGD algorithm to train the dataset to minimize J(θ) and find the optimal solution of θ, which are as follows:

$$\frac{\partial}{\partial \theta\_{\mathbf{j}}} \mathbf{J}(\theta) = \frac{1}{\mathbf{m}} \sum\_{i=1}^{\mathbf{m}} \left( \mathbf{h}\_{\theta} \left( \mathbf{x}^{(i)} \right) - \mathbf{y}^{(i)} \right) \mathbf{x}\_{\mathbf{j}}^{(i)} \tag{6}$$

The correction function for θ is:

$$\boldsymbol{\Theta}\_{\rangle} = \boldsymbol{\Theta}\_{\rangle} - \boldsymbol{\alpha} \frac{1}{m} \sum\_{i=1}^{m} \left( \mathbf{h}\_{\boldsymbol{\Theta}} \left( \mathbf{x}^{(i)} \right) - \mathbf{y}^{(i)} \right) \mathbf{x}\_{\rangle}^{(i)},\tag{7}$$

where α is the learning rate.

Based on the regression coefficients, the contributions of precipitation and temperature to runoff are calculated as:

$$\mathfrak{n}\_1 = |\mathfrak{c}\_1| / (|\mathfrak{c}\_1| + |\mathfrak{c}\_2|), \tag{8}$$

$$\mathfrak{m}\_2 = |\mathbf{c}2| / (|\mathbf{c}\_1| + |\mathbf{c}2|),\tag{9}$$

where η<sup>1</sup> is the contribution of temperature to runoff and η<sup>2</sup> is the contribution of precipitation to runoff.

#### **3. Results**

#### *3.1. Accuracy of Downscaled Climate Data*

After climate downscaling, we spatially distributed precipitation and temperature in six basins. The resolution of downscaled data is 90 m × 90 m. The observations from 17 meteorological stations distributed in and near the basins were used to verify the accuracy of the downscaling results. Tables S1 and S2 indicated the robust performance of the developed method. At 17 meteorological stations, the slope between the downscaled and observed data was close to 1, and the NSE was higher than 0.5. At most stations, the MAE and RMSE between downscaled temperature and observations were <3 ◦C (Table S1), and that between downscaled precipitation and observations were <10 mm (Table S2). At 11 stations, the NSE of downscaled temperature and the observed temperature was >0.9 (Table S1). The downscaled data accounted for scarce observations and reveal climate change in mountainous basins.

#### *3.2. Climate Change*

The mean and slope of downscaled grid data showed the characteristics of temperature and precipitation change. According to Figure 2, the temperature is <0 ◦C and annual precipitation is <600 mm in six basins. In valleys and plains, the annual precipitation is <300 mm, and the temperature is >0 ◦C. In mountainous areas, the annual precipitation is >300 mm, and the temperature is <0 ◦C.

**Figure 2.** (**a**–**d**), spatial patterns and trends of precipitation and temperature from 1979 to 2020 in KRB; (**e**–**h**) KuRB; (**i**–**l**) KaRB; (**m**–**p**) TRB; (**q**–**t**) MRB; and (**u**–**x**) URB; (**a**,**e**,**i**,**m**,**q**,**u**) are annual average temperature; (**b**,**f**,**j**,**n**,**r**,**v**) are temperature trend; (**c**,**g**,**k**,**o**,**s**,**w**) are annual precipitation; (**d**,**h**,**l**,**p**,**t**,**x**) are precipitation trend.

There are significant differences in the temperature and precipitation changes in different basins (Figure 2). In the past 40 years, KRB has experienced wetting, with a humidification rate of 9 mm/10a (Figure 2b,d). At the same time, KaRB became increasingly dry at 40 mm/10a (Figure 2l). Different from the above basins, TRB, KuRB, MRB, and URB have experienced warming and wetting. In TRB and KuRB, the rate of warming and humidification gradually slowed from west to east. In TRB, the temperature increased 0.29 ◦C/10a in the west and 0.20 ◦C/10a in the east (Figure 2n); the precipitation increased 50 mm/10a in the west and 7.42 mm/10a in the east (Figure 2p). In KuRB, the temperature increased 0.21 ◦C/10a in the west and 0.18 ◦C/10a in the east (Figure 2f); the precipitation increased 30 mm/10a in the west and 27 mm/10a in the east (Figure 2h). On the whole, compared with valleys and plains, the high mountains have more dramatic changes in temperature and precipitation.

#### *3.3. Impact of Climatic Variables on Runoff*

#### 3.3.1. Correlation of Runoff with Temperature and Precipitation

Figure 3 displays the correlation coefficients of runoff with climate on a monthly scale. In KaRB, the runoff has a stronger correlation with temperature, whereas in MRB, URB, KuRB, and KRB, the runoff has a stronger correlation with precipitation (Figure 3). Studies have shown that the correlations between runoff and climate varies in different seasons. Therefore, on a monthly scale, temperature and precipitation are the main driving factors for the runoff changes in mountainous watersheds. The difference is that, in KaRB, the temperature has a greater effect on runoff than precipitation, whereas, in MRB, URB, KuRB, and KRB, precipitation dominates the monthly runoff changes.

**Figure 3.** Correlation coefficients between climate and runoff on a monthly scale. \* indicates the significance of a < 0.01.

The seasonal correlation coefficients between runoff and climate are shown in Table 1. In URB, a positive correlation between runoff and precipitation is only shown in summer, which indicates that the runoff in URB is mainly supplied by summer precipitation. In MRB, the runoff has significant positive correlations with both temperature and precipitation in summer, indicating that the runoff is mainly replenished by summer precipitation and glacier meltwater. In KaRB and KRB, a significant positive correlation is shown between runoff and precipitation in spring and summer, so the runoff in KaRB and KRB is mainly replenished by spring and summer precipitation. In TRB, the correlation is positive between precipitation and runoff in spring and autumn, indicating that the runoff in TRB is mainly

replenished by spring and autumn precipitation. In KuRB, the correlation between runoff and temperature is significant and positive in spring, summer, and autumn, which indicates that the snowmelt water in spring and autumn, and melted ice water in summer, are the main replenishments for the runoff in KuRB (Table 1). There are obvious differences in the correlations between runoff and climate in different seasons. Overall, the changes in spring and autumn temperature, and summer precipitation, have an important impact on runoff.


**Table 1.** Seasonal correlation coefficients between climate and runoff.

Table note: \* indicates the significance of a < 0.05 and \*\* indicates the significance of a ≤ 0.01.

ICEEMDAN was used to explore the multi-scale variations of climate and runoff. Table S3 shows that the runoff has similar cycles with temperature and precipitation. In MRB, URB, and KaRB, five intrinsic mode functions (IMF) and one residual component (RES) were obtained after the decomposition of runoff, temperature, and precipitation. While in other basins, six IMF and one RES were obtained after the decompositions. We reconstructed the precipitation, temperature, and runoff on inter-seasonal, inter-annual, and inter-decadal scales [67], then calculated the correlation coefficients between runoff and climate on different scales (Table 2). In MRB, URB, and KaRB, the inter-decadal variations of runoff are not shown because of the short period of runoff data. On the seasonal scale, the runoff has a stronger correlation with temperature in the KaRB and KuRB, and has a stronger correlation with precipitation in other basins; on the inter-annual scale, the correlation between runoff and temperature is stronger than that of precipitation; on the inter-decadal scale, the runoff has a stronger correlation with temperature in TRB and KRB, and has a stronger correlation with precipitation in KuRB (Table 2). It is worth noting that the runoff and temperature in KRB are significantly negatively correlated on the inter-decadal scale, and the specific mechanism(s) needs to be further studied.


**Table 2.** Multi-scale correlation coefficients between runoff and climate.

Table note: \* indicates the significance of a < 0.05.

#### 3.3.2. Contributions of Climate Change to Runoff

Table S4 shows the collinearity tests for regression models. In the six basins, the tolerances were all > 0.8, and VIF were all < 2, indicating that the models did not have collinearity. According to partial regression coefficients, the contributions of temperature and precipitation to runoff can be calculated (Figure 4). Statistical results indicate significant differences in the contribution ratios among different basins. In KuRB, the temperature dominates annual runoff changes, whereas precipitation contributes more than temperature to runoff changes in other basins. Among the six watersheds, the KuRB has the highest elevation, and glaciers and permanent snow are widely distributed. Therefore, in high mountainous areas, the temperature contributed more than precipitation to runoff changes.

Across the seasons, there are obvious differences in the contribution of temperature and precipitation change to runoff (Figure 4). Generally, summer runoff accounts for the highest proportion of annual runoff in alpine basins [68]. Therefore, the relative contribution of temperature and precipitation to summer runoff changes can reflect the replenishment of runoff. In MRB, the contribution of precipitation change to runoff is 97% in spring, whereas the temperature change has a higher contribution than precipitation to runoff in summer and autumn, indicating that the glacier and snow melt water in summer is the main replenishment for runoff in MRB, followed by spring and summer precipitation. In URB, in spring, the contribution of temperature changes to runoff is higher than precipitation, indicating that glacier and snowmelt water are important replenishments for runoff; in summer, the runoff is mainly affected by precipitation change, and in autumn and winter, temperature and precipitation contribute equally to runoff. In KaRB, the relative contribution of precipitation changes to runoff is 72%, 88%, and 99% in spring, summer, and winter, respectively. Located in the Ili River Valley, KaRB is affected by warm and humid water vapor from the Atlantic Ocean; therefore, the runoff changes in KaRB are mainly affected by precipitation.

**Figure 4.** Contributions of precipitation and temperature change to runoff.

In TRB, precipitation contributed more than temperature to runoff in spring, summer, and autumn, whereas temperature has a higher contribution than precipitation to runoff in winter. This indicates that summer precipitation is the main replenishment of runoff in TRB, followed by spring and autumn precipitation. Different from other basins, in KuRB the temperature changes have a higher contribution than precipitation to runoff; therefore, glacier and snow melt water in summer is the main recharge of the runoff. In KRB, the contribution of precipitation changes to runoff is higher than temperature in each season, indicating that summer precipitation is the most important replenishment for runoff in KRB. In MRB and KuRB, glacier and snow melt water in summer is the main recharge of runoff, and runoff is mainly affected by temperature changes, whereas in other watersheds, summer precipitation is the main recharge of runoff, and runoff is mainly affected by precipitation changes.

#### **4. Discussion**

#### *4.1. Climate Downscaling*

Almost all previous studies have calculated the contribution of climate change to runoff using observed precipitation and temperature [11,16]. Generally, observed data only reflect climate change at immediate locations [51]. For data scarce areas, reanalysis products spatially distributed temperature and precipitation [31,32], whereas the low resolution eliminates the climate heterogeneity in basins [69]. Based on climate downscaling, this research obtained high-resolution temperature and precipitation data. We extracted the downscaled temperature (September 2017) and compared it with ERAT and observations at the corresponding time (Figure 5). There is only one meteorological station in KRB and TRB, and there is no meteorological station in KuRB, KaRB, MRB, and URB. Compared with observations and ERAT, downscaled data more accurately showed temperature changes in data-scarce mountain basins (Figure 5).

**Figure 5.** (**a**–**c**), comparison of downscaled temperature (90 m × 90 m), ERAT (0.125◦ × 0.125◦), and observations in the KRB; (**d**–**f**) KuRB; (**g**–**i**) KaRB; (**j**–**l**) TRB; (**m**–**o**) MRB; and (**p**–**r**) URB; (**a**,**d**,**g**,**j**,**m**,**p**) are downscaled temperature; (**b**,**e**,**h**,**k**,**n**,**q**) are ERAT; (**c**,**f**,**i**,**l**,**o**,**r**) are observations.

#### *4.2. Climate and Runoff Processes in Mountainous Rivers*

In high mountainous watersheds, runoff is mainly supplied by precipitation and glacial snowmelt water, among which precipitation mainly supplies runoff in the form of rainfall and snowfall [44,70]. Above the snow line, snow transforms into glaciers and snow, which melts as temperatures increase, thereby replenishing runoff [71]. Studies have shown that glacier meltwater accounts for about 20% to 40% of total runoff in the Tianshan Mountains [72]. In the context of climate warming, the precipitation form in mountainous areas has changed, which has caused changes in the runoff process [73,74]. In rivers dominated by snowmelt runoff, a decrease in snowfall rates will lead to a shift to precipitation [75], thereby altering the seasonal distribution of runoff and leading to earlier flood peaks [76]. Studies have shown that the contributions of precipitation and temperature to runoff were 36% and 64% in KuRB, 52% and 48% in TRB [70], and 56% and 44% in KRB [15], which is consistent with our results.

To further understand the mechanisms of climate change driving runoff in the Tianshan Mountains, selecting KRB as an example, this section first compared the changes of temperature, precipitation, and runoff, then analyzed glacier and snow changes. Figure 6a shows that runoff has an annual distribution consistent with temperature and precipitation. In summer, temperature is the highest, precipitation is abundant, and runoff is the most abundant. Interannual variation shows that runoff is consistent with the changes in temperature and precipitation, and runoff increases (decreases) when the temperature and precipitation increase (decrease) (Figure 6).

**Figure 6.** Changes in the temperature, precipitation, and runoff in KRB from 1979 to 2015: (**a**) annual distribution of temperature, precipitation, and runoff; (**b**) changes in precipitation and runoff; and (**c**) changes in temperature and runoff.

In the context of global warming, glacier and snow areas in KRB have changed. Bai [77] extracted the glacier area of KRB based on the cataloging data of two glaciers periods. Compared with the first statistical period (1956–1983), the glacier area in the second statistical period (2005–2010) decreased by 45.27% (Figure 7a). Warming is not only causing glaciers to retreat, but also accelerating snow melt. In the past few decades, the snow depth of KRB decreased; downward trends were most obvious in central and eastern regions. Relative to January 1980, snow depth in the northeastern mountains decreased by 10 cm (Figure 7b). Previous studies have shown that the correlation coefficients between temperature and snow cover in KRB are −0.81, −0.48, −0.80, and −0.82 in spring, summer, autumn, and winter, respectively. The above results confirmed that temperature has an important impact on the runoff change in high mountainous areas.

**Figure 7.** Glacier and snow changes in KRB: (**a**) changes in the glacier areas; and (**b**) changes in snow depth (January 2015 vs. January 1980). The data in (**a**) refers to Bai [75].

This research indicates that high mountains have more sensitive responses than plains to global climate changes. In mountainous areas, glaciers and snow are widely distributed, and plant diversity is abundant [78]. In valleys and plains, topography, latitude, and ecosystem stability may cause a buffering effect [79], with slower increases in temperature and precipitation than in mountains. The increase of precipitation is faster in mountainous areas than in plains, which is consistent with results from other studies [80]. In mountainous areas, increasing temperature accelerated the melting of snow and glaciers, leading to increased water vapor. Moreover, low saturated water vapor pressure in mountainous areas is conducive to the formation of precipitation [81].

Driven by climate change, runoff decreased in the KaRB in the past 40 years, which was mainly caused by decreasing precipitation. The KaRB is located near the Ili River Valley, and the decrease in precipitation may be related to the North Atlantic Drift [82]. At the same time, runoff increased in MRB, URB, KuRB, TRB, and KRB, but there are significant differences in the increase rate among different basins, and previous studies [70] have supported this.

#### *4.3. Limitations*

This study simulated high-resolution temperature and precipitation by downscaling reanalysis products. For alpine mountain areas, reanalysis products spatially distributed climate data. However, there is a deviation between the simulations of reanalyzed products and observed data, and the accuracy of products needs to be further improved. In addition, vegetation impacts climate change in mountainous areas, especially precipitation [83]. Due to the short time series and low spatial resolution of existing vegetation data, this study only introduced geographical and terrain factors in climate downscaling. In future research, the accuracy of downscaling results can be further improved if vegetation data with high temporal and spatial resolution are obtained.

This study mainly analyzed the impact on runoff of two main climate variables, temperature, and precipitation. In high mountainous basins, glaciers and snow melt are the main supplies of runoff. In future research, it is necessary to quantify the contributions of glaciers and snow to runoff change, to more clearly understand the mechanisms of climate change driving runoff. In addition, this study did not analyze the impact of alpine permafrost thawing on runoff change.

This study used beta coefficients (partial regression coefficients) of regression models to calculate the contributions of temperature and precipitation to runoff change. The bias of precipitation and temperature will lead to uncertainty in the calculation results of contribution ratio. In addition, as a component of the regression model, the contribution of regression constants to runoff were not calculated in this study. In future research, we will make improvements to the model to enhance its applicability.

#### **5. Conclusions**

Based on reanalysis products, this study developed an integrated method to calculate the contributions of climate change on runoff, in data scarce high mountains. Applying this method to six mountainous basins originating from the high Tianshan Mountains, this study found that, in high mountains, runoff changes are mainly affected by temperature, whereas in lowlands, precipitation contributes more than temperature to runoff changes. The contributions of precipitation and temperature change to runoff were 20% and 80%, respectively, in the Kumarik River. This study also found that climate changes are more significant in high mountains than in lowlands.

The present study lays the groundwork for future research using runoff simulations, which is of great significance to hydrological forecasting and water resource management in mountainous basins. This study highlights the impact of glaciers and snow on mountainous runoff. Thus, introducing the distribution of glaciers and snow into a hydrological modeling framework could better characterize runoff. Based on future climate scenarios, water management planning should be oriented to generate new strategies to cope with possible future changes in the strength of seasonality, and other variables.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14164015/s1, Table S1: accuracy of the downscaled temperature at 17 meteorological stations; Table S2: accuracy of the downscaled precipitation at 17 meteorological stations; Table S3: cycles (month) of the runoff, temperature, and precipitation in six basins; Table S4: collinearity tests for regression models.

**Author Contributions:** Conceptualization, M.F. (Mengtian Fan) and J.X.; methodology, software, validation, formal analysis, investigation, and data curation, M.F. (Mengtian Fan); writing—original draft preparation, J.X. and M.F. (Meihui Fan); writing—review and editing, Y.C., W.Y. and M.F. (Meihui Fan); visualization and supervision, Y.C. and W.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is supported by the National Natural Science Foundation of China (Grant No. 42130512, 41871025, U20A2098) and the special fund for the introduction of talents in Nanjing University of Information Science and Technology (Grant No. 1521582201013).

**Data Availability Statement:** The data used in this study are available from Mengtian Fan (mtfan@nuist.edu.cn).

**Acknowledgments:** The authors gratefully acknowledge the Youth Innovation Promotion Association of the Chinese Academy of Sciences (2019431) and the State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Quantifying the Trends and Variations in the Frost-Free Period and the Number of Frost Days across China under Climate Change Using ERA5-Land Reanalysis Dataset**

**Hongyuan Li 1,2, Guohua Liu 3, Chuntan Han 1,2, Yong Yang <sup>1</sup> and Rensheng Chen 1,4,\***


**Abstract:** Understanding the spatio-temporal variations in the frost-free period (FFP) and the number of frost days (FD) is beneficial to reduce the harmful effects of climate change on agricultural production and enhancing agricultural adaptation. However, the spatio-temporal variations in FFP and FD and their response to climate change remain unclear across China. To investigate the impact of climate change on FFP and FD, the trends and variations in FFP and FD across China from 1950 to 2020 were quantified using ERA5-Land, a reanalysis dataset with high spatial and temporal resolution. The results showed that ERA5-Land has good applicability in quantifying the trends and variations in FFP and FD across China under climate change. The spatial distribution of multi-year average FFP and FD across China showed significant latitudinal zonality and altitude dependence, i.e., FFP decreased with increasing latitude and altitude, while FD increased with increasing latitude and altitude. As a result of climate warming across China, the FFP showed an increasing trend with an increase rate of 1.25 d/10a and the maximum increasing rate of FFP in the individual region was 6.2 d/10a, while the FD showed a decreasing trend with a decrease rate of 1.41 d/10a and the maximum decreasing rate of FD in the individual region was −6.7 d/10a. Among the five major climate zones in China, the subtropical monsoon climate zone (SUMZ) with the greatest increasing rate of 1.73 d/10a in FFP, while the temperate monsoon climate zone (TEMZ) with the greatest decreasing rate of −1.72 d/10a in FD. In addition, the coefficient of variation (Cv) of FFP showed greater variability at higher altitudes, while the Cv of FD showed greater variability at lower latitudes in southern China. Without considering the adaptation to temperature of crops, a general increase in FFP and a general decrease in FD were both beneficial to agricultural production in terms of FFP and FD promoting a longer growing period and reducing frost damage on crops. This study provides a comprehensive understanding of the trends and variations in FFP and FD under climate change, which is of great scientific significance for the adjustment of the agricultural production layout to adapt to climate change in China.

**Keywords:** frost-free period; frost days; ERA5-Land; climate change; agricultural production

#### **1. Introduction**

The frost-free period (FFP) is the period of the year after the last frost day (LFD) and before the first frost day (FFD), during which no hoarfrost occurs [1–3]. The FFP is closely related to the growing period of crops, as a long FFP is associated with a long growing period of crops, and the length of the FFP is a very important heat index in agriculture [1,2]. The number of frost days (FD) is the total number of days in a year that frost occurs [4,5].

**Citation:** Li, H.; Liu, G.; Han, C.; Yang, Y.; Chen, R. Quantifying the Trends and Variations in the Frost-Free Period and the Number of Frost Days across China under Climate Change Using ERA5-Land Reanalysis Dataset. *Remote Sens.* **2022**, *14*, 2400. https://doi.org/ 10.3390/rs14102400

Academic Editor: Xander Wang

Received: 13 April 2022 Accepted: 15 May 2022 Published: 17 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Frost is a relatively common extreme agro-meteorological disaster that occurs in winter and spring [6–8], primarily caused by the sharp drop in temperature below 0 ◦C in a short time due to the southward movement of the cold wave, or by strong radiation cooling at ground level on the night of the day when the weather changes from cloudy to clear sky condition after being affected by a cold wave [6,9–12].

Climate change has become a common concern for scientists and the public, with the global surface temperature rising by 1.09 ◦C from 2011 to 2020 compared to the industrial revolution [13]. The trend of surface temperature in China is basically the same as the global one in the last hundred years, but the warming in the last 50 years is more significant than the global average [14]. Under climate warming, the frequency of cold events and the dates of frost have changed significantly in many regions of the world [15–17]. Climate warming has resulted in an earlier LFD, a later FFD, an increase in FFP and a decrease in FD [7,18–23], and variations in FFP and FD have had a significant positive effect on agricultural production [16,17,24–29]. This not only affected seasonal agricultural planting planning, but also the growth and development of crops [2,18,30–33]. Consequently, determining the trends and variations in FFP and FD under climate change is important for the planning and management of agricultural production, such as agricultural zoning planning, agricultural production restructuring, optimization of agricultural production structures and regional layout [2,18,30,31,34].

Research on the trends and variations in FFP and FD are generally based on surface and air temperatures observed by ground-based meteorological stations [1,22,24,35]. Although the research on trends and variations in FFP and FD based on ground-based meteorological stations is relatively effective, there are certain disadvantages to the quantification of trends and variations in FFP and FD [22,29]. This is due to the uneven distribution of meteorological stations in China and the limited number of meteorological stations, especially in the vast regions of northwest China and the Qinghai–Tibet Plateau, where the meteorological stations are very sparse [36–38]. Even the above studies extrapolated the results to space using algorithms such as inverse distance interpolation based on station observed data, which creates large spatial uncertainties in regions with sparse station distribution and complex topography [39]. The uneven distribution, complex topography and limited number of meteorological stations results in the less accurate assessment of many climatic and meteorological conditions [38,40–42], including FFP and FD.

With the development of satellite remote sensing and data assimilation techniques as well as land surface and atmospheric models, the spatial and temporal resolution of reanalysis dataset has gradually increased and is widely used for the analysis of meteorological and hydrological processes applications [43–48]. Reanalysis dataset is the result of reprocessing and analysis of historical meteorological observations to reproduce past atmospheric conditions using well-established numerical prediction models and assimilation analysis [49,50]. Reanalysis dataset is generally gridded to improve the estimation of the meteorological elements with higher spatial and temporal resolution relative to meteorological stations, and is particularly useful in the study of meteorological and hydrological processes in regions where meteorological stations are sparse and unavailable [43,51,52].

Although many studies have investigated the FFP and FD in single or multiple provinces [9,18,22,29,35,53–55], the spatial and temporal variations in FFP and FD, as well as their response to climate change remain unclear across China. China covers a vast region and a variety of climate zones [56,57], and the geographical distribution of FFP and FD is zonal with evident spatial differences [6,10,22]. The layout of agricultural production is closely related to climatic conditions [58], and FFP and FD are also influenced by climatic conditions. Therefore, based on the quantification of FFP and FD at a national level, it is more beneficial to understand the relationship between FFP and FD and agricultural production by further quantifying the trends and variations in FFP and FD in different climate zones.

In the current study, the daily minimum skin temperature (*Ts*) and daily minimum 2 m temperature (*Ta*) from the ERA5-Land reanalysis dataset [50,59,60] were selected as data sources for quantifying the trends and variations in FFP and FD across China, and the data sources were validated using observed daily minimum ground surface temperature (GST) and daily minimum air temperature (TEM) from the Daily Meteorological Dataset of basic meteorological elements of China National Surface Weather Stations (DMDC) (V3.0) [61]. The main objectives of this study are (1) to analyze the spatial pattern of FFP and FD across China, (2) to quantify the trends and variations in FFP and FD across China during 1950~2020 and (3) to explore the effect of variations in FFP and FD on agricultural production in China.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The vast size of China, the long distances between east and west, north and south, and the very different topographical and climatic characteristics of the various provinces and regions have resulted in a great difference in FFP and FD across China. In general, China can be divided into five main climate zones: the alpine climate zone (ALCZ), the temperate continental climate zone (TECZ), the temperate monsoon climate zone (TMCZ), the subtropical monsoon climate zone (SUMZ) and the tropical monsoon climate zone (TRMZ) [57] (Figure 1). There is consistency in the trends and changes in FFP and FD for each climate zone, while there is wide variability in the trends and variations in FFP and FD between climate zones. In addition, China contains 34 provinces, and the trends and variations in FFP and FD for each province are highly variable. Therefore, an integrated quantitative analysis of trends and variations in FFP and FD from a national perspective and in different climatic zones is of great scientific significance to adjust agricultural production to climate change. In this study, observed daily minimum surface and air temperatures from 822 national surface weather stations (including 620 national basic synoptic stations and 202 national reference climatological stations) (Figure 1) with longer time series were selected to evaluate the applicability of ERA5-Land reanalysis dataset.

**Figure 1.** China's five main climate zones: the alpine climate zone (ALCZ), the temperate continental climate zone (TECZ), the temperate monsoon climate zone (TMCZ), the subtropical monsoon climate zone (SUMZ) and the tropical monsoon climate zone (TRMZ), and the distribution of 822 national surface weather stations in China.

#### *2.2. Data Collection*

#### 2.2.1. Reanalysis Dataset of ERA5-Land

In this study, the daily minimum skin temperature (*Ts*) and daily minimum 2 m temperature (*Ta*) from the ERA5-Land reanalysis dataset were selected as the data source for quantifying the trends and variations of FFP and FD across China. ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution, and ERA5-Land has been produced by replaying the land component of the ECMWF ERA5 climate reanalysis [50,59,60]. ERA5-Land has a high spatial resolution of 0.1 degree (approximately 10 km), a temporal resolution of one hour and a time of 1950 to the present. Because the temporal resolution of skin and air temperatures from ERA5-Land is one hour, the daily minimum *Ts* and daily minimum *Ta* selected in this study are the minimum values of 24 skin temperatures and 24 air temperatures in a day, respectively. Although the original dataset covers the 1950 to the present, we selected 1950 to 2020 as the study period. The high spatial and temporal resolution of ERA5-Land makes this dataset very useful for all kinds of land surface applications [50], such as FFP and FD quantification.

#### 2.2.2. Dataset for ERA5-Land Evaluation

For the evaluation of daily minimum *Ts* and *Ta* from ERA5-Land, the observed daily minimum ground surface temperature (GST) and daily minimum air temperature (TEM) used to validate these land variables were collected from 822 national surface weather stations in China (Figure 1) during the 1951–2015 period and were provided by the China Meteorological Administration. Data from these observations are stored in the dataset of Daily Meteorological Dataset of basic meteorological elements of China National Surface Weather Stations (DMDC)(V3.0), which includes the daily minimum GST and daily minimum TEM [61]. Due to the early and late construction of each weather station, the time series of the observations is not consistent across weather stations. Only a few weather station observations had time series from 1951 to 2015, and most weather station observations had time series between 1951 and 2015. In particular, the time series of some weather station observations in the Qinghai–Tibet Plateau do not exceed 30 years. Therefore, the daily minimum *Ts* and *Ta* from ERA5-Land were validated by the corresponding daily minimum GST and TEM, respectively.

#### *2.3. Methods*

#### 2.3.1. Quantitative Criteria

The FFP is the period between the LFD and the FFD, but there is no uniform definition of the LFD and FFD [22,53]. Generally, the last time the minimum surface temperature lower than 0 ◦C in the first half of the year is defined as the LFD, the first time the minimum surface temperature lower than 0 ◦C in the second half of the year is usually defined as the FFD and the period between the LFD and the FFD is defined as the FFP [53,62]. Therefore, the daily minimum *Ts* from ERA5-Land was used to calculate the LFD and FFD, and thus the FFP in this study.

The FD is different from the FFP, which is a period of continuity, whereas the FD is the total number of days in a year that hoarfrost occurs. The FD is one of the 27 core extreme climate change indices revisited by the joint CCl/CLIVAR/JCOMM Expert Team (ET) on Climate Change Detection and Indices (ETCCDI) [4,5], and the definition of FD is the annual count of days when the daily minimum temperature is lower than 0 ◦C. In view of the above definition, the daily minimum *Ta* from ERA5-Land was used to calculate the FD in this study.

#### 2.3.2. Data Evaluation

Statistical indices were used for quantitative analysis of the performance of ERA5-Land in estimating the daily minimum *Ts* and daily minimum *Ta*, and four statistical indices were used in this study as follows:

$$R^2 = \left[\frac{\text{cov}(R, O)}{\sigma R \sigma O}\right]^2\tag{1}$$

$$NSE = 1 - \frac{\sum\_{i=1}^{n} \left(R\_i - O\_i\right)^2}{\sum\_{i=1}^{n} \left(O\_i - \overline{O}\right)^2} \tag{2}$$

$$MBE = \frac{1}{n} \sum\_{i=1}^{n} \left( R\_i - O\_i \right) \tag{3}$$

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left( R\_i - O\_i \right)^2} \tag{4}$$

where *R*2, *NSE*, *MBE* and *RMSE* are the coefficient of determination, Nash–Sutcliffe efficiency coefficient, mean bias error and root mean square error, respectively. *Ri* and *Oi* represent the estimated (reanalysis) and observed data at time *i*, respectively. *n* is the total number of time steps. cov(*R*,*O*) is the covariance of estimated (reanalysis) and observed data and *σR* and *σO* are the standard deviations of estimated (reanalysis) and observed data, respectively.

#### 2.3.3. Trend Analysis

In this study, the non-parametric Mann–Kendall (MK) test [63–66] was used to analyze the trend and significance level of the *Ts*, *Ta*, FFP and FD.

$$S = \sum\_{i=1}^{n-1} \sum\_{j=i+1}^{n} \text{sgn}(x\_j - x\_i) \tag{5}$$

$$\text{sgn}(\mathbf{x}\_{\dot{j}} - \mathbf{x}\_{i}) = \begin{cases} 1 & (\mathbf{x}\_{\dot{j}} - \mathbf{x}\_{i}) > 0 \\ 0 & (\mathbf{x}\_{\dot{j}} - \mathbf{x}\_{i}) = 0 \\ -1 & (\mathbf{x}\_{\dot{j}} - \mathbf{x}\_{i}) < 0 \end{cases} \tag{6}$$

$$\text{var}(S) = \frac{n(n-1)(2n+5) - \sum\_{k=1}^{m} t\_k(t\_k - 1)(2t\_k + 5)}{18} \tag{7}$$

$$Z\_{\mathbb{C}} = \begin{cases} (S-1) / \sqrt{\text{var}(S)} & S > 0 \\ 0 & S = 0 \\ (S+1) / \sqrt{\text{var}(S)} & S < 0 \end{cases} \tag{8}$$

where *S* is the statistic of the dataset, *n* is the length of the dataset, *xi* and *xj* are the sequential data values in time series *i* and *j*, *m* is the number of tied groups and *tk* denotes the number of ties of extent *k* and a tied group is a set of sample data having the same value. *Zc* is the standardized statistics of the dataset and the positive and negative values of *Zc* indicate increasing and decreasing trends, respectively. If |*Zc*| > *Z*1−<sup>α</sup>/2, the trend is statistically significant, otherwise, the trend is not statistically significant. Testing trends was carried out at a specific *α* significance level, and the significance level of *α* = 0.05 (95% confidence level) was applied in this study.

For the measurement of the trend in variation, the Sen's slope method [65–67] was used to analyze the slope of the variation, and the slope is expressed as follows:

$$\beta = \text{Median}(\frac{\mathbf{x}\_j - \mathbf{x}\_i}{j - i})\\1 < i < j < n \tag{9}$$

where *β* is the slope of the data variation, a positive *β* denotes an increasing trend, while a negative *β* means a decreasing trend.

#### **3. Results**

#### *3.1. Performance of ERA5-Land*

The daily minimum GST from DMDC (V3.0) was used to validate the daily minimum *Ts* from ERA5-Land, and the daily minimum TEM from DMDC (V3.0) was used to validate the daily minimum *Ta* from ERA5-Land. Figures 2 and 3 show the statistical indices between the daily minimum *Ts* and *Ta* from ERA5-Land and the daily minimum GST and TEM from DMDC (V3.0) at 822 national surface weather stations in China, respectively.

**Figure 2.** Statistical indices of the daily minimum *Ts* from ERA5-Land against observed daily minimum GST from CMCD (V3.0) at 822 national surface weather stations in China. (**a**) *R*2, coefficient of determination; (**b**) *NSE*, Nash–Sutcliffe efficiency coefficient; (**c**) *MBE*, mean bias error; (**d**) *RMSE*, root mean square error.

As can be seen from Figures 2a and 3a, the correlation between ERA5-Land simulated and meteorological stations observed daily minimum surface temperature and air temperatures is pretty good, with an *R*<sup>2</sup> greater than 0.6 at all 822 national surface weather stations across China. Overall, the performance of ERA5-Land for daily minimum *Ts* and *Ta* simulation is better in East China than in West China, and better in flat regions than in high-altitude regions, and the performance of ERA5-Land for daily minimum *Ts* simulations is better than that for daily minimum *Ta* simulations. Meanwhile, it can be seen from Figures 2 and 3 that there is a significant underestimation of daily minimum *Ts* and *Ta* simulated by ERA5-Land, and this underestimation shows a regional pattern, with

the underestimation of daily minimum *Ts* and *Ta* mainly concentrated in regions with high altitude and complex terrain, such as the southeastern Qinghai–Tibet Plateau. The regional underestimation of daily minimum *Ts* and *Ta* from ERA5-Land may have a negative impact on the quantification of local trends and variations in FFP and FD.

**Figure 3.** Statistical indices of the daily minimum *Ta* from ERA5-Land against observed daily minimum TEM from CMCD (V3.0) at 822 national surface weather stations in China. (**a**) *R*2, coefficient of determination; (**b**) *NSE*, Nash–Sutcliffe efficiency coefficient; (**c**) *MBE*, mean bias error; (**d**) *RMSE*, root mean square error.

#### *3.2. Spatial and Temporal Variation of Annual Mean Minimum Ts and Ta* 3.2.1. Annual Mean Minimum *Ts*

To quantify the trends and variations in daily minimum *Ts* and *Ta* across China from 1950 to 2020, statistics and trend tests were conducted on the annual mean minimum *Ts* and *Ta*, as shown in Figure 4. From Figure 4a, the spatial distribution of the annual mean minimum *Ts* has significant latitudinal zonality and altitudinal dependence. The annual mean minimum *Ts* decreases with increasing altitude in high-altitude regions such as the Qinghai–Tibet Plateau, the Tianshan Mountains and the Altai Mountains, while the annual minimum *Ts* decreases with increasing latitude in East China and other regions.

Based on the MK trend test (Figure 4b), it shows that the annual mean minimum *Ts* generally shows a significant increasing trend in other regions of China, except for the borderlands in West China, the eastern Qinghai–Tibet Plateau and the Yunnan–Kweichow Plateau, where there is no significant increasing or decreasing trend of annual mean minimum *Ts*, but the degree of increase in annual mean minimum *Ts* varies greatly between regions. As can be seen from Figure 4c, Northeast China, northern Xinjiang, North China and the central and southeastern Qinghai–Tibet Plateau were the regions with the largest rates of increase in annual mean minimum *Ts* in China. In particular, the rate of increase in annual mean minimum *Ts* in Northeast China exceeded 0.2 ◦C/10a, with most regions experiencing increases of between 0.2 ◦C/10a and 0.53 ◦C/10a, indicating that Northeast China was the region with the most significant increase in annual mean minimum *Ts* in China. In addition, as can also be seen from Figure 4c, individual regions, such as the eastern and southern margins of the Tianshan Mountains, parts of the eastern Qinghai– Tibet Plateau and parts of northwest Yunnan Province showed decreasing trends in annual mean minimum *Ts*, but for the whole country, the annual mean minimum *Ts* was generally increasing rapidly.

**Figure 4.** Spatial distribution of annual mean minimum *Ts* and *Ta*. (**a**) Trends of annual mean minimum *Ts* based on the Mann–Kendall trend test (95% confidence level), (**b**) Sen's slope of the annual mean minimum *Ts*, (**c**) annual mean minimum *Ta*, (**d**) trends of annual mean minimum *Ta* based on the Mann–Kendall trend test (95% confidence level), (**e**) Sen's slope of the annual mean minimum *Ta* (**f**) across China during the 1950~2020.

#### 3.2.2. Annual Mean Minimum *Ta*

Similar to the spatial distribution pattern of the multi-year average annual mean minimum *Ts*, the spatial distribution of the annual mean minimum *Ta* also has obvious latitudinal zonality and altitudinal dependence (Figure 4d). Except for high-altitude regions such as the Qinghai–Tibet Plateau, the Tianshan Mountains and the Altai Mountains, the annual mean minimum *Ta* in East China decreases with increasing latitude, and the Tarim Basin is the region with the highest annual mean minimum *Ta* in Northwest China due to its relatively closed geographical environment and geological conditions.

Except for the periphery of the Qinghai–Tibet Plateau and parts of the central Tarim Basin, where there is no significant trend in annual mean minimum *Ta*, the rest of China shows a significant increasing trend (Figure 4e). Most regions of China show a significant increasing trend in annual mean minimum *Ta*, but the degree of increase in annual mean minimum *Ta* varies considerably from place to place. As can be seen from Figure 4f, Northeast China, northern Xinjiang, North China and the eastern and central Qinghai–Tibet Plateau were the regions with the greatest rate of increase in annual mean minimum *Ta* across China, especially Northeast China, where the rate of increase in annual mean minimum *Ta* greater than 0.2 ◦C/10a, with most regions experiencing increases ranging from 0.2 ◦C/10a to 0.53 ◦C/10a, indicating that Northeast China was the region with the most significant increase in annual mean minimum *Ta* across China. Although very few regions in the Sichuan Province and Turpan Basins showed a decreasing trend in annual mean minimum *Ta*, nationally, the annual mean minimum *Ta* was generally increasing rapidly. In addition, as can be seen from Figure 4c,f, there was considerable spatial consistency in the variability of the annual mean minimum *Ts* and *Ta* across China.

#### *3.3. Spatial and Temporal Variation of FFP*

#### 3.3.1. Spatial Distribution

Figure 5a shows the spatial distribution pattern of the multi-year average FFP across China from 1950 to 2020, from which the spatial distribution of FFP across China was characterized by obvious latitudinal zonality and altitude dependence. It is obvious that the high-altitude mountainous regions of the Qinghai–Tibet Plateau, the Qilian Mountains, the Tianshan Mountains and the Altai Mountains were the regions with the smallest FFP, especially in the northern Tibetan Plateau and the Kunlun Mountains in the northwestern part of the Qinghai–Tibet Plateau, where FFP was even less than 10 days and there was almost no FFP throughout the year. Despite the high altitude of the Tsaidam Basin in the northeastern Qinghai–Tibet Plateau, its relatively enclosed geography has resulted in a much higher FFP in the basin than in the rest of the Qinghai–Tibet Plateau, making it an extremely special region on the Qinghai–Tibet Plateau. Except for the high-altitude regions, FFP in East and Northwest China largely showed a trend of decreasing with increasing latitude. Provinces in the tropical monsoon climate zone, such as Yunnan, Guangxi, Guangdong, Taiwan and Hainan, have very high FFP, all greater than 350 days. The Sichuan Basin was also a region with very high FFP, with most regions exceeding 300 days. Overall, the spatial distribution of FFP was consistent with the spatial variation of multi-year average surface temperatures in China.

**Figure 5.** Spatial distribution of multi-year average FFP, (**a**) standard deviation of FFP, (**b**) trends of FFP based on the Mann–Kendall trend test (**c**) and Sen's slope of FFP (**d**) across China during the 1950~2020.

The standard deviation of FFP from 1950 to 2020 in Figure 5b shows that the spatial distribution of FFP varies significantly across China, with the eastern and southern parts of the Qinghai–Tibet Plateau, the regions around the Tsaidam Basin and the Sichuan Basin, as well as the southern edge of the subtropical monsoon climate zone, being regions of high interannual variability in FFP in China. Especially in the eastern and southern parts of the Qinghai–Tibet Plateau, where the standard deviation of FFP was greater than 20 days, indicating that the annual mean surface temperature in these regions was more variable indirectly. The standard deviation of FFP in other regions of China was not very variable and did not show a clear latitudinal zonality.

#### 3.3.2. Temporal Variations

Interannual trends in FFP across China from 1950 to 2020 were quantified at the 95% confidence level based on the MK trend test and Sen's slope method, as shown in Figure 5c,d. As can be seen from Figure 5c, there were significant regional differences in the interannual trends of FFP across China. Overall, North China, Northeast China and Zhejiang Province were the regions where the significant increase in FFP was more concentrated. The outer Tarim Basin, the Junggar Basin and the Turpan Basin were also regions with a relatively high concentration of significant increases in FFP. In addition, the FFP in some regions of the Qinghai–Tibet Plateau and South China also showed a significant increasing trend, but did not show a clear regional pattern. Only a very few regions showed a significant decreasing trend in FFP, such as parts of the southern Sichuan Province and parts of the northern Yunnan Province. A comparison of Figure 5c,b shows that the interannual trends in FFP and the annual mean minimum *Ts* across China were not entirely consistent.

From the decadal rates of variation in FFP in Figure 5d, there were large differences in the degree of interannual variability in FFP across China. Most regions of China showed increasing trends in FFP, with rates of variation in FFP ranging from 0 to 6.2 d/10a, except for parts of western Inner Mongolia, northern Gansu Province, the Kunlun Mountains, the central and eastern Qinghai–Tibet Plateau and northern Yunnan Province. Overall, the rapid increase in FFP was the main feature of the variations in FFP across China during 1950~2020.

#### *3.4. Spatial and Temporal Variation of FD*

#### 3.4.1. Spatial Distribution

Similar to the spatial distribution of FFP, the spatial distribution of FD also showed a clear latitudinal zonality and altitudinal dependence (Figure 6a). Figure 6a shows the spatial pattern of the multi-year average FD across China from 1950 to 2020. The FD was greatest in the high-altitude mountainous regions of the Qinghai–Tibet Plateau, the Qilian Mountains, the Tianshan Mountains and the Altai Mountains, especially in the Kunlun Mountains in the northwestern part of the Qinghai–Tibet Plateau, where the FD even greater than 350 days. Despite its high altitude, the relatively enclosed geography of the Tsaidam Basin results in much lower FD in the basin than in other parts of the Tibetan Plateau, making the Tsaidam Basin a relatively special region on the Qinghai–Tibet Plateau. Apart from the high-altitude mountains, the FD in East China and Northwest China basically showed an increasing trend with increasing latitude. Provinces in the tropical monsoon climate zone, such as Yunnan, Guangxi, Guangdong, Taiwan and Hainan, have very low FD, all less than 10 days. The Sichuan Basin was also a region with very few FD, with most regions having less than 10 days. Overall, the spatial distribution of FD was consistent with the spatial variation of the annual mean minimum air temperature and was the opposite of the spatial distribution of FFP in China.

The standard deviation of FD from 1950 to 2020 (Figure 6b) shows that there were significant differences in the spatial variation of FD across China, with the central Qinghai– Tibet Plateau, the region around the Tsaidam Basin and the region around the Sichuan Basin being regions with large interannual variability in FD in China, indicating greater variability in air temperature in these regions indirectly. The standard deviation of FD in the rest of China varied little and was similar to the spatial variation of FPP, showing no clear latitudinal zonality.

**Figure 6.** Spatial distribution of multi-year average annual FD, (**a**) standard deviation of FD, (**b**) trends of FD based on the Mann–Kendall trend test (**c**) and Sen's slope of the FD (**d**) across China during the 1950~2020.

#### 3.4.2. Temporal Variations

Interannual trends in FD across China from 1950 to 2020 were quantified at the 95% confidence level based on the MK trend test and Sen's slope method, as shown in Figure 6c,d. As can be seen from Figure 6c, the trends of FD across China showed obvious regional differences. Overall, the FD in the peripheral alpine regions of the Qinghai–Tibet Plateau, Sichuan Basin, most of Yunnan Province, southern Guangxi Province, southern Guangdong Province, Taiwan Province and Hainan Province did not show a significant decreasing or increasing trend from 1950 to 2020. Except for the above-mentioned regions, all other regions in China showed a significant decreasing trend in FD, and only a very few regions showed a significant increasing trend in FD, indicating that the trends in FD across China were dominated by significant increasing trends, which was consistent with the trends in air temperature across China in the past decades.

Despite the clear regionalization of FD trends, the degree of increase or decrease in FD varied widely across China. Except for some alpine regions on the periphery of the Qinghai–Tibet Plateau, where an increasing trend in FD is seen, most regions in China showed a decreasing trend in FD, with rates of variation in FD ranging from 0 to −6.7/10a. The central and western parts of the Qinghai–Tibet Plateau, the middle and lower reaches of the Yangtze River, the Huaihe River Basin, northern Inner Mongolia and northern Xinjiang were the regions with the largest FD reduction rates.

#### *3.5. Regional Interannual Trends and Variations in FFP and FD*

The interannual trends and variations in FFP and FD for different climate zones were calculated by averaging the data for all gridded annual FFP and annual FD within each climate zone. Figure 7 shows the interannual trends and variations in FFP and FD across China and different climate zones from 1950 to 2020. As can be from Figure 7a, the annual FFP of China and different climate type zones showed a significant increasing trend, among which, the increase in FFP of SUMZ was the largest and the increase in FFP of TRMZ was the smallest. The national average rate of increase in FFP was 1.25 d/10a, and the rate of increase in FFP in SUMZ was 1.73 d/10a. The increase in annual FFP of China and different climate type zones was SUMZ > TEMZ > China > TECZ > ALCZ > TRMZ in order. Although the annual FFP showed an increasing trend across China and different climate zones, the process of increase was simultaneously phased. It was clear from Figure 7a that the FFP experienced a decreasing process for both China and the different climate zones from 1950 to the 1970s and 1980s, and then maintained a continuous fluctuating increasing trend from the 1970s and 1980s to the present. The phased variations in annual FFP across China were consistent with the phased variations in surface temperature across China, which was also consistent with the apparent warming of China from the 1970s and 1980s onwards [68,69].

**Figure 7.** Interannual trends and variations of FFP and FD for China and different climate type zones in China during the 1950~2020. *k* represents the degree of variation in annual FFP and FD, respectively, i.e., the slope of the data variation. (**a1**–**a6**) Annual FFP; (**b1**–**b6**) Annual FD.

Contrary to the trends and variations of annual FFP across China and different climate zones, except for the TRMZ, the annual FD showed a clear decreasing trend across China and other climate zones, where the TEMZ was the climate zone with the greatest decrease in FD. Despite the large interannual fluctuations in the FD of the TRMZ, the FD of the TRMZ from 1950 to 2020 did not show a significant decreasing trend. The national average rate of decrease in FD was −1.41 d/10a, and the rate of decrease in FD in SUMZ was −1.72 d/10a. The degree of decrease in annual FD across China and different climate zones was TEMZ > TECZ > SUMZ > China > ALCZ > TRMZ in order. Similarly, while there was a general decreasing trend in FD across China and in different climate zones, the decrease has been phased. It is clear from Figure 7b that the FD experienced an increasing process for both China and the different climate zones from 1950 to the 1970s and 1980s, and then maintained a continuous, fluctuating, decreasing trend from the 1970s and 1980s to the present. The phased variations in FD across China were consistent with the phased variations in air temperature across China, which was also consistent with the marked warming of China from the 1970s and 1980s onwards.

Among the different climate zones in China, the SUMZ was the zone with the most significant increase in FFP, while the TEMZ was the zone with the most significant decrease in FD, indicating that the increase in FFP and the decrease in FD was not consistent across China.

#### *3.6. Regional Variation Differences in FFP and FD*

In order to explore the degree of difference in the interannual variations of FFP and FD across China, the coefficient of variation (Cv) of FFP and FD across China was calculated. As can be seen from Figure 8, there were significant regional differences in the Cv of FFP and FD across China. It was obvious that, except for the Tarim Basin, high-altitude regions such as the Qinghai–Tibet Plateau, the Tianshan Mountains and the Altai Mountains were the regions with the largest Cv in FFP, with Cv ranging from 0.3 to 8.4. In particular, the Cv of FFP was even greater than 1 in the Kunlun Mountains in the northwestern part of the Qinghai–Tibet Plateau, while the Cv of FFP in the rest of China was not very variable, with Cv ranging from 0 to 0.3. Similarly, the Cv of FD across China also has obvious regional differences, with the Sichuan Basin, South China and southern Tibet being the regions with the largest Cv of FD, with a Cv of FD ranging from 0.3 to 8.4. In particular, the regions near the Tropic of Cancer were the regions with the largest Cv of FD, with Cv of FD greater than 1.

**Figure 8.** Comparison of the coefficient of variation (Cv) of FFP and FD across China from 1950~2020. (**a**) Cv of FFP; (**b**) Cv of FD.

The obvious regional differences in the Cv of FFP and FD across China indicating that the higher the altitude and lower the temperature, the greater the interannual variability of FFP, while the higher the temperature, the greater the interannual variability of FD. In addition, since there was little FD in the tropical monsoon climate zone in China, the Cv of FD in this region can be ignored.

Although the large regional variability exhibited by FFP and FD is related to regional climatic characteristics, the variability in FFP and FD differs from the latitudinal zonality and altitudinal dependence of climate. In particular, the spatial characteristics of FD variability are hardly latitudinally zoned, and the large regional variability of FFP and FD may have important implications for regional agricultural production, e.g., in the subtropical monsoon region of southern China, an increase in FD may cause damage to spring crops, which in turn may lead to reduced crop yields. Therefore, the large regional variability of FFP and FD may indicate the instability of regional climatic characteristics and regions where agrometeorological disasters occur more frequently to a certain extent.

#### **4. Discussion**

#### *4.1. Uncertainty of ERA5-Land on the Quantification of FFP and FD*

As can be seen from Figures 2a and 3a, the daily minimum *Ts* and *Ta* from ERA5- Land correlate reasonably well with the daily minimum GST and TEM from DMDC (V3.0). However, as can also be seen from Figures 2b–d and 3b–d, there is a clear regional underestimation of the daily minimum *Ts* and *Ta* from ERA5-Land. Based on the four evaluation indices, including the *R*2, *NSE*, *MBE* and *RMSE*, ERA5-Land is generally more applicable to surface and air temperatures simulation in East China than in West China and the Qinghai–Tibet Plateau, more applicable in flatter regions than in regions with high altitude and complex terrain, and more applicable in regions with dense weather stations than in regions with sparse weather stations. In particular, the southeast Qinghai–Tibet Plateau is the region where the ERA5-Land underestimates daily minimum *Ts* and *Ta* most significantly (Figures 2 and 3).

Studies have shown that the applicability of reanalysis dataset is poor in regions with sparse meteorological stations and complex terrain, such as Northwest China and the Qinghai–Tibet Plateau [43,70–73]. Reanalysis dataset is an important product of numerical modelling, obtained by assimilating quality-controlled observations (including ground, sounding, satellite, radar, buoy, aircraft, ships and other observations) into global or regional numerical model calculations [74,75]. Therefore, the accuracy of the reanalysis dataset for the simulation of land variables is reduced when ground-based or airborne observations are scarce. In addition, considering the complex terrain of the southeastern Qinghai–Tibet Plateau, the terrain height of the reanalysis dataset may differ from the actual altitude, leading to a deviation in the surface and air temperatures between the reanalysis and the in situ observed data [43]. In regions with complex terrain, the meteorological stations are generally constructed at lower elevations in flatter river valleys [36], and the terrain around the stations is highly undulating, with average elevations much higher than the elevations of the meteorological stations, making the average surface and air temperatures over a larger region much lower than those at lower elevations due to the temperature lapse rate [43].

The underestimation of daily minimum surface and air temperatures in a reanalysis dataset such as ERA5-Land is inevitable due to the scarcity of meteorological stations at high altitudes on the southeastern Qinghai–Tibet Plateau. Consequently, it is inevitable that the lack of ground-based meteorological stations on the southeastern Qinghai–Tibet Plateau leads to a reduction in the simulation accuracy of the ERA5-Land reanalysis dataset, which in turn leads to a reduction in the quantification accuracy of FFP and FD in this region. In fact, high-altitude mountainous regions such as the southeastern Qinghai–Tibet Plateau are not major crop production regions in China, so the uncertainty in the trends and variations in FFP and FD in this region based on the ERA5-Land does not affect the accuracy of quantifying the trends and variations in FFP and FD in China as a whole.

#### *4.2. Effect of Variations in FFP and FD on Agriculture Production*

As an important characteristic quantity of agro-climatic heat conditions, the FFP plays a crucial role in agricultural production, especially in North China [31]. Generally, the longer the FFP, the longer the growth period of crops, and the more conducive to the development of agricultural production [30,31]. Under climate warming, the increase in FFP leads to an increase in heat resources, and the increase in heat resources can prolong the growth period of crops or increase agricultural biological yields per unit region [22,31,32,76]. Frost is a meteorological disaster that can cause serious damage to crop growth [34,77]. The decrease in FD reduces the risk of frost damage to crops and is extremely beneficial to agricultural production [34,55]. The decrease in FD in North China has also led to a further expansion of wheat cultivation northwards [78,79]. In summary, the increase in FFP and the decrease in FD have had three main effects on agricultural production: firstly, the increase in FFP has contributed to a longer crop growing period; secondly, the decrease in FD has reduced the risk of extreme frost damage to crops; and thirdly, the increase in FFP and the decrease in FD have contributed to an adjustment in the layout of agricultural production to adapt to climate change.

However, the increase in FFP and the decrease in FD across China are the result of climate warming [16] which is not entirely positive for agricultural production. Generally, crops are divided into thermophilic crops such as rice and maize and chimonophilous crops such as wheat and barley [80–82], where climate warming is beneficial to the growth of thermophilic crops but detrimental to the growth of chimonophilous crops [81–84]. Studies have shown that climate warming leads to a reduction in the yields of chimonophilous crops such as barley on the Qinghai–Tibet Plateau, as it leads to a greater difference in saturated water vapor pressure, which in turn reduces the stomatal conductivity of plant leaves, reduces photosynthesis and ultimately leads to lower crop yields [85]. Climate warming has led to a longer growing season for thermophilic crops, but also to a shorter growing season for chimonophilous crops [81,82]. Therefore, when planning the layout of agricultural production, it is necessary to expand the cultivation of thermophilic crops in accordance with the increase in FFP and the decrease in FD, while shifting the cultivation of chimonophilous crops to regions with lower temperatures, thus adjusting the layout of agricultural production more scientifically in order to better adapt to climate change.

#### **5. Conclusions**

To investigate the trends and variations in FFP and FD across China under climate change, this study conducted a quantitative statistics and trend test of the FFP and FD across China from 1950 to 2020 based on the ERA5-Land reanalysis dataset. Although the ERA5-Land simulations of daily minimum *Ts* and *Ta* are significantly underestimated regionally in regions such as the southeastern Qinghai–Tibet Plateau with high altitude and complex terrain, from a national perspective, the ERA5-Land has good applicability for quantifying the trends and variations in FFP and FD.

The study revealed that both the annual mean minimum *Ts* and *Ta* showed a significant increasing trend from 1950 to 2020, with particularly pronounced warming in Northeast China, North China and northern Xinjiang, which has a lasting impact on the variations in FFP and FD. Quantitative statistics and trend test of FFP and FD across China from 1950 to 2020 found that both FFP and FD showed significant latitudinal zonality and altitude dependence, i.e., FFP decreased with increasing latitude and altitude, while FD increased with increasing latitude and altitude; at the 95% confidence level, FFP generally showed an increasing trend, while FD generally showed a decreasing trend across China, but there were large spatial differences in the increase in FFP and decrease in FD between regions, suggesting that the increase in FFP and decrease in FD was not entirely consistent in space.

For China as a whole, FFP showed a significant increasing trend with average increasing rate of 1.25 d/10a, the maximum increasing rate of FFP in individual regions was 6.2 d/10a, while FD showed a significant decreasing trend with an average decreasing rate of −1.41 d/10a, the maximum decreasing rate of FD in individual regions was −6.7 d/10a. Among the five major climate zones in China, the SUMZ was the region with the largest increase in FFP, with a rate of 1.73/10a, while the TEMZ was the region with the largest decrease in FD, with a rate of −1.72 d/10a. The variations in FFP and FD across China and in different climate zones were phased, which was consistent with the phased variations of surface and air temperatures across China over the past 70 years.

There is no denying that the variations in FFP and FD across China have a positive impact on agricultural production, but the adaptability of different crop species to climate change should also be considered when adjusting the layout of agricultural production to make it better and more scientifically adapted to climate change.

**Author Contributions:** Conceptualization, H.L. and C.H.; methodology, H.L. and Y.Y.; software, H.L. and G.L.; investigation, H.L., G.L. and C.H.; data curation, C.H. and R.C.; writing—original draft preparation, H.L.; writing—review and editing, G.L., C.H., Y.Y. and R.C.; funding acquisition, C.H., R.C. and Y.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Sciences Foundation of China (41971041), the Joint Research Project of Three-River Headwaters National Park, Chinese Academy of Sciences and the People's Government of Qinghai Province (LHZX-2020-11), the Sciences and Technology Plan Project of Gansu Province (21JR7RA056) and the Open Research Fund of the National Cryosphere Desert Data Center (2021kf09).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The ERA5-Land reanalysis dataset from the ECMWF used in this study can be accessed online (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land? tab=overview/ (accessed on 1 March 2022)). The Daily Meteorological Dataset of basic meteorological elements of China National Surface Weather Stations provided by the China Meteorological Administration can be accessed online (http://data.cma.cn/ (accessed on 1 March 2022)).

**Acknowledgments:** The authors would like to thank the European Centre for Medium-Range Weather Forecasts (ECMWF) (https://cds.climate.copernicus.eu/ (accessed on 1 March 2022)) for providing the ERA5-Land data, and the authors would also like to thank all team members in Qilian Alpine Ecology and Hydrology Research Station, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Flood Management, Characterization and Vulnerability Analysis Using an Integrated RS-GIS and 2D Hydrodynamic Modelling Approach: The Case of Deg Nullah, Pakistan**

**Ijaz Ahmad 1, Xiuquan Wang 2,\*, Muhammad Waseem 1, Muhammad Zaman 3, Farhan Aziz 2, Rana Zain Nabi Khan <sup>1</sup> and Muhammad Ashraf 4,5**


**Abstract:** One-dimensional (1D) hydraulic models have been extensively used to conduct flood simulations for investigating flood depth and extent maps. However, the 1D models cannot simulate many other flood characteristics, such as flood velocity, duration, arrival time and recession time when the flow is not restricted within the channel. These flood characteristics cannot be disregarded as they play an important role in developing flood mitigation and evacuation strategies. This study formulates a two-dimensional (2D) hydrodynamic model combined with remote sensing (RS) and geographic information system (GIS) approach to generate additional flood characteristic maps that cannot be produced with 1D models. The model was applied to a transboundary river of Deg Nullah in Pakistan to simulate an extreme flood event experience in 2014. The flood extent images from the moderate resolution imaging spectroradiometer (MODIS) and observed flood extents were used to evaluate the model performance. Moreover, an entropy distance-based approach was proposed to facilitate the integrated multivariate flood vulnerability classification. The simulated 2D flood modeling results showed a good agreement with the flood extents registered by MODIS and the observed ones. The northwest parts of Deg Nullah near Seowal, Dullam Kahalwan and Zafarwal were the most vulnerable areas due to high flood depths and prolonged flooding duration. Whereas high flood velocities, short flood arrival time, prolonged flood duration and recession times were observed in the upper reach of Deg Nullah thereby making it the most susceptible, critical and vulnerable region to flooding events.

**Keywords:** HEC-RAS 2D model; flood characterization; flood vulnerability; flood hazard maps; Deg Nullah

#### **1. Introduction**

Floods are considered the most devastating hazard all over the world. Floods claim more loss of life than any other hydrometeorological disasters [1], and negatively affect the socioeconomic development of many countries [2]. Moreover, the frequency and intensity of these flood events are increasing in many regions around the world, making it a problem of major concern in the sustainable development of countries vulnerable to flooding events [3]. The main reason behind these unprecedented flood events is the rise in temperature and precipitation since the 1950s due to anthropogenic activities and increased

**Citation:** Ahmad, I.; Wang, X.; Waseem, M.; Zaman, M.; Aziz, F.; Khan, R.Z.N.; Ashraf, M. Flood Management, Characterization and Vulnerability Analysis Using an Integrated RS-GIS and 2D Hydrodynamic Modelling Approach: The Case of Deg Nullah, Pakistan. *Remote Sens.* **2022**, *14*, 2138. https://doi.org/10.3390/rs14092138

Academic Editor: Luca Brocca

Received: 24 February 2022 Accepted: 27 April 2022 Published: 29 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

rates of greenhouse gas emission [4]. Moreover, the increase in population growth rate and urbanization led to increased impervious areas and increased runoff due to the reduced infiltration [5]. Recent flood events are becoming more frequent and severe in Pakistan due to the population growth and increasing anthropogenic activities in the floodplains [6].

Pakistan faces almost one major flood event every three years [7] and drought-like conditions every six years [8], which is one of the main challenges to the country's sustainable economic development. Pakistan faced the worst flood event of its history in 2010 [9]. Scientists from the World Climate Research Program and World Meteorological Organization (WMO) stated that climate change was one of the main reasons behind this unprecedented sequence of weather events in Pakistan [10]. Moreover, a United Nations (UN) based scientific body stated that the hot extremes, heat waves and heavy precipitation events are expected to be more frequent and intense in the future in this country [11]. WMO also qualified the assessments that the flood event of 2010 fit the sequence as predicted by climate experts and stated that these events match the projections of more frequent and intense climatic events due to global warming [12].

In recent years, remote sensing (RS) and geographic information systems (GIS) played an important role in presenting and validating the flood risk and hazard mapping results [13,14]. RS and GIS are considered effective geospatial tools assisting in preparing the flood characteristic maps and their impacts on the environmental, social and economic aspects [15]. These geospatial tools have been extensively used to map the results of simulation models of past flood events such as flood inundation extents, flood levels and to identify the critical facilities at risk [16,17]. These tools also support developing the flood forecasting and early warning systems in flood-prone areas, which are helpful in flood hazard management and preparedness planning [18,19]. RS is an important tool in acquiring the data required to formulate and validate the flood model result, flood risk assessment and to identify the infrastructure at risk (e.g., health and education facilities, transportation infrastructure, urban and commercial areas, etc.). Whereas, use of GIS helps in developing automated flood models and flood risk investigations (identification of urban areas, settlements and agricultural lands to different stages of floods), to generate flood inundation maps and to conduct other statistical analyses that may be useful in developing the flood disaster management strategies at various levels of flood events [20]. For instance, the analytical hierarchy process (AHP) method and a geographical information system (GIS) were utilized to create the flood hazard assessment map in the burned and urban areas of central Greece [21].

Assessment of flood risk and management practices is important in identifying the flood hazards, flood-prone areas and future mitigation strategies [22]. Understanding the characteristics of floods is essential in developing flood management action plans and analyzing the impact of various proposed management measures. In situ observations of flood events are the simplest ways to interpret the flood risk [23]. However, in situ observations of the floods are not available in many cases in developing countries. Moreover, flood analysis based on these observations is limited to specific flood events. Therefore, one of the other ways to conduct flood risk modeling studies is to use the RS data [24,25]. However, flood events usually occur under cloudy conditions, limiting the application of remotely sensed data in the visible and infrared spectrum due to visibility issues [26]. Therefore, these observation maps cannot be used directly to predict future flood events and to analyze the efficacy of various structural measures in developing flood mitigation strategies. The numerical models are considered more suitable for the simulation and prediction of flood events. Moreover, in many previous studies, numerical models have been efficiently applied to simulate flood events and investigate the flood hazards to develop management strategies [27–29].

Both one-dimensional (1D) and two-dimensional (2D) numerical hydrodynamic models combined with RS and GIS approaches have been extensively used in numerous studies for simulating flood events and risk assessment [4,25,30]. Although 1D flood models can be used for simulating the water levels and discharges when the flow is restricted within the channels [31,32], these models present many limitations when applied for simulating the overflow conditions. Moreover, methodologies used by 1D flood models are limited to generating flood hazard and depth maps. Flood depth maps are considered the primary input in evaluating flood impacts. However, additional flood characteristic maps, i.e., flood velocities, flood arrival times, flood duration and flood recession times, cannot be disregarded as they play an important role in the detailed design and analysis stages of flood mitigation studies [20]. For instance, high flood velocities with a longer flood duration can damage the flood mitigation structures and result in erosion and water pollution, which may cause environmental degradation, economic and life losses [33]. Awadallah et al. [34] used the LiDAR datasets in the HEC-RAS 2D model for comparing the flood inundations maps using the bathymetric and topographic terrain models in Norway. Therefore, the use of 2D models becomes indispensable to address the above-mentioned issues.

In previous studies, several researchers have used the 2D hydrodynamic models to assess flood risks and develop flood mitigation strategies. For instance, Santillan et al. [20] applied an integrated RS, GIS and 2D hydrodynamic modeling approach to generate flood characteristic maps, i.e., flood velocity, arrival time, flood duration and recession time, for improved flood disaster management in the Philippines. Quiroga et al. [31] used a 2D numerical model to perform the unsteady flow analysis and compared the model results with the satellite imagery in the floodplains of Bolivian Amazonia, Brazil. In another study, an integrated 1D/2D hydrodynamic model was applied to generate the flood hazard maps and velocities in the floodplains of the Musi river basin of Indonesia [35]. Bhandari et al. [4] employed the 2D hydrodynamic modeling approach to perform the unsteady flow analysis and flood inundation mapping to outline the flood-prone areas in the lower reaches of the Brazos river basin, USA. Elkhrachy [36] used the remote sensing images as independent variables in the HEC-RAS 2D simulation model to estimate the post-flash flood event depths of 24–26 April 2018 in New Cairo, Egypt. Moreover, 2D hydrodynamic modeling approaches combined with RS and GIS have been applied since the release of the HEC-RAS 2D model in different regions of the world to investigate the flood events and to develop mitigation strategies [37–42].

Even with the extensive flood simulations using the hydrodynamic models to generate the flood hazard and depth maps, it is almost impossible for the decision-makers to develop flood mitigation and management strategies without analyzing the vulnerability of different elements at risk in the floodplains [43]. The concept of vulnerability has changed drastically in various disaster management studies based on their objectives. Consequently, several attempts were made to define the term vulnerability. For instance, Cutter [44] defined vulnerability as a hazard that includes natural risks and human response to these risks. However, Mitchell [45] stated that vulnerability is the degree of loss to a given element at risk (or set of elements) resulting from a given hazard at a given severity level. Moreover, Merz et al. [46] and Salami et al. [47] stated that it is a function of the vulnerability definition to elements at risk, exposure and susceptibility. The vulnerability of different elements (e.g., road networks, bridges, urban areas/settlements, etc.) at risk in the floodplains largely depends on water depth, flood velocities and duration of flooding [48]. Santillan et al. [20] categorized the levels of the flood hazard based on maximum depths into three classes, i.e., low (<0.5 m depth), medium (0.5–1.5 m depth) and high (>1.5 m depth) for the residential buildings. However, in most of the previous studies, flood extents and water depth were considered; whereas the flood variables such as flow velocities, flood duration and recession times were discussed only in a few studies [31,49]. For instance, Balijepalli and Oppong [50] suggested that the flood velocities of less than 2–3 m/s do not affect the physical infrastructure (e.g., road networks, bridges and urban areas/settlements) in floodplains. However, to the best of the author's knowledge, no study has been conducted

that performs the multivariate flood vulnerability analysis by using a 2D hydrodynamic flood modeling approach. Moreover, no flood research has been carried out by using an integrated RS-GIS and 2D hydrodynamic modeling approach together with vulnerability analysis in Pakistan to generate flood characteristics other than depth for developing flood management strategies.

Therefore, this study aims to simulate an extreme flood event with a return period of 200 years in Deg Nullah of Pakistan using an integrated RS-GIS and a 2D hydrodynamic modeling approach to generate the additional flood characteristic maps, i.e., flood velocities, arrival times, duration and flood recession times. The study area is subjected to frequent floods and result in disruption of socio-economic activities. Moreover, flood vulnerability analysis was performed to investigate the sensitivity of roads, railway tracks and urban areas/settlements at risk to different flood variables such as flood depths, velocities, arrival times, duration and recession times in the study area. Therefore, it is believed that the results of the present study would help develop improved flood management strategies.

#### **2. Study Area**

Deg Nullah, a natural drain in Rachna Doab of Punjab province, was analyzed to simulate the 2014 major flood event. Two streams, Divak and Basenter join together in the Indian Administered Kashmir to form Deg Nullah before diverting into Pakistan near Lehri check post, located in the northeast of Narowal district. Deg Nullah is a straight twisted channel with wide and shallow cross-sections and uneven slopes. Flows of Deg Nullah often overspill during the monsoon season as its banks are not high enough to pass the large discharges. There is a reach of 75 km from Kingra-more to the Bambanwala-Ravi-Bedian (BRB) canal, as shown in Figure 1. Deg Nullah mostly remains dry, but carries a large volume of flows during monsoon season. The catchment of Deg Nullah receives most of the precipitation in summer (July to September) [51]. This heavy precipitation resulted from the combined effects of the monsoon precipitation regime from the Arabian Sea and Bay of Bengal, making it a meteorological complex region. Precipitation over the catchment of Deg Nullah causes high flow peaks during the monsoon season and the deteriorated condition of shallow banks intensifies the flooding problem and affects the road networks, urban areas, settlements, agricultural lands, etc.

**Figure 1.** Location of the study area.

In 2014, Deg Nullah experienced one of the worst flood events of its history and caused severe damage to its surrounding areas. A peak flood of 2050 m3/s with a return level of 200 years was observed in Deg Nullah on 6 September 2014 [52]. Therefore, water overflowed from the banks of the Nullah and caused submergence in Zafarwal city, Qila Ahmedabad town and 55 villages around Deg Nullah. This flood event caused the displacement of a vast population from urban areas/settlements, road networks were damaged due to submergence and negatively affected the standing crops over a large area. Moreover, more than 25 health centers and 104 government schools were submerged [53]. Residents of this area blamed the Deg Nullah authorities for not providing accurate information regarding the flood events. Therefore, accurate evaluations of the flood extents, depth, velocity, arrival and recession times are essential for developing improved flood management strategies in this region.

#### **3. Materials and Methods**

#### *3.1. Numerical Model*

This study used a 2D hydrodynamic model (HEC-RASv5) due to its various advantages over 1D models. For instance, 1D flood models are considered suitable when the flow is restricted within the channel. However, when the overflow conditions prevail, it limits the applicability of 1D flood models in simulating flood characteristics and defining the flood flow directions. It is not always realistic to define the flow paths or directions in the floodplains, e.g., in flat areas with large variations in water levels, 1D models do not provide any information regarding the velocity distribution when the water leaves the main channel to the floodplains. However, these limitations can be efficiently addressed by 2D hydrodynamic flood modeling. These flood models can simulate additional flood characteristics such as flood arrival times, flood velocities, flood duration and flood recession times [54]. Moreover, remotely sensed datasets have been extensively used with the HEC-RAS 2D model to simulate, validate and forecast flood events in different regions of the world [16,20,34,36].

Therefore, a 2D hydrodynamic model (HEC-RAS 2D) developed by the USACE, was used to simulate the selected flood events in the Deg Nullah catchment area. This model solved either the full 2D Saint-Venant equations or 2D diffusive equation as given below:

$$\frac{\partial \tilde{\xi}}{\partial t} + \frac{\partial p}{\partial x} + \frac{\partial q}{\partial y} = 0 \tag{1}$$

$$\frac{\partial p}{\partial t} + \frac{\partial}{\partial \mathbf{x}} \left( \frac{p^2}{h} \right) + \frac{\partial}{\partial y} \left( \frac{pq}{h} \right) = -\frac{n^2 pg\sqrt{p^2 + q^2}}{h^2} - gh \frac{\partial \mathfrak{J}}{\partial \mathbf{x}} + pf + \frac{\partial}{\rho \partial \mathbf{x}} (h \mathfrak{r}\_{\mathbf{x}\mathbf{x}}) + \frac{\partial}{\rho \partial y} (h \mathfrak{r}\_{yy}) \tag{2}$$

$$\frac{\partial\eta}{\partial t} + \frac{\partial}{\partial y}\left(\frac{q^2}{h}\right) + \frac{\partial}{\partial x}\left(\frac{pq}{h}\right) = -\frac{n^2 pg\sqrt{p^2 + q^2}}{h^2} - gh\frac{\partial\xi}{\partial y} + qf + \frac{\partial}{\rho\partial y}\left(h\tau\_{yy}\right) + \frac{\partial}{\rho\partial x}(h\tau\_{xx})\tag{3}$$

where represents the surface elevation (*m*); *p* and *q* represent the specific flows in the *x* and *y* directions (m2/s); *h* depicts the depth of water (*m*); *g* is the acceleration due to gravity (m/s2); *n* is the Manning roughness coefficient, *q* represents the density of water (kg/m3); *sxx*, *syy* and *sxy* are the components of the effective shear stress and *f* is the Coriolis (s<sup>−</sup>1) [31].

#### *3.2. Overall View of the Research Approach*

Numerical simulations of physical processes are considered efficient through an iterative development of the model. The accuracy of the numerical models depends upon the quality of the available data and hydrologic understanding of the problem being investigated. A brief description of the study approach being adopted to achieve the objectives is presented in the flowchart (Figure 2) and described in the subsequent paragraphs.

**Figure 2.** Overall summary of steps of methodology in the form of a flowchart.

The flow hydrographs of Deg Nullah at Kingra bridge station were collected from the Punjab Irrigation Department for the monsoon periods from 2014–2017 (Figure 3). The hydrograph data comprises an interval of one day. Among the collected hydrographs, the 2014 flood was the severest and was considered for flood analysis. The flow hydrograph of the Kingra bridge was used as the upstream boundary condition in the model.

The Shuttle Radar Topography Mission (SRTM) data are considered efficient in modeling the hydrological and hydraulic processes because of their free availability, homogeneity and consistency [55]. Therefore, the present study explores the hydrological modeling with the SRTM digital elevation model (DEM). The DEM with a resolution of 12.5 × 12.5 m was used. River cross-sections were considered the major input in both the 1D and 2D hydraulic modeling. The DEM was integrated with the observed cross-sections collected from the Punjab Irrigation Department at 53 locations across the Deg Nullah (Figure 4). The accuracy of the 2D hydrodynamic flood model, to analyze the water flow within the channel and floodplains (overflow conditions) was improved by integrating the DEM with the field observed cross-sections [20] and after that the study area was clipped from the DEM for further analysis. The DEM was then converted to a triangular irregular network (TIN) surface model whose surface does not deviate from the input raster. The DEM of the study area, along with the river cross-sections, is presented in Figure 4.

HEC-GeoRAS is a GIS extension designed to process geospatial data, allowing users to create an HEC-RAS import file containing geometric data. The results of the water surface profiles can be interpreted to examine the flood depths and extents. In this study, preprocessing was conducted in HEC-GeoRAS to generate the model files. Afterward, the HEC-GeoRAS file consisting of geometric data was imported to HEC-RAS to perform the hydraulic computations.

**Figure 3.** Observed flow hydrographs at Kingra bridge during monsoon periods of 2014–2017.

HEC-RAS model can either be used as an integrated 1D–2D model or as a fully 2D model to simulate the main rivers and floodplains, respectively. Although an integrated 1D–2D model might be faster compared to only the 2D modeling engine, it is necessary to define the connections of overflow locations between the 1D and 2D modules. In this study, a fully hydrodynamic 2D model was used, as the connections of overflow locations were unknown. The computational model domain (2D flow area) area was approximately 888 km2. This area was discretized into 100 × 100 m size cells/grids. The model was run from 1 July 2014 to 7 October 2014 at a time step of 1 min, whereas output results were generated at 1-h intervals. Moreover, two types of boundary conditions were used in the 2D hydraulic model, i.e., inflow hydrograph as an upstream boundary condition at Kingra bridge of Deg Nullah and average riverbed slope was considered as the downstream boundary condition.

To compare the model-generated flood map with the observed flood extents (Table 1), the MODIS extent images were downloaded from https://modis.gsfc.nasa.gov/data/ (accessed on 28 March 2019) for the monsoon period of 2014. In addition, two field visits were also conducted on 30 April 2019 and 15 May 2019 to collect the flood extents data at a few locations to verify the model simulated flood maps, ground-truthing of settlements and riverbed elevations by using GPS, and interviewing the locals and personnel from the irrigation department. A total of 32 imageries were downloaded from MODIS and a map showing the maximum extent was selected for the comparison of results.

**Table 1.** Observed flood extent locations.


**Figure 4.** Observed river cross-sections, location of Deg Nullah and DEM of the study area.

#### *3.3. Flood Vulnerability Analysis*

In this study, first, we digitized all the road networks, railway tracks and urban areas/settlements within the study area by using the satellite imageries (Google Earth Pro) and further verified their locations during the field visits, as shown in Figure 5. Secondly, various flood characteristic maps, i.e., flood depths, velocities, arrival times, duration and recession times, were superimposed on the digitized maps of the road networks and urban areas/settlements to investigate the flood risks at different locations in the study area. The details regarding the flood hazard classification based on flood depth, velocity, arrival time and duration are presented in Table 2 [31,50,56,57]. Moreover, the details regarding important urban areas (cities, villages, etc.) and road networks in the study area are presented in Table 3.

**Table 2.** Flood hazard classification based on maximum depth, velocity, duration and arrival time [26,44,51,52].



**Table 3.** Detaild of important locations (cities, villages, etc.) and road networks in the study area.

**Figure 5.** Digitized locations of the road networks and urban areas/settlements.

#### *3.4. Multivariate Flood Vulnerability Classification*

Once the standard values of the flood hazard level, i.e., very low, low, medium, high and extreme, of different flood variables, i.e., depth, velocity, arrival time and duration, were selected along with all the related information (Table 2), the entropy distance-based multivariate flood vulnerability classification was performed by using the following approach.

Let *Sij* be the standard value of hazard level *i* of the *j*th flood variable, and *Pjk* is the value of the *j*th flood variable at location k.

Where *i* = *i*th hazard level = 1, 2, ... , 5; *j* = *j*th flood variable = 1, 2, ... , 4 and *k* = *k*th location = 1, 2, . . . , 383.

Normalization of *Sij* and *Pjk* was performed to avoid any dimension-related issues by using Equations (4) and (5), respectively.

$$N\_{ij} = \frac{\text{abs}\left(S\_{ij}\right)}{\sum\_{i=1}^{n=5} \text{abs}\left(S\_{ij}\right)}\tag{4}$$

$$M\_{jk} = \frac{\operatorname{abs}\left(P\_{jk}\right)}{\sum\_{k=1}^{n=383} \operatorname{abs}\left(P\_{jk}\right)}\tag{5}$$

After that, difference of the normalized values and entropy-based weights were calculated by using the following equations

$$d\_{jk} = \mathcal{M}\_{jk} - \mathcal{N}\_{ij} \tag{6}$$

$$w\_{\vec{j}} = \frac{e\_{\vec{j}}}{\sum\_{j=1}^{4} e\_{\vec{j}}} \tag{7}$$

in which

$$x\_j = 1 - \left[\frac{-1}{\ln(N)} \sum\_{k=1}^{n=383} M\_{jk} \ln\left(M\_{jk}\right)\right] \tag{8}$$

Finally, the entropy-based distance is calculated by using the following equation.

$$D\_k = \sqrt{\sum\_{k=1}^{383} w\_j \cdot \left(d\_{jk}\right)^2} \tag{9}$$

where *wj* is the entropy-based weight assigned to individual flood variables. *Dk* is the weighted Euclidean distance between the standard value of flood hazard level and the value obtained at different locations. A minimum value of the entropy-based distance of flood hazard level at a selected location represents the hazard level.

#### **4. Results**

#### *4.1. Model Performance Evaluation*

The performance of the HEC-RAS 2D model was evaluated by comparing the model simulated flood extents with the satellite image registered by MODIS on 6 September 2014. Moreover, two field visits were also conducted to identify the flood extents at three locations in the study area as shown in Figure 6 and summarized in Table 1. The image from 6 September 2014 was selected based on its maximum flood extents for calibration. Flood extents registered by MODIS and the simulated flood extents by the HEC-RASv5 model, along with the observed flood extents, are presented in Figure 6. It is clear from Figure 6 that the model simulated flood extents were consistent with the MODIS registered flood extents and the observed ones. The simulated flood extent was found 6% less compared to the flood extents obtained from the satellite image, which is considered acceptable due to the complexity of the system.

**Figure 6.** Comparison of model results with MODIS extents and observed flood marks.

#### *4.2. Model Performance Evaluation*

After successfully comparing the model results with the satellite imageries and observed flood extents, the HEC-RAS model was simulated to generate the flood characteristic maps of flood depths, velocities, arrival times, duration and recessions times. These flood characteristic maps are briefly discussed in the subsequent sections.

#### 4.2.1. Maximum Flood Depths

The maximum flood depth map was generated using the 2D hydraulic model, as shown in Figure 7. This map was generated by considering the maximum depth of water in each cell during the simulation period regardless of the time when this depth was achieved. It was observed that the majority of the flooded areas north of the Deg Nullah have water depths of less than 1 m. Whereas the lower parts of the Deg Nullah have depths ranging from 0.5 m to 2 m. The most extreme flood depths were observed in the northeast part of the study area. However, these cells represent a small portion of the area compared to the rest of the wet cells. This map of flood depths can be useful in flood preparedness strategies and in developing evacuation plans when a flood of the same magnitude is expected to occur. This map can also help identify the areas required to be alerted or need emergency evacuation before future flood events. Moreover, the map of the maximum flood depths (Figure 7) can also be evaluated based on the various flood hazard levels. For instance, upstream reaches of the Deg Nullah experienced extreme flood hazards compared to the lower reaches as the flooded water overflowing from the channel spread over vast areas.

**Figure 7.** Map of the flood based on the maximum flood depths.

#### 4.2.2. Maximum Flood Velocities

In addition to flood extent and depth maps, the HEC-RAS 2D model has the capability to generate velocity maps to depict the flood velocity at various locations in the study area as shown in Figure 8. The map of flood velocities would explain how fast the flood water will reach a particular location and thereby helps in developing flood mitigation strategies and evacuation planning. In most of the areas above the Deg Nullah, flood velocities vary between 0 to 1 m/s; whereas, in the western parts of the study area, the flood velocities were less than 0.5 m/s. Therefore, these areas are not expected to face severe flood risks due to flood velocities. However, flood velocities in the south of Deg Nullah vary from 0.5 to 1.5 m/s. At the upper reaches of Deg Nullah near Kingra bridge, flood velocities of more than 1.5 m/s were also observed. Such high flood velocities cause more water overflows from this point. Moreover, the areas with flood velocities of about 1 m/s or higher may pose additional hazards and evacuation will be more problematic due to the flash nature of flooding. These maps of flood velocities indicate the level of harm to which a community is exposed; therefore, they play an important role in developing flood risk management strategies.

**Figure 8.** Map of the maximum flood velocities.

#### 4.2.3. Flood Arrival Times

The maps of flood arrival times represent the model computed time from a specified simulation time when the water depth at a location reaches a specified inundation depth [54]. In this study, a flood arrival time map was generated in hours when a location is inundated at maximum depth (Figure 9). Most of the areas in the upper reaches of Deg Nullah achieved the maximum inundation depths within the first fourteen (14) hours. Whereas the lower reaches of the study area become flooded after 28 h from initiation of the flood event and so on. This implies that the areas in the upper reaches of the Deg Nullah are more vulnerable to floods and require a swifter evacuation response to flooding. These maps of flood arrival times are helpful in developing flood management and evacuation strategies.

**Figure 9.** Map showing flood arrival times from the start of the simulation period at different locations of the study area.

#### 4.2.4. Flood Duration

Figure 10 showed the flood duration map for the study. This map helps to understand the flood propagation in the areas that remain flooded for an extended duration. These maps are considered very useful in estimating the time required for a flood-affected community to return to their respective areas and in assessing the flood damages to crops, road networks and other critical facilities located in the area. Figure 10 showed that most of the areas in the upper reaches remain flooded for more than 80 h. However, flood durations decreased in the areas of lower reaches of Deg Nullah.

**Figure 10.** Map of flood duration at different locations of the study area.

4.2.5. Flood Recession Times

The map of flood recession times is presented in Figure 11. This map represents the number of hours that floodwater required to recede from different locations in the study area. The map of flood recession time generated showed that the flood-affected area required about 99 h to completely recede the inundation area generated due to the flood event of 2014. However, this duration decreased at locations far from the mainstream of Deg Nullah. This map has the same importance as flood duration in developing flood mitigation and management strategies.

**Figure 11.** Map of flood recession times at different locations in the study area.

#### *4.3. Flood Vulnerability Analysis*

Physical infrastructure (e.g., road networks, railway tracks and urban areas/settlements) in the floodplains was considered to determine their vulnerability to flooding. The road networks and urban areas/settlements were digitized using the satellite imageries (Google Earth Pro) and later verified during the field visits. A total of 381 polygons of urban areas/settlements were marked in the study area. These polygons cover a total urban area/settlements of 47 km2, which is about 6% of the entire study area. Whereas the total length of road networks in the study area was about 175 km connecting these urban areas/settlements and other major cities. The spatial flood characteristic maps, such as flood depths, velocities, arrival time, duration and recession time generated from the model for the flood event of September 2014, were superimposed on the maps of urban areas/settlements and road networks for vulnerability analysis. The results of the flood vulnerabilities to urban areas/settlements and road networks for various flood characteristic maps are presented in Figure 12. The flood vulnerability map is self-explanatory; however, flood hazard levels to maximum water depths and velocities are explained in the subsequent paragraph based on the water depths and velocities around the urban areas/settlements and road networks. Different flood hazard levels based on maximum depths are presented in Table 2.

**Figure 12.** Multivariate flood hazard classification map along with the urban areas/settlements and road networks.

#### **5. Discussion**

Flooding events can significantly impact economic conditions and even threaten human life. These events are considered as the most disastrous among the hydro-meteorological hazards [58]. Hence, investigation of these events is highly important in developing mitigation strategies to reduce their negative impacts on human life and infrastructure. Flood simulation through hydraulic models has been extensively used for this purpose. The present research used the HEC-RAS 2D model for simulating an extreme flood event observed in a transboundary Deg Nullah which often experiences floods during the monsoon season and results in economic and human life loss. However, before the application of the model to generate flood characteristic maps, such as flood depth, velocity, arrival time, duration and recession time, the model simulated flood extents were compared with the extents registered by MODIS and later verified during the field visits. The model simulated flood showed good performance when compared with a satellite image of the

flood event [31]. The flood extent area simulated by the model was about 6% less than that as registered by MODIS extents. However, during the field visits, maximum flooding extents were verified by observations and discussions with locals at three different locations (Figure 6). Moreover, previous studies [27,59,60] confirmed the acceptability of the present study results in which the model simulated flood areas were found to be less compared to that of the satellite imageries. This reduction in the model simulated areas compared to satellite imageries may be due to the courser DEM resolution [61] and due to the limitations of the HEC-RAS sub-grid configuration to produce a continuous inundation pattern that restricts the calculation of flood extent and the distribution of local water depth values [61]. Therefore, further studies may be carried out using the high-resolution DEM for improved model performance.

The maps of various flood characteristics such as flood depths, velocities, arrival times, duration and recession times were prepared to evaluate their distribution in the study area for an extreme flood event of 2050 m3/s [52]. The flood depth allows identifying areas exposed to different hazard levels. In most of the areas located in the northern part of Deg Nullah, the flood depths are less than 1 m whereas in its southern parts depth varies between 0.5–2 m, thereby posing low and low medium threats, respectively [56]. However, results have demonstrated that most of the settlements are constructed in the flood prone areas, which necessitates proactive planning and selection of the proper construction rules for the prevention and mitigation of the consequences of flood hazards. Over most of the flooded area, the water velocity was less than 0.1 m/s. However, water velocity in the south of Deg Nullah varies between 1–1.5 m/s, whereas at Kingra bridge it exceeds 1.5 m/s (Figure 8). The evacuation process may become difficult in areas with floodwater velocities of more than 1 m/s [57]. Flood arrival times at particular locations are one of the major variables along with velocity, depth, safe locations, etc., used to prepare the evacuation strategies [62]. The floodwater in the upper reaches of Deg Nullah attains the maximum depth within the first 14 h after the overflow condition, whereas in lower parts of the study area, it took more than 28 h and thereby, hazard levels are classified as low to medium [47]. The north part of the study area is the most exposed to the flood; it showed a larger flood extent, longer flood duration and deeper water depth. The flood that threatens the Seowal and Dullam-Kahalwan towns originates from the Kingra bridge, where the safe carrying capacity of the channel is only 280 m3/s [53]. The flood from the north gets close to Seowal hours after it begins to overflow, while it takes one day for the flood to reach the southern parts of the study area when the channel begins to overflow. Moreover, the flood in the north region is deeper than the flood in the south, the flood from the north begins to flood before the south. Therefore, flood hazard levels are higher in the northern region compared to the south and required efficient flood mitigation strategies. The study showed the applicability and the value of the 2D capabilities of the new HEC-RAS for flood studies.

Flood vulnerability of settlements and road networks to maximum water depths is presented in Figure 12. It is recorded that the vast majority of road networks are vulnerable to medium to high flood risks and may reduce the accessibility to certain parts of the areas or in the extreme, they may remain completely cut off from the rest until the affected links are revived [21]. Therefore, the vulnerability of road networks along with the urban areas and settlements to flood events was investigated by superimposing their digitized maps over the flood characteristic maps. The settlements in the upper reaches experienced medium to extreme flood hazard levels based on the water depths. However, these hazards decreased gradually as the flood water arrived in the lower reaches of Deg Nullah. Moreover, Balijepalli and Oppong [50] stated that the physical infrastructure (roads, bridges and buildings) is not subjected to severe damage for velocities below 2–3 m/s. In most areas, flood velocities are lower than 3 m/s and only a few locations in the upper reaches of Deg Nullah are experiencing flood velocities of more than 3 m/s as shown in Figure 8. However, the vulnerability of different infrastructures needs to be evaluated by considering the flood depths and velocities together. Therefore, multivariate flood vulnerability classification was performed to assess the combined effect of different flood variables (i.e., flood depth, velocity, arrival time and duration and recession times) at various locations in the study area. Most of the study area lies within the medium hazard levels; however, upper and middle reaches near the channel pose high to extreme threats (Figure 12). Therefore, detailed studies related to the level of extreme events and the vulnerability of the exposed elements at risk (i.e., urban areas/settlements) will help propose the proper protection measures [63]. Moreover, the appropriate mitigation or hazard reduction approaches can be more effectively designed and applied. Furthermore, the awareness of the non-safe regions related to flood hazards might be helpful in emergency preparedness planning.

Therefore, scientists, engineers, stakeholders, planners and decision-makers may utilize the proposed approach in forthcoming spatial planning projects [64,65]. Additionally, the local authorities may use the produced map to guide the adoption of measures and strategies aiming toward flood hazard mitigation and post-fire management. The present study was limited to simulating the flood event of 2014 with return levels of 200 years. Further studies may be carried out for the floods of different return periods to develop flood mitigation strategies. Moreover, sediment and riverbed morphology may be included in the simulation for detailed investigation in future studies.

#### **6. Conclusions**

In this study, the extreme flood event of 6 September 2014 of Deg Nullah was simulated using a 2D HEC-RAS hydrodynamic model. The model simulated results showed a good agreement with flood extents registered by satellite images (MODIS) and observed flood extents. Moreover, flood characteristics, including depth, velocity, arrival time, duration and recession time, were generated by the HEC-RAS 2D model. Model simulated flood extents were found consistent with the flood extents registered by MODIS along with the field observed flood extents. Model simulated flood extent area was found 6% less than the flood extents obtained from the satellite image, which is considered acceptable.

The results of this study showed that the northwest parts of Deg Nullah near Seowal (ID-2), Dullam Kahalwan (ID-3) and Zafarwal (ID-12) were found most hazardous due to high flood depths and longer flood duration. Therefore, these areas are considered the most critical for evacuation. The flood velocities were found to be less than 1.5 m/s at most of the study area locations, thereby posing no serious threats due to the low velocity of the floodwater. However, in the upper reaches of Deg Nullah, floodwater velocities of greater than 2 m/s were observed. Moreover, flood duration in this region was also the longest among the rest of the areas. Therefore, high flood depths, flood velocities, longer flood duration and recession times and less flood arrival times make it the most critical and sensitive region to future flood events.

The present study analyzes the flood vulnerability of residential areas and road networks in the floodplains of Deg Nullah. Based on the hydrodynamic flood modeling results, it is noticeable that most of the residential areas and transportation infrastructure currently exists in medium to high flood risk zones and are likely to be exposed to future flood hazards of the same or more magnitudes. However, it was observed that the areas lying in the upper reaches of Deg Nullah were exposed to severe flood hazards due to high flood depths and velocities, faster flood arrival times, longer flood duration and recession times. Moreover, most of the urban areas/settlements in the middle and lower reaches of Deg Nullah were categorized in the range of low to medium hazard levels and the population may be considered safe inside their homes. Similarly, the road networks in the upper reaches were subjected to high flood depths and velocities which may influence the evacuation. The study showed that the HEC-RAS 2D model is an effective tool for studying and assessing the risk of flood events by coupling it with MODIS and other RS data in areas with little or no post-flood event information available. Future applications of the HEC-RAS 2D model with high-resolution DEM and other observed high floodwater marks may help analyze possible flood management strategies.

**Author Contributions:** Conceptualization, I.A. and R.Z.N.K.; data curation, M.W.; formal analysis, M.A.; funding acquisition, I.A.; investigation, M.W.; methodology, I.A., M.Z. and F.A.; project administration, I.A.; software, M.A., X.W., F.A. and R.Z.N.K.; supervision, X.W.; validation, M.Z.; visualization, M.Z.; writing—original draft, I.A. and F.A.; writing—review and editing, X.W. and M.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded funded by the Higher Education Commission of Pakistan (grant: SRGP-1593).

**Data Availability Statement:** Some or all data, models or code that support the findings of this study are available from the corresponding author upon reasonable request.

**Acknowledgments:** This research was funded funded by the Higher Education Commission of Pakistan (grant: SRGP-1593). The authors also thank the Punjab Irrigation Department for providing the data required for this research without any cost.

**Conflicts of Interest:** The authors declare that they have no conflict of interest in this work.

#### **References**


## *Article* **Resolution-Sensitive Added Value Analysis of CORDEX-CORE RegCM4-7 Past Seasonal Precipitation Simulations over Africa Using Satellite-Based Observational Products**

**Gnim Tchalim Gnitou 1, Guirong Tan 2,\*, Yan Hongming 3, Isaac Kwesi Nooni 1,4 and Kenny Thiam Choy Lim Kam Sian <sup>1</sup>**


**Abstract:** This study adopts a two-way approach to CORDEX-CORE RegCM4-7 seasonal precipitation simulations' Added Value (AV) analysis over Africa, which aims to quantify potential improvements introduced by the downscaling approach at high and low resolution, using satellite-based observational products. The results show that RegCM4-7 does add value to its driving Global Climate Models (GCMs) with a positive Added Value Coverage (AVC) ranging between 20 and 60% at high resolution, depending on the season and the boundary conditions. At low resolution, the results indicate an increase in the positive AVC by up to 20% compared to the high-resolution results, with an up to 8% decrease for instances where an increase is not observed. Typical climate zones such as West Africa, Central Africa, and Southern East Africa, where improvements by Regional Climate Models (RCMs) are expected due to strong dependence on mesoscale and fine-scale features, show positive AVC greater than 20%, regardless of the season and the driving GCM. These findings provide more evidence for confirming the hypothesis that the RCMs AV is influenced by their internal physics rather than being the product of a mere disaggregation of large-scale features provided by GCMs. Although the results show some dependencies to the driving GCMs relating to their equilibrium climate sensitivity nature, the findings at low resolutions similar to the native GCM resolutions make the influence of internal physics more important. The findings also feature the CORDEX-CORE RegCM4-7 precipitation simulations' potential in bridging the quality and resolution gap between coarse GCMs and high-resolution remote sensing datasets. Even if further post-processing activities, such as bias correction, may still be needed to remove persistent biases at high resolution, using upscaled RCMs as an alternative to GCMs for large-scale precipitation studies over Africa can be insightful if the AV and other performance statistics are satisfactory for the intended application.

**Keywords:** regional climate models; global climate models; precipitation; Africa; added value; satellite-based observations

#### **1. Introduction**

The distillation of regional climate information is crucial for anticipating the potential threats of regional-to-local climate change and formulating actionable adaptation and mitigation plans [1]. Carrying out these activities with primary climate models, known as Global Climate Models (GCMs), has been prohibitive due to the relatively coarse resolutions at which they are produced and the subsequent computational burden that could arise from increasing such resolutions. Moreover, the regional nature of decision-making expected

**Citation:** Gnitou, G.T.; Tan, G.; Hongming, Y.; Nooni, I.K.; Lim Kam Sian, K.T.C. Resolution-Sensitive Added Value Analysis of CORDEX-CORE RegCM4-7 Past Seasonal Precipitation Simulations over Africa Using Satellite-Based Observational Products. *Remote Sens.* **2022**, *14*, 2102. https://doi.org/10.3390/rs14092102

Academic Editors: Xander Wang and Beatriz M. Funatsu

Received: 2 March 2022 Accepted: 25 April 2022 Published: 27 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

from these activities demands the accountability of fine-scale atmospheric phenomena and heterogeneity of surface properties not explicitly resolved by GCMs [2].

To address this central issue, limited-area nested Regional Climate Models (RCMs) are used as a tool to downscale large-scale boundary conditions from GCMs or reanalysis data. The expectation from the RCMs is to serve as a "magnifying glass" to reveal fine-scale details that are hampered by the impossibility of running the GCMs at the desired high resolution [3]. This one-way nesting approach to dynamical downscaling has been the backbone of the three decade's worth of researching and developing RCMs, with substantial applications and use cases for various scientific problems worldwide [4,5]. Although limited-area two-way nesting models have been introduced for understanding feedback from a regional to a global scale, which one-way models do not consider, their results remained inconclusive and sometimes difficult to obtain due to their computationally demanding nature [4].

The large adoption of the one-way nesting methods for the production of RCMs also contributed to a paradigm shift in climate model evaluation and validation methods. Unlike traditional validation methods where the skill of the climate model output is directly assessed by comparison with observations, the one-way nesting method has triggered the need to quantify the improvement by the downscaling process, known as Added Value (AV) methods [6]. Since then, the AV concept has undergone many refinements and precondition relaxation to accommodate various validation scenarios. These scenarios range from a conjectural and subjective expectation of an AV to a more objective quantification based on observations [7].

Sticking with the principal aim of RCMs, which is to produce reliable high-resolution data for regional to local decision-making, AV studies remained mostly observational, with a one-way perspective of quantifying improvements by downscaling models from the coarse resolution GCMs to the high-resolution RCM outputs. Consequently, less interest was shown in a second type of observational AV, which could complementarily quantify in a two-way manner the AV feedback from a regional to a global scale [6]. Such a secondary metric could constitute a reliable way of quantifying the resolution-sensitivity of the AV results and disentangling results due to the RCMs' internal physics from the ones due to a mere disaggregation of the boundary condition [8].

The availability of RCM data in the public domain has served as a good playground for exploring the AV by one-way nested high-resolution RCMs [9]. This was made possible thanks to the World Research Climate Program (WRCP) under the Coordinated Regional Downscaling Experiment (CORDEX) [10–12], which made available a series of reanalysis and GCM-driven RCM outputs at approximately 50 km resolution in its first phase, while giving the highest priority to the African continent. Recently, the second phase, which aims at guaranteeing the availability of a homogeneous set of simulations over all the regions of the world, called the Common Regional Experiment Framework (CORDEX-CORE) [13–16], was launched by making available for its first experiments a set of simulations at an unprecedented resolution of 25 km and, thus, reaching common satellite-based observational products scale.

For Africa, which is a priority region in the context of CORDEX, the production of highly resolved simulations has not been followed by an increase in the resolution of local or global ground-based observational datasets. This situation is detrimental to observational AV studies. In practice, one has to rely on satellite-based observational datasets produced at similar resolutions by keeping in mind that they might also have some biases. Moreover, RCMs AV studies over Africa [17–21] were conducted under the one-way paradigm, except for the recent work by Dosio et al. [19], where a two-way approach is adopted in the context of precipitation climate change projections. These studies show that the evaluated RCMs add value to their boundary conditions over Africa. Still, the extent to which such an AV could be sensitive to the resolution is usually not analyzed.

Beyond the valuable information that could be obtained using a two-way approach to AV analysis over Africa, the use of satellite-based observational products as a reference for such analysis offers the opportunity to explore the potential of dynamical downscaling to bridge the resolution and quality gap between GCMs and observational products based on remote sensing technology. These newly emerging needs are highly relevant for both the climate modelling and the remote sensing communities, especially for the precipitation variable over Africa, where a consensual and unified characterization is needed [22–24].

In this context, the AV metrics can represent reliable quantitative metrics for estimating how the downscaling approach improves the driving GCMs towards reproducing observed features by remotely sensed datasets. Such applications of the AV analysis could be valuable for climate data distillation [1] and for exploring the possibility of using highly resolved RCMs as proxies for satellite-based observations in climate projections, where observations are not available. Additionally, the AV analysis could be instrumental in choosing postprocessing methods such as bias correction if the RCMs' performances are inadequate for the intended application [21,25].

In this study, we propose an analysis of the CORDEX-CORE RegCM4-7 past precipitation simulations over Africa, with a perspective on the contribution of the resolution to AV results. The AV by the RCM simulations over the driving GCMs is computed and analyzed at fine- and large-scale resolutions to represent an improvement at a regional and global scale and to further understand AV sensitivity to resolution. The study results are also used to distinguish the role of the resolution from the role of the physics parameterizations used for the downscaling experiment of CORDEX-CORE RegCM4-7 over Africa and quantify its contribution to bridging the gap between GCMs and satellite-based observational precipitation products.

#### **2. Study Area, Data, and Methods**

#### *2.1. Study Area*

The African continent was designated as the highest priority region in the context of the CORDEX framework, owing to its vulnerability to global warming and its deficit of infrastructural resources needed to carry out climate projection modeling activities [13]. Africa represents a key domain of the 9 out of 14 continental CORDEX domains considered for the resolution doubling CORDEX-CORE simulations [16], and an undeniable testbed region for typical improvement expected from highly resolved RCMs, especially for fine scale-dependent variables such as precipitation [7,9,13]. This is particularly true concerning Africa's complex topographic structure, and its unique and homogeneous climate zones such as the Sahara (SAH), West Africa (WAF), Central Africa (CAF), Northern East Africa (NEAF), Southern East Africa, Eastern South Africa (ESAF) and Western South Africa (WSAF) (see Figure 1).

Moreover, Africa is a hotspot for large-scale precipitation patterns, which are still not adequately reproduced by state-of-the-art GCMs [26,27]. Although the AV by RCMs is usually expected at a finer scale, the extent to which such improvements can cumulatively enhance large-scale precipitation patterns and represent a large-scale AV at GCMs resolutions is an open question [6,7,28]. Addressing such a question for Africa is critical, especially from the model users' perspective, given the increasing availability of various climate models over the continent and the possible risk of data misuse [1]. Additionally, such a distinction between AVs at large- and fine-scale can be resourceful in disentangling the role of resolution from the RCMs internal physics and reveal how sensitive the AV is to resolution.

Last but not least, the production of precipitation estimates complementing existing ones, such as satellite-based products, reanalysis data and traditional climate model data, has been recommended as a prerequisite toward unifying and further understanding rainfall over Africa [29]. This alternative to rainfall stations observation data is becoming unavoidable given the serious decline in the very few stations that have been operating for the past few decades [22]. The presence of an AV by CORDEX-CORE RegCM4-7 could be instrumental, not only to address the need for a consensus on rainfall over Africa, but also for a process-based understanding of decades of satellite-based climate data records available over the continent [24,30]. Another opportunity for the African modeling community could be the potential collaborations with the remote sensing community in order to leverage the gigantesque amount of literature available on data processing [31] and use it to enhance RCMs quality.

**Figure 1.** Topographical features of the African domain and its key climate zones.

#### *2.2. Data and Method*

The resolution-sensitive analysis of the AV by RCMs output, proposed in the present study, is mainly based on the CORDEX-CORE RegCM4-7 past precipitation simulations over the African domain. The RegCM4-7 RCM [32] is developed by the Abdus Salam International Center for Theoretical Physics (ITCP), located in Trieste, Italy. As part of the CORDEX-CORE's first experiment, the RegCM4-7 was used to downscale the ERA-Interim (ERA-INT) reanalysis data [33] for evaluation runs and 3 GCMs of the Coupled Model Intercomparison Project, phase 5 (CMIP5) [34] for the historical runs. HadGEM2-ES [35], MPI-ESM-MR [36], and NorESM1-M [37] GCMs, respectively, corresponding to high, medium and low equilibrium climate sensitivity, were used for the historical runs to capture the sensitivity range of the CMIP5 ensemble given the computational limitations related to downscaling a larger ensemble of simulations [38]. The AV sensitivity analysis is carried out using the 25 km resolution satellite-based product of the Climate Hazard Group InfraRed Precipitation with Station data version 2 (CHIRPS-2.0) [39] to represent fine-scale observations and its upscaled version at 250 km resolution to represent large-scale observation. The 250 km resolution used for the upscaling process is obtained from the Global Precipitation Climatology Project monthly data version 2.3(GPCPv2.3) [40,41], which is widely used for large-scale climate analysis. Although not directly presented in the results, GPCPv2.3 is used as a supplement to show potential uncertainty implications that may prevail at low resolutions in the context of resolution-sensitive AV studies. Further details about the different datasets used are provided in Table 1.



Both the climate simulations and observations datasets are acquired from 1981 to 2005. The driving GCMs datasets and the RegCM4-7 outputs are first interpolated at CHIRPS' 25 km grid resolution for high-resolution analysis and then upscaled to 250 km resolution to match CHIRPS' upscaled grids for large-scale analysis. The analysis focuses on the seasonal mean bias pattern of the GCM-driven RegCM4-7 precipitation outputs and their consistency with observations and structural biases from ERA-INT driven simulations, considering both high resolution (25 km) and coarse resolution (250 km). The analysis is carried out for the December–January–February (DJF), the March–April–May (MAM), the June–July–August (JJA), and the September–October–November (SON) seasons. The potential improvement of the RegCM4-7 outputs as compared to the driving GCMs at higher and coarser resolution

is also analyzed, using the *AV* metric proposed by Dosio et al., [17], for which the formula is given as follows:

$$AV = \frac{\left(X\_{GCM} - X\_{OBS}\right)^2 - \left(X\_{RCM} - X\_{OBS}\right)^2}{\max\left(\left(X\_{GCM} - X\_{OBS}\right)^2 \left(X\_{RCM} - X\_{OBS}\right)^2\right)}\tag{1}$$

where *XGCM*, *XRCM*, *XOBS* represent, respectively, the GCM, the RCM and the observation's statistics for which the *AV* is evaluated. The *AV* values vary between −1 and 1 to capture possible improvement or degradation of the RCM over the GCM.

The seasonal precipitation bias patterns of the GCMs and RegCM4-7 outputs and the subsequent *AV* outcomes are aggregated using the different climate zones over Africa (see Figure 1). Specifically, the Added Value Coverage (*AVC*), representing the percentage of grid cells showing a positive, negative, or non-significant *AV* is computed for each climate zone. The *AVC* provides a general figure of the improvement by the RegCM4-7 RCM, and allows adequate comparison among different zones, seasons and resolutions. The *AVC* formula is given as follows:

$$AVC\_{pos/n\text{g}/ns} = \frac{N\_{pos/n\text{g}/ns}}{N\_{tot}} \times 100\tag{2}$$

where *AVCpos*/*neg*/*ns* represents the positive, negative, or non-significant *AV* coverage, *Npos*/*neg*/*ns*, the number of pixels with a positive, negative, or non-significant *AV* and *Ntot*, the total number of pixels over the region considered.

Following previous use of the *AVC* [20,21], we use a threshold of 0.1 for significant positive *AV* and −0.1 for significant negative *AV*. This means that any pixel with an *AV* > 0.1 is considered a pixel with significant positive *AV*, and any pixel with an *AV* < −0.1 is considered a pixel with significant negative *AV*. Any pixel with an *AV* between −0.1 and 0.1 is considered non-significant.

#### **3. Results**

#### *3.1. Evaluation Results for DJF Season*

Figures 2 and S1 depict the DJF season's mean bias results by the driving GCMs and their RegCM4-7 dynamically downscaled outputs at a high resolution (25 km) and coarse resolution (250 km), using CHIRPS observations. Rain-abundant areas such as the southern CAF, SEAF, northern WSAF, and ESAF show similar patterns at high and coarse resolutions (Figures 2a and S1a). The driving GCMs (Figures 2c–f and S1c–f) tend to show wet biases of 1 to 10 mm/day over WSAF, ESAF, and parts of CAF with pronounced quantities of more than 6 mm/day in NorESM1-M (Figures 2d and S1d). Over northern CAF and NEAF, a dominance of dry biases ranging from 0 to −4 mm/day is observed with HadGEM2-ES (Figures 2c and S1c) and MPI-ESM-MR (Figures 2d and S1d), while wet biases of 0 to 8 mm/day are depicted by NorESM1-M (Figures 2d and S1d).

In accordance with RegCM4-7 evaluation runs by ERA-INT (Figures 2b and 3b), the dynamically downscaled outputs (Figures 2g–j and S1g–j) show persisting wet biases of about 1 to 8 mm/day, both at high and coarse resolution over Southern Africa (SAF), resulting in degraded AV for most simulations such as HadGEM2-ES (Figures 2k and S1k) and MPI-ESM-MR (Figures 2l and S1l) driven RegCM4-7 outputs. The presence of generally wet biases in the dynamically downscaled outputs compared to mainly dry biases in the driving GCMs is confirmed by the spatially averaged bias results reported in Table 2.

**Figure 2.** Performance of African precipitation in DJF season at high resolution compared to (**a**) CHIRPS, for (**b**) RegCM4-7 evaluation run driven by ERA-INT, (**c**) driving GCM HadGEM2- ES, (**d**) driving GCM MPI-ESM-MR, (**e**) driving GCM NorESM1-M, (**f**) ensemble mean of the driving GCMs, (**g**) RegCM4-7 historical run driven by HadGEM2-ES, (**h**) RegCM4-7 historical run driven by MPI-ESM-MR, (**i**) RegCM4-7 historical run driven by NorESM1-M, (**j**) RegCM4-7 historical runs' ensemble mean, (**k**) Added Value by RegCM4-7 to HadGEM2-ES, (**l**) Added Value by RegCM4-7 to MPI-ESM-MR, (**m**) Added Value by RegCM4-7 to NorESM1-M, (**n**) Added Value by RegCM4-7 to the ensemble mean of the driving GCMs.

**Figure 3.** Performance of African precipitation in MAM season at high resolution compared to (**a**) CHIRPS, for (**b**) RegCM4-7 evaluation run driven by ERA-INT, (**c**) driving GCM HadGEM2-ES, (**d**) driving GCM MPI-ESM-MR, (**e**) driving GCM NorESM1-M, (**f**) ensemble mean of the driving GCMs, (**g**) RegCM4-7 historical run driven by HadGEM2-ES, (**h**) RegCM4-7 historical run driven by MPI-ESM-MR, (**i**) RegCM4-7 historical run driven by NorESM1-M, (**j**) RegCM4-7 historical runs' ensemble mean, (**k**) Added Value by RegCM4-7 to HadGEM2-ES, (**l**) Added Value by RegCM4-7 to MPI-ESM-MR, (**m**) Added Value by RegCM4-7 to NorESM1-M, (**n**) Added Value by RegCM4-7 to the ensemble mean of the driving GCMs.

Over CAF, SEAF, and NEAF, reduced intensity of the dry biases by HadGEM2-ES and MPI-ESM-MR is observed in the downscaled outputs, with a sign shift from wet biases to slightly dry biases in NorESM1-M downscaled output. Consequently, a positive AV is observed for all simulations and their ensemble mean (Figures 2k–n and S1k–n) over CAF. The positive AV is extended to other dry areas such as WAF, NEAF and SAH. The relatively similar AV patterns at high and low resolutions are further confirmed by the AVC results (see Table 3), which report an increase/decrease of roughly 8%. Other differences are also observed in the error amplitude, which tends to show a systematic reduction from high- to low-resolution (see Table 2).

*Remote Sens.* **2022**, *14*, 2102



**Table 3.** Added Value Coverage results of RegCM4-7 simulations over continental Africa at high resolution (HR) and low resolution (LR).

#### *3.2. Evaluation Results for MAM Season*

In the MAM season, the performances of the RegCM4-7 outputs and their boundary forcing at downscaled and upscaled resolutions are shown in Figures 3 and S2. At high and low resolution, CHIRPS (Figures 3a and S2a) show similar rain belt expansion patterns over both WAF and NEAF in addition to rain-abundant areas such as CAF.

An underestimation of rainfall quantities of 0 to −4 mm/day is mostly found over NEAF and SEAF in all the driving GCMs (Figures 3c–f and S2c–f) and their dynamically downscaled outputs (Figures 3g–j and S2g–j). These biases tend to be identical except for MPI-ESM-MR (Figures 3h and S2h) driven RegCM4-7 output, which shows a reduction and a subsequent positive AV (Figures 3m and S2m). Although the biases in the driving GCMs (Figures 3c–f and S2c–f) over WAF and CAF show unique patterns based on the model used, their dynamically downscaled simulations (Figures 3g–j and S2g–j) share unique features that are highly similar to the evaluation runs driven by ERA-INT (Figures 3b and S2b).

An intensification of HadGEM2-ES's (Figures 3c and S2c) slightly wet biases (0–1 mm/day) in its downscaled output (Figures 3g and S2g) is observed over SAF, leading to negative AV (Figures 3k and S2k). A general dominance of dry biases for the driving GCMs and slightly wet biases for RegCM4-7 outputs are observed as reported in Table 2, which is similar to the DJF season results. The positive AVC results in Table 3 show an increase from high- to low-resolution results for MPI-ESM-MR, NorESM1-M and the ensemble mean downscaled outputs. At the same time, the HadGEM2-ES-based RegCM4-7 simulations report a decrease. Compared to the DJF season results, the results during MAM season are still within the 8% increase/decrease range, except for the downscaled NorESM1-M results, which show an increase of nearly 15%. The systematic reduction in the averaged error from high to low resolution remains the same as in DJF, even if the error amplitudes are lower (see Table 2).

#### *3.3. Evaluation Results for JJA Season*

In the JJA season, the mean biases of RegCM4-7 simulations and their driving GCMs at high and low resolutions are summarized in Figures 4 and S3. The results show dry and wet bias signals in all the climate simulations along the rain belt depicted over WAF, northern CAF and NEAF by CHIRPS (Figures 4a and S3a) at both resolutions. Dry biases of about 0–4 mm/day are observed over CAF and southern SAH, and wet biases of about 1–8 mm/day are observed along the remaining part of the rain belt over NEAF, SEAF and ESAF, in both the driving GCMs (Figures 4c–f and S3c–f) and the RegCM4-7 downscaled simulations (Figures 4g–j and S3g–j).

**Figure 4.** Performance of African precipitation in JJA season at high resolution compared to (**a**) CHIRPS, for (**b**) RegCM4-7 evaluation run driven by ERA-INT, (**c**) driving GCM HadGEM2- ES, (**d**) driving GCM MPI-ESM-MR, (**e**) driving GCM NorESM1-M, (**f**) ensemble mean of the driving GCMs, (**g**) RegCM4-7 historical run driven by HadGEM2-ES, (**h**) RegCM4-7 historical run driven by MPI-ESM-MR, (**i**) RegCM4-7 historical run driven by NorESM1-M, (**j**) RegCM4-7 historical runs' ensemble mean, (**k**) Added Value by RegCM4-7 to HadGEM2-ES, (**l**) Added Value by RegCM4-7 to MPI-ESM-MR, (**m**) Added Value by RegCM4-7 to NorESM1-M, (**n**) Added Value by RegCM4-7 to the ensemble mean of the driving GCMs.

The spatially averaged bias results (Table 2) show a systematic reduction in the RegCM4-7 averaged bias, regardless of the driving GCM, with the ensemble mean reporting improvement over some parts of CAF for all the GCM-based downscaled outputs (Figures 4g–j and S3g–j), compared to individual simulations. A positive AV is observed over parts of WAF for RegCM4-7 simulations driven by HadGEM2-ES and MPI-ESM-MR, while NorESM1-M-driven RegCM4-7 simulation depicts a positive AV over SAH, NEAF, SEAF and parts of ESAF and WSAF.

At a high resolution, RegCM4-7 reports a positive AVC of 33.94% for the simulation driven by HadGEM2-ES, 47.33% for the simulation driven by MPI-ESM-MR and 59.98% for NorESM1-M-driven simulation. The change from high to low resolution ranges between roughly 12 and 20% for all RegCM4-7 simulations, except the one driven by NorESM1- M, which decreases by 0.89% (Table 3). The error amplitude reduction from high to low resolution is also observed in the JJA season, as shown in Table 2.

#### *3.4. Evaluation Results for SON Season*

The evaluation and historical runs of RegCM4-7 and their driving GCMs results for SON season over Africa at high and low resolution are given in Figures 5 and S4. High resolution and upscaled CHIRPS observations (Figures 5a and S4a) show a retreat of the monsoonal belt toward coastal WAF and CAF with lightweight rain quantities over NEAF.

The driving GCMs (Figures 5c–f and S4c–f) exhibit wet biases ranging from 1 to 8 mm/day, mostly over CAF, with an extension to NEAF, SEAF and SAF in NorESM1- M results. Dry biases of −4 to 0 mm/day are observed, especially over WAF in the HadGEM2-ES results. The historical runs of RegCM4-7 (Figures 5g–j and S4g–j) depict wet biases, which tend to represent substantial reduction compared to the driving GCM results (Figures 5c–f and S4c–f) over parts of CAF and some parts of NEAF and SEAF. However, the downscaled simulations also show a second type of wet biases over most parts of WAF, ESAF and WSAF, particularly in the HadGEM2-ES- and MPI-ESM-MR-driven RegCM4-7 results, which tend to degrade the driving GCMs' results.

The spatially averaged bias results from Table 2 show a dominance of wet and relatively high biases in the RegCM4-7 simulations compared to dry and relatively low biases for the driving GCMs, when HadGEM2-ES and MPI-ESM-MR are considered. For NorESM1- M-based results, a systematic reduction in the driving GCM dry biases in the downscaled outputs is observed. These findings are reflected in the AV results (Figures 5k–n and S4k–n), with positive AV pixels observed mostly over CAF and SEAF for all RegCM4-7 simulations and their ensemble mean, and a higher positive AVC for NorESM1-M at both high and low resolution. The positive AVC of the simulations (see Table 3) indicates an increasing tendency from high- to low-resolution results of, at most, 11%, except MPI-ESM-MR, which demonstrates a dynamically downscaled output with a 22.71% positive AVC at high resolution and a 22.53% positive AVC at low resolution. Similar to the previous seasons' results, the SON season spatially averaged results (see Table 2) indicate a decrease in the error amplitude from high to low resolution.

**Figure 5.** Performance of African precipitation in SON season at high resolution compared to (**a**) CHIRPS, for (**b**) RegCM4-7 evaluation run driven by ERA-INT, (**c**) driving GCM HadGEM2-ES, (**d**) driving GCM MPI-ESM-MR, (**e**) driving GCM NorESM1-M, (**f**) ensemble mean of the driving GCMs, (**g**) RegCM4-7 historical run driven by HadGEM2-ES, (**h**) RegCM4-7 historical run driven by MPI-ESM-MR, (**i**) RegCM4-7 historical run driven by NorESM1-M, (**j**) RegCM4-7 historical runs' ensemble mean, (**k**) Added Value by RegCM4-7 to HadGEM2-ES, (**l**) Added Value by RegCM4-7 to MPI-ESM-MR, (**m**) Added Value by RegCM4-7 to NorESM1-M, (**n**) Added Value by RegCM4-7 to the ensemble mean of the driving GCMs.

#### *3.5. Unified Season, Sub-Area and Resolution-Based Results*

The seasonal performances of the RegCM4-7 historical runs over continental Africa indicate a clear similarity pattern between the results at low and high resolution, even if a wide range of differences in terms of AVC and error amplitude are reported in Tables 2 and 3. Due to the heterogeneous nature of the results over continental Africa, climate zones and seasons-based partition of the overall AVC findings are further presented in Figure 6. Overall, the AVC results show various outcomes based on the season, sub-area, and driving

GCM. NorESM1-M-dynamically downscaled RegCM4-7 output tends to show the highest positive AVC for all sub-regions and seasons, with few exceptions such as WAF in DJF (Figure 6e) and JJA (Figure 6g), and NEAF in MAM (Figure 6n).

**Figure 6.** Added Value Coverage results for all seasons over (**a**–**d**) SAH region, (**e**–**h**) WAF region, (**i**–**l**) CAF region, (**m**–**p**) NEAF region, (**q**–**t**) SEAF region, (**u**–**x**) WSAF region, and (**y**-**ab**) ESAF region. "\_HR" and "\_LR", respectively refer to high-resolution and low-resolution results using CHIRPS observational data.

The positive AVC changes from high to low resolution are mostly in line with the overall seasonal results from Table 3. CAF (Figure 6i–l), NEAF (Figure 6m–p) and SEAF (Figure 6q–t) represent areas with the most consistent positive AVC (mostly >50%), regardless of the driving GCMs, especially in DJF and MAM seasons. RegCM4-7 ensemble mean provides an acceptable performance tradeoff but can be less satisfactory in terms of positive AVC in some specific regions and seasons. Typical examples are the SAH and WSAF regions in SON season (Figure 6d,x), where the positive AVC is less than 10%. In general, sub-regions such as WAF (Figure 6e–h), CAF (Figure 6i–l), and SEAF (Figure 6q–t) show a positive AVC greater than 20%, regardless of the season and driving GCMs.

#### **4. Discussion**

The potential applicability of climate models in different climate studies is highly constrained by model resolution. Beyond the ability of RCMs to integrate fine-scale features, their relatively high resolution compared to GCMs often influences data user preferences [7,9]. The expectation of an AV by RCMs at high resolution owing to their fundamental design choices constitutes another incentive for RCM data use. AV issues have been central to the past few decades of research and development of RCMs [4,5]. Still, their discussions were tailored using the one-way paradigm, where the information flow from the driving GCMs to the RCM is prioritized.

Although sufficient to prove the presence of improvement in a statistical sense, this perspective of the AV gives fewer insights into the attribution of such improvements [7]. Moreover, GCMs are still useful for large-scale studies, but using RCMs as alternatives is still an active research question, especially for data-scarce parts of the world such as Africa. The methodological choices of the present study were mainly motivated by these mentioned issues and the need to provide valuable information towards a better understanding of the resolution-sensitivity of AV by RCMs over Africa, and their potential to bridge resolution and quality gaps between the GCMs and the high-resolution satellite-based products.

The results mainly indicate AV by RegCM4-7 simulations at their native resolution, with a typical dependence on the driving GCM, the season, and the sub-area. Similar results have been highlighted by Gnitou et al., [21] in the context of other CORDEX-CORE precipitation simulations. Moreover, other challenges related to CORDEX-CORE precipitation simulations over Africa, such as the persistence of dry and wet biases at noticeably high amplitudes along the seasonal rain belt, were also found in the RegCM4-7 results. The historical run biases have shown clear consistency with the evaluation run driven by ERA-INT, thus suggesting that the RCM internal model physics might have more influence than the driving GCMs, particularly along the seasonal rain belt. Reasons for such deficiencies may be similar to previously known ones, including missing or misrepresented processes and regional model transferability issues [42,43].

Although the influence of RegCM4-7 internal model physics appears to be consistent with the dynamically downscaled outputs, typical dependencies to the driving GCMs are found in terms of AV. For instance, the NorESM1-M dynamically downscaled output shows significantly higher positive AVC coverage as compared to other driving GCMs. These performances may be due to the high, medium and low equilibrium climate sensitivity criteria under which the driving GCMs were chosen [16,38]. The typically low equilibrium climate sensitivity nature of the NorESM1-M driving GCM may explain the higher positive AVC performances. Since the low sensitivity relates to relatively low performances and high error, the downscaled NorESM1-M results reinforce early conclusions by Diaconescu and Laprise [28] on the fact that RCMs can bring substantial error reduction when the driving lateral boundary condition contains errors.

Beyond considerations related to the AV by regional climate models seen from a one-way perspective, adopting a second type of AV based on upscaled results from high resolution yielded supplementary results and insights. At low resolution, a typical increase in the positive AVC is observed with some exceptions. These exceptions are related to cases with a decrease in the positive AVC from high to low resolution. In the worst case, however, these exceptions represent a reduction of 8% and therefore indicate that RegCM4- 7 simulations could be used for large-scale precipitation applications over Africa when sufficient AV is observed or considered enough for the intended application. These results make the RCMs influential contribution hypothesis to the AV more plausible than the idea of an AV due to a mere disaggregation of the driving GCMs. Previous findings by Gnitou et al. [21] and Sørland et al. [8] led to similar conclusions.

The resolution-sensitivity results also revealed that the improvement of the positive AVC from high to low resolution occurs in a climate zone where consistent and uniformly distributed spatial positive AV patterns are observed at high resolution. This general tendency is also true for areas with large negative AVC. Therefore, the AV at fine scale probably represents a precondition to the expectation of improved large-scale features by RCM outputs. In the context of climate change projection over Africa, some studies [19,44] documented such major differences between GCMs and RCMs known under the concept of potential or conjectural AV [6,7] since observations are not available to confirm it. The findings from the present study bring more arguments to the possibility that at least part of such GCM-RCM differences is due to salient positive AV of the RCMs that could not, however, be measured in the context of future projections.

The influential impact of the RCMs internal physics as compared to the driving GCMs is further confirmed by the regional results, which feature the WAF, CAF and SEAF for all seasons, and NEAF for DJF and MAM seasons, as the best performing sub-region with a positive AVC greater than 20%. For instance, the WAF and CAF sub-regions are known for mesoscale activities and land-atmosphere interactions, and the SEAF and NEAF for their topographic influence on rainfall quantities [19]. The significant positive AVC over these sub-areas represents additional evidence for the hypothesis that the observed AV originates from mesoscale and fine-scale features resolved by RCMs [2,4].

Despite the encouraging results obtained at low resolution and their dependence on the AV at high resolution considering CHIRPS data, cautionary considerations of the present study results are still needed regarding observational uncertainties. Although the availability of CHIRPS data produced at a resolution similar to CORDEX-CORE simulations motivated the present resolution-sensitivity study, alternative datasets widely used at low resolutions may provide different estimates compared to CHIRPS upscaled results and therefore represent a source of uncertainty for AV results at such resolutions. For instance, as shown in Table S1, the AVC values at low resolution using GPCP data show some differences compared to upscaled CHIRPS estimates used in this study. The upscaling of CHIRPS at the GPCP grid for the present study is motivated by these reasons. This implies that more due diligence is needed from users in their data choices to avoid misuses, as suggested by Giorgi [1].

Overall, the AV results by the CORDEX-CORE RegCM4-7 precipitation simulations and their highly likely attribution to internal RCM physics opens a wide range of opportunities for both the climate modeling and the satellite remote sensing communities. For the former community, these results will be instrumental for improving RCMs simulations, and therefore, continuously bridging the resolution and quality gap between traditional climate models and high-resolution satellite-based precipitation products. For the latter, these results expand the remote sensing data applications spectrum, emphasizing on climate change applications and opening new avenues toward a process-based understanding of remotely sensed earth observations [30,31].

#### **5. Conclusions**

This study evaluates the AV by CORDEX-CORE RegCM4-7 seasonal precipitation simulations over Africa, using both high- and low-resolution satellite-based observations, with the aim of quantifying large- and fine-scale improvements by the downscaled outputs in comparison to their driving GCMs. The study yielded substantial results, which can be summarized as follows:


In comparison with the 60 years of research and development of earth observation satellites [31], it is fair to say that the present study's results represent a remarkable achievement of the RCM community, which is only 30 years old [4,5]. For instance, the AV results from this study bring new pieces of evidence on the ability of dynamical downscaling to bridge the resolution and quality gap between coarse GCMs and high-resolution satellitebased precipitation products. For Africa, the opportunities ahead are enormous in the context of unifying multi-source precipitation estimates and undertaking process-based climate assessments and projections, especially for applications where high-resolution data is needed [24].

Overall, RegCM4-7 outputs will still need further processing, such as bias correction, by leveraging available historical high-resolution satellite-based products for more plausible future projection analysis due to the persistent biases observed in the present study [21,25], until new developments and improvements become available. This is true for alternative tools such as Empirical Statistical Downscaling (ESD) [45] and convection-permitting simulations [46], which constitute the next step toward further understanding regional climates. The second component of the CORDEX phase II framework, named Flagship Pilot Studies (FPSs), will provide a coordinated setting for testing these emerging tools [13]. Upcoming studies looking at Vulnerability, Impacts and Adaptation (VIA) assessments are to be expected in the near future to explore other CORDEX-CORE data applications.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14092102/s1, Figure S1 Performance of African precipitation in DJF season at low resolution compared to (a) CHIRPS, for (b) RegCM4-7 evaluation run driven by ERA-INT, (c) driving GCM HadGEM2-ES, (d) driving GCM MPI-ESM-MR, (e) driving GCM NorESM1-M, (f) ensemble mean of the driving GCMs, (g) RegCM4-7 historical run driven by HadGEM2-ES, (h) RegCM4-7 historical run driven by MPI-ESM-MR, (i) RegCM4-7 historical run driven by NorESM1-M, (j) RegCM4-7 historical runs' ensemble mean, (k) Added Value by RegCM4-7 to HadGEM2-ES, (l) Added Value by RegCM4-7 to MPI-ESM-MR, (m) Added Value by RegCM4-7 to NorESM1-M, (n) Added Value by RegCM4-7 to the ensemble mean of the driving GCMs. Figure S2 Performance of African precipitation in MAM season at low resolution compared to (a) CHIRPS, for (b) RegCM4-7 evaluation run driven by ERA-INT, (c) driving GCM HadGEM2-ES, (d) driving GCM MPI-ESM-MR, (e) driving GCM NorESM1-M, (f) ensemble mean of the driving GCMs, (g) RegCM4- 7 historical run driven by HadGEM2-ES, (h) RegCM4-7 historical run driven by MPI-ESM-MR, (i) RegCM4-7 historical run driven by NorESM1-M, (j) RegCM4-7 historical runs' ensemble mean, (k) Added Value by RegCM4-7 to HadGEM2-ES, (l) Added Value by RegCM4-7 to MPI-ESM-MR, (m) Added Value by RegCM4-7 to NorESM1-M, (n) Added Value by RegCM4-7 to ensemble mean of the driving GCMs. Figure S3 Performance of African precipitation in JJA season at low resolution compared to (a) CHIRPS, for (b) RegCM4-7 evaluation run driven by ERA-INT, (c) driving GCM HadGEM2-ES, (d) driving GCM MPI-ESM-MR, (e) driving GCM NorESM1-M, (f) ensemble mean of the driving GCMs, (g) RegCM4-7 historical run driven by HadGEM2-ES, (h) RegCM4-7 historical run driven by MPI-ESM-MR, (i) RegCM4-7 historical run driven by NorESM1-M, (j) RegCM4-7 historical runs' ensemble mean, (k) Added Value by RegCM4-7 to HadGEM2-ES, (l) Added Value by RegCM4-7 to MPI-ESM-MR, (m) Added Value by RegCM4-7 to NorESM1-M, (n) Added Value by RegCM4-7 to ensemble mean of the driving GCMs. Figure S4 Performance of African precipitation in SON season at low resolution compared to (a) CHIRPS, for (b) RegCM4-7 evaluation run driven by ERA-INT, (c) driving GCM HadGEM2-ES, (d) driving GCM MPI-ESM-MR, (e) driving GCM NorESM1-M, (f) ensemble mean of the driving GCMs, (g) RegCM4-7 historical run driven by HadGEM2-ES, (h) RegCM4-7 historical run driven by MPI-ESM-MR, (i) RegCM4-7 historical run driven by NorESM1-M, (j) RegCM4-7 historical runs' ensemble mean, (k) Added Value by RegCM4-7 to HadGEM2-ES, (l) Added Value by RegCM4-7 to MPI-ESM-MR, (m) Added Value by RegCM4-7 to NorESM1-M, (n) Added Value by RegCM4-7 to ensemble mean of the driving GCMs. Table S1 Added Value Coverage results of RegCM4-7 simulations over continental Africa at low resolution for CHIRPS data (LR\_C) and GPCP data (LR\_G).

**Author Contributions:** Conceptualization, G.T.G., Y.H. and G.T.; methodology, G.T.G., Y.H. and G.T.; software, G.T.G., K.T.C.L.K.S., Y.H. and G.T.; validation, Y.H. and G.T.; formal analysis, G.T.G. and K.T.C.L.K.S.; investigation, G.T.G. and Y.H.; resources, Y.H. and G.T.; data curation, G.T.G. and K.T.C.L.K.S.; writing—original draft preparation, G.T.G.; writing—review and editing, I.K.N., K.T.C.L.K.S. and Y.H; visualization, G.T.G., K.T.C.L.K.S. and I.K.N.; supervision, Y.H. and G.T.; project administration, Y.H. and G.T.; funding acquisition, Y.H. and G.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** The National Natural Science Foundation of China, Grant Nos: U1902209 and the National Natural Science Foundation of Yunnan Province (202201AS070069) supported this work, through the Climate Center of Yunnan Meteorological Bureau, with GST registration number 12530000431270053A.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data will be provided upon request from the corresponding author.

**Acknowledgments:** We acknowledge the producers of the CMIP5 and CORDEX-CORE precipitation datasets from the Earth System Grid Federation (ESGF), the GPCP dataset from the University of Maryland, and the CHIRPS dataset from the University of California Sancta Barbara Climate Hazard Group. These datasets were used in accordance with their respective terms and conditions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Dynamical Downscaling of Temperature Variations over the Canadian Prairie Provinces under Climate Change**

**Xiong Zhou 1, Guohe Huang 1,2,\*, Yongping Li 1, Qianguo Lin 3, Denghua Yan <sup>4</sup> and Xiaojia He <sup>5</sup>**

	- Beijing Normal University, Beijing 100875, China; xzhou@bnu.edu.cn (X.Z.); yongping.li@iseis.org (Y.L.) <sup>2</sup> Faculty of Engineering and Applied Science, University of Regina, Regina, SK S4S 0A2, Canada

**Abstract:** In this study, variations of daily mean, maximum, and minimum temperature (expressed as *Tmean*, *Tmax*, and *Tmin*) over the Canadian Prairie Provinces were dynamically downscaled through regional climate simulations. How the regional climate would increase in response to global warming was subsequently revealed. Specifically, the Regional Climatic Model (RegCM) was undertaken to downscale the boundary conditions of Geophysical Fluid Dynamics Laboratory Earth System Model Version 2M (GFDL-ESM2M) over the Prairie Provinces. Daily temperatures (i.e., *Tmean*, *Tmax*, and *Tmin*) were subsequently extracted from the historical and future climate simulations. Temperature variations in the two future periods (i.e., 2036 to 2065 and 2065 to 2095) are then investigated relative to the baseline period (i.e., 1985 to 2004). The spatial distributions of temperatures were analyzed to reveal the regional impacts of global warming on the provinces. The results indicated that the projected changes in the annual averages of daily temperatures would be amplified from the southwest in the Rocky Mountain area to the northeast in the prairie region. It was also suggested that the projected temperature averages would be significantly intensified under RCP8.5. The projected temperature variations could provide scientific bases for adaptation and mitigation initiatives on multiple sectors, such as agriculture and economic sectors over the Canadian Prairies.

**Keywords:** dynamical downscaling; projected variations; Canadian Prairies; global warming

#### **1. Introduction**

Climate warming is one of the most significant challenges currently facing the globe and humanity. Consequently, assessment of climate change impacts is needed to support adaptation and mitigation strategies [1–9]. Such impacts are commonly investigated based on future climate projections simulated by global climate models (GCMs) under multiple scenarios [10–13]. However, the mechanisms and processes of climate change at regional scales cannot be comprehensively reflected by these coarse-resolution projections [14–18]. Therefore, the development of fine-resolution climate projections is needed to explore the climate change impacts within a regional context.

Climate projections of GCMs have been primarily employed in previous studies to examine climate variations over the Canadian region in response to global warming [19–30]. More recently, several studies [31–34] have attempted to dynamically downscale GCMs over the Canadian prairies based on regional climate models (RCMs). For example, PaiMazumder et al. [33] projected changes in short- and long-term drought characteristics over the Canadian Prairies by using an ensemble of ten Canadian RCM (CRCM)

**Citation:** Zhou, X.; Huang, G.; Li, Y.; Lin, Q.; Yan, D.; He, X. Dynamical Downscaling of Temperature Variations over the Canadian Prairie Provinces under Climate Change. *Remote Sens.* **2021**, *13*, 4350. https:// doi.org/10.3390/rs13214350

Academic Editor: Jorge Vazquez

Received: 8 September 2021 Accepted: 26 October 2021 Published: 29 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

simulations. These previous studies have demonstrated that RCMs have advantages in reflecting the climatology processes at local scales as compared to GCMs [9,35]. For instance, Zhou et al. [9] developed a dynamical-coupled downscaling approach and demonstrated its advantages in capturing the regional details.

RCMs can resolve more detailed features such as mountain ranges, coastal zones, and soil properties, which are consistent with physical mechanisms in GCMs. As a consequence, dynamical downscaling can provide climate variables at a fine spatio-temporal resolution in order to support a better understanding of climate change impacts within the global warming context [36]. However, there have been few reports of dynamic downscaling of daily temperatures over the Canadian Prairie Provinces at a high spatial resolution (e.g., 25 km) under the representative concentration pathways (RCPs).

In addition, the Canadian Prairie Provinces are affected by the Rocky Mountains and undulating prairies. Such complex topography will have a significant impact on the local climate conditions in this region. The observations in the Rocky Mountain region are sparse due to the complex topography. Initiatives of regional climate downscaling can provide gridded climatic information with a fine spatio-temporal resolution, supporting mitigation and adaptation of the severe impacts of climatic changes. The development of a fine-resolution climate fluctuations is thus essential to comprehensively reveal the potential impacts of climate change on the Prairie Provinces.

Therefore, the objective of this study is to examine how the regional climate over the Canadian Prairie Provinces will increase in response to global warming through the development of climate projections based on RCM. In detail, the Regional Climatic Model (RegCM) will be undertaken to downscale the boundary conditions of Geophysical Fluid Dynamics Laboratory Earth System Model Version 2M (GFDL-ESM2M). Daily mean, maximum, and minimum temperature are subsequently extracted from the historical and future climate simulations. Temperature variations in the two future periods (i.e., 2036 to 2065 and 2065 to 2095) are then investigated relative to the baseline period (i.e., 1985 to 2004). The spatial temperature distributions will be analyzed to reveal the impacts of global warming on the Canadian Prairie Provinces. It is expected that the projected temperature variations would provide scientific bases for adaptation and mitigation initiatives on multiple sectors, such as agriculture and economic sectors.

#### **2. Model Setup, Study Area, and Data Sets**

In this study, physically-based climate downscaling is based on the 4.6.0 version of RegCM, which is developed by the International Center for Theoretical Physics [37–40]. The RegCM4.6 model is employed to dynamically downscale the GFDL-ESM2M simulations [41–44], which are derived from the historical (1950–2005) and future (2006–2099) experiments under RCPs [35]. Specifically, the intermediate (i.e., RCP4.5) and heavy (i.e., RCP8.5) emission scenarios are chosen to reveal the range of possible future climate variations. Such emission scenarios are mainly distinctive from each other after the year 2050. More specific information for the scenarios is illustrated in the previous studies [9,35].

The study area is within the context of the Canadian Prairie Provinces, which include the Provinces of Manitoba, Saskatchewan, and Alberta. As shown in Figure 1, the topography in the study area is varied and complicated. The highest and lowest elevation is 3434 and 1 m, respectively. The Prairie Provinces have a total surface area of over 1,960,000 km2, accounting for 19.6% of the entire area of Canada [9]. Moreover, the climatic patterns of the Arctic and the Rocky Mountain have significant impacts on the climate of the Canadian Prairie Provinces [35,45].

A spatial experiment domain of 108 longitude × 128 latitude grid points with a horizontal resolution of 0.22◦ × 0.22◦ (i.e., approximately 25 km) is set up to cover the Prairie Provinces [9]. The land surface scheme is the 4.5 version of the Community Land Model (CLM4.5), while the multiple-phase cloud microphysics scheme is selected as the moisture scheme in RegCM4.6 [46]. The microphysics scheme is employed as the moisture scheme, while the lateral boundary condition scheme is the relaxation exponential

technique [47,48]. The Holtslag PBL [49] is chosen as the boundary layer scheme and the cumulus convective scheme is the Emanuel scheme [50].

**Figure 1.** Topography of the study area.

The annual average of daily maximum temperature over the Canadian Prairies in the period of 1960 to 2005 ranged from −4.0 to 12.6 ◦C, whereas daily minimum temperature varied from −12.2 to 0.6 ◦C. From 1960 to 2005, the maximum and minimum temperature increased by 1.5 and 2.3 ◦C, respectively. The spatial average of daily mean temperature for the period of 1960 to 2005 fluctuated from 3.7 to 7.3 ◦C, which increased by 1.6 ◦C since 1895 [9].

Intensified greenhouse-gas impacts and aggravated land-cover changes have contributed to increases in annual averages of daily maximum and minimum temperature [51]. It is expected that such increases would be amplified in the future, which can significantly affect the agriculture sector in the Canadian Prairies. Therefore, a higher spatial resolution of climate projections is needed to reveal the impacts of climate changes on the Canadian Prairies, and thus facilitate proper mitigative and adaptive strategies.

The historical climate observations over the Canadian Prairies were derived from a 10 km gridded climate dataset, which was produced by the National Land and Water Information Service (NLWIS), Agriculture and Agri-Food, Canada [52]. In this study, daily maximum and minimum temperatures in the period of 1985 to 2004 were extracted to analyze the annual averages within the study area. The elevation of the Canadian Prairie Provinces with a horizontal resolution of 30 arc-seconds was acquired from the Canadian Digital Elevation Data, which is developed by the Canadian Forestry Service, Ontario region [53].

The RegCM4.6 model is undertaken to dynamically downscale daily mean, maximum, and minimum temperature (expressed as *Tmean*, *Tmax*, and *Tmin*) over the Canadian Prairies. Annual averages of the daily temperatures for future periods of both 2036 to 2065 and 2066 to 2095 are calculated. The changes in temperature relative to the baseline period (i.e., 1985 to 2004) are then analyzed. Particularly, the changes in temperature (i.e., *Vmean*, *Vmax*, and *Vmin*) can be calculated as follows:

$$V\_{\rm mean} = T\_{\rm mean}^f - T\_{\rm mean}^h \tag{1}$$

$$V\_{\text{max}} = T\_{\text{max}}^f - T\_{\text{max}}^h \tag{2}$$

$$V\_{\min} = T\_{\min}^f - T\_{\min}^h \tag{3}$$

where *T<sup>f</sup> mean*, *<sup>T</sup><sup>f</sup> max*, and *<sup>T</sup><sup>f</sup> min* represent the annual average of daily mean, maximum, and minimum temperature in the future periods, respectively; whereas *T<sup>h</sup> mean*, *T<sup>h</sup> max*, and *T<sup>h</sup> min* indicate the annual average of daily mean, maximum, and minimum temperature for the historical period (i.e., 1985 to 2004). Detailed steps for projecting temperature variations through the RegCM4.6 model are summarized in Figure 2.

**Figure 2.** Methodological framework for projecting temperature variations.

#### **3. Projected Variations of Temperature**

Previous studies have illustrated that RCMs are capable of capturing regional terrestrial and atmospheric processes as compared with GCMs [9,35]. The performance of RCMs in reproducing the historical temperature observations has been evaluated in previous studies [9,35,45,54–60]. Therefore, the development of fine-resolution temperature projections is the focus of this study. To comprehensively reveal how future temperature would be changed over the Canadian Prairies, temperature variations (i.e., *Vmean*, *Vmax*, and *Vmin*) in two future periods of both 2036 to 2065 and 2066 to 2095 under RCPs are analyzed with respective to the baseline period of 1985 to 2004.

Figure 3 presents the projected variations of annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2036 to 2065 under RCP45 relative to the baseline period. The figures in columns one and two show the spatial distribution of projected averages of daily temperatures for the historical and future period, respectively, whereas column three shows the calculated variations (i.e., *Vmean*, *Vmax*, and *Vmin*); rows one to three present *Tmean*, *Tmax*, and *Tmin*, respectively. For *Tmean*, the simulated spatial average and standard deviation

over the entire area for the historical period are −0.4 and 2.1 ◦C, respectively (Figure 3a). It can be found that there is a decreasing pattern of the annual average from north to south. The spatial distribution from the future experiment in the period of 2036 to 2065 under RCP4.5 (Figure 3b) is closed to the historical one (Figure 3a), with an obviously increased magnitude. The result also indicates that the spatial average of *Tmean* over the provinces is projected to be 1.0 ◦C in the period of 2036 to 2065, where maximum and minimum values in the map reach −7.3 ◦C in the southwest and 5.8 ◦C in the southeast, respectively. However, a smaller standard deviation of 1.9 ◦C is projected for the period of 2036 to 2065, which indicates that the spatial variation would be reduced due to the increased magnitude of negative values under climate change. As shown in Figure 3c, the average of *Tmean* increases over the entire of the Prairie Provinces, with a northeastward-intensified pattern of projected changes. In addition, the RegCM model also simulates the largest increase at 56.60◦ N and 96.73◦ W in the northeast of Manitoba, and the smallest one at 51.47◦ N and 112.66◦ W in the southwest of Alberta.

**Figure 3.** Projected variations of annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2036 to 2065 under RCP45. (**a**,**d**,**g**) show the historical distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**b**,**e**,**h**) show the future distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**c**,**f**,**i**) show the projected variations of *Vmean*, *Vmax*, and *Vmin*, respectively.

Compared with *Tmean*, most of the spatial averages of *Tmax* are generally simulated to be above zero degrees, with the magnitude varying from −5.7 to 8.2 ◦C. The results derived from the RegCM model also show that *Tmax* has a spatial average of 4.3 ◦C and a standard deviation of 2.1 ◦C over the entire area for the historical period of 1985 to 2004 (Figure 3d). For the projected average of *Tmax*, generally the future (Figure 3e) and the historical (Figure 3d) experiments have a similar gradient pattern, but amplified values are revealed in the future period of 2036 to 2065. In particular, a spatial average of 5.6 ◦C over the Canadian Prairies is projected, with a standard deviation of 2.1 ◦C in the future period under RCP4.5. Figure 3f presents the projected variations of annual averages of *Tmax* for the future period relative to the baseline period. It can be observed that the entire Prairie Provinces would experience increases in *Tmax*, with a gradient intensified pattern from southwest to northeast. Likewise, the spatial average of *Tmax* would increase from 0.6 to 2.1 ◦C, while the most significant increase is found at the grid of 58.13◦ N and 94.91◦ W in the northeastern part.

In contrast, *Tmin* over the Prairie Provinces is almost below zero degrees, with a spatial average of −4.6 ◦C and a standard deviation of 2.4 ◦C (Figure 3g). It also indicated that greater spatial variations are presented in *Tmin* as compared with *Tmax*. The spatial pattern of both simulated historical and projected future average of *Tmin* matches each other quite well (Figure 3g,h), but with much greater values. A spatial average of −3.1 ◦C over the Canadian Prairies is projected, with a standard deviation of 2.2 ◦C in the period of 2036 to 2065 under RCP4.5. The greatest and smallest average are projected to be 3.4 ◦C in the southeast and −11.0 ◦C in the Rocky Mountain area, respectively. However, smaller spatial variations in the future might be caused by increased values of negative *Tmin* under climate change. The projected variations of annual averages of *Tmin* for the future period are presented in Figure 3i. Similar to *Tmax* during the future period, the annual averages of *Tmin* increase from 0.8 to 2.3 ◦C. It also can be observed that there is a gradient intensified pattern from southwest to northeast. Similarly, the most significant increase is located at 56.86◦ N and 88.97◦ W in the northeast, whereas the smallest one is at 51.47◦ N and 112.66◦ W in the southwest.

Moreover, Figure 4 depicts the annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2066 to 2095 under RCP4.5, as well as their changes with respect to the baseline. Similarly, the spatial distributions of projected averages of daily temperatures for the historical and future period are shown in columns one and two, respectively. Column three depicts the computed changes in the projected values relative to the baseline period. In addition, the *Tmean*, *Tmax*, and *Tmin* respectively are presented in rows one to three. In general, there is a similar spatial distribution pattern of projected variations among *Tmean*, *Tmax*, and *Tmin* for the historical and future period. Similar to the revelations during the period of 2036 to 2065, the spatial average of *Tmean*, *Tmax*, and *Tmin* for the period of 2066 to 2095 increase by 1.7, 1.6, and 1.8 ◦C, respectively (Figure 4c,f,i). However, such increases are projected as slightly greater than those in the period 2036 to 2065 under RCP4.5. Moreover, the projected changes in the annual averages of *Tmean*, *Tmax*, and *Tmin* would be amplified from the southwest in the Rocky Mountain area to the northeast in the prairie region.

In order to further analyze the projected variations of *Tmean*, *Tmax*, and *Tmin*, the spatial distributions of their annual averages over the entire domain for the period of 2036 to 2065 and 2066 to 2095 under RCP4.5 are compared to those for the baseline period (Figure 5). In general, the shape of the distribution of *Tmean*, *Tmax*, and *Tmin* among three different periods matches each other. However, intensification of daily temperatures in the future is evident as shown from the changes of mean values, which is consistent with the previous revelations (Figures 3 and 4). It can be observed that there is no significant difference in the projected changes between the period of 2036 to 2065 and the period of 2066 to 2095 in terms of the mean and standard deviation. For example, the mean and standard deviation of projections for *Tmin* in the period of 2036 to 2065 are projected to be −2.8 and 2.3 ◦C, respectively. Nevertheless, during the period of 2066 to 2095, the mean of projections is −3.1 ◦C with a standard deviation of 2.2 ◦C.

Figure 6 displays the projected annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2036 to 2065 under RCP8.5. In the same way, the figures in columns one to three respectively present the simulated averages of daily temperatures for the historical and future periods, as well as their differences. The figures in rows one to three show *Tmean*, *Tmax*, and *Tmin*, respectively. Overall, the results show that the projections in the period of 2036 to 2065 under RCP8.5 are similar to those under RCP4.5. Moreover, there is an apparent warming pattern during the period of 2036 to 2065 under RCP8.5. For example, the RegCM model simulates the spatial averages of −4.6 and −2.5 ◦C for *Tmin* during the period 1985 to 2004 and 2036 to 2065, respectively. The largest average over the map during the future period of 2036 to 2065 is located at 50.20◦ N and 98.04◦ W in the southeast, nevertheless, the smallest one is at 52.19◦ N and 117.24◦ W in the southwest (Figure 6b).

Moreover, significant increases under RCP8.5 are observed in the calculated differences between two periods. There is a similar intensified pattern from southwest to northeast, which is consistent with the results under RCP4.5.

**Figure 4.** Projected variations of annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2066 to 2095 under RCP45. (**a**,**d**,**g**) show the historical distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**b**,**e**,**h**) show the future distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**c**,**f**,**i**) show the projected variations of *Vmean*, *Vmax*, and *Vmin*, respectively.

To further analyze the temperature changes, Figure 7 presents the projected averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2066 to 2095 under RCP8.5. Similar to the pattern during the period of 2036 to 2065 under RCP8.5, the increase in the averages would be reduced from the northeastern Prairie to the southeastern Rocky Mountain. However, it can be found that the projected increases during the period of 2066 to 2095 under RCP4.5 and RCP8.5 are significantly different from each other. For example, the maximum increases in annual averages of daily maximum temperature during the period of 2066 to 2095 under RCP4.5 and RCP8.5 are 2.5 and 4.5 ◦C, respectively. Such increases are amplified under RCP8.5 as caused by heavy GHG emissions.

The spatial distribution of annual averages over the entire Prairie Provinces for two future periods under RCP8.5 are in comparison with those for the baseline period (Figure 8). It is indicated that the distribution of *Tmean*, *Tmax*, and *Tmin* generally maintain the same shape among three different periods. Such revelations are also consistent with the results under RCP4.5. Nevertheless, the projected averages of daily temperatures are significantly intensified as seen from the shifted mean values. For example, the mean value of *Tmax* in the period of 2066 to 2095 is projected to be 6.0 ◦C under RCP4.5, whereas it is enlarged to 7.8 ◦C under RCP8.5.

**Figure 5.** Spatial distribution of annual averages for the historical and future periods under RCP45. (**a**) *Tmean*, (**b**) *Tmax*, and (**c**) *Tmin*.

**Figure 6.** Projected variations of annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2036 to 2065 under RCP85. (**a**,**d**,**g**) show the historical distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**b**,**e**,**h**) show the future distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**c**,**f**,**i**) show the projected variations of *Vmean*, *Vmax*, and *Vmin*, respectively.

**Figure 7.** Projected variations of annual averages of *Tmean*, *Tmax*, and *Tmin* for the period of 2066 to 2095 under RCP85. (**a**,**d**,**g**) show the historical distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**b**,**e**,**h**) show the future distributions of *Tmean*, *Tmax*, and *Tmin*, respectively; (**c**,**f**,**i**) show the projected variations of *Vmean*, *Vmax*, and *Vmin*, respectively.

**Figure 8.** Spatial distribution of annual averages for the historical and future periods under RCP85. (**a**) *Tmean*, (**b**) *Tmax*, and (**c**) *Tmin*.

#### **4. Projected Trends of Temperature**

In order to further explore the impacts of climate change on the Prairie Provinces, the Mann–Kendall (MK) test [61,62] and the non-parametric Theil–Sen (TS) estimator [63,64] were performed to reveal the temporal trends in annual averages of *Tmean*, *Tmax*, and *Tmin* at multiple stations under RCPs. In this study, sixteen spatially distributed stations are chosen, while more details can be referred to in Zhou et al. [9]. The probability of the MK test statistics and the magnitude of the TS estimator are examined for demonstrating how the Prairie Provinces are affected within the context of global warming.

Figure 9 shows the time series and temporal trends of annual averages of *Tmean* at the 16 stations for two future periods under RCPs. It can be observed that the *p*-values of the MK test statistics at the majority of stations for future periods under RCP8.5 are less than the significance level (*α* = 0.05), indicating that there are significant trends in the annual averages of *Tmean*. Particularly, *Tmean* at all the stations in the period of 2066 to 2095 under RCP4.5 shows insignificant trends, ranging from −0.029 to 0.020 ◦C/year. This might be attributable to the stabilization of the population growth, land-use conversions, and greenhouse gas (GHG) emissions under RCP4.5 after the year 2050.

**Figure 9.** Temporal trends in annual averages of *Tmean* at the 16 stations. The value of *p* and trend is derived from the MK test and the Theil–Sen estimator, respectively. The straight line represents the linear trend derived from the Theil–Sen estimator. (**a**) Edmonton, (**b**) Calgary, (**c**) Fort Mcmurray, (**d**) Red Deer, (**e**) Medicine Hat, (**f**) Grande Prairie, (**g**) Key Lake, (**h**) Saskatoon, (**i**) Regina, (**j**) Waseca, (**k**) Pelly, (**l**) Winnipeg, (**m**) Gillam, (**n**) Thompson, (**o**) Brandon, and (**p**) The Pas.

In addition, temporal trends are investigated for annual averages of *Tmax* and *Tmin* for future periods under RCPs, which are presented in Figures 10 and 11, respectively. It can be noticed that similar warming trends are detected for the annual averages at all the stations under RCP8.5. However, the magnitude of positive trends in *Tmean* for most stations is generally smaller than *Tmin*, while larger than *Tmax*. This is also consistent with the results of projected variations, where larger increases are reported in *Tmin*. It is also evident that the trends in *Tmax* and *Tmin* at all the stations in the period of 2066 to 2095 under RCP4.5 are insignificant, as seen from the *p*-values of the MK test statistics.

**Figure 10.** Temporal trends in annual averages of *Tmax* at the 16 stations. The value of *p* and trend is derived from the MK test and the Theil–Sen estimator, respectively. The straight line represents the linear trend derived from the Theil–Sen estimator. (**a**) Edmonton, (**b**) Calgary, (**c**) Fort Mcmurray, (**d**) Red Deer, (**e**) Medicine Hat, (**f**) Grande Prairie, (**g**) Key Lake, (**h**) Saskatoon, (**i**) Regina, (**j**) Waseca, (**k**) Pelly, (**l**) Winnipeg, (**m**) Gillam, (**n**) Thompson, (**o**) Brandon, and (**p**) The Pas.

**Figure 11.** Temporal trends in annual averages of *Tmin* at the 16 stations. The value of *p* and trend is derived from the MK test and the Theil–Sen estimator, respectively. The straight line represents the linear trend derived from the Theil–Sen estimator. (**a**) Edmonton, (**b**) Calgary, (**c**) Fort Mcmurray, (**d**) Red Deer, (**e**) Medicine Hat, (**f**) Grande Prairie, (**g**) Key Lake, (**h**) Saskatoon, (**i**) Regina, (**j**) Waseca, (**k**) Pelly, (**l**) Winnipeg, (**m**) Gillam, (**n**) Thompson, (**o**) Brandon, and (**p**) The Pas.

#### **5. Discussion**

In this study, the Canadian Prairies shows a larger increase in the annual averages of *Tmin* than those of *Tmax* (Figure 12). This also implies that projected increases in annual averages of *Tmax* (Figure 12b) make a smaller contribution than *Tmin* (Figure 12c) to the increase in annual averages of *Tmean* (Figure 12a). The downscaling experiments also reveal that the trends in *Tmean*, *Tmax*, and *Tmin* at all the stations in the 2080s under RCP4.5 are insignificant. This might be attributable to the stabilization of population growth, landuse conversions, and greenhouse gas (GHG) emissions under RCP4.5 after the year 2050. Moreover, we found that greater increases in annual averages of daily temperatures are projected under RCP8.5 than those under RCP4.5, indicating that a higher emission scenario will initiate a more significant increase in temperatures due to greenhouse gas effects.

The projected results of temperature variations reveal that increases in the annual averages of daily temperatures would be intensified from the southwest in the Rocky Mountain area to the northeast in the prairie region. This is most likely due to the dynamic effects of a strong positive ice-snow albedo feedback, amplifying climate change as a result of decreasing ice-snow cover in the prairie region. The contribution of ice-snow albedo feedback to larger increases in the annual averages of *Tmin* in the northeast region would be more significant than *Tmax* due to a similar amplification effect. This is also consistent with the previous studies, which indicate that the high-latitude ice-snow albedo feedback is a primary element in projected increases in temperatures under the global warming scenarios [65–67].

**Figure 12.** Spatial averages of projected changes for the future periods under RCPs. (**a**) *Tmean*, (**b**) *Tmax*, and (**c**) *Tmin*.

#### **6. Conclusions**

In this study, dynamically downscaled variations of daily mean, maximum, and minimum temperature over the Canadian Prairies were developed through regional climate simulations. How the regional climate will increase in response to global warming in the future was thus analyzed. Specifically, RegCM was undertaken to downscale the boundary conditions of GFDL-ESM2M. Daily temperatures were subsequently extracted from the historical and future simulations. Temperature variations in the future periods (i.e., 2036 to 2065 and 2065 to 2095) are then investigated relative to the baseline period (i.e., 1985 to 2004). The spatial temperature distributions were analyzed to reveal the regional impacts of global warming on the Prairie Provinces.

The results suggested that the spatial distribution from the future experiment in the period of 2036 to 2065 under RCP4.5 is closed to the historical one, with an obviously increased magnitude. Moreover, it could be observed that there is no significant difference in the projected changes in temperatures between the period of 2036 to 2065 and the period of 2066 to 2095 in terms of their mean and standard deviation. The results further indicated that the projected changes would be amplified from the southwest in the Rocky Mountain area to the northeast in the prairie region. It was also suggested that the projected changes would be significantly intensified under RCP8.5. The projected variations of daily mean, minimum, and maximum temperature could provide scientific bases for adaptation and mitigation initiatives on multiple sectors, such as agriculture and economic sectors.

Significant increases in temperatures over the Canadian Prairies have been revealed through a comprehensive analysis of the projected variations. Such revealed increases can provide scientific bases for identifying appropriate adaptation and mitigation initiatives on multiple sectors. This study is limited since the RegCM model is merely driven by one GCM, which has difficulty in reflecting the uncertainties of boundary conditions. However, such boundary data provided by GCM outputs could significantly affect the projected temperature variations. Therefore, future research efforts can be extended to investigate

the full potential ranges of temperature increases through dynamical downscaling of multiple GCMs.

**Author Contributions:** Conceptualization, X.Z.; methodology, X.Z.; software, X.Z.; validation, X.Z.; formal analysis, X.Z.; investigation, X.Z.; resources, X.Z., G.H. and Y.L.; data curation, X.Z., Q.L., D.Y. and X.H.; writing—original draft preparation, X.Z.; writing—review and editing, X.Z; visualization, X.Z.; supervision, G.H. and Y.L.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is supported by the Natural Science Foundation of China (51779008, U2040212) and the Fundamental Research Funds for the Central Universities.

**Acknowledgments:** We are very grateful for the helpful inputs and suggestions from the editor and anonymous reviewers.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Long-Term Projection of Water Cycle Changes over China Using RegCM**

#### **Chen Lu 1, Guohe Huang 1,\*, Guoqing Wang 2, Jianyun Zhang 2, Xiuquan Wang <sup>3</sup> and Tangnyu Song <sup>1</sup>**


**Abstract:** The global water cycle is becoming more intense in a warming climate, leading to extreme rainstorms and floods. In addition, the delicate balance of precipitation, evapotranspiration, and runoff affects the variations in soil moisture, which is of vital importance to agriculture. A systematic examination of climate change impacts on these variables may help provide scientific foundations for the design of relevant adaptation and mitigation measures. In this study, long-term variations in the water cycle over China are explored using the Regional Climate Model system (RegCM) developed by the International Centre for Theoretical Physics. Model performance is validated through comparing the simulation results with remote sensing data and gridded observations. The results show that RegCM can reasonably capture the spatial and seasonal variations in three dominant variables for the water cycle (i.e., precipitation, evapotranspiration, and runoff). Long-term projections of these three variables are developed by driving RegCM with boundary conditions of the Geophysical Fluid Dynamics Laboratory Earth System Model under the Representative Concentration Pathways (RCPs). The results show that increased annual average precipitation and evapotranspiration can be found in most parts of the domain, while a smaller part of the domain is projected with increased runoff. Statistically significant increasing trends (at a significant level of 0.05) can be detected for annual precipitation and evapotranspiration, which are 0.02 and 0.01 mm/day per decade, respectively, under RCP4.5 and are both 0.03 mm/day per decade under RCP8.5. There is no significant trend in future annual runoff anomalies. The variations in the three variables mainly occur in the wet season, in which precipitation and evapotranspiration increase and runoff decreases. The projected changes in precipitation minus evapotranspiration are larger than those in runoff, implying a possible decrease in soil moisture.

**Keywords:** climate change; China; water cycle; precipitation; evapotranspiration; runoff; projection

#### **1. Introduction**

The global water cycle is becoming more intense in a warming climate; increases in precipitation, evapotranspiration, and runoff can be widely observed over the world [1,2]. The resulting extreme rainstorms and heavy runoff can themselves lead to losses of life and damage to infrastructure (such as urban drainage systems), not to mention that they may also cause flood events that are of more devastating consequences [3]. On the other hand, the delicate balance of the three variables also deserves attention, as, according to the surface water budget equation *dS*/*dt* = *P* − *E* − *R* (where *S* denotes the subsurface storage of water substances, *P* is precipitation, *E* is evapotranspiration, and *R* is runoff) [2,4], they are closely associated with variations in soil moisture. In cases where meteorological drought occurs (i.e., long-term rainfall deficit), the interactions among the water cycle components could affect how the meteorological drought is propagated to hydrological drought (i.e.,

**Citation:** Lu, C.; Huang, G.; Wang, G.; Zhang, J.; Wang, X.; Song, T. Long-Term Projection of Water Cycle Changes over China Using RegCM. *Remote Sens.* **2021**, *13*, 3832. https:// doi.org/10.3390/rs13193832

Academic Editor: Magaly Koch

Received: 29 August 2021 Accepted: 23 September 2021 Published: 25 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

long-term runoff deficit) and agriculture drought (i.e., long-term soil moisture deficit), which can have significant influences on regional water availability and agriculture [5]. In addition, variations in the water cycle can have profound long-term environmental [6–8] and ecological [9,10] implications. In China, disasters caused by variations in water cycle components have been recorded. For example, a series of flood events that occurred in the Yangtze River Basin in 2020, in which the inflow to the Three Gorges Dam once reached 75,000 m3/s during the fifth flood, caused the death/disappearance of 219 people and an economic loss of 178.96 billion yuan by 13 August [11,12]. In order to avoid losses such as above, studies of future variations in the water cycle are needed to help provide scientific foundations for the design of relevant adaptation and mitigation measures.

Previous attempts have been made in projecting future changes in individual water cycle components. For example, Li et al. [13] applied a self-organizing map to study the future changes in summer precipitation in eastern China using global climate model (GCM) data of climate model intercomparison project phase 5 (CMIP5). Dong et al. [14] examined the climate change impacts on reference evapotranspiration in Xinjiang, China based on CMIP5 GCM projections. Yan et al. [15] projected future runoff in the Yellow River Basin using CMIP5 GCM data through global Bayesian model averaging. Fewer studies can be found focusing on more than one water-cycle-related variable. One example is the study of Gu et al. [16] on future precipitation and runoff changes and the resultant drought events over China using GCM and hydrological model ensembles. In another study conducted by Zhang et al. [2], future changes in the three water cycle components over global land monsoon regions were systematically examined using the GMIP5 GCM ensemble.

In general, most of the previous studies that simultaneously involve the projection of more than one water cycle component are based on GCM data. However, precipitation, evapotranspiration, and runoff are closely related to land surface processes that are largely dependent on topographic and land-use information, which is not adequately depicted in GCMs due to their coarse resolution [17,18]. Climate downscaling, including statistical and dynamical approaches, is commonly undertaken to derive regional- and/or localscale information [18]. While both techniques have been demonstrated to be capable of correcting biases in GCM outputs, studies have shown that statistical simulations can inherit unphysical signals from the driving data, and thus fail to generate physically consistent regional climate projections as regional climate models (RCMs) used in the dynamical approach do [19]. The applicability of RCMs to the simulation of temperature and precipitation over China has been tested by various previous studies, which suggest that RCMs can refine large-scale climate information in complex terrains, correct certain biases in the boundary conditions, and provide results that correspond better with the observations [17,20]. Among all RCMs, RegCM has been noted to perform well over China [21,22]. For example, Pan et al. [23] conducted climate projections over Northwest China, and suggested that RegCM can reproduce the spatial patterns of temperature, precipitation, and climate extremes over this region. Gao [24] simulated climate over China using RegCM and WRF, and noticed that RegCM shows better skills in simulating the temperature and precipitation pattern and magnitude in dry regions. Jiang et al. [25] compared the performance of RegCM and PRECIS in modeling precipitation over China, and indicated that the former outperforms the latter in capturing annual precipitation and wet days in eastern China.

Therefore, as an extension of previous studies, the objective of this study is to examine, through the application of an RCM, climate change impacts on the water cycle components over China. Specifically, regional climate simulations over China are undertaken using RegCM. Model performance is evaluated via the comparison of the simulated climate against the remote sensing and gridded observation data. Then, high-resolution long-term projections of precipitation, evapotranspiration, and runoff under two emission scenarios are developed, and the changes and trends of these variables are subsequently evaluated. The results of this research could be beneficial to the forecast and control of flood and drought events in China.

#### **2. Methodology and Data**

Dynamical downscaling of water cycle components over China is developed using RegCM, which is a regional climate model developed by the International Center for Theoretical Physics [26]. The Community Land Model version 4.5 (CLM4.5) coupling is enabled in RegCM simulations to provide an improved description of land surface processes (e.g., carbon cycle, vegetation dynamics, and river routing) [27,28]. The detailed representation of water vapor fluxes for both non-vegetated and vegetated surfaces in CLM is expected to help with the simulation of evapotranspiration. In addition, CLM is embedded with SIMTOP (simple TOPMODEL-based runoff model) [29], which can take into account the influence of topological information in runoff generation. The simulated total runoff is then routed to active ocean or marginal seas through a river transport model [30]. More details about RegCM parameterization scheme configuration can be found in Lu et al. [22].

Two rounds of RegCM hindcast simulations are conducted for model validation purposes; one of them is driven by the ERA-Interim reanalysis data developed by the European Centre for Medium-Range Weather Forecasts [31], which provide the realistic historical climate over China; the other one is driven by the historical climate scenario of the Earth System Model developed by the Geophysical Fluid Dynamics Laboratory (GFDL) [32,33], which is used to provide the baseline for projections. The baseline period is 1986 to 2005.

The RegCM performance is validated through comparisons of the annual and seasonal averages of model results with those of the observations, remote sensing data, and reconstructed data. The months included in each season are as follows: December (of the previous year), January, and February for winter; March, April, and May for spring; June, July, and August for summer; September, October, and November for autumn. Spatial correlation is employed as a quantitative metric to reflect the similarity between the annual and seasonal averaged observation/remote sensing data/reconstructed data and the simulation results.

In this study, version 4 of the high-resolution gridded observation dataset from the Climate Research Unit (CRU) [34] is used for verifying the model-generated temperature and precipitation. This dataset is generated through the interpolation of extensive networks of gauge station observations into a 0.5◦ regular grid [34], and is widely applied for the calibration/validation of global and regional climate models [35].

For evapotranspiration, two sets of remote sensing data are employed, specifically, version 6 of the Resolution Imaging Spectroradiometer (MODIS) terrestrial evapotranspiration product (MOD16A2GF v006) [36], and the latest version of the Global Land Evapotranspiration Amsterdam Model (GLEAM, v3.5b) [37,38]. The MOD16A2GF dataset is created from remotely sensed data products of MODIS (e.g., land cover and albedo) based on the Penman–Monteith equation [39]. On the other hand, GLEAM assembles various satellite-based observations (e.g., radiation from GERES, precipitation from TMPA and MSWEP, air temperature from AIRS, soil moisture from SMOS and ERA CCI SM) and derives global evaporation variables with the Priestley and Taylor model [37]. Both datasets were demonstrated to be able to reasonably represent the actual evapotranspiration over China [40,41]. The evapotranspiration from MOD16A2GF and the actual evaporation from GLEAM are used for validating the RegCM-generated evapotranspiration. It is worth noting that the start years of these two remote sensing datasets are 2000 and 2003, respectively; therefore, RegCM results averaged over the same periods (i.e., 2000 to 2005 and 2003 to 2005) are used for comparison. To assist the validation for evapotranspiration over the entire baseline period, station-based observed tank evaporation from the National Meteorological Information Center of China (NIMC) is also used (data available at data.cma.cn; accessed on 9 April 2019). The locations of the stations are shown in Figure S1 in the Supplementary Materials.

For runoff, the validation was undertaken through the comparison of the model outputs with the Global Runoff Reconstruction (GRUN), which is constructed through machine learning techniques based on runoff and meteorological observations [42]. This dataset is widely used in weather and climate research and is shown to have a reasonable performance over China [43–45].

For future climate, GFDL projections of two representative concentration pathways (RCPs) are employed, which are RCP4.5 and RCP8.5, respectively, for intermediate and heavy emissions. Simulations are conducted for the entire twenty-first century. Three time-slices are considered in result analysis: 2020 to 2039 (or 2030s), 2040 to 2069 (or 2050s), and 2070 to 2099 (or 2080s); time averages and trends are calculated with respect to these periods. In addition, since the water cycle components are closely related to atmospheric moisture contents, and the saturation vapor pressure is related to the temperature following the Clausius–Clapeyron equation [46], different warming periods are also considered in this study. The warming periods are defined as twenty-year periods in which the domain average temperature increases by 1, 1.5, 2, 3, and 4 ◦C compared with the baseline average. Table 1 lists the respective periods for each warming level under the two emission scenarios. The numerical values of future trends are obtained based on Sen's slope estimator, and their statistical significance is examined by Mann–Kendall tests [47–50].

**Table 1.** Future periods in which warming over China reaches 1, 1.5, 2, 3, and 4 ◦C under RCP4.5 and RCP8.5.


#### **3. Results**

#### *3.1. Model Validation*

Validation results for near-surface temperature, precipitation, evapotranspiration, and runoff are shown in Figures 1–4, respectively. The columns of each figure are for different datasets (i.e., CRU, MODIS, GLEAM, GRUN, RegCM driven by ERA-Interim, and RegCM driven by GFDL); the rows are for annual and seasonal averages. It can be observed that RegCM can reasonably reproduce the spatial distribution and seasonal variations of temperature over China. As shown by the CRU observations (Figure 1a), above-zero baseline-average temperature can be found in most parts of the domain, except for the Tibetan Plateau and a small part of northeastern China. This spatial feature is realistically generated in the two sets of RegCM results, although with underestimations of various degrees (Figure 1b,c). Such underestimations, as discussed by Lu et al. [22], are partly caused by the model setup and partly due to the driving GCM data. Temperature over China demonstrates clear seasonality, i.e., hot summer, cold winter, and mild spring and autumn. RegCM is able to capture the seasonal variations, although underestimations can still be observed. The spatial correlation between the observed and modeled temperature can be found in Table S1 in the Supplementary Materials. The high correlation (ranges between 0.94 and 0.98) indicates RegCM's good performance in temperature. In addition, the results of RegCM driven by GFDL have higher correlations with the observations than those of the raw GCM data (please refer to Table S2 in the Supplementary Materials), which suggests that RegCM is able to correct some biases in GFDL's temperature results.

**Figure 1.** Observed and RegCM-simulated near-surface temperature over China. The annual and seasonal averages are calculated with respect to the baseline period of 1986 to 2005.

The performance of RegCM with respect to precipitation is less satisfactory than for temperature. Although it is able to generate the observed wet-in-the-southeast and dry-inthe-northwest precipitation pattern, underestimations can be found over the southeastern part of the domain and overestimations over the northwestern part (Figure 2a–c). This overand underestimation pair was shown to be related to the simulation bias in vapor pressure by Lu et al. [22], who further argued that the bias in vapor pressure could be associated with the bias in temperature. There is also an apparent dry bias near the Sichuan Basin and a wet bias near the southeastern edge of the Tibetan Plateau, which could be related to the configuration of the cumulus convective scheme [22]. The spatial correlation (Table S1) is lower for precipitation than for temperature, which is consistent with the above results. The seasonal precipitation over China shows clear monsoon features (more precipitation in summer and less in winter), which is shared by the two sets of RegCM results.

**Figure 2.** Observed and RegCM-simulated precipitation over China. The annual and seasonal averages are calculated with respect to the baseline period of 1986 to 2005.

**Figure 3.** Remotely sensed and RegCM-simulated evapotranspiration over China. The annual and seasonal averages are calculated with respect to the period of 2000 to 2005 for MODIS and 2003 to 2005 for GLEAM.

The spatial pattern of actual evapotranspiration from MODIS and GLEAM (Figure 3a,d) shows more regional details than the observed precipitation pattern from CRU, which is in part due to the higher resolution of remote sensing data than that of the gridded observation, and in part due to evapotranspiration's closer relationship with the geophysical characteristics of the domain than precipitation. The two sets of remote sensing data show similar annual patterns; subtle differences exist sporadically over the domain, which could be explained by the different skills of the two datasets over different land-use types [51]. RegCM shows better skills in simulating evapotranspiration than precipitation (Figure 3b,c,e,f), although minor overestimations can be noticed. As shown in Table S1, for annual average evapotranspiration, the spatial correlations between RegCM results and GLEAM are higher than 0.8, and those for MODIS are higher than 0.6, indicating a reasonable performance for RegCM in evapotranspiration. The spatial correlation between RegCM results and the observed tank evaporation from NIMC is 0.50 and 0.42, respectively, for the two rounds of hindcast simulations. The lower correlation between RegCM and NIMC could be related to the difference between tank evaporation and actual evapotranspiration. In terms of the seasonal variations, RegCM demonstrates overestimations in spring and winter, and underestimations in summer and autumn.

**Figure 4.** Reconstructed and RegCM-simulated runoff over China. The annual and seasonal averages are calculated with respect to the baseline period of 1986 to 2005.

The RegCM-generated spatial patterns for runoff show considerably larger biases than for other variables. Apparent overestimations can be spotted near the southeastern corner of the Tibetan Plateau, which is likely to be related to the wet bias in precipitation that occurs at the same location. Slight underestimations in runoff can be found over the entire Tibetan Plateau, where overestimations in evapotranspiration can be identified; the latter is likely to be the cause of the former. The spatial correlations between the annual average runoff of the two sets of RegCM results and GRUN are 0.67 and 0.66, respectively (Table S1). From the seasonal perspective, the GRUN reconstructed runoff is high in summer and

low in winter. This monsoon feature is well captured by RegCM. As shown by the spatial correlation, for runoff, RegCM shows better skills in summer/autumn (ranges between 0.65 and 0.69) than in winter/spring (between 0.37 and 0.53).

The RegCM-simulated domain average annual cycles for precipitation, evapotranspiration, and runoff are also examined (shown in Figure 5). The two sets of RegCM results present similar features (please refer to Figure S2 in the Supplementary Materials for a direct comparison of the two sets of simulation results). All three variables show a peak in their annual cycles during the monsoon period (May to September), which is consistent with previous results. On domain average, a considerable part of the precipitation is balanced by evapotranspiration, and a smaller portion is attributed to the runoff. The difference between the simulated precipitation and evapotranspiration is also plotted in Figure 5. As indicated by the surface water budget equation, the amount of precipitation that is not balanced by the other two variables contributes to the moisture storage in soil. It can be observed that the difference between precipitation and evapotranspiration is larger than runoff in early spring, and this relationship reverses in autumn, which indicates water storage in spring and dissipation in autumn.

**Figure 5.** Simulated annual cycles for precipitation, evapotranspiration, runoff, and the difference between precipitation and evapotranspiration in the baseline period.

#### *3.2. Future Changes in Precipitation, Evapotranspiration, and Runoff* 3.2.1. Precipitation

Having reasonable skills in reproducing the historical climate over China, RegCM is subsequently used to project future changes in water cycle components. The projected changes in precipitation over China in three future periods and under two emission scenarios are shown in Figure 6. The left three columns are for RCP4.5 and the right three for RCP8.5; for each scenario, the three columns, respectively, indicate 2030s, 2050s, and 2080s. The rows in the figure are the annual and seasonal averages. On annual average, increases in precipitation can be found in most parts of the domain. Precipitation changes of the two RCPs show certain similarities. For example, in the 2080s, precipitation increases of larger than 0.3 mm/day are expected in parts of the Tibetan Plateau, Yellow River Basin, Haihe River Basins, Yangtze Plain, and southern coastal hilly regions under both scenarios. In general, the area experiencing increased precipitation is larger under RCP8.5, especially in the Tibetan Plateau, where more pronounced increases (over 0.9 mm/day in the southeastern corner) can be observed as well.

**Figure 6.** Projected changes in precipitation over China.

Precipitation changes demonstrate apparent seasonal variations. In winter, precipitation decreases of over 0.3 mm/day can be found in parts of the Yunnan–Guizhou Plateau in the 2050s and 2080s under RCP4.5 and in the 2030s and 2050s under RCP8.5. In the 2080s under RCP8.5, more severe changes over larger areas can be noticed; parts of the Pearl River Basin are projected with precipitation decreases of over 0.3 mm/day and parts of the Yunnan–Guizhou Plateau of over 0.6 mm/day. Summer precipitation exhibits similar patterns of changes in the 2050s under the two scenarios, where decreases of over 0.3 mm/day are to be found in the middle and lower reaches of the Yangtze River Basin, and increases of over 0.9 mm/day are expected near the Hengduan Mountains located in the southeastern corner of the Tibetan Plateau. In the 2080s, the spatial distributions of summer precipitation change under the two scenarios are quite different. Under RCP4.5, precipitation decreases of over 0.3 mm/day are likely to occur in the very north of northeastern China, parts of the Yangtze Plain, and parts of the Pearl River Basin, while increases of over 0.9 mm/day are projected in the Hengduan Mountains and the southeastern coastal hilly regions. In comparison, under RCP8.5, decreases in summer precipitation mainly occur in the middle and lower reaches of the Yangtze River Basin (over 0.3 mm/day), while increases of over 0.9 mm/day are to be found in the southern parts of the Tibetan Plateau, Hengduan Mountains, and parts of the Haihe River Basin. The changes in summer precipitation may be related to variations in its major moisture transport branches, which are the transportations by the Indian monsoon, Southeast Asian monsoon, and midlatitude westerlies, as shown by Simmonds [52]. The increases in summer precipitation in southeastern China and decreases in central south and southeastern China under both scenarios could indicate an enhanced Indian summer monsoon and a subsided Southeast Asian summer monsoon. In spring and autumn, precipitation increases are to be seen in the Yangtze Plain (can reach over 1.8 mm/day) and the Yellow River Basin (over 0.9 mm/day), respectively.

The annual series of domain average precipitation under both RCPs are shown in Figure 7. For both scenarios, precipitation demonstrates an evident increasing trend (although a decreasing trend can be observed between 2050 and 2060 under RCP4.5). The Mann–Kendall test confirms the statistical significance of the trends at an *α* level of 0.05 (when the period of 2010 to 2100 is considered as a whole). The magnitudes of trends, given by Sen's slope estimator, are 0.02 and 0.03 mm/day per decade under RCP4.5 and RCP8.5, respectively (as shown in Table 2). Some seasonal trends are also statistically significant; summer precipitation shows increasing trends of 0.02 and 0.05 mm/day per decade under the two scenarios, and autumn precipitation increases at a rate of 0.02 mm/day per decade under RCP8.5.


**Table 2.** Future trends (mm/day per decade) of domain average precipitation, evapotranspiration, and runoff (*P*, *E*, and *R* denote precipitation, evapotranspiration, and runoff, respectively).

Notes: Numbers in bold font indicate trends that are statistically significant at *α* = 0.05.

**Figure 7.** Projected domain average anomalies of precipitation, evapotranspiration, and runoff. The time series are smoothed with a 20-year moving average filter.

Precipitation changes (with respect to the baseline period) in the three future periods under both scenarios are listed in Table 3. Under RCP4.5, the annual average precipitation is projected to increase by 0.06, 0.08, and 0.16 mm/day, respectively, in the three future periods (statistically significant at an *α* level of 0.05). Under RCP8.5, precipitation change in the 2030s is not statistically significant, and the increases in the 2050s and 2080s are 0.12 and 0.2 mm/day, respectively. Spring and summer precipitation is expected to undergo larger increases than that in winter and autumn. These numbers are consistent with the changes in the annual cycles of precipitation as shown in Figure 8, in which large precipitation increases can be found from April to September in the 2050s and 2080s under both scenarios.


**Table 3.** Projected changes (mm/day) in precipitation, evapotranspiration, and runoff at different future periods (*P*, *E*, and *R* denote precipitation, evapotranspiration, and runoff, respectively).

Notes: Numbers in bold font indicate trends that are statistically significant at *α* = 0.05.

**Figure 8.** Projected changes in the annual cycles of precipitation, evapotranspiration, and runoff.

Precipitation changes with respect to different warming levels under the two scenarios are shown in Table 4. The magnitudes of changes under the same warming level are similar under different scenarios. For example, with a domain average warming of 2 ◦C, the annual average precipitation is likely to increase by 0.12 and 0.13 mm/day under RCP4.5 and RCP8.5, respectively. This phenomenon is reasonable because, according to the Clausius–Clapeyron equation, the increase in the water holding capacity of the atmosphere is the same given the same temperature increase. The change in precipitation is not necessarily the same since the actual amount of water vapor available can be different. At a warming level of 2 ◦C, spring and summer precipitation is also projected to increase by 0.15 and 0.17 mm/day under RCP4.5, and by 0.21 and 0.17 mm/day under RCP8.5. When domain average warming reaches 4 ◦C under RCP8.5, the annual, spring, and summer precipitation are projected to increase by 0.19, 0.24, and 0.27 mm/day, respectively.


**Table 4.** Projected changes (mm/day) in precipitation, evapotranspiration, and runoff at different warming levels (*P*, *E*, and *R* denote precipitation, evapotranspiration, and runoff, respectively).

Notes: Numbers in bold font indicate trends that are statistically significant at *α* = 0.05.

#### 3.2.2. Evapotranspiration

The projected changes in evapotranspiration over China are shown in Figure 9. Similar to the spatial pattern of precipitation changes, the annual average evapotranspiration is likely to increase over most parts of the domain. The area experiencing increased evapotranspiration is larger than that for precipitation, but the magnitude of the increase is smaller when compared with precipitation. Annual average evapotranspiration changes are within ±0.3 mm/day for most of the time under both scenarios, except that increases of over 0.3 mm/day are projected in the Hengduan Mountains, Yunnan–Guizhou Plateau, and southeastern coastal hilly regions in the 2080s under RCP8.5. Intra-annual variations can also be observed for evapotranspiration changes. The most pronounced changes are to occur in summer, in which evapotranspiration increases of over 0.3 mm/day can be found over the entire domain except parts in northern and northeastern China in the

2080s under RCP8.5, and increases of over 0.6 mm/day are to be seen in the Hengduan Mountains. In spring, evaporation increases of over 0.3 mm/day are projected in areas between the Yangtze River and Pearl River Basins in the 2080s under RCP4.5, and in most of the southern parts of the domain in the same period under RCP8.5. In autumn, evapotranspiration increases of over 0.3 mm/day can be found in the Hengduan Mountains. Evapotranspiration changes in winter are within ±0.3 mm/day for all future periods under both scenarios.

**Figure 9.** Projected changes in evapotranspiration over China.

The annual average evapotranspiration time series is shown in Figure 7, which appears to be less fluctuating than those of the other two variables. Evident increasing trends can be observed under both scenarios, the magnitudes of which are 0.01 and 0.03 mm/day per decade (statistically significant at an *α* level of 0.05), respectively (Table 2). All seasonal trends are exclusively statistically significant, which are 0.01, 0.02, 0.01, and 0.01 mm/day per decade, respectively, for winter, spring, summer, and autumn under RCP4.5 and 0.01, 0.03, 0.04, and 0.02 mm/day per decade under RCP8.5. Annual and seasonal evapotranspiration changes in the three future periods under both scenarios are all statistically significant, except for winter in the 2030s under RCP4.5 (Table 3). Under RCP4.5, annual average evapotranspiration is projected to increase by 0.06, 0.1, and 0.12 mm/day, respectively, for the 2030s, 2050s, and 2080s, while, under RCP8.5, the increases are 0.08, 0.14, and 0.22 mm/day. Such increases in evapotranspiration can also be observed in Figure 8, in which evapotranspiration changes are always above zero and larger increases can be found in the monsoon months. At a domain average warming of 2 ◦C, annual

average evapotranspiration is projected to increase by 0.12 and 0.14 mm/day under the two scenarios (Table 4), which are close to the amount of precipitation increases. For the 4 ◦C warming period under RCP8.5, annual evapotranspiration is likely to increase by 0.21 mm/day, and seasonal evapotranspiration by 0.08, 0.23, 0.36, and 0.18 mm/day for winter, spring, summer, and autumn, respectively.

#### 3.2.3. Runoff

Figure 10 shows the projected changes in future runoff. In general, the annual and seasonal variations in runoff share a certain resemblance with those in precipitation, and only the area experiencing increased runoff is considerably smaller. Runoff changes in most parts of the domain are within ±0.3 mm/day in the three future periods under both scenarios. Under RCP4.5, runoff increases of over 0.3 mm/day can be found in the Yangtze Plain in the 2080s. Under RCP8.5, over 0.3 mm/day decreases in runoff are likely to occur in parts of the Yunnan–Guizhou Plateau in the 2050s and also in parts of the Yangtze River Basin in the 2080s. Seasonal changes in future runoff also demonstrate distinct characteristics. Among all seasons, runoff changes in summer are most severe. For example, under RCP4.5, runoff increases of over 1.2 mm/day are projected in the southeastern coastal hilly regions in the 2080s, which can be related to the increased precipitation in this area. Under RCP8.5, a considerable part of the Yangtze River Basin is projected to experience a runoff reduction of over 0.6 mm/day in the 2080s, which is likely to be caused by the simultaneous decrease in precipitation and increase in evapotranspiration. In winter, changes in runoff are within ±0.3 mm/day in most parts of the domain, with decreases in the central and southwestern parts and increases elsewhere. Spring runoff is projected to increase in the Yangtze Plain by at least over 0.3 mm/day in the future. In autumn, parts of the Yellow River and Haihe River Basins are to receive runoff increases of over 0.3 mm/day. The observed changes in runoff are closely related to the corresponding changes in precipitation.

As shown in Figure 7, the interannual variation in runoff is highly correlated with those in precipitation and precipitation minus evapotranspiration (hereafter as *P* − *E*). In addition, the changes in *P* − *E* are slightly larger than those in runoff, which are consistent with the observations of Zhang et al. [2]. They offered two possible explanations: problems with model spinup or water balance closure, and changes in terrestrial water storage as a result of global warming [2]. The latter case indicates a reduction in future soil moisture, which may bring a negative influence on agriculture. The time series of annual average runoff do not exhibit apparent trends, which is consistent with the lack of a significant trend as shown in Table 2. Previous studies also noticed that the annual runoff series does not demonstrate a trend in the historical period; for example, no statistically significant trend can be detected in the Huaihe River Basin, according to Yu et al. [53]. Significant weak trends of 0.01 mm/day per decade are projected in autumn under RCP4.5 and in winter and autumn under RCP8.5. In comparison with precipitation and evapotranspiration, there is no statistically significant change in annual average runoff except for the 2080s under RCP4.5 (0.05 mm/day). Statistically significant runoff reductions are projected in summer, which are −0.08 and −0.09 mm/day in the 2030s and 2050s under RCP4.5 and −0.12, −0.13, and −0.16 mm/day in the three future periods under RCP8.5. Such a runoff reduction in summer is also evident in Figure 8. These observations are partially consistent with those of Zhang et al. [2], who noticed that changes in precipitation, evapotranspiration, runoff, and *P* − *E* mainly occur in the wet season with simultaneous increases in all variables. In this study, the most pronounced changes also occur in the wet season. The projected reduction in runoff and *P* − *E* could be related to the difference in the domain selection between this and Zhang et al.'s studies. There is no significant change in annual average runoff under all warming levels and emission scenarios, except that an increase of 0.04 mm/day is projected at a warming level of 3 ◦C under RCP8.5. At a warming level of 2 ◦C, runoff changes of 0.02 and 0.08 mm/day are likely to occur in winter and autumn under RCP4.5, and 0.03, −0.11, and 0.08 mm/day in winter, summer, and autumn under RCP8.5. With a

#### domain average warming of 4 ◦C, runoff in winter, summer, and autumn is to change by 0.03, −0.15, and 0.01 mm/day under RCP8.5.

**Figure 10.** Projected changes in runoff over China.

#### **4. Discussions and Conclusions**

In this study, the long-term variations in the water cycle over China are studied using RegCM. The performance of RegCM in terms of temperature, precipitation, evapotranspiration, and runoff is validated through comparisons of the model-generated results with gridded observations, remote sensing data, and reconstructed data. The results show that RegCM can reasonably capture the spatial and seasonal variations in these variables, although certain biases exist, such as a cold bias in the entire domain, a dry and wet bias pair in the southeastern and northwestern parts of the domain, some over- and underestimations of evapotranspiration, respectively, in winter/spring and summer/autumn, and some over- and underestimations of runoff near the Tibetan Plateau.

Long-term projections of precipitation, evapotranspiration, and runoff under two emission scenarios are then developed. The results show that increased annual average precipitation and evapotranspiration can be found in most parts of the domain, while a smaller part of the domain is projected with increased runoff.

For precipitation, the regions most affected by global warming are the Yangtze Plain, Yellow River and Haihe River Basins, and southeastern parts of the Tibetan Plateau, where over 0.3 mm/day increases are expected in the 2080s under both scenarios. The projected increase in precipitation in the Yellow River and Haihe River Basins can also be observed in a CMIP6 GCM ensemble according to Tian et al. [54]. It is worth noting that

although the projected precipitation increase in the Tarim Basin is within 0.3 mm/day, the percentage change could be large considering its low annual average total precipitation, which is why several studies (e.g., [55,56]) identified it as an area vulnerable to climate change. Precipitation increase has been shown to be among the driving factors for the increased flood frequency in the Tarim River Basin since the 1980s [57]; the increase in future precipitation as projected by this and the previous studies may indicate increased flood risks in this area, which suggests the need for relevant flood prevention measures.

For evapotranspiration, areas experiencing evident increases are the Hengduan Mountains, Yunnan–Guizhou Plateau, and southeastern coastal hilly regions; the magnitude of change is 0.3 mm/day in the 2080s under RCP8.5. The apparent evapotranspiration increase in southeastern coastal hilly regions is also noted by Su et al. [58]. In terms of seasonal variations, summer and spring are likely to see larger increases in evapotranspiration; this feature is consistent with Ma et al.'s observation [59]. The developed projections in evapotranspiration can be used to evaluate climate change impacts on drought conditions through the calculation of evapotranspiration deficit; this index, compared with those that are based on precipitation and soil moisture, could more effectively reflect moisture deficiency in ecosystems [60].

For runoff, the regions most affected are the Yangtze Delta, Yunnan–Guizhou Plateau, and parts of the Yangtze River Basin, with increases of 0.3 mm/day for the first region in the 2080s under RCP4.5 and decreases of over 0.3 mm/day for the latter two regions in the 2080s under RCP8.5. The projected decrease in runoff in the middle reaches of the Yangtze River Basin is consistent with the results from Xing et al.'s study [61]. In addition, the runoff reduction in the Yunnan–Guizhou Plateau and the upper reaches of the Yangtze River Basin is also noticed by Zhai et al. in their ensemble projection of runoff [62]. Extreme high and low runoff are often related to flood and drought hazards [63], and the runoff projection developed in this study could help identify regions vulnerable to increased flood and drought risks, and thus support flood mitigation and water resource management [64].

In summary, future precipitation and evapotranspiration are likely to increase over China in the wet season, while runoff decreases. The projected changes in precipitation minus evapotranspiration are larger than those in runoff, implying a possible decrease in soil moisture. It is important that future variations in the water cycle components be considered when designing flood and drought mitigation measures.

Extensions of this study can be conducted with respect to the current limitations. For example, more sophisticated bias correction techniques can be applied to the model results. In addition, more GCMs can be used to drive RegCM so that an ensemble can be constructed for more robust projections.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/10 .3390/rs13193832/s1, Figure S1: Location of NIMC weather stations. Note: Red dots indicate selected stations for evapotranspiration data. The NIMC tank evaporation data contains a large number of missing data; the stations are selected if the percentage of missing data is less than 50% for each season. Figure S2: Simulated annual cycles for precipitation, evapotranspiration, runoff, and the difference between precipitation and evapotranspiration in the baseline period. The R2's for the two sets of RegCM results are 0.88, 0.92, 0.83, and 0.82, respectively for precipitation, evapotranspiration, runoff, and the difference between precipitation and evapotranspiration. Table S1: Spatial correlation between RegCM-generated results and observations, remote sensing data, and reconstructed data. Table S2: Spatial correlation between raw GFDL data and observations.

**Author Contributions:** Conceptualization, C.L., G.H. and X.W.; Data curation, C.L. and X.W.; Formal analysis, C.L.; Funding acquisition, G.H.; Investigation, C.L.; Methodology, C.L. and X.W.; Project administration, G.H.; Resources, C.L. and G.H.; Software, C.L. and X.W.; Supervision, G.H.; Validation, C.L., G.H., X.W., G.W., J.Z. and T.S.; Visualization, C.L.; Writing—original draft, C.L.; Writing—review and editing, C.L. and X.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Key Research and Development Plan (2016YFA0601502), Natural Sciences Foundation (U2040212), Canada Research Chair Program, and Natural Science and Engineering Research Council of Canada.

**Data Availability Statement:** The gridded observation data are obtained from the Climate Research Unit (https://crudata.uea.ac.uk/cru/data/hrg/; accessed on 18 March 2019). Remote sensing datasets are obtained from the Land Processes Distributed Active Archive Center (https://lpdaac. usgs.gov/; accessed on 9 September 2021) and the Global Land Evaporation Amsterdam Model (https://www.gleam.eu/; accessed on 29 June 2021). The reconstructed runoff data are obtained from Ghiggi et al. [42]. The station-based observations are obtained from the National Meteorological Information Center of China (data.cma.cn; accessed on 9 April 2019). The reanalysis data used to drive RegCM are obtained from the European Centre for Medium-Range Weather Forecasts (https://www. ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era-interim; accessed on 22 November 2017). Global climate model data are obtained from the Geophysical Fluid Dynamics Laboratory and downloaded from the Earth System Grid Federation (https://esgf-node.llnl.gov/search/cmip5/; accessed on 4 April 2017).

**Acknowledgments:** We are very grateful for the helpful inputs from the editor and anonymous reviewers.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Communication* **Glacier Velocity Changes in the Himalayas in Relation to Ice Mass Balance**

**Yu Zhou 1, Jianlong Chen 1,2 and Xiao Cheng 3,\***


**Abstract:** Glacier evolution with time provides important information about climate variability. Here, we investigated glacier velocity changes in the Himalayas and analysed the patterns of glacier flow. We collected 220 scenes of Landsat-7 panchromatic images between 1999 and 2000, and Sentinel-2 panchromatic images between 2017 and 2018, to calculate surface velocities of 36,722 glaciers during these two periods. We then derived velocity changes between 1999 and 2018 for the early winter period, based on which we performed a detailed analysis of motion of each individual glacier, and noted that the changes are spatially heterogeneous. Of all the glaciers, 32% have sped up, 24.5% have slowed down, and the rest 43.5% have remained stable. The amplitude of glacier slowdown, as a result of glacier mass loss, is significantly larger than that of speedup. At regional scales, we found that glacier surface velocity in winter has uniformly decreased in the western part of the Himalayas between 1999 and 2018, while increased in the eastern part; this contrasting difference may be associated with decadal changes in accumulation and/or melting under different climatic regimes. We also found that the overall trend of surface velocity exhibits seasonal variability: summer velocity changes are positively correlated with mass loss, i.e., velocity increases with increasing mass loss, whereas winter velocity changes show a negative correlation. Our study suggests that glacier velocity changes in the Himalayas are spatially and temporally heterogeneous, in agreement with studies that previously highlighted this trend, emphasising complex interactions between glacier dynamics and environmental forcing.

**Keywords:** glacier velocity changes; Himalayas; Landsat-7; Sentinel-2; ice mass balance

#### **1. Introduction**

Glaciers are sensitive to climate variability and are a major contributor to global sea level rise [1–7]. It is of great importance to understand glacier evolution with time because it provides direct evidence for climate change [2,8,9]. The Himalayas host the largest volume of glaciers outside the polar regions, which also contribute importantly to water resources for the Indus and Ganges basins [10–12]. Due to the difficult accessibility of high mountain areas, remote sensing has been a powerful tool for studying the Himalayan glaciers. Researchers have used satellite altimetry (e.g., [2,13,14]) and optical satellite stereo imagery (e.g., [7,15,16]) to quantify glacier mass balance in the Himalayas. Although the estimates derived from different techniques vary, they consistently show that the Himalayan glaciers are experiencing significant thinning and mass loss, thereby affecting ice fluxes and river discharge. The thinning rate is also suggested to have accelerated in the past 40 years, which is possibly driven by atmospheric warming and associated energy fluxes [15–17]. Recently, Dehecq et al. [9] investigated the response of glacier flow to mass changes at regional scales. They estimated time-series glacier velocities from 2000 to 2016,

**Citation:** Zhou, Y.; Chen, J.; Cheng, X. Glacier Velocity Changes in the Himalayas in Relation to Ice Mass Balance. *Remote Sens.* **2021**, *13*, 3825. https://doi.org/10.3390/rs13193825

Academic Editors: Yi Luo and Gareth Rees

Received: 20 August 2021 Accepted: 22 September 2021 Published: 24 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

using Landsat-7 optical satellite images [18], and found that the variability in velocity changes within a large region can be explained solely by changes in ice thickness, i.e., ice mass balance [9]. Their study provides a novel way for estimating ice mass balance in the Himalayas as glacier velocity changes can be easily measured with satellite images.

Glacier surface velocity in summer has been heavily exploited, e.g., [9,18]. Velocity estimates in [9] show that glaciers in the Himalayas have experienced significant slowdown in the past two decades. However, seasonal variability of regional glacier motion remains unclear. The aim of this study is to explore the long-term winter velocity and its changes. We first derived glacier velocities for two periods, early winter in 1999–2000 and 2017–2018, using Landsat-7 and Sentinel-2 image pairs, respectively. High-resolution satellites provide a powerful tool for monitoring ice and snow [19,20]. We chose Sentinel-2 over Landsat-7 for mapping the present glacier motion because it has been tested to have a better geometric and radiometric quality [21]. By differencing the Landsat-7 and Sentinel-2 derived velocities, we map velocity changes over nearly two decades, and, combining the data with glacier mass balance, we show the complex patterns of glacier flow in the Himalayas.

#### **2. Study Area**

The Himalaya front (Figure 1) stretches over 3000 km from the west to the east, containing more than 36,000 glaciers of different sizes (Randolph Glacier Inventory, RGI 6.0). The topography increases rapidly across the front, from 200 m in the south to over 5000 m in the north, entering the Tibetan Plateau. Evolution of glaciers in different parts of the Himalayas is affected by different climatic regimes. In the western part, snow accumulation is controlled by westerly atmospheric circulations, so Hindu Kush, Spiti Lahaul and Karakoram receive most accumulation during winter. In the eastern part, the Indian summer monsoon dominates the accumulation in West Nepal, East Nepal, Bhutan and Nyainqentanglha [2,12,22–24]. The extreme topography creates additional complexity; precipitation at high-altitude regions has been suggested to be 2–10 times higher than that at low-altitude regions [25,26]. As a result, glaciers in the Himalayan front exhibit contrasting variabilities in evolution and mass balance [2,7,9]. Shaded relief of the Himalaya region generated from the 3 arc second Shuttle Radar Topography Mission (SRTM) DEM [27] is shown in Figure 1.

#### **3. Data and Methods**

In this study, we focused primarily on surface velocity changes of the Himalayan glaciers at decadal scales. Satellite optical images were used to generate glacier velocity maps at different times. We collected 40 pairs of Landsat-7 Level-1T data between 1999 and 2000 to calculate glacier velocity during this period. Each panchromatic Landsat-7 Level-1T image covers an area of 185 km × 170 km, with a spatial resolution of 15 m. A total of 70 panchromatic Sentinel-2A/B Level-1C image pairs were obtained to calculate glacier velocity between 2017 and 2018. Each Sentinel-2 Level-1C image has a footprint of 100 km × 100 km and a spatial resolution of 10 m. Dehecq et al. [9] analysed velocity changes in summer. We are interested to explore whether glacier velocities exhibit seasonal variations, so all the 110 pairs of Landsat-7 and Sentinel-2 images used in this study were acquired during winter, centred around December. We also collected glacier geometry data including length, area, slope and thickness from RGI 6.0 and [28], along with satellitederived glacier elevation changes from [7], for a comprehensive analysis of glacier velocity changes and the possible driving factors.

We estimated glacier velocities by applying cross-correlation using the COSI-Corr software package. Optical correlation is implemented in the frequency domain with an accuracy ∼1/10 of the input pixel size [29,30]. We used a correlation window of 64 pixels × 64 pixels as a first step, followed by 32 pixels × 32 pixels for a second run, with a step of 16 pixels × 16 pixels (160 m) for Sentinel-2A/B data and 10 pixels × 10 pixels (150 m) for Landsat-7 data. The resulting east-west and north-south components of the displacement were filtered using the non-local means algorithm [30]. The purpose of filtering is to use the correlation to excluded snow-covered regions where the correlation is relatively low. We then used RGI 6.0 to mask out the non-glacial areas and generated two annual velocity fields for 1999–2000 and 2017–2018, for all the 36,722 glaciers within the Himalayas, as shown in Figure 2.

The velocity uncertainty was estimated from the overlapping areas between adjacent pairs. As shown in Figure 3, the Landsat-7 and Sentinel-2 derived velocities have a measurement error of −3.4 ± 11.6 m yr−<sup>1</sup> and 0.4 ± 4.6 m yr−1, respectively. Since all the image pairs are centred around December with little interannual fluctuations, the uncertainty depends largely on radiometric quality and image resolution, indicating that Sentinel-2 imagery has a better radiometric quality than Landsat-7 imagery [21], and higher resolution improves pixel matching [20], hence yielding more consistent results.

**Figure 1.** Glaciers are highlighted in light blue. The total number of glaciers in each subregion is labelled. Black box marks the location of Figure 2. We calculated the mean value of velocity changes of each subregion and compared our estimates to [9]. (**a**–**g**) shows examples of velocity changes in subregions. See Table 1 for details.

To calculate velocity changes, we re-sampled the Landsat-7 and Sentinel-2 derived velocity maps onto the same grid with a spacing of 160 m, and then differenced the two velocities on a pixel basis (Figure 1). Unlike [9], who based their analysis of glacier velocity changes mainly on regions, we conducted our analysis based on individual glacier. For each glacier, we computed its velocity and the associated change by averaging all the pixel values that cover the glacier.

**Figure 2.** Deriving glacier surface velocity using optical correlation. The Sentinel-2 optical data were provided by the European Space Agency (ESA). See Figure 1 for location.

**Figure 3.** Velocity uncertainties. The Landsat-7 and Sentinel-2 derived velocities have a measurement error of <sup>−</sup>3.4 <sup>±</sup> 11.6 m yr−<sup>1</sup> and 0.4 <sup>±</sup> 4.6 m yr<sup>−</sup>1, respectively.

#### **4. Results**

A total of 36,722 glaciers were included in our study (from the RGI 6.0 database). As shown in Table 1, 43.5% of the glaciers had a stable velocity during 1999–2018 (the difference between Landsat-7 and Sentinel-2 velocities is no more than 3 m yr−1), 32.0% sped up (velocity changes > 3 m yr−1) and 24.5% slowed down (velocity changes < − 3 m yr<sup>−</sup>1). Although speedup glaciers outnumber slowdown glaciers, the average amplitude of glacier speedup (6.3 m yr−1) is much smaller than slowdown (−12.3 m yr−1). At regional scale, more glaciers in the east of the Himalayas, i.e., West Nepal (33.4%), East Nepal (56.2%), Bhutan (34.1%) and Nyainqentanglha (41.3%), have experienced speedup than the west in Hindu Kush (22.6%), Spiti Lahaul (19.6%) and Karakoram (17.3%). Karakoram has seen the largest proportion of glacier slowdown (31.5%), followed by West Nepal (29.1%), Hindu Kush (28.3%), Bhutan (22.1%), Nyainqentanglha (21.9%), East Nepal (19.6%) and Spiti Lahaul (18.7%). The greatest speedup (about 130 m yr−<sup>1</sup> in RGI60-14.04875) and slowdown (about −250 m yr−<sup>1</sup> in RGI60-14.0300) also occurred in Karakoram.

We analysed glacier surface velocity in combination with glacier geometry data including area, length, slope and thickness. We used Sentinel-2 derived velocities in the analysis, given the smaller measurement uncertainties. The results (Figure 4) showed that small glaciers (area < 20 km2), regardless of non-surge and surge types, exhibit complex flow velocities, which are poorly correlated with area (*R* = 0.13 for non-surge type and *R* = 0.22 for surge type), length (*R* = 0.12 and *R* = 0.21), slope (*R* = −0.08 and *R* = 0.02) and thickness (*<sup>R</sup>* = −0.01 and *<sup>R</sup>* = 0.09). Flow velocities of large (area ≥ 20 km2), non-surge-type glaciers show a positive correlation with both area (*R* = 0.69) and length (*R* = 0.78), i.e., the size of glaciers, suggesting faster motion with increasing sizes, as glacier flow laws would predict. Surge-type glaciers do not show an evident correlation with any of the factors. We also noted that glacier velocities appear to be independent of, or at least not linearly correlated with, both slope and thickness. Similarly, we analysed the relationship between glacier velocity changes and glacier size and thickness. As shown in Figure 4, velocity changes of both non-surge- and surge-type glaciers are completely independent of glacier geometry, indicated by the very low correlation (*R* ≈ 0).

**Table 1.** Statistics of glacier velocity changes in the Himalayas. Stable glaciers are defined as the amplitude of differences between Landsat-7 and Sentinel-2 velocities, i.e., velocity changes between 1999 and 2018, <sup>≤</sup>3 m yr−1. Velocity changes <sup>&</sup>gt;3 m yr−<sup>1</sup> are regarded as speedup and <sup>&</sup>lt;−3 m yr−<sup>1</sup> as slowdown. The overall changes were calculated by averaging all the glaciers within each subregion, in order to be comparable with the results in [9].


**Figure 4.** Glacier velocity and velocity changes versus area, length, slope and thickness. Dots and triangles stand for glaciers with an area of smaller than 20 km<sup>2</sup> and over 20 km2, respectively. Non-surge- and surge-type glaciers are separated by blue and red. Lines show the linear regression analysis of the corresponding group of data.

#### **5. Discussion**

#### *5.1. Regional Patterns of Surface Velocity Changes*

Dehecq et al. [9] calculated time-series velocity anomalies for 11 subregions in the Himalayas. An analysis of glacier changes at regional scales allows us to explore cryospheric responses to climatic forcing. Here, we followed their line of analysis and focused on the 7 subregions that stretch along the range front. In order to compare our estimates to theirs, we averaged the velocity differences of all glaciers in each subregion and calculated the rate of change (velocity differences divided by 1.8 decades). Velocity changes (Table 1) show that, on average, glaciers in Hindu Kush (−1.0 m yr−<sup>1</sup> decade−<sup>1</sup> for 4206 glaciers), Karakoram (−1.6 m yr−<sup>1</sup> decade−<sup>1</sup> for 12,822 glaciers), Spiti Lahaul (−0.4 m yr−<sup>1</sup> decade−<sup>1</sup> for 7796 glaciers) and West Nepal (−0.5 m yr−<sup>1</sup> decade−<sup>1</sup> for 3906 glaciers) have slowed down from 1999 to 2018, whereas a slight acceleration has occurred in the east: East Nepal (0.7 m yr−<sup>1</sup> decade−<sup>1</sup> for 3563 glaciers), Bhutan (0.2 m yr−<sup>1</sup> decade−<sup>1</sup> for 1547 glaciers) and Nyainqentanglha (0.2 m yr−<sup>1</sup> decade−<sup>1</sup> for 2882 glaciers). Our results differ from [9]'s, who found that all subregions have experienced a slowdown between 2000 and 2017, with a changing rate varying from −6.4 m yr−<sup>1</sup> decade−<sup>1</sup> in Nyainqentanglha to −1.4 m yr−<sup>1</sup> decade−<sup>1</sup> in Hindu Kush, except for Karakoram, which experienced a small speedup (0.8 m yr−<sup>1</sup> decade−1). The inconsistency is likely to be resulted from seasonal changes in glacier flow. The images used in our study were collected in December, so our estimates are velocities during winter, while [9]'s velocity estimates are centred around June. Past studies (e.g., [31–33]) have shown that, during summer, when melting occurs, glaciers flow much faster with stronger temporal and spatial variations. Velocity changes during winter (0∼1 m yr−<sup>1</sup> decade<sup>−</sup>1) are considerably smaller than the changes during summer (1∼6 m yr−<sup>1</sup> decade<sup>−</sup>1) (see Figure <sup>1</sup> and Table 1), possibly due to strong spatio-temporal variations in melting in the Himalayas. From the winter velocity changes, we also observed a contrasting difference between the western (slowdown) and eastern (speedup) parts of the Himalayan front, indicating heterogeneous changes in accumulation and/or melting under different climatic regimes.

#### *5.2. Linking Surface Velocity Changes with Glacier Mass Balance*

To further investigate the driving factors of glacier velocity changes, we use the empirical power law relation between glacier surface velocity *V* and driving stress *τ* [34–37]:

$$V = A\tau^m \tag{1}$$

where *A* and *m* are positive constants, related to ice rheology, bed topography and flow mechanisms (ice deformation and basal sliding) [34,35]. *m* has been estimated to vary from 1 (flow over soft sediments, [38]) to 4 (high subglacial pressure, [9]) under different circumstances.

Taking the derivative of *V* with respect to *τ*, we have:

$$dV = Am\tau^{(m-1)}d\tau\tag{2}$$

Combining Equations (1) and (2), we have:

$$\frac{dV}{V} = m\frac{d\tau}{\tau} \tag{3}$$

Assuming that changes in the driving stress (*dτ*) are induced by mass balance (*dM*) and ignoring other factors, as proposed by [9], i.e., *dτ* = *C*1*dM* + *C*2, we have:

$$\frac{dV}{V} = \frac{m}{\tau} \mathcal{C}\_1 dM + \frac{m}{\tau} \mathcal{C}\_2 \tag{4}$$

where *C*<sup>1</sup> and *C*<sup>2</sup> are assumed constant, relating the driving stress and mass balance. Dehecq et al. [9] analysed velocity changes and mass balance data and found that, at regional scales, summer velocity changes are positively correlated with mass balance: *dV <sup>V</sup>* ∼ 1.25*dM* (see Figure 5).

To test whether velocity changes exhibit any seasonal variability, we applied Equation (4) to analyse the winter velocity estimates. We calculated the average values of velocity *V* and its change *dV* for each of the glaciers, based on which we determined the overall *dV* and *V* for each subregion. The measurements of glacier mass balance (*dM*) were taken from [7]. Although estimates of glacier mass balance in Brun et al. [7] did not take seasonal variability into account, an earlier study in [2] has shown that the long-term trend of *dM* is consistent between seasons (the amplitude differs slightly). Therefore, using the average long-term trend of *dM* should not affect the linear relationship between *dV* and *dM*. As shown in Figure 5, glacier velocity changes *dV <sup>V</sup>* are negatively correlated with mass balance ( *dV <sup>V</sup>* ∼ −0.96*dM*), suggesting that ice mass loss promotes glacier motion in winter. This is in contrasting difference with summer velocity changes in the study by [9], which states that mass loss drives glacier slowdown. Such seasonal variability indicates

different mechanisms of glacier mass loss in the Himalayas. Increased melt conditions in the early winter period, November and December, could enhance velocity, as noted by Bocchiola et al. [39], Pelto et al. [40] in the eastern Himalaya. During summer, mass losses are more likely to be resulted from increased ablation zone melting, given that precipitation has not been significant [41,42]. The result suggests that velocity changes in the Himalayas are temperature driven, rather than accumulation process driven.

**Figure 5.** Glacier surface velocity changes versus mass balance. Mass balance data were taken from [7]. Shaded bands indicate 68% confidence interval. Winter velocity changes (filled circles, in this study) are negatively correlated with mass balance, with a correlation of *R* = −0.87. Summer velocity changes (filled triangles, [9]) are positively correlated with mass balance.

#### **6. Conclusions**

In this study, we used Landsat-7 and Sentinel-2 optical imagery to investigate glacier surface velocity and the associated changes in the Himalayas. We analysed flow patterns of individual glacier along the Himalayan mountain front and found that glacier velocity changes exhibit an evident heterogeneity at different spatial scales. Of all the 36,722 glaciers, 32% have experienced speedup, 24.5% have slowed down, and the rest, 43.5%, have remained stable. At regional scales, the amplitude of velocity changes is significantly larger in summer [9] than that in winter (this study). The decreasing velocities in winter between 1999 and 2018 in the western part of the Himalayas, in contrast to the increasing velocities in the eastern part, may be caused by changes in accumulation and/or melting under different climatic regimes. Accelerated flow in Eastern Nepal may be impacted by melting conditions, where more melting has been observed. We also observed that glacier velocity changes in winter are controlled by mass balance, as suggested by [9]; however, unlike summer velocity changes that are positively correlated with mass balance, winter velocity changes show a negative correlation. Our study suggests that glacier velocity changes in the Himalayas are more spatially and temporally heterogeneous than what

was previously thought, emphasising complex interactions between glacier dynamics and environmental forcing.

**Author Contributions:** Y.Z. and J.C. performed data processing and interpreted the results. Y.Z. wrote the original draft. J.C. and X.C. reviewed the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** The APC was funded by the Second Tibetan Scientific Expedition and Research Program (STEP) (2019QZKK0901).

**Data Availability Statement:** Landsat-7 images were downloaded from https://earthexplorer.usgs. gov/ (accessed on 23 January 2020); Sentinel-2 images were downloaded from https://scihub.esa.int/ (accessed on 23 January 2020). The RANDOLPH GLACIER INVENTORY (RGI 6.0) is freely available from https://www.gtn-g.ch/data\_catalogue\_rgi/ (accessed on 23 January 2020).

**Acknowledgments:** This work was supported by the Second Tibetan Plateau Scientific Expedition and Research Program (STEP) (2019QZKK0901) and Innovatiom Group Project of Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) (No. 311021008).

**Conflicts of Interest:** The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

#### **References**


## *Article* **The UVSQ-SAT/INSPIRESat-5 CubeSat Mission: First In-Orbit Measurements of the Earth's Outgoing Radiation**

**Mustapha Meftah 1,\*, Thomas Boutéraon 1, Christophe Dufour 1, Alain Hauchecorne 1, Philippe Keckhut 1, Adrien Finance 1,2, Slimane Bekki 1, Sadok Abbaki 1, Emmanuel Bertran 1, Luc Damé 1, Jean-Luc Engler 1, Patrick Galopeau 1, Pierre Gilbert 1, Laurent Lapauw 1, Alain Sarkissian 1, André-Jean Vieau 1, Patrick Lacroix 1, Nicolas Caignard 1, Xavier Arrateig 1, Odile Hembise Fanton d'Andon 2, Antoine Mangin 2, Jean-Paul Carta 3, Fabrice Boust 4, Michel Mahé <sup>5</sup> and Christophe Mercier <sup>6</sup>**


**Abstract:** UltraViolet & infrared Sensors at high Quantum efficiency onboard a small SATellite (UVSQ-SAT) is a small satellite at the CubeSat standard, whose development began as one of the missions in the International Satellite Program in Research and Education (INSPIRE) consortium in 2017. UVSQ-SAT is an educational, technological and scientific pathfinder CubeSat mission dedicated to the observation of the Earth and the Sun. It was imagined, designed, produced and tested by LATMOS in collaboration with its academic and industrial partners, and the French-speaking radioamateur community. About the size of a Rubik's Cube and weighing about 2 kg, this satellite was put in orbit in January 2021 by the SpaceX Falcon 9 launch vehicle. After briefly introducing the UVSQ-SAT mission, this paper will present the importance of measuring the Earth's radiation budget and its energy imbalance and the scientific objectives related to its various components. Finally, the first in-orbit observations will be shown (maps of the solar radiation reflected by the Earth and of the outgoing longwave radiation at the top of the atmosphere during February 2021). UVSQ-SAT is one of the few CubeSats worldwide with a scientific goal related to climate studies. It represents a research in remote sensing technologies for Climate observation and monitoring.

**Keywords:** climate observation and monitoring; earth radiation budget; nanosatellite; IPCC

#### **1. Introduction**

UVSQ-SAT is a nanosatellite designed to observe the Sun and the Earth, mainly for observing essential climate variables. It allows, among other things, measurements of the solar radiation reflected by the Earth or outgoing shortwave radiation (OSR) and of

**Citation:** Meftah, M.; Boutéraon, T.; Dufour, C.; Hauchecorne, A.; Keckhut, P.; Finance, A.; Bekki, S.; Abbaki, S.; Bertran, E.; Damé, L.; et al. The UVSQ-SAT/INSPIRESat-5 CubeSat Mission: First In-Orbit Measurements of the Earth's Outgoing Radiation. *Remote Sens.* **2021**, *13*, 1449. https://doi.org/ 10.3390/rs13081449

Academic Editor: Xander Wang

Received: 24 March 2021 Accepted: 5 April 2021 Published: 8 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the outgoing longwave radiation (OLR) at the top of the atmosphere (TOA). A detailed description of the UVSQ-SAT mission and its satellite is given by Meftah et al. (2020) [1]. The main technical and scientific goal is to implement an agile demonstrator allowing to perform measurements of the Earth's radiation balance at the TOA. The UVSQ-SAT mission will allow to validate the principle of several miniaturized technologies as a first step to accurately measure the Earth's energy imbalance (EEI) with a constellation of small satellites. Regarding the educational objectives, the UVSQ-SAT program seeks to promote the transmission and valorization of scientific and technical knowledge through the innovation. This one feeds the creation of value and allows the implementation of technological breakthroughs and the associated risk-taking, the "NewSpace", and technology transfer.

Developed from 2017, UVSQ-SAT is based on a CubeSat architecture [2,3], which represents a satellite format defined in 1999 by the California Polytechnic State University and Stanford University (USA). UVSQ-SAT was developed and implemented by LATMOS and its INSPIRE international partners. The INSPIRE consortium was formed in 2015, with the objectives of developing small satellites to perform science missions, building a project-based learning curriculum for space science and engineering, and establishing a supporting network of ground stations. UVSQ-SAT/INSPIRESat-5 fully meets these objectives. A representative mockup of the UVSQ-SAT satellite is provided in Figure 1 left.

**Figure 1.** (**Left**) Representative mockup of the UVSQ-SAT satellite. (**Right**) Transporter 1 mission (SpaceX) with 143 commercial and government satellites on-board, including UVSQ-SAT. Credits: SpaceX.

Before being launched into space, a test campaign of the UVSQ-SAT CubeSat was carried out from August to October 2020. It typically involves several mechanical tests, electromagnetic compatibility (EMC) tests, calibration and performance tests, thermal vacuum and thermal balance tests, magnetic tests, and end to end tests. The satellite has been fully tested with French facilities (Figure 2) at LATMOS, at the Plateforme d'Intégration et de Tests (PIT) of the Observatoire de Versailles Saint-Quentin-en-Yvelines (OVSQ), at French space agency (CNES), and at the Office National d'Etudes et de Recherches Aérospatiales (ONERA). A successful mechanical testing campaign of the UVSQ-SAT CubeSat was mandatory to get a launch clearance. It involved quasi-static testing (test carried out with

a static load of 15 g to simulate the acceleration of the launch vehicle), shock testing, as well as random vibration (test levels are shown in Figure 3 left). Random vibration was carried out to simulate the loads of the upper stages of the SpaceX Falcon 9 launcher and is the most severe test. Before and after each random axis test run, a resonance search (Figure 3 right) was done with a frequency sweep of low-level sinusoidal vibrations to characterize main resonant modes of the UVSQ-SAT CubeSat, to determine if the tests loads have changed the mechanical properties of the CubeSat (first eigenmode close to 418 Hz), and to show possible deficiencies. Thermal bakeout was mandatory (+60 ◦C) for outgassing volatile materials of the CubeSat. A thermal vacuum cycling test (temperatures profile given in Figure 4) was also carried out to simulate the temperature conditions in orbit and to demonstrate that the CubeSat is able to survive, without loss of integrity or functionality. After performing all performances and environmental tests with success and confirmation of frequency coordination, UVSQ-SAT was ready to go in space.

**Figure 2.** UVSQ-SAT environmental tests and calibrations carried out during the year 2020. (**a**) Vibration. (**b**) EMC tests. (**c**) Calibration with a Xenon lamp and a black body. (**d**) Thermal vacuum and thermal balance tests. (**e**) Magnetic cleanliness verification of the satellite. (**f**) End to end tests with the UHF/VHF LATMOS ground-based station (telemetry at 437.020 MHz, telecommand at 145.905 MHz).

UVSQ-SAT was launched on 24, January 2021 by the Falcon 9 rocket in the frame of the "Transporter 1" mission (Figure 1 right). Since then, the satellite is on a sun-synchronous orbit around the Earth at an altitude of about 533 km (515 km at perigee), an inclination of 97.5010◦, an eccentricity of 0.0014455, a right ascension at the ascending node of 128.4244◦, and an argument at perigee of 94.1573◦. The duration of an orbit is about 95.18 min. The selected orbit presents eclipses, which allow to realize calibrations with observations towards the deep space or only towards the Earth.

This manuscript presents first UVSQ-SAT in-orbit measurements of the Earth's outgoing radiation. Section 2 presents the objectives of the UVSQ-SAT mission and the importance of measuring the Earth's radiation budget. The expected scientific requirements on the measurement of the Earth's energy imbalance will be presented. They are given for the UVSQ-SAT demonstrator and the future constellation (Terra-F) which must be disruptive. Section 3 presents the methods used to reconstruct the maps of the albedo and the OLR at the TOA. They are applicable to the observations by the UVSQ-SAT's instruments and the model used. Section 4 shows the first observations made by the UVSQ-SAT mission and the preliminary results. Section 5 highlights the perspectives related to this study and finally in Section 6, the conclusions are presented.

**Figure 3.** (**Left**) Random vibration profile was run in three orthogonal axes, referred as X, Y, Z. (**Right**) Resonance search carried out on UVSQ-SAT before and after each random run (September 2020).

**Figure 4.** Profile of the UVSQ-SAT thermal vacuum cycling tests carried out in September 2020.

#### **2. Scientific Objectives and Requirements**

#### *2.1. Scientific Objectives*

One of the objectives of the UVSQ-SAT mission is the study of the Earth's energy budget (ERB) and its energy imbalance [1]. Climate is largely determined by its energy balance at the top of the atmosphere, which regulates the overall energy content of the Earth system, i.e., the atmosphere and ocean (IPCC, 2013). The system is heated by the absorption of incoming solar energy (shortwave) and is cooled by terrestrial infrared emission (longwave) to space. Almost 30% of the incoming solar energy is not absorbed but reflected back to space. Human activities have led to increased levels of heat-trapping greenhouse gases (GHGs) in the atmosphere, with a decrease in the amount of terrestrial radiation that can escape, resulting in an Earth energy imbalance. As a result, the Earth system is heating up. The most obvious sign of this is the long-term increase in global surface temperatures, which increases outgoing terrestrial infrared emission and thus tends to restore the Earth's energy balance. However, as illustrated by the apparent slowdown in global surface warming observed between 1998 and 2012 ("climate hiatus"), global average surface temperature, with its large decadal internal variability, is certainly not the most appropriate indicator of the EEI and associated energy accumulation in the Earth system [4]. Since the EEI drives climate change, its continuous monitoring has been identified as a fundamental diagnostic for analyzing climate variability and anticipating future changes. Closing the Earth's energy balance is considered a key step to further improve our understanding of global climate change [5]. Ideally, the EEI should be measured continuously from space. To date, existing satellite instruments have not been able to measure solar and terrestrial radiative fluxes with sufficient accuracy to directly determine the absolute value of the EEI on their own, although measurements of relative changes in the EEI are much more reliable [6–8]. Note that determining EEI requires very accurate global radiative measurements because EEI results from the difference of several terms (incoming solar energy, reflected incoming solar energy, outgoing terrestrial infrared energy) that are two orders of magnitude higher than EEI. For this reason, satellite radiative flux data are often anchored in some way to ocean heat content data, as most of the excess energy does not accumulate at the surface or in the atmosphere, but is absorbed by the oceans [5,8]. Although ocean heat content is considered a better indicator of EEI, its global assessment remains challenging, mainly due to insufficient and inconsistent data coverage, and uncertainties in global scale extension [9,10]. Overall, this lack of accurate satellite measurements of EEI is a major problem because climate change is primarily a perturbation of the Earth's energy balance. In addition, satellite measurements of solar and terrestrial energy fluxes also represent critical constraints and benchmarks for the evaluation and improvement of climate models, including their representations of the radiative effects of clouds and aerosols (IPCC, 2013). Earth's radiation budget data are therefore a key source of information for distinguishing external from internal climate variability and for assessing the sensitivity of climate to different forcings (including volcanic eruptions, solar variability, and anthropogenic aerosols).

EEI determination represents a complex problem where the first relatively detailed balances of the energy exchanges of our planet and its atmosphere were published in 1997 by Kiehl and Trenberth [11]. An update of the ERB was completed by Stephens et al. [12] in 2012. Figure 5 illustrates the flow of energy through Earth's atmosphere. Table 1 provides ERB main contributions and EEI during the 2000–2010 period. Figure 5 shows how solar radiation warms our planet, and how this energy becomes temporarily trapped as it fluxes away from Earth's surface as longwave infrared radiation. This energy trap produces the greenhouse effect, which represents the main driver of global warming [13].

EEI is the difference between the total solar irradiance (TSI) and the OSR combined with the OLR thermally emitted. At the TOA, the EEI was +0.58 ± 0.15 Wm−<sup>2</sup> for the period 2005–2010 according to Hansen et al. (2011) [14]. It was estimated at +0.60 ± 0.40 Wm−<sup>2</sup> between 2000 and 2010 by Stephens et al. (2012) [12]. A recently published study by Von Schuckmann et al. (2020) [15] shows that the EEI is currently estimated to be +0.87 ± 0.12 Wm−<sup>2</sup> over the period 2010–2018 compared to +0.47 ± 0.10 Wm−<sup>2</sup> over the period 1971–2018. This positive EEI describes the excess of heat in the Earth system, mainly due to human activity.

**Figure 5.** Radiation budget diagram of the Earth's atmosphere from Stephens et al. (2012) [12]. ERB for the period 2000–2010, where all flows are expressed in Wm<sup>−</sup>2.



Figure 6 shows the evolution of the OLR and the EEI (12-month rolling averages) over the last four decades. Changes in sea surface temperature (SST) during different phases of the El Niño Southern Oscillation (ENSO) and in cloud properties are often reflected in changes in OLR. The EEI strong variations correspond to volcanic eruptions that deposit aerosols in the stratosphere, thus cooling Earth by reflecting sunlight back to space. In 1982, El Chichón ejected million metric tons of sulfur dioxide into the stratosphere. In 1991, Mount Pinatubo represents the largest stratospheric disturbance since Krakatoa eruption in 1883, dropping global temperatures and increasing ozone depletion. Then, the global energy imbalance of the Earth is not constant over time. It depends on natural variability and the complexity of different climate forcings. The time series of measurements from the Clouds and the Earth's Radiant Energy System (CERES) instrument is compared to the ERA 5 model, which represents the fifth generation of atmospheric reanalyses of global climate from the European Centre for Medium-Range Weather Forecasts (ECMWF). These results highlight the impact of anthropogenic radiative forcing during a period where the TSI has decreased since 1980 and where solar cycle 24 (24th cycle since 1755) is much weaker than the previous two cycles. The various studies [12,14,15] conclusively show the dominant role of the human-created greenhouse effect in driving global climate change. The EEI represents the most critical measure of the state of the Earth's climate. It sets expectations for future climate change (taking into account the Earth's inertia).

**Figure 6.** Time evolution of the OLR and the EEI from CERES measurements and the ERA 5 model.

The EEI represents the best approach to characterize global warming compared to other commonly used parameters (CO2 concentration and Earth surface temperature). This highlights the importance of space-based missions that perform such measurements, including the UVSQ-SAT pathfinder mission associated with its possible small satellites constellation [1] named Terra-F. Further efforts to implement new space-based observing systems are more than necessary, as the EEI represents additional global warming that will occur without additional changes in radiative forcings. Direct measurements of changes in the EEI are of paramount importance in determining the rate of climate change on regional and global scales.

#### *2.2. Scientific Requirements and Uncertainties on EEI Measurements*

Accurately measuring the absolute value of the EEI at the TOA and its variability over time is still a scientific and technological challenge. The relevant scientific objective is to be able to detect any long-term global trend with a high accuracy (stability of at least ±0.2 Wm−<sup>2</sup> per decade), but also to be able to obtain fine spatial (a few tens of km) and high temporal resolutions (three hours as for weather forecasting) with an accuracy of a few Wm−2. With current technologies, we do not know if all these requirements are achievable with instruments on board satellites, but they are very important to better determine the changes.

To date, the best estimate of the EEI is obtained from the change in heat stored in the ocean, commonly referred as ocean heat content (OHC). However, the absorption of heat by the ocean acts as a "buffer". This causes the rate of surface warming to slow down. The OHC measurements made by the Argo automated floats are excellent, but do not provide answers to the "short term" dynamics. To solve this issue, we need fine spatial and temporal resolutions as previously stated. For this, we need a constellation of satellites with instruments that combine measurements with a narrow field of view (FOV) and a wide FOV. The key to solve this problem is related to the resolution of the diurnal variation, which depends on the physical parameterization of the models. The "long term" is associated with the convolution of quick processes and dynamic phenomena at longer temporal and spatial scales of the atmosphere-ocean system. Thus, the best strategy for measuring the EEI is to track the "long term" temporal evolution of the OHC since most of the excess heat (89% over 1970–2018) is absorbed by the ocean [15]. These measurements must be combined with observations at the TOA using satellite-based instruments to provide "short term" information. Measurements with good spatio-temporal resolutions are crucial for advancing our understanding of climate change, as the radiative balance is driven in part by the radiative impacts of aerosols and clouds, which are highly spatially and temporally variable and are still relatively poorly quantified [16]. Currently, we do not know whether cloud-related changes will amplify, mitigate, or perhaps have only

a small effect on the increase in the Earth's global temperature due to anthropogenic radiative forcings. Furthermore, we do not know the impact of radiative forcing related to anthropogenic aerosols that alter cloud properties. Global climate models have limitations with respect to these two issues. This highlights the importance of space-based missions to make accurate measurements with excellent spatio-temporal resolutions.

Observations are important to validate climate models and the magnitude of the energy imbalance at the TOA. Incident solar flux is an input to the model. The solar radiation reflected from the Earth or OSR and the OLR are calculated at the model level and must be compared to those measured. Table 2 summaries the expected science requirements for disruptive measurements of the energy imbalance components. In addition, some objectives of the UVSQ-SAT mission are presented. UVSQ-SAT is a science and technology demonstrator, one of whose objectives is to obtain an accurate short-term measurement of the EEI with a relative uncertainty that must be less than ±1.0 Wm−<sup>2</sup> over the course of a year. The expected spatial resolution of the UVSQ-SAT measurements is in the order of 1000 km. Forward, the method of Gristey et al. (2017) [17] will be used to improve the accuracy of the measurements thanks to a spherical harmonic analysis.

**Table 2.** Scientific requirements for the components of the Earth's radiation budget. High scientific relevance for a satellites constellation (Terra-F) and UVSQ-SAT expected performances.


#### **3. UVSQ-SAT Data Method and Map Reconstruction of the Variables (Observations and Model)**

*3.1. UVSQ-SAT Data Processing to Obtain Observation Time Series*

3.1.1. Instrumental Equations of the Earth's Radiative Sensors

The UVSQ-SAT scientific payload [1] consists of twelve miniaturized Earth's radiative sensors (ERS) based on thermopiles for monitoring the incoming solar radiation and the outgoing terrestrial radiation. Each face of the UVSQ-SAT satellite has two ERS sensors, which have different optical coatings (carbon nanotubes or optical solar reflector). The properties of the coatings were characterized in lab (solar absorption, hemispherical emissivity, bidirectional reflectance distribution function (BRDF)). The principle of these ERS sensors with their associated coatings is to convert thermal energy into electrical energy. The output voltage (*V*) is passively induced from the thermopile proportional to the heat flux (Wm−2) through the sensor or similarly the temperature gradient across the thin-film substrate and number of thermocouple junction pairs. The output voltage is conditioned by an electronic unit and converted into analog-to-digital units (ADU) and is read by the UVSQ-SAT onboard computer (OBC) thanks to a serial peripheral interface (SPI) bus, and stored by the OBC. Once data are retrieved, the ADU signals *SADU* are

converted into physical units to express the incident flux measurements Φ*in* on an ERS sensor using the transfer function (Equation (1)):

$$\Phi\_{\rm in} = \left(\frac{\mathbb{S}\_{\rm ADM}(T\_{\rm s}, T\_{\rm b}) - V\_{\rm ADM}^{ref}(T\_{\rm b})}{N\_{\rm sample}} + \mathbb{C}\_1(T\_{\rm s}, T\_{\rm b})\right) \times \frac{1}{\mathcal{G}(T\_{\rm b})} \times \frac{1}{\mathcal{S} \rm res} + \mathbb{C}\_2(T\_{\rm s}) \tag{1}$$

where *Vref ADU* is a reference voltage, *Nsamp* is the number of samples for a measurement, *G*(*Tb*) is the gain of the electronic unit, Sens is the sensitivity of the sensor (calibration in lab), *C*<sup>1</sup> and *C*<sup>2</sup> are corrective offsets, which depend on the temperature of the sensor. In the case of thermopiles, *C*<sup>2</sup> = − *σT*<sup>4</sup> *<sup>s</sup>* .  is the hemispherical emissivity of the thermopile coating and *Ts* is the ERS sensor temperature. *Tb* represents the temperature of the sensors electronic board.

The total incident flux (Φ*in*) on each ERS sensor is the sum of the solar incident flux, the planetary incident flux (Φ*pin*), and the albedo incident flux (Φ*ain*). The incident solar flux is considered to be known (TSI is measured precisely elsewhere). By combining information from two ERS sensors on the same face (sensor with carbon nanotubes (solar absorption close to 1) and sensor with optical solar reflector (solar absorption less that 0.1)), we get the albedo (*a*) of the planet and the outgoing longwave radiation (OLR) at a measurement point located in latitude and longitude on the world map (Equations (2) and (3) are given as example if a satellite face observes at the nadir):

$$
\Phi p\_{\rm in} = \varepsilon \times \text{OLR} \times \left(\frac{R}{R + z\_{\rm sat}}\right)^2 \tag{2}
$$

$$\Phi a\_{in} = a \times a \times \left(\frac{R}{R + z\_{sat}}\right)^2 \times \cos(\xi) \times \text{TSI} \times \left(\frac{1 \, au}{d\_s}\right)^2 \tag{3}$$

where *R* is Earth's radius, *zsat* is the satellite altitude, *α* is the ERS (coating) solar absorption, *ξ* is the solar zenith angle, 1 *au* is one astronomical unit, and *ds* is the distance UVSQ-SAT—Sun.

During satellite eclipse periods, two ERS sensors on the same face measure only OLR, which allow to obtain direct measurements and inter-calibration of the sensors.

#### 3.1.2. Instrumental Equations of the Optical Sensors Based on Photodiodes

The UVSQ-SAT scientific payload has also six photodiodes to measure the total solar irradiance and the outgoing shortwave radiation, and four additional photodiodes for observing the Sun (three measure the total solar irradiance and one measures the UV solar spectral irradiance in the Herzberg continuum). Each face of the satellite has at least one photodiode who observes external fluxes. A photodiode is a semiconductor p-n junction device that converts light into an electrical current. The resulting current is converted into ADU and stored by the UVSQ-SAT onboard computer. Once data are retrieved, the ADU signals *SpADU* are converted into physical units to express the measurements Φ*pho* using the transfer function (Equation (4)):

$$\Phi\_{\rm pho} = \left(\frac{Sp\_{ADIL}(T\_{\rm pho}, T\_b) - V\_{ADM}^{ref}(T\_b)}{Np\_{\rm amp}} + \text{Cp}\_1(T\_{\rm pho}, T\_b)\right) \times \frac{1}{G\_{\rm pho}(T\_b)} \times \frac{1}{S\_{\rm pho}(T\_{\rm pho})} \tag{4}$$

where *N psamp* is the number of samples for a measurement, *Gpho* is the gain of the electronic unit, *Spho*(*Tpho*) is the responsivity of the photodiode (calibration done by the manufacturer), *Cp*<sup>1</sup> is a corrective offset, which depends on the temperature of the photodiode. *Tpho* is the satellite structure temperature close to the photodiode.

For the six photodiodes that measure the total solar irradiance and the outgoing shortwave radiation, the albedo of the Earth can be obtained using an equation similar to that proposed in Equation (3).

#### 3.1.3. Methodology for Obtaining UVSQ-SAT Attitude and Position Time Series

UVSQ-SAT has a new 3-axis accelerometer/gyroscope/compass (Teach' Wear (TW)) for providing satellite attitude estimation [1]. Using the TW observations, it is possible to achieve sufficient attitude estimation accuracy for a satellite using regular Kalman filter algorithm. Another way is to use a new method based on a multilayer perceptron network to determine the UVSQ-SAT satellite attitude [18].

Otherwise, we have a good knowledge of the UVSQ-SAT CubeSat position (latitude, longitude, altitude) according to the Two-Line Element sets (TLE) that are generally used to predict the position of a satellite. For each measurement carried-out by the instruments, we have its location on the world map. Figure 7 shows an example of ground track measurements location of the UVSQ-SAT satellite during February 2021, which represents the path on the surface of the Earth directly below the satellite's trajectory. The knowledge of the satellite position allows to calculate notably the solar zenith angle (from latitude, longitude, solar declination, Universal Time) and to rigorously locate the measurements area.

**Figure 7.** Location (latitude and longitude) of UVSQ-SAT observations (red colored dots on the world map) during the month of February 2021.

#### *3.2. Map Reconstruction Method from UVSQ-SAT Observation Time Series*

Time series of the albedo and the OLR at the TOA are obtained from the observations made by the dedicated sensors on-board UVSQ-SAT.

Physically, each measurement of these time series represents the integral of the signal of interest (OLR, albedo) and depends in practice on a large number of parameters among which the bidirectional reflectance distribution function of the Earth's surface, the opacity of the atmosphere, the spectral and angular sensitivities of the sensors and their FOV. In particular, concerning the opacity, it depends on factors such as aerosol composition, clouds, temperature or pressure.

In a first approach, we consider that each measurement results from the contribution of a Gaussian distribution *G*(*θi*,*j*) of points located at the surface of the Earth. Let the surface of the Earth modeled by a regular grid whose coordinates are expressed with latitude and longitude. Each pixel (*i*, *j*) of the grid is defined by its longitude *λ<sup>i</sup>* and latitude *φ<sup>j</sup>* coordinates and by its area *Sij* (Equation (5)):

$$S\_{i\bar{j}} = R^2 \times \cos(\phi\_{\bar{j}}) \text{ } \Delta\lambda\_i \text{ } \Delta\phi\_{\bar{j}} \text{ } \tag{5}$$

where Δ*λi*, Δ*φ<sup>j</sup>* are size in radians of the pixel at longitude *λ<sup>i</sup>* and latitude *φj*.

The satellite has a location *λsat*, *φsat* and an altitude *zsat*. The angle *θi*,*<sup>j</sup>* at the center of the Earth (nadir) between the satellite and the pixel (*i*, *j*) is given by Equation (6). The angle *αi*,*<sup>j</sup>* between the nadir of the satellite and the pixel (*i*, *j*) is given by Equation (7). The satellite elevation *βi*,*<sup>j</sup>* (complement of zenith angle) seen by the pixel (*i*, *j*) is given by Equation (8).

$$\theta\_{i,j} = \arccos[\cos(\phi\_j)\cos(\phi\_{sat})\cos(\lambda\_i - \lambda\_{sat}) + \sin(\phi\_j)\sin(\phi\_{sat})] \tag{6}$$

$$\alpha\_{i,j} = \operatorname{atan}\left(\frac{R\sin(\theta\_{i,j})}{z\_{\text{sat}} + R\left(1 - \arccos(\theta\_{i,j})\right)}\right) \tag{7}$$

$$
\beta\_{i,j} = \frac{\pi}{2} - a\_{i,j} - \theta\_{i,j} \tag{8}
$$

The view angle Ω*i*,*<sup>j</sup>* under which the pixel (*i*, *j*) is seen by the satellite is given by Equations (9) and (10).

$$
\Omega\_{i,j} = \mathcal{S}\_{i\bar{j}} z^2 \cos(\mathfrak{a}\_{i,\bar{j}}) \sin(\beta\_{i,\bar{j}}) \qquad \text{if} \qquad \beta\_{i,\bar{j}} \ge 0 \tag{9}
$$

$$
\Omega\_{i,j} = 0 \qquad \qquad \qquad \qquad \text{if} \qquad \beta\_{i,j} < 0 \tag{10}
$$

Figure 8 represents the different angles between the satellite, the nadir, and the observation location. This latter location is seen by the satellite according to a Gaussian distribution *G*(*θi*,*j*), which is defined by Equation (11):

$$G(\theta\_{i,j}) = \exp\left(-\frac{\alpha^2}{2\sigma^2}\right) \tag{11}$$

with *σ*, the variance of the Gaussian. *σ* is related to the FOV via *σ* = FOV/2. For example, for an average altitude of 540 km and a FOV of 135◦, which are the characteristics of UVSQ-SAT, the diameter of the ground track is about 2600 km.

**Figure 8.** Visualization of the satellite in orbit, characteristic angles and pixel seen by the satellite.

The observed flux by UVSQ-SAT *Fsat*(*λsat*, *φsat*) is given by (Equation (12)):

$$F\_{\rm sat}(\lambda\_{\rm sat}, \phi\_{\rm sat}) = \frac{\sum\_{i,j} \, \Omega\_{i,j} \, G\_{i,j} \, F\_{i,j}}{\sum\_{i,j} \, \Omega\_{i,j} \, G\_{i,j}} \tag{12}$$

Finally, to reconstruct the map of what was measured by UVSQ-SAT, i.e., *Fsat*(*λsat*, *φsat*), we calculate the functions Ω*i*,*j*, *Gi*,*<sup>j</sup>* for the satellite track to find the flux *Fi*,*<sup>j</sup>* associated with each pixel.

#### *3.3. Reconstruction Method of the ERA 5 Maps to Compare with the UVSQ-SAT Maps*

To compare the UVSQ-SAT measurements and a model (ERA 5), it is necessary to have similar inputs that means, i.e., to have the same observations time series (temporal) and similar FOV (spatial) at TOA.

Figure 9 shows the monthly averaged maps of the albedo and the OLR in February 2021 (for the ERA 5 model). All data of the ERA 5 model were used to obtain these maps (temporal resolution of 1 h and a pixel resolution of 721 × 1440).

**Figure 9.** (**Top**) Albedo from ERA 5 during the month of February 2021. (**Bottom**) OLR from ERA 5 during the month of February 2021.

Albedo is computed by the ratio of the difference of the incident shortwave radiation and the net shortwave radiation by the incident shortwave radiation. Surfaces that reflect a lot of the light falling on them have a high albedo. Surfaces that don't reflect much light have a low albedo. Albedo values vary across the globe with latitude. At the poles, it can be greater than 0.7 in some areas. This is a result of the lower solar angle present at the poles but also the higher presence of fresh snow, ice, and smooth open water. In the tropics (23.5◦N to 23.5◦S), albedo is between 0.1 to 0.4. OLR values vary also across the globe with latitude, and is primarily sensitive to near-surface and atmospheric temperatures, the air humidity, and the presence of clouds, which are related to the intensity of convective activity as well as the latitude and altitude dependence of the variability. Low OLR values (<200 Wm<sup>−</sup>2) associated with deep atmospheric convection are found over the equatorial land masses, Amazon basin, and in the western equatorial Pacific. These maps highlight the finesse of the observations (notably in equatorial Atlantic ocean). These details will disappear when increasing the FOV, and consequently degrade the spatial resolution. Considering the ground track of UVSQ-SAT in February 2021 (with same spatio-temporal observations), it is possible to reconstruct new albedo and OLR maps from the ERA 5 model. The impact of the FOV (related to *σ*) also has a significant effect on the map reconstructions. Figure 10 shows the reconstructed map of the albedo for the ERA 5 model for *σ* = 1◦ and *σ* = 180◦. Figure 11 shows the reconstructed map of the OLR for the ERA 5 model for *σ* = 1◦ and *σ* = 180◦.

**Figure 10.** (**Top**) Albedo (ERA 5) during the month of February 2021 from same observation than UVSQ-SAT (*σ* = 1◦). (**Bottom**) Albedo (ERA 5) from same observation than UVSQ-SAT (*σ* = 180◦).

These reconstructed maps are slightly different from the ERA 5 monthly averaged data (Figure 9). The selected data (same spatio-temporal resolution than UVSQ-SAT) related to the ground track of the satellite (Figure 7) have an impact on the reconstructed maps (clearly visible for albedo when increasing *σ*). Low OLR values in the equatorial Atlantic ocean disappear (Figure 11 Bottom) due to the lack of data and low spatial resolution.

These simulations show the importance for doing the right comparisons between model and observations (same spatio temporal resolution). That's why, other methods are under investigation to improve the finesse of the details to be restored (albedo and OLR maps). In a second approach, it is planned to use the spherical harmonics method developed by Gristey et al. (2017) [17] to reconstruct the maps of the quantities of interest. The value measured by the satellite *F*sat is expressed as follows (Equation (13)):

$$F\_{\rm sat}(\lambda, \theta) = \sum\_{l=0}^{L} \sum\_{m=0}^{l} \left[ \mathbb{C}\_{lm} \overline{Y}\_{lm}^{\rm C}(\theta, \lambda) + \mathbb{S}\_{lm} \overline{Y}\_{lm}^{\rm S}(\theta, \lambda) \right] + \varepsilon \tag{13}$$

with *<sup>θ</sup>*, *<sup>λ</sup>*, the colatitude and longitude of the point considered, *<sup>Y</sup><sup>C</sup> lm* and *<sup>Y</sup><sup>S</sup> lm*, the spatially integrated spherical harmonics, *Clm*, *Slm*, the coefficients that we seek to determine and *e*, the error on the measurements. The coefficients *Clm* and *Slm* are then estimated by the least squares method.

#### **4. UVSQ-SAT First Observations**

Since February 2021, UVSQ-SAT has been measuring, among other things, solar radiation reflected from the Earth and OLR. UVSQ-SAT does not have an active attitude control system. Therefore, it is equipped with several photodiodes and ERS sensors on all sides to perform scientific measurements. Six photodiodes allow to measure both the TSI and the OSR. The albedo is obtained from these measurements considering the TSI as known and measured with other space-based instruments. Twelve ERS sensors allow to measure the TSI, the OSR and the OLR with a wide FOV in a complete hemisphere. The whole absorbed radiations generate temperature gradients on the ERS thermopile which delivers a voltage proportional to the absorbed flux (Equation (1)). Six of the twelve ERS thermopiles are coated with carbon nanotubes in order to absorb all incident radiation from short to long wavelengths (absorption very close to 1). The BRDF of the carbon nanotubes coating was measured with a goniophotometer. The optical coating shows a weak angular dependence. During eclipses, these detectors measure only the OLR. Out of eclipses, the carbon nanotube-based thermopiles measure the TSI, the OSR, and the OLR. There are also six ERS thermopiles, which are coated with optical solar reflectors to mainly measure the OLR.

#### *4.1. UVSQ-SAT Time Series*

From the measurements made by the different sensors of UVSQ-SAT and using the instrumental equations described in Sections 3.1.1 and 3.1.2, we obtain the evolution of the albedo and the OLR. Figure 12a shows the evolution of the signal measured by the six photodiodes over time (example given for about 2 observation orbits). The eclipse periods are clearly identifiable by the absence of signal. Figure 12b shows a good agreement between the calculated albedo and thus modeled from the ERA 5 data using different FOV (referred as *σ* values). On Figure 12c, the satellite temperatures evolve in accordance with the UVSQ-SAT thermal qualification (Figure 4). These temperature and the signal measured by the ERS sensors allow to retrieve the OLR (Figure 12d), which is directly observed during eclipse periods. Once again, the measured OLR fluxes show a good agreement with the modeled ones from the ERA 5 data.

**Figure 12.** (**a**) Evolution of the measured flux by the six photodiodes during a given period. (**b**) Evolution of the albedo modeled from the ERA 5 data and for UVSQ-SAT observations. (**c**) Evolution of the satellite temperatures. (**d**) Evolution of the OLR.

The accumulation of OLR measurement points over time for the Earth's location points will provide averages to characterize the evolution of the OLR over time. It is important to have new record of the OLR to bring new lights and to continue historical records (Figure 6). A long term continuous record is possible using a new disruptive space observational system based on UVSQ-SAT pathfinder.

#### *4.2. UVSQ-SAT Maps Reconstruction*

UVSQ-SAT maps reconstruction (albedo and OLR) are based on UVSQ-SAT time series. Currently, the reconstruction of each map is based on a Gaussian function whose parameters are defined by the satellite altitude and its FOV. In addition, a method based on a deep learning (DL) algorithm has been developed to obtain the satellite attitude in order to better determine the Earth's albedo and OLR [18]. The satellite attitude can also be reconstructed from the measurements of an inertial measurement unit (IMU). UVSQ-SAT is equipped with a gyrometer to measure angular velocity, an accelerometer to measure gravity and linear acceleration, and a magnetometer to measure the Earth's magnetic field. The magnetometer allows to measure the direction and the intensity of the magnetic field. The objective is to gather these inertial and magnetic data in order to reconstruct the attitude of the satellite. Figure 13 shows the intensity of the Earth's magnetic field measured by UVSQ-SAT thanks to its different sensors.

**Figure 13.** Earth's magnetic field intensity measured by UVSQ-SAT instruments. The calibrations are being validated to consolidate the absolute measurement bias.

Regarding the albedo and OLR, preliminary results have been obtained by applying the methods described in this study and in Meftah et al. (2020) [1]. Figure 14 shows a reconstructed map of the Earth's albedo (February 2021) obtained from UVSQ-SAT instruments. The albedo due to the atmosphere comes from Rayleigh backscattering in the short wavelength range and from clouds, which contribute about two thirds of the total albedo. The effect of the TSI at the TOA is stronger in the tropics. The high albedo at polar latitudes due to the presence of snow and ice on the ground has an important effect. This is of interest to follow this evolution over time. There are also local differences due to the albedo of cloudy regions (intertropical zone of convergence) or the ground (Sahara). The albedo of the ground depends strongly on its nature, it can be very high for fresh snow (about 0.75) and low for vegetation (below 0.2). The albedo of the ocean remains very low and of the order of 0.1. It depends on the distribution of waves.

**Figure 14.** Earth's albedo at TOA measured by UVSQ-SAT instruments. The dark blue areas correspond to regions without data.

The OLR at the TOA (Figure 15) also has a latitude dependence. The high latitudes are cooler and emit less infrared radiation. The humid tropical regions are clearly visible. In tropical and equatorial regions, the weak radiation emerging at the TOA at long wavelengths is due to the presence of clouds at high altitudes. These clouds absorb the radiation emitted by the Earth's surface. Therefore, as they are cold, they emit a weak outgoing radiation in space.

**Figure 15.** OLR at the TOA measured by UVSQ-SAT instruments.

UVSQ-SAT maps (Figures 14 and 15) look identical to the maps obtained with the ERA 5 model using the same observations time series (Figures 10 and 11). The UVSQ-SAT sensors used have wide fields of view. The reconstructions maps are associated with wide fields of view.

#### **5. Perspectives—Toward a Satellites Constellation for Climate Studies**

The objective of the UVSQ-SAT mission is to validate the principle of several miniaturized technologies at the highest level of technological maturity (TRL 9) in order to be able to accurately measure the EEI using a constellation (homogeneous and or heterogeneous)

of small satellites. It will be necessary to combine simple instruments (wide FOV and wide wavelength band) with more complex instruments (narrow FOV and using on-board calibration sources to follow the evolution of the degradation of the instruments in orbit). The Radiometer Assessment using Vertically Aligned Nanotubes (RAVAN) 3U CubeSat mission [19] develops a similar approach to demonstrate technologies for the measurement of Earth's radiation budget in order to implement a satellites constellation.

Gristey et al. (2017) [17] showed how a constellation of satellites with broadband radiometers can provide both spatial and temporal information to accurately measure the ERB. Wiscombe and Chiu (2013) [20] proposed a constellation of space-based instruments, capable of overcoming sampling limitations and providing global and diurnal measurements of the Earth's outgoing radiation with accuracy. With advances in technology and miniaturization of instruments, it is possible to envision a constellation of small satellites capable of measuring outgoing and variable terrestrial radiation from the Earth. The "UVSQ-SAT" concept is fully integrated in this approach and shows a way.

With a single low-orbiting satellite (UVSQ-SAT), it is possible to obtain "satisfactory" mappings of the OSR and the OLR at the TOA only every month (with a loss of short time-scale information). Meftah et al. (2020) [1] show that a constellation of at least 50 satellites is necessary to have at least a daily mapping. Indeed, a constellation of at least 50 satellites is a prerequisite to capture spatio-temporal variations (every three hours and ideally a resolution of a few km). This would be fundamental data to validate climate models. These data are used to validate models and the magnitude of energy imbalances at the TOA. The model mesh size is about 100–200 km. The Baseline Surface Radiation Network (BSRN) and Atmospheric Radiation Measurements Program network (ARM) can be used on local and regional scale models. They are mainly used in the calibration and use of spatial data. These data are of primary importance to help validate spatial data and numerical model simulations.

UVSQ-SAT and its future constellation (Figure 16) address the same scientific theme as Far-infrared Outgoing Radiation Understanding and Monitoring (FORUM), but from a completely different angle. FORUM will be put into orbit in 2026. It will measure the Earth's infrared emission spectrum at high spectral resolution. Libera (named for the daughter of agricultural goddess Ceres in ancient Roman mythology) represents another important mission to measure the Earth's outgoing radiative energy. The Libera instrument will fly on National Oceanic and Atmospheric Administration (NOAA) operational Joint Polar Satellite System-3 (JPSS-3) satellite, which is scheduled to launch by December 2027.

**Figure 16.** UVSQ-SAT and its potential future satellites constellation named Terra-F.

#### **6. Conclusions**

The UVSQ-SAT mission was launched in January 2021. Since February 2021, the satellite has been observing the Earth and the Sun. The first results were obtained using the methods described by Meftah et al. (2020) [1]. These results are very encouraging and clearly show that small satellites represent a fast method to answer key scientific questions. In the era of NewSpace, i.e., an era benefiting from technological advances in miniaturization and reasonable costs to access space, this mission clearly shows that a constellation of small satellites dedicated to the measurement of the EEI is feasible.

At a time when the climate emergency is obvious and the signals of global warming of the Earth are largely above the level of natural variability, one of the greatest challenges remains the difficulty of obtaining adequate funding for research and especially for missions dedicated to the study of climate (EEI at the TOA, spectral solar irradiance, vertical temperature profile, ozone, aerosols, etc.). These missions must rely on the implementation of satellites constellation to ensure sufficient spatial and temporal coverage. Satellites constellation dedicated to the study of climate and science are now a matter of course [21]. The UVSQ-SAT prototype opens the way to a potential Terra-F constellation composed of 50 small satellites with the necessary performance to meet the scientific need.

Other solutions to measure the EEI are possible with two "traditional" satellites placed at the L1 and L2 Lagrange points in order to make measurements of our planet on the illuminated and non-illuminated portions. The main goal is to make measurements of the EEI at the TOA and its variability in time in order to detect any long-term global trend with a stability of at least ±0.2 Wm−<sup>2</sup> per decade but also to be able to obtain spatial resolutions of a few tens of km and temporal resolutions of three hours with an accuracy of a few Wm<sup>−</sup>2. These measurements must be performed on wide wavelength bands.

Today, it is becoming increasingly clear that small satellites are not only a fast way to answer key scientific questions but also contribute qualitatively to scientific research.

**Author Contributions:** Conceptualization, M.M. (Mustapha Meftah), T.B., C.D., A.H., P.K., A.F., S.B., S.A., E.B., L.D., J.-L.E., P.G. (Patrick Galopeau), P.G. (Pierre Gilbert), L.L., A.S., A.-J.V., P.L., X.A., O.H.F.d., A.M., J.-P.C., F.B., M.M. (Michel Mahé) and C.M.; Data curation, M.M. (Mustapha Meftah), T.B., C.D., A.H., A.F. and N.C.; Formal analysis, M.M. (Mustapha Meftah), T.B., C.D., A.H. and A.F.; Funding acquisition, M.M. (Mustapha Meftah); Investigation, M.M. (Mustapha Meftah), E.B., P.L. and N.C.; Methodology, M.M. (Mustapha Meftah), T.B., C.D., A.H., P.K., A.F., S.B., S.A., L.D., J.-L.E., P.G. (Patrick Galopeau), P.G. (Pierre Gilbert), L.L., A.S., A.-J.V. and X.A.; Project administration, M.M. (Mustapha Meftah); Resources, M.M. (Mustapha Meftah); Software, M.M. (Mustapha Meftah), T.B., C.D., A.H., P.K. and A.F.; Supervision, M.M. (Mustapha Meftah); Validation, M.M. (Mustapha Meftah), T.B. and C.D.; Visualization, M.M. (Mustapha Meftah), T.B., A.H. and A.F.; Writing—original draft, M.M. (Mustapha Meftah), T.B., C.D., A.H., P.K., A.F., S.B., S.A., E.B., L.D., J.-L.E., P.G. (Patrick Galopeau), P.G. (Pierre Gilbert), L.L., A.S., A.-J.V., P.L., N.C., X.A., O.H.F.d., A.M., J.-P.C., F.B., M.M. (Michel Mahé) and C.M.; Writing—review and editing, M.M. (Mustapha Meftah), T.B., C.D. and A.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was mainly funded by Centre National de la Recherche Scientifique (CNRS, France), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ, France), and Agence Nationale de la Recherche (ANR, France). This work was supported by the Programme National Soleil Terre (PNST, France) of CNRS/INSU (France) co-funded by Centre National d'Études Spatiales (CNES, France) and Commissariat à l'énergie atomique (CEA, France).

**Acknowledgments:** The UVSQ-SAT team acknowledges support from CNRS, UVSQ, the Sorbonne Université (SU, France), the Université Paris-Saclay (France), the Office National d'Etudes et de Recherches Aérospatiales (ONERA, France), CNES, the Laboratory for Atmospheric and Space Physics (LASP, USA), the National Central University (NCU, Taiwan), and the Nanyang Technological University (NTU, Singapore). The authors acknowledge the multiple and very fruitful discussions about the Earth's energy imbalance which took place during the conception and writing-up of the unsuccessful European Space Agency EArth enerGy imbalance ExploreR (EAGER) proposal. The authors thankfully acknowledge the Ministère de l'Enseignement supérieur, de la Recherche et de l'Innovation (MESRI, France) for their support. Finally, the authors would like to thank the referees for their valuable comments which helped to improve the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Communication* **Current Status and Variation since 1964 of the Glaciers around the Ebi Lake Basin in the Warming Climate**

**Lin Wang 1, Changbin Bai <sup>1</sup> and Jing Ming 2,\***


**\*** Correspondence: petermingjing@hotmail.com

**Abstract:** This work analyzed the spatial and temporal variations of the glaciers in the Ebi Lake basin during the period 1964 to 2019, based on the 1st and 2nd Chinese Glacier Inventories (CGI) and remote sensing data; this is believed to be the first long-term comprehensive remote sensing investigation on the glacier change in this area, and it also diagnosed the response of the glaciers to the warming climate by analyzing digital elevation modeling and meteorology. The results show that there are 988 glaciers in total in the basin, with a total area of 560 km2 and average area of 0.57 km2 for a single glacier. The area and number of the glaciers oriented north and northeast are 205 km2 (327 glaciers) and 180 km<sup>2</sup> (265 glaciers), respectively. The glaciers are categorized into eight classes as per their area, which are less than 0.1, 0.1–0.5, 0.5–1.0, 1.0–2.0, 2.0–5.0, 5.0–10.0, 10.0–20.0, and greater than 20.0 km2, respectively. The smaller glaciers between 0.1 km2 and 10.0 km<sup>2</sup> account for 509 km2 or 91% in total area, and, in particular, the glaciers smaller than 0.5 km<sup>2</sup> account for 74% in the total number. The glacial area is concentrated at 3500–4000 m in altitude (512 km<sup>2</sup> or 91.4% in total). The number of glaciers in the basin decreased by 10.5% or 116, and their area decreased by 263.29 km2 (−4.79 km2 <sup>a</sup>−1) or 32% (−0.58% a−1) from 1964 to 2019; the glaciers with an area between 2.0 km2 and 5.0 km2 decreased by the largest, <sup>−</sup>82.60 km<sup>2</sup> or <sup>−</sup>40.67% in the total area at <sup>−</sup>1.50 km2 <sup>a</sup>−<sup>1</sup> or <sup>−</sup>0.74% a<sup>−</sup>1), and the largest decrease in number (i.e., 126 glaciers) occurs between 0.1 km<sup>2</sup> and 0.5 km2. The total ice storage in the basin decreased by 97.84–153.22 km3 from 1964 to 2019, equivalent to 88.06–137.90 km<sup>3</sup> water (taking 0.9 g cm−<sup>3</sup> as ice mass density). The temperature increase rate in the basin was +0.37 ◦C decade<sup>−</sup>1, while the precipitation was +13.61 mm decade−<sup>1</sup> during the last fifty-five years. This analysis shows that the increase in precipitation in the basin was not sufficient to compensate the mass loss of glaciers caused by the warming during the same period. The increase in temperature was the dominant factor exceeding precipitation mass supply for ruling the retreat of the glaciers in the entire basin.

**Keywords:** Ebi Lake basin; decelerated-melting glaciers; climate change; climate response

#### **1. Introduction**

The cryosphere consists of snow, ice, and permafrost on and below the Earth's land and ocean surfaces, and it is one of the major components in the climate system [1]. The accelerated shrinkage of the cryosphere in the context of global warming and the subsequent impacts on the sustainability of the Anthroposphere have attracted unprecedented attention and raised deep concern, because the High-Asian mountain glaciers are crucial for buffering against drought [2] and protecting life from drought [3]. These mountain glaciers are an essential part of the global cryosphere and have shown generally varying degrees of continuous retreat in recent decades [4–6]. As projected by the Intergovernmental Panel on Climate Change (IPCC), even in the mild emission scenario (RCP4.5), the Asian glaciers would disappear by ~50% by the end of the century [7].

**Citation:** Wang, L.; Bai, C.; Ming, J. Current Status and Variation since 1964 of the Glaciers around the Ebi Lake Basin in the Warming Climate. *Remote Sens.* **2021**, *13*, 497. https:// doi.org/10.3390/rs13030497

Academic Editor: Xander Wang Received: 29 December 2020 Accepted: 28 January 2021 Published: 30 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

China has the most developed mountain glaciers in low and middle latitudes [8], and these glaciers are a vital water resource in arid northwest China and High Asia [9]. As global temperatures increase, China's glacial covers are generally in negative mass balance and showing a retreating and thinning trend [10–12]. The Xinjiang region is an arid and semi-arid area of China rich in mountain glaciers. The Tianshan and Altai Mountains are two main mountains developing concentrated mountain glaciers. The glaciers in the Tianshan Mountains had experienced rapid mass loss (averaged from −24.6 mm w.e. a−<sup>1</sup> in 1957–1970 to −444.6 mm w.e. a−<sup>1</sup> in 1971–2009) during the second half of the 20th century [13]; while in Altai, over a quarter of mountain glaciers were projected to shrink in RCP4.5 by 2100 [14].

The Ebi Lake basin is located in the northwestern Xinjiang region, China (Figure 1). The increasingly irrigated area and population in the basin over the last 60 years have led to increased water consumption in the basin, increasing tension between supply and demand, and the ecological degradation [15–17]. This lake had shrunk by 50% from 1955 to 2013, as reported by NASA in 2014 [18]. Another study stated that the water area of the Ebi Lake showed a significantly decreasing trend, although the precipitation was increasing from 2001 to 2016 [19]. Therefore, it is essential to assess the regional glaciers change in a timely manner to monitor glacier water resources and assess their impact on the water resource supply in the basin. This aspect's research has conceptual and practical significances for water security, especially for industrial and agricultural production and economic development in this ecologically vulnerable area.

**Figure 1.** Study area with the Ebi Lake basin (5Y74) and related river and lake basins, where Kuitun River, Sikeshu River, Jinghe River, Daheyanzi River, Sayram Lake, and Bortala River sub-basins are annotated with their CGI codes of 5Y741, 5Y742, 5Y743, 5Y744, 5Y746, and 5Y746, respectively.

> In recent years, numerous studies used the topographic maps and remote sensing data of the Tianshan Mountains, regional watersheds, and typical reference glaciers and revealed that the regional glaciers show retreating and thinning [20–26]. So far, very few studies on the glacier change status have been carried out in the Ebi Lake Basin. For example, Wang

et al. used remote sensing data to partially reveal that there were around 450 glaciers in the Ebi Lake Basin, Tianshan, undergoing significant mass loss glacier changes from 1964 to 2004 [27]; Zhang et al. estimated the change of the Haxilegen 51 glacier in the basin from 1964 to 2006 and its response to climate [28]. To some extent, these studies explored the changes of an individual glacier or several glaciers in the basin. Still, the holistic and more detailed picture of these glaciers' change is not complete and update.

Dramatic change has had been with these glaciers in number, area, and ice volumes from 1964 to 2009. According to the 1st CGI, there were 1104 glaciers with the total glacial area of 823 km2 and the ice storage of 47.54 km3 by 1964; while to the 2nd CGI, the number of glaciers here decreased to 1000, the glacier area to 598 km2, and the ice storage to 32.45 km<sup>3</sup> by 2009 [8]. However, the update status of the varying glaciers and their association with the regional climate change during the most recent decade (2009–2019) is little known, which are crucial for policy-makers to take proper measures to adapt climate change and mitigate the impact of the change. Therefore, based on the first and second Chinese Glacier Inventories (CGIs) and the most updated glacier vector data released in 2019, this work assesses the glaciers' change in the Ebi Lake Basin over the last 55 years (1964–2019), more systematically and accurately. Furthermore, we will discuss the response of glaciers in the basin to the warming climate and suggest the measures of how to use the regional water resource for sustainability rationally.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The Ebi Lake Basin (43◦38'~45◦52'N, 79◦53'~85◦02'E) is this study area and located at the hinterland of Asia and Europe (Figure 1). The basin lies to the northern foothills of the western Tianshan Mountains and in the southwest of the Junggar Basin. It is surrounded by mountains from three sides—the north, west, and south—and in the east is China's second largest desert, Gurbantunggut Desert. The basin is concurrently in the northern hemispheric temperate zone with the significantly continental arid climate, mostly windy weather. The annual average temperature in the basin moderates between 6.6 and 7.8 ◦C, the annual precipitation between 116.0 and 169.2 mm, and the potential evapotranspiration from 1500 to 2000 mm [29,30]. The Ebi Lake is the largest lake in the basin and is the largest saltwater lake with about 500 km2 in Xinjiang. The plains in the basin are accompanied with little precipitation and runoff, and mountain precipitation and snowmelt water from the peripheral alpine glaciers are the main sources of river runoff. The Ebinur Lake Basin (code 5Y74 in the CGIs) consists of six sub-basins, the Quitun River Basin (5Y741), the Sikeshu River Basin (5Y742), the Jing River Basin (5Y743), the Daheyanzi River Basin (5Y744), the Sailimu Lake Basin (5Y745), and the Bortala River Basin (5Y746) (Figure 1). The glaciers around the basin are typically continental glaciers developing along the valleys.

#### *2.2. Data*

#### 2.2.1. Remote Sensing Images

At present, the publicly accessible high spatial resolution remote sensing images mainly include


Here, we use the Sentinel-2 MSI remote sensing images to retrieve the most accurate interpretation of the glacier boundaries. The image data have a revisit spanning time of five days and were acquired from June to September of 2019, which was the melting season of the glaciers. We use the Google Earth Engine (GEE) platform, a cloud-based geospatial processing platform, to quickly and efficiently filter high-quality, cloud-free Sentinel-2 MSI remote sensing images (referring to the work in [31]) and avoid laborious data collection, storage, organization, and preprocessing work.

#### 2.2.2. The 1st and 2nd CGIs Data

The first CGI data, completed in 2002, are mainly derived from the 1960s' aerialphotographed topographic maps, and the watershed of the Ebi Lake basin includes 13 sheets of 1:50,000 topographic maps. Moreover, the second CGI data, released in 2014, integrate remote-sensing data, topographic maps, and digital elevation model data. Both times of CGI data are available on the National Tibetan Plateau Third Pole Environment Data Center (http://westdc.westgis.ac.cn/).

#### 2.2.3. Digital Elevation Data

The digital elevation model is derived from the SRTM (Shuttle Radar Topography Mission), measured jointly by the NASA and the National Mapping Agency (NIMA), USA. This work uses the revised version 4.1 data with a spatial resolution of 90 m [32]. This version of SRTM data is provided by the CIAT (International Center for Tropical Agriculture) with a new interpolation algorithm. The nominal absolute elevation and planimetric accuracy of the data are ±16 m and ±20 m [33], respectively.

#### 2.2.4. Meteorology

The meteorological data were obtained from the China Meteorological Data Service. (CMDS). The service provides the daily meteorological dataset collected from nearly 700 ground baseline stations in nationwide China. It can be accessed freely by registered educational and academic users, and here is the direct web link http://data.cma.cn/data/ cdcdetail/dataCode/SURF\_CLI\_CHN\_MUL\_DAY\_V3.0.html. In this work, the monthly temperature and precipitation data were retrieved from the CMDS dataset recorded by the five meteorological stations in the lower Ebi Lake Basin during 1964–2017 (Table 1).

**Table 1.** The locations of the five meteorological stations in the Ebi Lake Basin.


#### *2.3. Methods*

2.3.1. Confining Glacial Boundaries

Automatic interpretation methods applied to remotely sensed images for retrieving glacier boundaries are popular at present for simplifying and expediting the process [34]; however, the presence of snow, shadows, moraines, and water bodies makes it difficult to guarantee the accuracy of the automatic-interpretation methods in obtaining glacier boundaries. Visual interpretation is to obtain glacier-boundary information with remote sensing images based on existing glaciological knowledge, and this method is the most credible method to retrieve glacier boundaries [35]. Taking into account that the relatively small study area and accuracy, this study adopts visual interpretation for the latest remote sensing images to manually delineate glacier boundaries. With the ArcGIS version 10.5 software [36] and the first and second CGI cataloguing data, the glacier boundaries around the Ebi Lake Basin were vectorized and corrected by artificially visual interpretation in the Google Earth images, and the derived glacier boundaries were further amended with expert opinions.

#### 2.3.2. Calculating Glacier Area and Volume

Changes in glacier area can be reflected by the difference in glacier area between the two periods, indicated by the rate of change in the glacier area and the relative rate of change in glacier area, using the following equation,

$$AC = (A\_1 - A\_0) / (\mathfrak{t}\_1 - \mathfrak{t}\_0),\tag{1}$$

$$\text{and } AAC = (A\_1 - A\_0) / [A\_0 \times (\mathbf{t}\_1 - \mathbf{t}\_0)] \times 100\%,\tag{2}$$

where *AC* is the varying rate of glacier area (km2/a), *AAC* is the relative varying rate of glacier area (%/a), *A* is glacier area (km2), t indicates the year, and the subscript 0 and 1 indicate the starting and ending years for the specific glacial areas, respectively.

The glacier ice volume is not a directly approachable measurement here, because there were no direct measurements of ice thickness in most glaciers. This research used the volume–area empirical relationship by Gärtner-Roer et al. [37]:

$$\mathbf{V} = \mathbf{c} \times A^{\mathbf{e}}\,,\tag{3}$$

where V is glacier ice storage (km3), *A* is glacier area (km2), and c and e are the empirical coefficients by Gärtner-Roer et al. [37], respectively.

#### 2.3.3. Assessing the Uncertainty

The accuracy evaluation of glacier boundaries obtained from remote sensing images is important but difficult to determine. In processing remote sensing images, errors mainly come from glacier boundary extraction uncertainties and image resolution and alignment errors. Glacier boundary extraction uncertainties can be reduced through field validation and glaciological experience, and the uncertainty in remotely sensed images can be calculated through the following formulas (4 [38] and 5 [20]) to calculate their error, based on the formula for calculating the uncertainty of glacier length and area variation [39–41].

$$
\mathcal{U}\_T = \sqrt{\Sigma \lambda^2} + \sqrt{\Sigma \varepsilon^2} \tag{4}
$$

$$\text{and } \mathcal{U}\_A = 2\mathcal{U}\_T \sqrt{\Sigma \lambda^2} + \sqrt{\Sigma \varepsilon^2} \tag{5}$$

where *UT* is the uncertainty of glacier length, λ is the impact resolution, ε is the alignment error of satellite images and the boundary layers, and *UA* is the uncertainty of glacier area. In this study, λ equals 10 m for the Sentinel-2 images, and ε equals <sup>1</sup> <sup>4</sup> pixel. The calculated length and area uncertainties in this study are ±12.5 m and ±0.00025 km2, respectively.

#### **3. Results**

#### *3.1. Updated Status of Glaciers in the Ebi Lake Basin in 2019*

Overall, the number of glaciers in the Ebi Lake Basin investigated in 2019 was 988, with an area of ~560 km2, and the averaged area of a single glacier is 0.57 km2. The glaciers with the area between 1.0 and 5.0 km<sup>2</sup> account for the largest share (40%) of the total glacier area in the study area, followed by those with areas of <0.5 km<sup>2</sup> (21%), 5.0–10.0 km2 (17%), 0.5–1.0 km2 (16%), and >10.0 km<sup>2</sup> (6%). The Sikeshu River Basin (5Y742) has the most extensively developed glaciers among the six sub-basins on all scales, while the Sayram Lake and Daheyanzi River basins have barely visible glaciers in Figure 1. The only two glaciers larger than 10 km<sup>2</sup> are both located in the Sikeshu River Basin (Figure 2a). 74% of the glaciers in number have areas smaller than 0.5 km2 and are majorly located in the Kuitun, Sikeshu, and Bortala River sub-basins. The larger glaciers are much fewer distributed (Figure 2b). The glacial resources in the Ebi Lake basin are concentrated in four sub-basins (5Y741, 5Y742, 5Y743, and 5Y746), with the number of 974 (98.6%) and the area of ~557 km2 (99.6%), respectively.

**Figure 2.** The glacial areas (**a**) and numbers of glaciers (**b**) in each sub-basin categorized into <0.5, 0.5–1.0, 1.0–5.0, 5.0–10.0, and >10.0 km2, respectively, as of 2019. KT, SK, JH, DH, BT, and SY denote the Kuitun (5Y741), Sikeshu (5Y742), Jinghe (5Y743), Daheyanzi (5Y744), Bortala (5Y746) Rivers, and the Sayram Lake (5Y745) sub-basin, respectively. The percent shares of the glaciers in area and number are shown in the respective pie charts.

In the Ebi Lake Basin, there is around 400 km<sup>2</sup> of glaciers, accounting for 72% in a total spread from 3500 to 4000 m in elevation; 11% and 17% in area developed under 3500 m and between 4000 and 4500 m, respectively; and only 1% glacial area developed above 4500 m as of 2019 (Figure 3a). Most glaciers developed in the north (75% in area and 65% in number) and the east (15% in area and 22% in number) orientations (Figure 3b), implicated by the developing conditions of mountain glaciers if the north orientation indicates colder air masses, and the east implies richer water vapor sources in the Tianshan Mountains.

**Figure 3.** (**a**) The distribution of glacier areas in elevation, and (**b**) that of glacier areas and numbers in orientation in the Ebi Lake basin investigated in 2019.

#### *3.2. Changes of the Glaciers Relative to the 1st and 2nd CGIs*

Table 2 present the glaciers in area and number investigated by the first and second CGIs (1964 and 2009, respectively) and this work up to 2019. The glaciers in the Ebi Lake Basin have changed a lot since the first and second CGIs either in area or in number. Almost all measures had decreased from the first CGI investigation to this work in 2019, except that the glacier number increased from 281 by the second CGI to 285 by this work. The sole exception was probably because of some separated glacier branches resulting from the shrinking area.


**Table 2.** Glaciers in area and number in the six drainage sub-basins as per three inventories.

Note: The red underlined number indicates the increasing count.

In more detail, the number of the Ebi Lake Basin's glaciers decreased from 1104 to 998 (−2 a−<sup>1</sup> or −2% a−1), and the area decreased by 263 km2 (−4.79 km2 <sup>a</sup>−<sup>1</sup> or −0.58% a−1) during the period 1964–2019. The overall rate of glacier area decreasing during 1964–2019 was −4.79 km<sup>2</sup> <sup>a</sup>−1, and the rate during 2009–2019 (−3.87 km<sup>2</sup> <sup>a</sup>−1) was slower than that during 1964–2009 (−4.99 km<sup>2</sup> <sup>a</sup>−1) (Figure 4). The shrinking rate of the glaciers in the Ebi Lake Basin was slowed by 22% or 1.12 km2 a−<sup>1</sup> in the last decade, comparing with the period 1964–2009. Figure 5 shows some typical glacier boundaries in the basin defined by the second CGI and 2019 investigations, respectively. Compared with the boundary by 2009, the glacial area showed a general shrinkage in 2019.

**Figure 4.** The area changing rates of the glaciers in the Ebi Lake basin during 1964–2009, 2009–2019, and 1964–2019, respectively.

**Figure 5.** The land cover images with two typical ice bodies of the Ebi Lake Basin, defined by (**a**) the second CGI and (**b**) 2019 investigations (retrieved from the Sentinel-2 satellite with the Google Earth Engine in 2019), where the greenish colors indicate the ice.

#### **4. Discussion**

#### *4.1. Spatial and Volume Variations of Glaciers in the Ebi Lake Basin since the 1st and 2nd CGIs*

To understand the spatial variations of the Ebi Lake Basin's glaciers, we show their varying rates in a geographic map (Figure 6). The area and number of glaciers in the individual sub-basin of the Ebi Lake basin showed a decreasing trend from 1964 to 2019. The Bortala River Basin (5Y746) had the largest decrease in glacier area (111.80 km2) with a rate of change of −2.03 km2 <sup>a</sup><sup>−</sup>1, followed by the Sikeshu River Basin (5Y742) (−101.90 km2, or −1.85 km<sup>2</sup> <sup>a</sup>−1) and the Quitun River Basin (5Y741) (−94.62 km2, −1.72 km<sup>2</sup> <sup>a</sup>−1); the largest number of glaciers disappeared in the Sikeshu River Basin (5Y742) and the Daheyanzi River Basin (5Y744), losing 27 and 28 glaciers, respectively. From the analysis of the relative rate of glacier area change, the glaciers in the Daheyanzi River Basin (5Y744) were retreating fastest (−1.81 % a−1), followed by the Selimu Lake Basin (5Y745) with a retreating rate of −1.23 % a−1, and the glacier resources in the two sub-basins are on the verge of extinction. During 1964–2009, the vanished glaciers spread from west to east, while they concentrated in the west during 2009–2019.

This work used various methods to calculate the ice storages derived from different investigations and their variations in the basin (Table 3). The results show that the basin glacier ice reservation decreased by 97.84–153.22 km<sup>3</sup> between 1964 and 2019, the equivalent water equivalent was 88.06–137.90 km3, taking 0.9 g cm−<sup>3</sup> as ice density. The ice reservation decreased by 39.18%–41.23%, or −1.78~−2.79 km3 <sup>a</sup>−1. Compared with the glacier area reduction rate, the reduction rate of ice is greater, indicating that ice storage is more sensitive to regional warming. The glaciers in the basin are experiencing dramatic retreating and thinning.

**Figure 6.** Area changes of glaciers at drainage scale in the Ebinur Lake Basin.



#### *4.2. Retreat of Glaciers in the Ebi Lake Basin Compared with the Greater Tianshan Region*

In recent years, high-resolution satellite image data have been widely used in the dynamic monitoring of glaciers, and it has become possible to study glacier changes in large regions. To study the change of the glaciers in the Ebi Lake Basin in the greater Tianshan, this study compares the glacier changes here with those in the other typical mountains and watersheds of the Tianshan (Table 4). It is found that the trend of glaciers change in the Ebi Lake is consistent with the trends in other regions, that is, smaller-size glaciers have less reduction. In general, the glacial retreat rates in the greater Tianshan region were varying. The glaciers in the Ebi Lake basin retreated at a rate of 0.58% a−1, making the Ebi Lake Basin one of the fastest retreating areas in the Tianshan region. The glacier retreat rate in the western section of the Tianshan is larger, followed by the middle section and the smallest in the eastern section, and the glacier retreat rate on the northern slope is higher than on the southern slope. Regional climate variation (temperature and precipitation) and the individual glaciers' sizes are the main influencing factors on glacier

area in the Ebi Lake Basin. The number of glaciers in the region is dominated by those with an area of smaller than 0.5 km2 (730 glaciers, or 73.9% of the total number of glaciers).


**Table 4.** Comparison of the area change of the glaciers in the greater Tianshan range.

#### *4.3. Factors and Their Roles in the Change of Glaciers in the Ebi Lake Basin*

Precipitation and temperature are the main factors affecting glacier development. The interannual changes of these two factors together determine the nature, development and evolution of glaciers. Temperature determines ablation, and precipitation affects accumulation. Therefore, it is prerequisite to study the climate in the watershed to project the glaciers' evolution. As per the geography of the study area, this work selected five meteorological stations, i.e., Alashankou (51232), Bole (51238), Wenquan (51330), Jinghe (51334), and Wusu (51346), as the climatic reference for the study area (Figure 1).

The yearly average temperature and precipitation, and their anomalies at the five meteorological stations in winter and summer from 1964–2017, are shown in Figure 7. The temperature averaged from the five stations shows an increasing trend of 0.37 ◦C (10a)−<sup>1</sup> (0.001 confidence level) from 1964 through 2017, 2.6 times as high as the global mean rate, i.e., 0.14 ◦C (10a)−<sup>1</sup> from 1951 to 2012 [4]. The temperatures of the watershed in summer and winter both show increasing trends (0.51 ◦C (10a)−<sup>1</sup> in winter and 0.21 ◦C (10a) <sup>−</sup><sup>1</sup> in summer, respectively), and the temperature increase in winter is 2.5 times faster than the summer. Similar to the temperature, precipitation records at all five weather stations show an average increasing trend (13.61 mm (10a)−<sup>1</sup> at 0.001 confidence level); both summer and winter precipitation in the watershed show increasing, greater in summer (4.78 mm (10a)−1) than in winter (2.63 mm (10a)−1). The increasing temperature and precipitation trends in the Ebi Lake Basin are consistent with the climate shift theory from warm dry to warm wet in northwest China [55]. The mass loss of global mountain glaciers induced by a one degree increase in temperature needs 25–35% increase in precipitation to compensate for the loss [56]. In the context of this combined hydrothermal climate change in the Ebi Lake Basin, precipitation has increased, but temperature also increases to melt the glaciers, and the increased precipitation does not add enough mass to the glaciers and compensate for their mass loss. With increasing mass deficiency, glacier mass loss accelerates, and mass balance lines rise, leading to the widespread glacier retreat.

**Figure 7.** The interannual variations (**a**,**c**) and winter–summer anomalies (**b**,**d**) of temperature and precipitation in the Ebi Lake basin.

#### **5. Conclusions**

The number of glaciers in the Ebi Lake Basin in the 2019 ablation season was 988 with the total area of 559.77 km2. The glaciers in the basin are mainly characterized by individuals with the area between 0.1 and 10.0 km2 summed 509.05 km2, ~91% of the total glacial area in the basin. There are 730 glaciers smaller than 0.5 km2, accounting for 74% of the total number of glaciers in the basin. The glaciers are concentrated between 3500 m and 4000 m in elevation, with a total area of 512 km2, ~91% of the total.

During 1964–2019, the number and area of the Ebi Lake Basin's glaciers had decreased by 116 (10.5%) and 263 km<sup>2</sup> (32%) at the rate of −4.79 km<sup>2</sup> <sup>a</sup>−<sup>1</sup> or −0.58% a<sup>−</sup>1, respectively. Glaciers with an area between 2.0 and 5.0 km<sup>2</sup> had the largest reduction of −82.60 km<sup>2</sup> (~41%) with the rate of −1.5 km2 <sup>a</sup>−<sup>1</sup> or −0.74% a−1. The glaciers in the basin were all in retreat from 1964 through 2019, and the glaciers lower than 3200 m disappeared in 2019, and those between 3500 m and 4000 m dominated the total glacial area. Those north- and northeast-oriented glaciers in the basin had the largest area and number. The glaciers in each sub-basin of the Ebi Lake basin showed a decreasing trend from 1964 to 2019; noticeably, these glaciers in the recent decade (2009–2019) showed a slower retreating trend, compared with the investigations by the first and second CGIs. The Bortala River Basin (5Y746) has the largest decrease in glacier area (111.80 km2) with a rate of −2.03 km2 <sup>a</sup>−1, followed by the Sikeshu River Basin (5Y742) (−101.90 km2 or −1.85 km2 <sup>a</sup>−1) and the Quitun River Basin (5Y741) (−94.62 km<sup>2</sup> or −1.72 km<sup>2</sup> <sup>a</sup><sup>−</sup>1). The ice storage in the basin dur-

ing the last 55 years had decreased by 97.84–153.22 km3. The equivalent water equivalent was 88.06–137.90 km<sup>3</sup> with −1.78~−2.79 km3 <sup>a</sup>−<sup>1</sup> or −0.71~−0.75 % a<sup>−</sup>1.

The temperature in the basin had increased by 0.36 ◦C (10a)−<sup>1</sup> during 1964–2017, much faster than the global mean, and the annual precipitation in the basin also showed an increasing trend of 12.06 mm (10a)−1. The temperature and precipitation trends in the basin are consistent with the climate shift from warm-dry to warm-wet in northwest China. Although the precipitation in the basin has increased, the increase in precipitation was not sufficient to compensate for the mass loss of the glaciers by the increased temperatures, leading widespread retreating of the glaciers in the basin. This work may be the first report on the status of the glaciers in the Ebi Lake Basin and revealed their most recent change since the second China Glacier Investigation. The main points can be helpful for policymakers to take measures to mitigate the impact of climate change on the region.

**Author Contributions:** Conceptualization, L.W. and J.M.; methodology, L.W.; software, C.B. and J.M.; validation, L.W., C.B., and J.M.; formal analysis, C.B.; investigation, L.W.; resources, J.M.; data curation, L.W. and C.B.; writing—original draft preparation, L.W. and C.B.; writing—review and editing, J.M.; visualization, C.B.; supervision, J.M.; project administration, L.W.; funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is funded by the National Key Research and Development Program of China (grant number 2020YFF0304400), the Second Tibetan Plateau Scientific Expedition and Research (STEP) program (grant number 2019QZKK0201), the State Key Laboratory of Cryospheric Sciences (grant number SKLCS-ZZ-2020), and the Key Research Program of Frontier Sciences of Chinese Academy of Sciences (grant number QYZDB-SSW-SYS024).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Acknowledgments:** We thank the TPDC for providing the first and second CGI data; NASA, NIMA, and CIAT for providing the version 4.1 SRTM data; and the CMDS for providing the meteorology data. We also thank the two anonymous reviewers for their suggestions during the revising procedure.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel: +41 61 683 77 34

www.mdpi.com ISBN 978-3-0365-6629-0