Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data

Schauer, Henri; Schlaffer, Stefan; Bueechi, Emanuel; Dorigo, Wouter

doi:10.3390/rs15194659

Open AccessArticle

Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data

¹

Climate and Environmental Remote Sensing Research Unit, Department of Geodesy and Geoinformation, Technische Universität Wien (TU Wien), Wiedner Hauptstraße 8-10, 1040 Vienna, Austria

²

Department for Climate Impact Research, GeoSphere Austria, Hohe Warte 38, 1190 Vienna, Austria

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(19), 4659; https://doi.org/10.3390/rs15194659

Submission received: 15 August 2023 / Revised: 15 September 2023 / Accepted: 18 September 2023 / Published: 22 September 2023

Download

Browse Figures

Versions Notes

Abstract

:

Salt pans are unique wetland ecosystems. In the Austrian Seewinkel region, salt pans are in an increasingly vulnerable state due to groundwater drainage and heightened climatic pressures. It is crucial to model how seasonal and long-term hydrological and climatological variations affect the salt pan dynamics in Seewinkel, yet a comprehensive understanding of the driving processes is lacking. The goal of this study is to develop random forest machine learning models driven by hydrological and meteorological data that allow us to predict in early spring (March) of each year the inundation state in the subsequent summer and fall. We utilize Earth observation data from Landsat 5 (L5), 8 (L8), and 9 (L9) to derive the time series of the inundation state for 34 salt pans for the period 1984–2022. Furthermore, we demonstrate that the groundwater level observed in March is the strongest predictor of the salt pan inundation state in summer and fall. Utilizing local groundwater data yields a Matthews correlation coefficient of 0.59. Models using globally available meteorological data, either instead of or in addition to groundwater data, provide comparable results. This allows the global transfer of the approach to comparable ecosystems where no in situ data are available.

Keywords:

salt pan; soda pan; wetlands; drought; Lake Neusiedl—Seewinkel National Park; water extent; Earth observation; Landsat; inundation state; random forest

1. Introduction

Salt pans are a special type of terrestrial wetlands, which are formed in relation to arid climates, topographic depressions, and salt-rich groundwater [1,2]. They can be defined as “(...) arid zone basins (...), subject to ephemeral surface water inundation of variable periodicity and extent” [2]. Saline lakes, such as salt pans, are of vital importance for biodiversity and water management [2,3]; however, at the global scale, their number is declining mainly due to direct human intervention in their hydrology and climate change [4,5]. Although global data on salt pans are missing [6], many case studies suggest a global trend toward salt pan degradation and decline [2,5,7,8,9,10]. These trends also apply to the salt pans in Seewinkel in eastern Austria [11,12], where key regional ecosystem functions are under threat. The lives of, among others, halophytes [13], amphibians, reptiles [14,15], and birds [16,17] depend on these wetlands. Halophytes, such as communities of Puccinellio-Salicornietea, require a high groundwater level facilitating capillary rise to ensure their water supply [13]. Birds, such as the kentish plover (Charadrius alexandrinus), use high water levels in spring (for hatching [18]) and summer [17], as do amphibians and reptiles [14,15]. In Central Europe, such ecosystems can only be found in the Pannonian Basin [19] due to the unique tectonic conditions in the region [20]. In recent years, processes such as eutrophication, paludification, siltation, overgrowth with vegetation, fragmentation, long-term drying, and in consequence, habitat loss, have accelerated [21,22]. These are largely connected to excessive groundwater drainage for land use change [21,23]. The potential impact of climate change on the salt pans in Seewinkel is not yet fully understood [23], although small, geographically isolated wetlands reportedly react rather quickly to meteorological forcing [24].

The salt pans in the Seewinkel region follow the salt pan cycle [1], in which the dry basin is the default, central, and recurrent moment, which is alternated by its opposite state: the varying presence of water [1,25]. In summer, high evaporation rates in combination with an interruption of groundwater supply tend to outweigh precipitation [14] leading to salt pan desiccation. Especially during late winter and early spring [21], low evaporation rates allow precipitation combined with an increased contribution of groundwater to fill the basins. Wind contributes to important ecosystem processes as it influences evaporation rates and drives the mixing of water when the salt pan is inundated [21]. It also strengthens capillary rise and blows out inorganic sediments from the salt pan basins during periods of desiccation [20]. Salt pans in poor hydrological conditions lose additional water by surface water infiltrating into deeper soil layers [14]. Thus, monitoring and predicting both the long-term and short-term variability of surface water occurrence in the Seewinkel salt pans is needed to assess ecosystem change and their resilience.

Wetland hydroperiod [26], a key characteristic and ecological indicator of intermittent wetlands, such as salt pans [19,27], can be characterized by means of water height (WH), water extent (WE), or water volume (WV) [2,28,29]. WH derived from in situ water level gauges offers the most reliable and temporally frequent source of information. However, water gauges provide merely vertical, locally tied measurements and are costly to install and maintain. Especially for salt pans, the water level gauge must be positioned at the deepest point due to increased drying towards the edges. In many regions of the world, long-term, automatic in situ measurements are not widely available, as is also the case for the Seewinkel region (web address: https://wasser.bgld.gv.at/hydrographie/die-seen and https://ehyd.gv.at/ (accessed on 14 August 2023)).

WE is especially suited for studying the inundation state of the salt pans due to their shallow topography so that small changes in water volume cause substantial changes in water surface area. WE can be reliably retrieved from Earth observation (EO) satellite data that provide global, freely available information of high spatial and sufficient temporal resolution [30,31]. Multispectral imagery has proven to be suitable for studying salt pans because of the high reflectivity of exposed salt surfaces and the absorption of infrared radiation by water surfaces [6]. Although commonly suffering from cloud cover, multispectral observations are less affected by wind than radar systems [32,33,34], which have been widely used to monitor wetlands [30,31,35,36]. Most studies use data from the moderate resolution imaging spectroradiometer (MODIS) [37] or a series of the Landsat missions, which together cover an observation period of nearly 50 years [38]. Examples of global satellite-derived WE products are the global surface water (GSW) product [39] and the dynamic surface water extent product [40]. Additionally, continental-scale products exist [41]. These large-scale products include data on salt pans (Figure 1); however, they are often inaccurate for small-size ecosystems, such as those encountered in Seewinkel [39]. Local case studies using remote sensing to derive WE and inundation states are numerous [42,43,44,45,46,47,48,49,50].

Several modeling approaches exist that link various drivers to salt pan hydrological properties. Traditional hydrological modeling [52,53,54] applied to wetlands depends on a certain quantity and quality of data for parameterization, which often hampers their spatial transferability to regions where these data are not available [55,56,57,58,59]. Stochastic modeling has long been recognized as a vital alternative to process-based modeling [60]. A number of studies have focused on summarizing past, present, and perspective machine learning (ML) methodology in estimating different hydrological variables [61,62,63,64,65,66,67,68,69,70,71], such as groundwater [65,66,67]. Conventional ML models, such as the random forest (RF) approach [72,73], have been the most commonly used concepts for modeling hydrological variables [70,71]. They encompass the advantage of being well explored, non-parametric, often robust estimators that, in many cases, offer extensive algorithmic options for model interpretation [70]. Hybrid models [74,75,76,77] and deep learning models [78] have only recently gained attention in hydrological research [77,78]. Hybrid models are meant to incorporate the advantages of traditional hydrological modeling and ML modeling [74,75,76,77]. Although fit for complex pattern recognition tasks, deep learning models typically require large amounts of data for model training [78,79] and are harder to interpret [80].

Advances in ML [70,71] have boosted the relevance of stochastic modeling for predicting lake WH [81,82,83,84,85]. Past research in modeling wetland inundation dynamics using ML methods is often restricted to using in situ measurements for identifying the presence of water [86,87]. Greater data availability provided by EO [39,88,89] has contributed to studies utilizing WE for modeling wetlands, although, to our knowledge, not for salt pans and in different temporal resolutions. The monthly inundation state of freshwater playas in the Great Plains of North America has been modeled on a large spatial scale using a monthly global water extent product based on Landsat [39] and climate and land cover data [90]. Inundation patterns in the Darling River Floodplain, Australia, were modeled using Landsat data and topography, meteorological, and hydrological data [91]. Satellite-derived WE (lake surface area) of Lake Gregory, Australia’s salt lake, has been modeled using ML with precipitation and temperature as predictors [92]. Quantification of wetland permanence of four water body permanence classes in the Prairie Pothole Region, although not carried out for salt pans, was executed by ref. [93], who, in addition to climate and land cover, introduced features based on topography to ML modeling. Various ML models were used for the mentioned studies. Ref. [90] used a long short-term memory neural network, ref. [91] used RF, ref. [92] applied a generalized group method of data handling, and ref. [93] used extreme gradient boosting techniques.

The goal of this study is to combine long-term EO and ML to build seasonal prediction models of the inundation state of salt pans in the Seewinkel region of Austria. Modeling of the salt pan inundation state contributes to a better understanding of the effect of climate variability and groundwater exploitation on salt pan health. First, we derive and evaluate a reference data set of the yearly inundation states from 1984 to 2022 of 34 salt pans based on the Landsat satellite archive [38,94]. Second, we use this new long-term inundation state data set as a target variable in combination with meteorological data from global reanalysis and in situ groundwater data to develop ML models to predict in early spring (end of March) if the salt pans fall dry during July, August, September, or October (JASO) of the same year. We develop models with and without the use of local in situ groundwater gauge data, which are precise but sparse at the global level. Therefore, we tested whether meteorological predictors in combination with EO-based estimates of WE are fit for use in regions with limited availability of in situ data. In total, we create three core models: a meteorology-based model, a groundwater-based model, and one combining meteorological and groundwater data. To identify the most important drivers, we apply concepts from explainable artificial intelligence [95].

2. Materials

2.1. Study Area

The study area is located in the Lake Neusiedl—Seewinkel National Park (Figure 1). Salt pans in Seewinkel are steppe wetlands [14] and mostly relicts of the Würm glaciation [96,97]. Their size ranges between 0.03 km

^{2}

and 1.5 km

^{2}

(based on the data collected in the scope of this study). The salt pans are highly heterogeneous in their hydrological and ecological condition [14,98]. Due to various causes, the number of salt pans in Seewinkel has decreased from 139 in the year 1855 to about 59 ecologically intact specimens in 2012 [14]. Historically, the landscape has been subject to strong economic utilization, most notably through the intensification and expansion of agricultural practices since the 1960s [96] and because of tourism [14]. A process referred to as “drying from beneath” [20] has been identified as the main driver behind “dying salt pans” [14], although diverging theories exist [97,99]. Due to human-induced groundwater drainage, the capillary rise is disrupted starting at a depth to groundwater of approximately 70 cm [21]. This finally results in the failure of the water retention capacity of the salt pans and desalinization [21]. Some salt pans tend to be naturally filled with water all year (type: ‘naturally perennial’), and some artificially hold water over the whole year (type: ‘artificially perennial’). Others tend to desiccate over continuous periods of time (type: ‘periodically filled’) [14].

The climate in Seewinkel can be classified as Dfb-climate (warm-summer humid continental; based on monthly ERA5-Land data from 1984 to 2022; Köppen-Geiger climate classification [100]). The total annual potential evaporation (

E_{p o t}

) is 602 mm (1978–2010; [101]), the total annual precipitation (P) is 556 mm (1971–2020; [102]), and the annual mean 2 m temperature (T) is 11.1 °C (1971–2020; [102]). P is highest from May to September and lowest from October to April (based on monthly ERA5-Land from 1984 to 2022; verified via ref. [103]). While there is research on the impact of climate change on Lake Neusiedl [101,102,104,105], it remains unclear whether climate change currently affects the salt pans in Seewinkel [23]. Most scenarios (Representative Concentration Pathway (RCP) 4.5 and RCP 8.5) for the Austrian state of Burgenland predict a significant increase in temperature for all seasons, whereas a slight increase in mean annual precipitation is expected together with a significant increase (+33%) in winter precipitation in the distant future (RCP 8.5) [106]. Based on climate scenarios, it is estimated that the return period of moderate and extreme droughts will decrease in lowland Austria over the course of the 21st century [107], suggesting greater pressure on vulnerable wetland ecosystems in the future.

2.2. Multispectral Satellite Imagery

The salt pan inundation state was mapped using multispectral data from the Landsat 5, 8, and 9 satellites acquired during the time period from 1984 to 2022 [94,108].

Landsat 5 operated between 1984 and 2013. Its thematic mapper (TM) provided data in six bands in the visible, near-infrared (NIR), and shortwave infrared (SWIR) portions of the electromagnetic spectrum. Landsat 8 was launched in 2013, and Landsat 9 was launched in 2021. Their operational land imager (OLI) acquires data in nine bands, of which six bands are used here to be consistent with the information from the TM. Landsat data have a 30 m spatial resolution with a revisit time of approximately 16 days [38]. We accessed the Level 2, Collection 2, surface reflectance data for Landsat 5 TM (DOI: 10.5066/P9IAXOVV) and Landsat 8–9 OLI (DOI: 10.5066/P9OGBGM6) using Google Earth Engine (GEE) [109]. Level 2 data also include masks for clouds and cloud shadows, which were applied to the data. The WE retrieval was limited to scenes acquired between April and October of each year, as drying commonly occurs during these months and to limit the potential influence of ice cover on classification accuracy. Furthermore, scenes that contain more than 30% cloud cover according to the metadata were excluded from the classification. This resulted in a total of 286 images acquired between May 1984 and October 2022 that were used as the basis for the wetland extent classification. In 2012, there is a gap in the Landsat time series, as the Landsat 5 data for the region were only available until the end of 2011, and Landsat 8 was launched in 2013 [108].

In addition to the Landsat data, the Sentinel-2 data with similar spectral characteristics and a slightly higher spatial resolution from 10 m to 20 m was used for validation of the retrieved WEs. Seven Sentinel-2 scenes acquired within one day before or after the respective Landsat scene were used for this purpose. In addition, a cloud-free scene acquired on 31 May 2018 was used for manually delineating salt pan polygons.

2.3. Predictor Data

2.3.1. Predictor Selection

The selected features relate to the main drivers of salt pan variability, i.e., groundwater, precipitation, temperature, and evaporation (Table 1). They were narrowed down from a larger set of potential drivers based on the literature on wetlands and salt pan modeling in general, as well as the salt pans in Seewinkel, in particular [1]. For example, topography, a feature proposed by ref. [93], was excluded as a predictor since the variation between salt pan basins is likely to be minimal (based on a local digital elevation model (DEM) (web address: https://geodaten.bgld.gv.at/de/downloads/hoehenmodelle-orthofotos.html (accessed on 14 August 2023))). Information on anthropogenic drivers in sufficient resolution, e.g., well extraction amounts or channel discharge, was not available [110].

WE in spring is expected to have a major impact on inundation state estimates in summer [21]. However, we did not include WE as a predictor for the following reasons: First, its inclusion potentially results in overshadowing other predictors. Although possibly improving model performance, this would limit model interpretability and ecosystem understanding. Second, natural spring WE is the result of the underlying processes steered by groundwater, precipitation, temperature, and evaporation. Hence, WE information is implicitly included in the nine predictors used in this study.

To account for interannual variability in hydrological and meteorological conditions, we applied various integration periods to the selected predictors. However, the overall focus is on integration over 12 months as this period covers the entire time since the last prediction was made. Other integration periods span 6 months, whereas the SGI is a continuous variable, as indicated in Table 1. To detect trends in the features, we applied the Mann-Kendall test [111] to the nine predictors. We abstained from feature selection methods [112] as we wanted to obtain information on feature importance and partial dependencies on all introduced predictors. Importantly, our sample size is expected to be large enough with respect to the maximum number of features (nine features for the combined model), hence decreasing the likelihood of overfitting [113].

Table 1. The nine predictors used for modeling inundation state are divided into meteorology and hydrology. Furthermore, the relation to the salt pan cycle is explained, and additional information is provided.

Field	Variable [Unit]	Derived Predictor [Unit] (Abbrev.)	Integ. Period	Relation to Salt Pan Cycle	Data Source
Hydrology	Groundwater [m.a.s.l.]	Anomalies [unitless] (GW Anom.)	12 m.	Groundwater is of key importance for salt pan water abundance in Seewinkel [21] Short-term and especially long-term groundwater depletion leads to salt pan degradation [114]	Austrian eHyd portal
		SGI [unitless] (SGI)	Cont.	The Standardized Groundwater Index can serve as a robust estimation of groundwater drought [115,116] Groundwater drought in March influences the salt pan water extent in spring and therefore the inundation state in summer
		Level ratio [unitless] (GW level ratio)	Oct./March (6 m.)	Fall-winter groundwater level ratio is closely connected to regional precipitation during that time [117] The level ratio stands in relation to salt pan water extent in spring [14]
Meteorology	Temperature [°C]	Anomalies [unitless] (T Anom.)	12 m.	Higher temperature increases water temperature [21,118,119,120] Higher summer temperature increases evaporation and therefore the number of drying events [14] Higher temperatures in winter decrease spring water extent [20]	ERA5-Land [121], DOI: 10.24381/cds.68d2bb30
	Temperature [°C]	Numb. of days above 25 °C [days]	12 m.	The number of days above 25 °C is connected to heatwaves and extensive evaporation [122]
	Evaporation [mm]	Anomalies [unitless] (Epot Anom.)	12 m.	Evaporation leads to salt pan concentration and desiccation [1,123]
	Precipitation [mm]	Anomalies [unitless] (P Anom.)	12 m.	Precipitation leads to salt pan filling [1] Precipitation leads to eluviation of the saliferous horizon [20] Precipitation as observed over a 12-month period is related to hydrological drought [124]
		SPI 6 [unitless]	6 m.	Standardized Precipitation Index 6 is connected to medium-term precipitation patterns and agricultural drought [125,126]
		SPI 24 [unitless]	24 m.	Standardized Precipitation Index 24 is connected to long-term precipitation patterns and hydrological/socioeconomic drought [124,125,126,127]

2.3.2. Groundwater Level

Seewinkel is equipped with 84 groundwater gauges located in relative proximity to the salt pans. Of these, only six provide continuous observations throughout the time span covered by the Landsat observations and could thus be used for our study (i.e., stations 306043, 319418, 316174, 305755, 305813, and 319426). The point-based data are provided natively as monthly means through eHyd (web address: https://ehyd.gv.at/ (accessed on 14 August 2023)). First, we derived the mean monthly groundwater level of the six stations. We then calculated the groundwater anomalies to exclude long-term climatology. This was completed by subtracting the mean seasonal component from the original time series [128,129]:

A_{t} = D_{t} - C_{t},

(1)

where

A_{t}

is the anomaly at time t,

D_{t}

is the monthly averaged values at time t, and

C_{t}

is the long-term seasonal climatology.

C_{t}

was calculated by averaging groundwater levels per month for the entire reference period from 1984 to 2022. Finally, we calculated the 12-month average of the time series between April (previous year) and March (current year) to derive presummer season groundwater anomalies for each year. All of these steps were executed with the Python packages numpy [130] and pandas [131]. We decided not to apply further detrending to the anomalies to inform the model about long-term environmental changes (e.g., introduced by human management).

Other predictors that were derived are the Standardized Groundwater Index (SGI) in March of each year and the October/March groundwater level ratio [115]. The SGI is the only predictor with a continuous accumulation period and is based on a non-parametric normal scores transform of the groundwater level data for each calendar month [115]. It represents information on the groundwater level, not as an average over 12 months, but as derived in March. The SGI was also calculated based on the monthly groundwater values averaged across all six stations. The calculation was executed through the Python package pastas [132]. The groundwater level ratio was calculated as the ratio of the groundwater level in March (the time of prediction and typically the time of year of the groundwater level maximum [21]) divided by the level in October of the previous year (approx. the lowest level [21]). It serves as a proxy for groundwater recharge during winter.

2.3.3. ERA5-Land Meteorology

The ERA5-Land reanalysis [121] is aimed at land applications. It has the following main characteristics: the data from ERA5-Land offers a high spatial (ca. 9 km × 9 km) and temporal resolution, global availability, and temporal coverage dating back to the 1950s, thus, overlapping with Landsat retrieval periods [133,134,135,136]. Three variables were used: the total precipitation P, the potential evaporation

E_{p o t}

, and the 2 m temperature T. The

E_{p o t}

in ERA5-Land is higher than the Pannonian average of 600–800 mm per year [137] as the

E_{p o t}

is often overestimated because of representing open water evaporation [121]. We used the ERA5-Land monthly averaged data from 1950 to the present (DOI: 10.24381/cds.68d2bb30) for the calculation of nearly all meteorological predictors. Only the number of days in a year with a maximum temperature above 25 °C was derived from hourly 2 m temperature data based on the ERA5-Land hourly data (DOI: 10.24381/cds.e2161bac).

First, the seven pixels covering the study area were combined by spatial averaging for each variable. Preprocessing of the ERA5-Land variables was performed in accordance with Equation (1) and the Python packages numpy [130] and pandas [131]. Subsequently, a 12-month average was performed in the case of T and a 12-month summation in the cases of P and

E_{p o t}

. For the same reason mentioned in Section 2.3.2, detrending was not performed for these three features either. In addition, we used the ERA5-Land monthly P to compute the drought indicator standardized precipitation index (SPI) over 6 months and 24 months (SPI; [125,138]) using the R package SPEI [124]. The SPI represents the precipitation conditions of a predefined time period in relation to the respective normal values. It builds on the calculation of a normal distribution as in reference [138]. Furthermore, the number of days in a year with a maximum temperature above 25 °C was derived.

3. Methods

3.1. Salt Pan Mapping

3.1.1. Derivation of Water Extent Time Series

The WE mapping was based on the unique spectral properties of water, vegetation, and salt pans. While water absorbs most of the incoming radiation in the near-infrared spectrum [139], dry salt pans reflect more radiation than inundated areas throughout the visible and near-infrared wavelengths [6]. Visible, NIR, and SWIR bands were used as input data for the classification. Additionally, the normalized difference vegetation index (NDVI), the normalized difference water index (NDWI) [140], and the modified normalized difference water index (MNDWI) [141] were computed as complementing the spectral bands with vegetation indices has been shown to have a positive impact on water classification [39]. To classify inundated and non-inundated pixels, RF models [72] were trained using data sampled from the permanently inundated nearby Lake Neusiedl and permanently non-inundated areas from various land cover classes that are common in the region, such as agriculture and grassland. For each month that was selected for monitoring due to the reasons detailed in Section 2.2 (April to October), a separate model was trained to take the seasonality in reflectance values into account.

As the locations of the salt pans are known a priori, we used them as the baseline for the extent monitoring. The baseline extent of the salt pans was defined by manual delineation of 34 inundated and dry salt pans and water bodies visible in a cloud-free image acquired by Sentinel-2 on 31 May 2018. The rationale behind choosing a year late in the study period for the baseline inventory of wetlands was the focus of this study to model the inundation/drying behavior of currently existing water bodies. A year early in the study period likely would have resulted in a higher number of wetlands for monitoring; however, many of them would not have shown water during a large part of the study period. The salt pans were either covered by water or by salt crusts, which made them easy to distinguish visually from their vegetated surroundings. The remaining salt pans could not be clearly distinguished visually, so it was assumed that reliable monitoring would not be feasible based on the Landsat data. The baseline salt pan data set was then used to derive a time series of WE for each salt pan from the classified Landsat data. For each time step, connected water pixels were first clustered, and, for each salt pan contained in the baseline data set, the clusters intersecting its baseline extent were used for determining its water area. In the case that the baseline extent of a salt pan was covered by the cloud/cloud-shadow mask ≥ 10%, the WE of the respective salt pan was set to a missing value. The classification process resulted in 34 WE time series, one for each salt pan, thus covering more than half of all functioning salt pans in the region [14]. Each time series comprises between 199 (Oberer Stinkersee) and 245 (Albersee) valid data points derived from the 286 Landsat images. The number of valid data points per salt pan varies due to the influence of the cloud/cloud-shadow masking.

The retrieved WEs were validated for seven dates using reference sample points that were manually labeled as ’water’ and ’non-water’ samples using the Sentinel-2 data acquired within one day before or after the respective Landsat scene. As wetlands make up a relatively small portion of the study area, a stratified approach for the sampling was selected, i.e., the same number of water and non-water samples was drawn. For this purpose, each of the Sentinel-2 scenes was preliminarily classified using a threshold of NDWI

> 0

, and then a morphological opening (a kernel size of

3 \times 3

pixels) was applied to the resulting binary raster to remove isolated pixels. This preliminary classification was then used to sample 50 reference points for each class, i.e., a number of 100 reference pixels was available for each of the seven validation dates. These samples were then assigned a preliminary label according to the NDWI-derived class, which was then manually corrected if a visual inspection of the original Sentinel-2 imagery showed discrepancies. For each validation date, a confusion matrix was computed between the reference points and our Landsat-based WE classification. The overall classification accuracy (OA), true positive rate (TPR), and kappa coefficient are reported as performance metrics for the WE classification. Accuracies were computed for each validation date because the conditions, particularly in cloud cover, can change significantly between different image acquisitions, which is reflected in the obtained accuracy measures.

3.1.2. Derivation of JASO Inundation State

To derive the model target variable, an inundation state, that is ‘desiccated’ (0) or ‘inundated’ (1), was assigned to each year. This was completed based on the WE time series in JASO described above for nearly every year (1984–2022) and for each of the 34 salt pans individually. Due to the WE data gaps caused by cloud cover during the year 2002 (only acquisitions on 13 June 2002 and on 20 June 2002), this year could not be used for inundation state modeling. The only acquisition for the year 1999 in JASO, on 25 September 1999, did not result in WE data for Kiesgrube and St. Martins Therme 2 due to cloud masking. We decided to manually insert two states ’inundated’ into the time series for the two salt pans after a visual inspection of the image, as otherwise the entire year would have been discarded. The year 2012 is missing in the inundation state time series due to the reasons described in Section 3.1.1.

In the case of a desiccation event, that is, a WE of zero, in any of the four JASO months for each year, the year was tagged as ‘desiccated’. Correspondingly, in the event that no desiccation occurred, meaning a non-zero WE was present during the entire JASO period, the year was tagged as ‘inundated’. Hence, a yearly and binary target space was formed. This resulted in a total of 1258 data points (37 years times 34 salt pans).

We decided to display the inundation state for each salt pan and year to increase the understanding of the model target. This was completed with the Python package matplotlib [142], as well as with all other data visualizations. Additionally, the month of the first desiccation event per year between April and October was visualized based on the original WE time series for spotting inundation events outside the JASO period. The first desiccation event per year is, furthermore, of key ecological importance [14,21].

3.2. Inundation State Prediction

3.2.1. Exploratory Data Analysis: Separability and Correlation Analysis

As a first step, we analyzed the feature space to unveil underlying distributions, feature relations, possible non-linearities in modeling, data complexity, and class separability [143]. This was completed by visual inspection of histograms per predictor and scatter plots between all predictor pairs in combination with class-based coloring using a seaborn pair plot [144]. The complexity analysis was performed for two exemplary salt pans: Lange Lacke, which is one of the largest salt pans, and Unterer Stinkersee, which is known for its close connection to groundwater [14]. We abstained from a complete analysis for all salt pans via the maximum Fisher’s discriminant ratio or other complexity metrics [145] due to compactness and the ability of the RF to detect multi-dimensional patterns.

Furthermore, we performed a correlation analysis to gain an understanding of the temporal agreement between the predictive features. For this purpose, both Pearson’s correlation coefficient r and Spearman’s rank correlation coefficient

ρ

were calculated between the nine features with the scipy stats Python package [146].

3.2.2. Random Forests

RF models [72] are commonly used in classification and regression problems in remote sensing [147,148] and hydrology [69,73]. Because of the assumed strong relationship between the targets (i.e., the inundation state of different salt pans), we used the scikit-learn multi-output option (web address: https://scikit-learn.org/stable/modules/tree.html#tree-multioutput (accessed on 14 August 2023)), which is expected to improve generalization accuracy by estimating the inundation state of multiple salt pans in a single model [149,150]. This assumption is based on detailed information on the hydrology of the individual salt pans [14,20]. We applied hyperparameter optimization (also referred to as hyperparameter tuning) to reduce overfitting [151]. All concepts applied here are based on algorithms of the respective scikit-learn packages [152].

3.2.3. Model Setup

The inundation state in summer/fall serves as the target variable in our models. Using data until the end of March of each year, we predict whether a salt pan dries out (’desiccated’) or remains ’inundated’ during JASO of the same year. This binary classification scheme has already been used as the basis for modeling WE in a number of studies [87,90,91]. The simplicity of the inundation state in summer/fall, meaning its low temporal resolution, its low number of classes, and its low degree of mathematical abstraction (Section 3.1.2), in combination with our predictor setup and the RF algorithm, leads to relatively good model performance and model interpretability. The inundation state in summer/fall comprises a number of advantages: it is relatively robust considering the inhomogeneous number of acquisitions per year; it closes the lack of preceding research by introducing a classification task with a low number of classes; it addresses important hydrological and ecological issues, as many plants and animals in Seewinkel rely on water abundance within the salt pans during summer/fall [14,17,18]; it is of interest to decision-makers that can use our models to enable efficient water management by, e.g., steering artificial inundation; and it further helps us to identify the most useful predictors quantifying refilling during winter/spring months, as it is aimed at the core characteristic of salt pans [1], namely the yearly inundation state in JASO. The main limitations of the target chosen in this study are its coarse temporal resolution and the omittance of spring inundation/desiccation events that are of high ecological importance [14,17].

Although the prediction is only made once per year, lead times vary between three and six months as the drying events accounted for by our model can happen in any month from July to October. Despite desiccation sometimes also occurring before July, we aimed for a setup that enables forecasting. This is, on the one hand, more challenging because we include longer lead times, but, on the other hand, more valuable for policymakers and stakeholders.

To gain a thorough understanding of the predictors of salt pan desiccation while ensuring global model transferability, we developed four RF models. The model GROUNDWATER only uses in situ groundwater information, the model METEOROLOGY uses only meteorological data, and the COMBINED model uses both groundwater and meteorological predictors (Table 1; sometimes referred to as main models). In addition, we developed the RANDOM model, based on a single predictor randomly sampled from a uniform distribution to create a baseline for testing model performance by involving chance.

For each of the four models, we followed the same training, validation, and test splitting [153,154]. We use the definition of ref. [155], who defines the validation set via the use of hyperparameter tuning and the test set via the use of a final and independent evaluation of the model. The model splits are presented in Figure 2. We followed an approximate overall 70% (training), 10% (validation), and 20% (testing) split. For hyperparameter tuning (described in Section 3.2.5), we applied a six-fold stratified cross-validation (CV) within the 80% model validation set that recurrently applied the approximate 70%/10% split. This roughly translates into an 85%/15% split relative to the entire model validation set (Figure 2).

3.2.4. Model Testing

Inside model testing (∼20% of the entire data set), we made use of a leave-one-out cross-validation (LOOCV) [156,157]. This is meant to improve the prediction skill as always all other folds, except for the current single test fold, were used for model training as part of a recurrent 97%/3% split with respect to the entire data set (Figure 2). Hence, seven LOOCV runs were executed to test, each time, one year (at this point the independent year) of the overall seven independent test years. This independent test set (Test 7 in Figure 2) comprises the years 1985, 1991, 1997, 2004, 2010, 2017, and 2022. The years were chosen at roughly similar intervals across the entire temporal domain to ensure a balanced distribution of folds over time while maintaining the class balance of the entire data set.

While independent model testing uses ∼20% of the data (seven test years), we aimed to understand the year-wise model performance from 1984 to 2022. Hence, we reintroduced the folds used for model validation to model testing as part of a dependent test set. Here, the training data (∼80%, 30 additional years) were reused from model validation in the scope of the overall LOOCV (Test 30 in Figure 2). The year-wise information on metrics additionally benefits model understanding, as the target data exhibits skewed fold-wise class distributions that affect fold-wise model performance. Although the reintroduction enables the calculation of feature importance and partial dependencies for all folds, we abstained from such analysis due to small year-wise sample sizes. The reintroduction of the years used for model validation inherits the predisposition that these folds have been part of the hyperparameter tuning. Therefore, the metrics for the folds of the dependent test set might be inflated as the hyperparameters were adjusted to exactly this set (model validation set).

As we did not want to exclude fold estimation for the initial training set, we used LOOCV instead of nested CV [158], which is more commonly used for time series. The metrics for some salt pans may also be heightened due to a large target autocorrelation as we apply an LOOCV scheme [158], although the effect is expected to be minimal due to the random characteristics (sample bagging) of the RF algorithm [72].

The estimations within the LOOCV scheme were carried out in the scope of 30 runs per main model. These repetitions aim to address the stochastic variability (randomness within the sampling and selection of features) of the RF approach [72]. The results of the LOOCV were divided into training and test scores for the four model setups and two test sets (independent test set and dependent test set) and averaged over all runs. We calculated the standard deviation (SD) for model variability quantification but did not include it in the results, as it was generally low (max. SD of the Matthews correlation coefficient (MCC; Section 3.2.6) of 0.02 for the dependent test set (model GROUNDWATER); max. SD of the MCC of 0.03 for the independent test set (model RANDOM)).

For more insight, we calculated the average fold-wise LOOCV performance over all 30 model runs for the three main models (not the RANDOM model; Section 4.2.2). This provides information on the development of the test metrics over time. Additionally, random model realizations for the three main models were chosen to gain an in-depth understanding of model test set behavior (Section 4.2.3). This allows for studying the salt pan-wise inundation prediction, meaning the outcome as true positive (TP), false positive (FP), true negative (TN), and false negative (FN), for every year. The results for the RANDOM model were based on 200 repetitions to account for the random nature.

3.2.5. Model Validation

For model validation, hyperparameter tuning was applied using GridSearchCV [159]. The final hyperparameters used for the three models can be seen in Table 2. We used the hyperparameters from the COMBINED model for the calculation of the RANDOM model, as they offer the most robust solution against overfitting due to the larger number of features compared to the GROUNDWATER model and METEOROLOGY model.

Throughout the parameter-tuning process, GridSearchCV was performed with a six-fold stratified CV that uses an approx. 85%/15% split within the model validation set, as described in Section 3.2.4. Hence, for each validation run, five years were introduced as the actual validation set. The hyperparameter tuning was based on the definition of a range of values deemed sensible by the literature [151]. Individual hyperparameters were varied within the predefined range and compared with performance differences of the training and validation sets. This gave a rough indication of and lever against overfitting. Subsequently, a further, more restricted range of hyperparameters was declared that allowed for training–test score differences of a maximum of 25%. Once more applying GridSearchCV, this range of values (Table 2) was tested using the MCC as the indicator. This resulted in the final set of hyperparameters also displayed in Table 2.

The hyperparameters were chosen using a different CV scheme and, hence, different model splits compared to the LOOCV scheme of the main models. The results obtained in the scope of the hyperparameter tuning were slightly worse compared to the test scores. In combination with the removal of the seven test folds, independent model testing is ensured, as hyperparameters adjusted to a certain split are prevented. The hyperparameters for all models (Table 2) are less complex compared to the default scikit-learn ones that are designed to fit a maximum of use cases [160]. Generally, the decision trees have similar complexity between the models. The exact number of trees (40, 100, or 300) is not essential for model performance, as was verified both by varying the number of trees and by the CV results provided by GridSearchCV.

3.2.6. Evaluation Metrics

Global confusion matrices with an MCC, F1-Score, and OA were calculated for the training sets, test sets (Test 7 and Test 30), and validation sets as the main indicators of the classification performance. Since we have a slight class imbalance, but both classes are equally important, we performed macro-averaging to compute the F1-score to put equal weight on both classes [161]. As the F1-Score does not consider TNs, and the overall accuracy is vulnerable to a skewed class distribution [162], the main metric considered in this study is the MCC. It is a robust metric regardless of class imbalances [163]. In the case of the main models, the metrics, in addition to being averaged over the 30 model runs (200 for the RANDOM model), were either averaged over all folds and salt pans based on the LOOCV scheme [156,157] or averaged over all salt pans for a single fold. To gain a salt pan-wise understanding of the model performance, the metrics were averaged across all years for the random model realizations described in Section 3.2.4 for the eight salt pans that exhibit a balanced class distribution. It was not possible to include the results for many of the salt pans due to the model’s tendency to predict a single class in the case of a highly skewed class distribution per salt pan. We applied the formulation of the MCC used in scikit-learn [152]:

MCC = \frac{T N \times T P - F N \times F P}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(2)

3.2.7. Feature Importance

For model interpretability, we calculated the feature importance by the mean decrease in impurity (MDI, also Gini index) [164] and partial dependencies [165,166]. The results for the MDI were averaged separately for each of the three main models over all CV folds and model runs. Since the predictors used in this study are all time series that exhibit high cardinality, the use of feature importance based on the MDI is justified [72,167,168]. Partial dependency plots (PDP) and individual conditional expectation (ICE) plots [165,166] were calculated for each salt pan individually using the training set for the test fold 1984, as this includes the most recent information. We interpret the PDP or ICE curve as the probability of predicting ’inundated’ or ’desiccated’ given different predictor values, as can be performed for a binary classification [169]. The PDP and ICE plots were based on the COMBINED model’s run, which is also displayed in Section 4.2.3. Salt pans with an especially large partial dependency, spread with regard for the respective predictor, were manually chosen and visualized, in addition to the PDP, by the ICE plots [166]. The spread was calculated by subtracting the minimum partial dependence value from the maximum partial dependence value for each salt pan and predictor combination. For Heidlacke, Hottergrube, and Gsigsee, no partial dependencies could be calculated as only the state ’desiccated’ is present. Both algorithms are based on the respective scikit-learn packages [152].

4. Results

4.1. Salt Pan Mapping

For most validation dates, moderate to high accuracy and kappa values could be achieved (Table 3). The WE tends to be underestimated, and reference samples along the edges of the WE are often not contained in the Landsat-derived WE.

A total of 60% of the complete inundation state data set (754 combinations of years and salt pans) are classified as ‘desiccated’, and 40% (504 events) are classified as ‘inundated’ (Figure 3a). For individual salt pans, the class distribution is highly heterogeneous, with a class imbalance of up to 100% in the case of Hottergrube, Heidlacke, and Gsigsee. For some salt pans, e.g., Lange Lacke, Unterer Stinkersee, or Herrnsee, extensive years of drought are needed to force drying. Unterer Stinkersee (type: ‘naturally perennial’) falls dry less frequently compared to Lange Lacke (type: ‘periodically filled’). Salt pans that regularly fall dry are, e.g., Hochstätten, Fuchslochlacke 1 and 2, Huldenlacke, or Oberer Stinkersee. For Darscholacke, Zicksee, Kiesgrube, and Badesee Apetlon (type: ‘artificially perennial’), an inundated state can be observed for nearly all years. Numerous salt pans fell dry six years in a row from 2016 to 2022, the longest (nearly common) desiccation period since 1984. In many cases, the year-wise distribution is similarly skewed as in recent years. The ‘inundated’ state commonly occurs clustered in time and over multiple pans. For example, the three periods in (and around) the years 1987, 1997, 2010, and 2015 are dominated by the state ‘inundated’, while around the years 1984, 1992, 2003, and 2007, and all years since 2016, the class ‘desiccated’ prevails.

As Figure 3b shows, in some years, (early) desiccation is prevalent; for others, inundation is present throughout the year (assuming the presence of water in winter). Since 2016, desiccation occurs earlier in the year compared to earlier periods. A total of 41 desiccation events outside the JASO period can be found when the ‘inundated’ state is prevalent.

4.2. Inundation State Prediction

4.2.1. Exploratory Data Analysis

We detected no trend over the study period for five features (GW Anom., GW lvl. ratio, SGI, P Anom., and SPI 6) and an increasing trend for the other four features (T Anom., Number of days above 25 ° C, E Anom., and SPI 24). As expected, the features that are connected to an increase in WE, e.g., groundwater and precipitation anomalies, have positive correlations with each other (Figure 4). The same applies to the features that are assumed to be connected to a decrease in WE, i.e., temperature, evaporation, and number of days above 25 ° C. Consequently, correlations between predictors associated with water gain and water loss, respectively, are negative. The relationships between features that build on integration periods other than 12 months (e.g., SPI 6) are less clear.

The histograms in Figure 4 suggest a better separability between the desiccation and inundation states for Lange Lacke (a) compared to Unterer Stinkersee (b). This is indeed confirmed by the inundation state classes in the scatter plots, which show more distinct clusters in the case of Lange Lacke. The plots suggest that groundwater-based features, especially SGI, are the most promising predictors for desiccation forecasting. They also reveal that no two-dimensional predictor combination can lead to perfect class separability on its own and that multiple features should be included in the models.

4.2.2. Average Prediction Skill

On average over all 37 folds provided by the LOOCV and 30 model runs, the three models have moderate skill with an MCC of approximately 0.6 (GROUNDWATER: 0.6, METEOROLOGY: 0.59, COMBINED: 0.6). For the independent test sets (seven test folds), a 0.24 performance increase with respect to the RANDOM model is obtained on average for the three models. Generally, the test metrics as averaged over all folds (described above) and the dependent folds (Table 4) confirm the abilities of GROUNDWATER in modeling the inundation state. The independent test set performance metrics of METEOROLOGY exceed those of GROUNDWATER. The COMBINED model does not show any increase in performance with respect to METEOROLOGY or GROUNDWATER. Averaged over all independent test folds and 30 model runs together, the MCC of METEOROLOGY is 0.09 (0.07) higher than that of COMBINED (GROUNDWATER).

Differences between the confusion matrices, as averaged over 30 model runs, are minimal between the three models (below 1% in regard to the entire sample; Figure 5). All three models struggle more with the correct estimation of state ‘inundated’ compared to ‘desiccated’ (relation of FPs and FNs to the total number of state ‘inundated’ and ‘desiccated’, respectively). GROUNDWATER exhibits more skill in the classification of TNs (a surplus of nine TNs) compared to the METEOROLOGY model. COMBINED manages to achieve a surplus of one TP and nine TNs compared to METEOROLOGY. As discussed in Section 4.2.1, large salt pan inundation state variability results in extensive year-wise heterogeneity in the class distribution and model performance. As indicated by Figure 5, the performance between the folds is heterogeneous. The models perform better, and with less variability in terms of the MCC, for the independent test folds (GW 7, METEO 7, and COM 7) compared to the 30 dependent test folds (GW 30, METEO 30, and COM 30). Outliers excluded, the tested folds exhibit a pronounced dynamic over the years. Beginning in 2004, the estimates tend to improve. Here, the MCC does not fall below 0.5, neither for the seven test folds nor for the additional 30 test folds. In the years 2006 and 2007, the GROUNDWATER model performed much better for the 30 dependent test folds compared to the other two models. A very different picture is observed pre-2004: all three models struggle to achieve scores from above 0.5 to 0.6. Especially 1992 presents itself as challenging for the RF models.

All GW 7, METEO 7, and COM 7 (Figure 5) perform better compared to GW 30, METEO 30, and COM 30. With this in mind, the difference in metrics between the two sets, as displayed in Table 4 can be integrated more clearly.

4.2.3. Detailed Analysis of Single LOOCV Model Runs

All three models successfully predict the diverse interannual dynamics of the inundation state per salt pan (Figure 6). TPs and TNs are numerous, especially in the years with average conditions. Overall, an accurate estimation in times of pronounced dry and especially wet periods appears more challenging. For instance, the models lack adequate prediction for the dry periods in 1986, 1992, 2003, 2007, 2011, and 2016, as well as for the wet periods in 1987 and around 1996 (1997), 2010, and 2015.

For 18 salt pans with a dominant state, the events are correctly predicted in favor of the majority class (a co-occurrence of light blue and dark red and of light red and dark blue, respectively, in Figure 6). In these cases, the estimates between the three main models (i.e., GROUNDWATER, METEOROLOGY, and COMBINED) do not differ, e.g., Badesee Apetlon. Differences between the models exist for 16 salt pans but are only marginal for Standlacke, Ochsenbrunnlacke, Wörtenlacken 1, Sechsmahdlacke, Fuchslochlacke 1, Herrnsee, Birnbaumlacke, and Albersee. The eight cases with varying results between the models are Zicklacke, Katschitzlacke, Fuchslochlacke 3, Oberer Stinkersee, Mittlerer Stinkersee, Wörtenlacken 2, Neubruchlacke, and Lange Lacke. These salt pans feature a more balanced underlying class distribution with varying class succession. Here, the models demonstrate their flexibility, i.e., the ability to estimate different states for subsequent years. When there is a shift from periods of ’inundated’ to ’desiccated’, or vice-versa, the models do not correctly predict wet and dry states for many salt pans. For instance, correct estimates for wet conditions in 1994 and for dry conditions in 2016 emerge one year later after a dry period (around 1993) or wet period (around 2015) (Figure 6). The strength of model GROUNDWATER lies in its ability to more precisely identify wet, and, in particular, dry episodes compared to model METEOROLOGY (the identification can be salt pan-specific). Examples here include Lange Lacke and Wörtenlacken 2 around 1991 and Neubruchlacke, Katschitzlacke, and Zicklacke after 2004. The METEOROLOGY model struggles to achieve this, often failing to correctly estimate for a number of years in a row (e.g., Neubruchlacke around 2006 and Zicklacke around 2014). However, for many salt pans, the model performed better in 1986 and 1988. In a number of additional instances, the METEOROLOGY model manages to outperform the other two. This translates into the correct estimation of single anomalous events before and after extremely wet or dry conditions. Examples include Ochsenbrunnlacke, Mittlerer Stinkersee, Fuchslochlacke 1, Fuchslochlacke 3, and Katschitzlacke in 2016 (as outcome TP) or Lange Lacke and Wörtenlacken 2 in 1994 (as outcome TN). The COMBINED model performs similarly to the GROUNDWATER model, though it surpasses its performance in a few cases (e.g., Mittlerer Stinkersee in 1999 and 2000). In these cases, the model is able to integrate meteorological and groundwater-based information in a productive manner.

The MCCs for the eight salt pans that exhibit a more balanced class distribution differ between the three models and between the salt pans (Table 5). On average, the GROUNDWATER model performs best with a moderate MCC of 0.44 and the METEOROLOGY model performs the worst (0.29). The COMBINED model attains an MCC of 0.37. For Fuchslochlacke 3 and Mittlerer Stinkersee, the METEOROLOGY model performs better than the GROUNDWATER and COMBINED models, whereas the opposite is true for the other six salt pans. Compared to the overall model performance of 0.6, the models perform worse for these eight salt pans with an average MCC of 0.37.

4.2.4. Feature Importance

In the GROUNDWATER model, GW anomalies and the SGI have comparable importance (around 0.35, respectively) while this is lower for the GW level ratio (Figure 7). In the METEOROLOGY model, SPI 6 has little importance while the predictors derived from temperature have highest importance. SPI 24, although substantially correlated with GW anomalies and the SGI (

r = 0.64, r = 0.53

, respectively; Section 4.2.1), which are important in the GROUNDWATER model, has a lower importance than the temperature predictors. These observations are in contrast with COMBINED, where the SGI has the highest importance by far. With a mean importance of 0.47, it is much larger than the other predictors, most of which lie at a maximum of ∼0.05. Only groundwater and temperature anomalies have an importance

> \sim

0.05.

4.2.5. Partial Dependency

The potential significance of groundwater-based predictors to estimate the inundation state of Lange Lacke (Section 4.2.1) translates into a pronounced evolution of partial dependency against the SGI (Figure 8, steel-gray dashed) in the COMBINED model (also for Wörtenlacken 2). SPI 24 has a comparable impact on Unterer Stinkersee (light blue dash-dotted), although to a lesser extent. It is striking that the other predictors do not, or only slightly, affect the prediction skills in both cases. Statistically, when involving the entire population of 34 salt pans, the variability in the inundation states of nine salt pans can be explained mainly by meteorological predictors, whereas, in 22 cases, groundwater-based features perform best (Table A1, Appendix A). In some cases, there is only little variability of the partial dependency when plotted against any of the predictors. This is also indicated by Table A1, though it is not further regarded here due to compactness. We find that, except for the

E_{p o t}

and SPI 6, every predictor exhibits a strong interaction with the dynamics of at least one salt pan (Table A1; Figure 8 in dotted red). Groundwater-based features exhibit more pronounced curves that span a wider partial dependency range compared to the other predictors.

In general, the partial dependencies agree with the underlying physical process. For instance, higher temperature anomalies (Kirchsee) or the number of days above 25 °C (Katschitzlacke) contribute to a prediction probability in favor of the ‘desiccated’ inundation state. This rule is not true for all salt pans, although this dependence commonly applies. The PDP reveals all important probability thresholds for inundation state prediction. For example, at

S G I = 0

, the class attribution probability switches from the ‘desiccated’ state to the ‘inundated’ state for Lange Lacke. At around 60 days above 25 °C in the previous 12 months, the estimates change from ‘inundated’ to ‘desiccated’ for Katschitzlacke. Similar inferences for nearly all salt pans can be made. These are summarized in Appendix A.

5. Discussion

5.1. Assumptions

The modeling framework has been built on several assumptions. First, the targets (i.e., inundation dynamics for different salt pans) are correlated. Second, the accumulation periods (particularly, the dominant 12-month period) hold explanatory power in regard to the inundation state in summer. Third, the salt pans are in an environmental condition that is good enough to allow them to react to the natural drivers applied in this study. In other words, the salt pans need to be at least sufficiently well connected to the salt pan cycle to respond to the groundwater-based and meteorological predictors. Fourth, we assume that the climatology always leads to the prediction of a drying from spring toward summer. Hence, the models cannot predict the ’inundated’ state based on dry conditions in spring. This is unless the salt pan-wise class distribution is skewed towards the ’inundated’ state, causing the models to always predict the ’inundated’ state (outcome TN). In other words, if the situation during the lead time deviates much from the climatology, the models will not capture many of the effects on the salt pan inundation state. This is a disadvantage in years when drivers strongly change and may lead to misclassifications.

In total, desiccation in spring (in April to June; here used as a proxy for dry conditions) in combination with the ’inundated’ state in JASO occurred for 41 events (Figure 3a) and resulted in 20 TNs and 21 FPs for the GROUNDWATER model. The year 2008 accumulated a notable number of fourteen FP outcomes. This circumstance also reveals that not all annual desiccation events were captured. Taking into account the desiccation events from the beginning of April to the end of October would result in a class distribution of 63%/ 37%.

5.2. Predictors

The results of the EDA (only for Lange Lacke; Figure 4a), feature importance (Figure 7), and calculation of partial dependencies (Figure 8) support the assumption of a close connection between salt pans and groundwater [14,21]. We suspect that the rather long time steps of the model and the respective long-term predictor setup support the forcing of the slow-reacting features evolving around groundwater as a key predictor. It is to be determined whether the high importance of groundwater is actually due to the contribution of groundwater to salt pan water status directly or more generally to water abundance, i.e., drought conditions, in the region. The outstandingly high feature importance of SGI is presumptively connected to its continuous nature, rather than relying on artificially thresholded integration periods [115].

Still, the METEOROLOGY model achieved similar scores compared to the GROUNDWATER model. Both were able to capture many of the interannual differences in the inundation state. We managed to find meteorological predictors that are of importance for spring salt pan water abundance, which is essential for the salt pan inundation state in JASO. The importance of meteorological predictors could stem from their temporal autocorrelation from one year to the next [170,171]. For the meteorological predictors, a single (or more) not included month(s) from the previous year could make a change in spring water abundance.

We find that, other than the continuous SGI, time periods of 12 months or more work best for predicting salt pan inundation state. Such predictors exhibited large feature importance within their model setups. It is up to further research to determine whether the 12-monthly anomaly mean is the most appropriate integration period. This argument is particularly relevant as shifting climate patterns influence groundwater recharge. The SPI 6 and the GW level ratio relate to a similar time period (6 months). This period does not seem to be particularly relevant, as both predictors were comparatively insignificant in all three models (also when disregarding the SGI). Temperature-based predictors were most important in the METEOROLOGY model despite exhibiting low correlations with the SGI.

For some combinations of salt pans and predictors, the PDP (Figure 8) exhibited sigmoid curves with a wide spread. The PDPs for Lange Lacke and Unterer Stinkersee showed a clear progression against the SGI and SPI 24, respectively. The SPI 24 is closely related to groundwater drought as suggested in the literature [125] and by the correlation analysis (

r_{S P I 24, S G I} = 0.53

and

ρ_{S P I 24, S G I} = 0.55

; Section 4.2.1). Therefore, our results can confirm the observation made by ref. [14] that both salt pans are closely connected to groundwater. This is even more true for Wörtenlacken 2, which is reported to have an atypically strong connection to groundwater, even greater than that of Lange Lacke [14]. Similar inferences can be made for all other salt pans (Appendix A). Additionally, the probability thresholds for the SGI were similar in the case of Lange Lacke and Wörtenlacken 2 (Figure 8). However, such an analysis is prone to misinterpretations as partial dependency behavior can vary depending on the model setup and the underlying training data. For example, the hydrology of Katschitzlacke is reportedly similar to Lange Lacke [14], whereas our results indicate a closer connection to the predictor number of days above 25 °C.

A drawback associated with the input data is their low spatial resolution. The argument is particularly valid for P anomalies, since groundwater level,

E_{p o t}

anomalies, and T anomalies vary less in space and time [172,173]. Here, future models could improve the (spatial) representation of precipitation. An understanding of the inundation state in JASO would require seasonal forecasts of hydrological and meteorological variables. Meteorological predictors that focus on depicting changing precipitation, evaporation, and temperature patterns in the region due to climate change should additionally prove beneficial [24].

Features evolving due to the human impact on the ecosystem, such as the (e.g., monthly) amount of groundwater extraction from wells and discharge into drainage canals, were not used, as, to our knowledge, no such information is available in the region. However, the use of this information could potentially enhance the knowledge to be gained from the models, especially if such information was available at the subregional scale or for each salt pan.

5.3. Target

Our results confirm that the EO-based inundation state is a useful target variable for ML-based modeling. Data from the Landsat mission has been shown to form a useful basis for quantifying interannual dynamics in surface water dynamics [39,54,174]. Although in some years, the impact of cloud cover was high, the summer/fall inundation status could be retrieved for all salt pans over the entire study period except for the years 2002 and 2012. The variation in this target variable roughly showed similar dynamics to some of the variables considered in other studies on a larger area, e.g., SPEI3 [103] or long-term precipitation [102]. The year 2015 represents an exception, as it is referred to as drought year in ref. [103] but appears rather wet in our analysis. This might be because of the rather wet conditions in fall-winter 2014, which is also visible in ref. [103].

Uncertainties in the salt pan time series are expected to be larger for smaller salt pans, which have a larger relative proportion of mixed pixels with bordering land (Section 4.1). However, this argument turns out to be secondary since the accuracy of the model target is dependent on the exact recognition of desiccation and not on the precise sensing of the true WE. Higher resolution remote sensing products, such as Sentinel-2 imagery, could reduce the error connected to spotting desiccation inside the ’last’ pixels. Such data would need to be used in combination with, e.g., the Landsat archive, to build the models on extensive time series. In addition to using satellite data with higher resolutions, we propose the use of alternative target variables to avoid the skewed salt pan-wise (and year-wise) class imbalances. The time of the first desiccation (Figure 3) would constitute an interesting target variable [14,21].

5.4. Model Error

Although in this study we were able to predict the salt pan inundation state in Seewinkel with only moderate accuracy, the average performance of the three independent test sets indicates a gain of 0.24 compared to the RANDOM model. We regard the average score between the models of 0.6 as acceptable only insofar as the assumed reasons for the observed model error are numerous and, depending on the salt pan and year, heavy-weighing. Therefore, the model error can be, approximately, explained. Increasing the model performance based on the issues described, in detail, below is largely limited by data uncertainty and data availability. The failure of the model to make correct predictions if the meteorological conditions deviate much from the climatology, the artificial inundation, and the uncertain hydrological condition, meaning surface water possibly infiltrating into deeper layers, explain the results and provide starting points for future improvements to the model. In general, we consider the model setup performant and stable.

Many years exhibit highly skewed class distributions, especially since 2016. This influences the metrics since different years are connected to varying degrees of difficulty in correct estimation. The total skill of the three models is very similar. The indirect setup of the model, which means the prediction of the inundation state in summer via the water balance at the end of March, can be considered a major contributor to this outcome. Salt pans with a more balanced class distribution are more challenging to correctly estimate for the three models (Table 5). On average, the GROUNDWATER model performed best in predicting these eight salt pans (Zicklacke, Katschitzlacke, Fuchslochlacke 3, Oberer Stinkersee, Mittlerer Stinkersee, Wörtenlacken 2, Neubruchlacke, and Lange Lacke), although interpreting these results proved difficult due to the widely varying hydrological conditions of the salt pans [14]. Section 4.2.2 stresses the importance of the underlying physical conditions on the fold-wise performance. As already discussed in Section 5.1, moderate success mainly lies in the struggle to estimate extreme dry, and, especially, wet conditions in summer (e.g., drought around 1992, 2003, and 2016, and wet periods around especially 1996 (1997) and 2010).

As stated in Section 4.2.3, estimates were worse for years in which the inundation state shifted to the alternative state. The misclassifications are probably due to some salt pans reacting faster to hydrometeorological changes than others. Hence, for some salt pans, the environmental conditions of the previous months and year(s) have a stronger influence on the prediction of the current year than for others. Furthermore, the misclassifications may be partly due to the fact that the input features are coarse resolution (i.e., do not differ between Lacken) and partly to the model trying to get a best fit over all the years.

Although we did not apply feature selection [112] to reduce the number of features used in this study, we were able to inhibit overfitting in model testing. This was completed by trimming the decision trees used in the four RF models in the scope of the hyperparameter optimization. This built on our model design, which enables independent model testing and, practically, on closely monitoring training–test differences throughout this study.

In addition to changing climate patterns, a process referred to as “drying from beneath” [20] challenges the water-holding capacity of the salt pans. Depending on the ecological state of the salt pans, this mechanism can directly influence WE and, therefore, the inundation state. We suppose that the worse the ecological health of the salt pan, the higher the negative impact on model performance. However, it is not possible to characterize this ecological state based on our models and using the available input data. Due to the skewed class distribution, the assumption that our predictor selection works better for more natural/ecologically healthy salt pans could not be answered inside this model setup. The disregarded large human influence on the water cycle [21] constitutes an additional source of error.

All models are subject to a division between the periods before and after 2004 (Figure 5). This pattern cannot be found in the target variable (Figure 3). Additional research is needed to clearly connect climate change and the phenomenon of “dying salt pans” to these observations. As artificially inundated salt pans were introduced into the modeling, year-wise estimates could additionally have been affected due to misguided thresholding.

5.5. Model Transferability

The model based on meteorological predictors can be transferred to any other salt pan ecosystem worldwide in combination with the use of high-resolution remote sensing imagery, such as that provided by Landsat. In general, globally available predictor data in sufficient temporal and spatial resolution with respect to the studied ecosystem are needed, at best in combination with uncertainty quantification. This can be ensured by choosing an adequate spatial resolution of the predictors with regard to the catchment size. Here, ERA5-Land offers a good starting point with its 9 km × 9 km spatial resolution. The EO data should have a suitable temporal and spatial resolution to capture the dynamics of the studied ecosystem. For example, it is not possible to retrieve the water extent information of ecosystems of a smaller size than the Landsat resolution of 30 m × 30 m. Another important constraint is that this approach will likely not be suitable in the case of water bodies whose water extent shows a low sensitivity with respect to water volume, i.e., with steep bathymetry in which a drop in the water level will not lead to a proportional decrease in the water area.

6. Conclusions

As salt pans in Seewinkel are increasingly vulnerable ecosystems in often poor hydrological conditions, we aimed at improving ecosystem understanding and, finally, decision-making by predicting the salt pan inundation state in summer and fall with ML models.

Our models stress the importance of groundwater for the estimation of the inundation state in summer/fall. This solidifies the general notion represented in the literature [14,21] and calls for sustainable groundwater management in the region to ensure the conservation of this ecosystem. We stress that the use of the SGI [115] as a predictor is promising. The model based on meteorological predictors can be transferred to any other salt pan ecosystem worldwide in combination with the use of high-resolution remote sensing imagery, such as the Landsat archive. METEOROLOGY achieved an MCC of 0.66 compared to GROUNDWATER with 0.59 and COMBINED with 0.57, with respect to the independent test set. We identified the most likely sources of error, namely the struggle to estimate the inundation state correctly in the case of extreme environmental conditions developing after March, human intervention into the water cycle by artificially inundating the salt pans, and surface water loss due to the possible infiltration into deeper layers due to a failure of the water retention capacity [20]. Furthermore, we highlight the potential of the concept of partial dependency [166] to understand threshold-dependent ecosystems, such as salt pans in the Seewinkel region.

To our knowledge, the results represent the first data-driven prediction and understanding of salt pan dynamics in the Seewinkel region. We identified the main drivers and potential improvements for future model development. In this context, the use of more advanced ML algorithms could prove beneficial.

Furthermore, the possibility of transferring the METEOROLOGY model to other salt pan ecosystems in combination with EO data makes this study particularly valuable. We propose the application of our models to salt pans of larger sizes and ones that are less influenced by humans and in a better ecological condition. This could improve both performance and interpretability.

The possibility of predicting the salt pan inundation state in summer/fall is of potential importance to decision-makers in conservation and tourism [14,17]. A better understanding of salt pans can contribute to preserving this unique geographic space in the Pannonian Basin.

Author Contributions

Conceptualization, S.S. (Stefan Schlaffer) and W.D.; methodology, S.S., H.S. (Henri Schauer), E.B. (Emanuel Bueechi) and W.D.; software, H.S. and S.S.; validation, S.S.; formal analysis, H.S.; investigation, H.S. and S.S; resources, H.S. and W.D.; data curation, S.S. and H.S.; writing—original draft preparation, H.S.; writing—review and editing, H.S., S.S., E.B. and W.D.; visualization, H.S.; supervision, S.S., E.B. and W.D.; project administration, S.S. and W.D.; funding acquisition, S.S. and W.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK, Austria), the Federal Ministry of Education, Science and Research (BMBWF, Austria), Klima- und Energiefonds (KLIEN, Austria), and Land Oberösterreich in the framework of the StartClim2021/22 Programme, grant number StartClim2021.G, by Interreg Central Europe and the European Union in the framework of the project Clim4Cast (study about forecasting drought, heatwave, and fire weather in central Europe - https://www.interreg-central.eu/projects/clim4cast/) grant number CE0100059., and by Open Access Funding by TU Wien Bibliothek.

Data Availability Statement

Data and code will be made available on request.

Acknowledgments

The authors thank Harald Grabenhofer, Michael Kuttner, and Arno Cinamon of Lake Neusiedl—Seewinkel National Park (web address: https://www.nationalparkneusiedlersee.at/de/naturraum/ (accessed on 14 August 2023)) for discussion and providing useful information on the salt pans in Seewinkel.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of this study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Appendix A

Table A1. Most important predictors for each salt pan according to PDP displayed together with the largest spread (in parentheses). Furthermore, the threshold for predicting a certain class for the most important predictor is indicated.

	Predictor with Largest PDP Spread	Threshold for Predictor
Badesee Apetlon	GW Anomal (0.18)	−0.09
Lange Lacke	SGI (0.51)	−0.09
Neubruchlacke	GW Anomal (0.43)	0.1
Kiesgrube	GW Anomal (0.12)	−0.12
Standlacke	GW Anomal (0.25)	0.14
Ochsenbrunnlacke	GW Anomal (0.25)	0.1
Gsigsee	GW Anomal (0.0)	−0.45
Wörtenlacken 2	SGI (0.61)	−0.03
Kirchsee	T Anomal (0.21)	−0.62
Wörtenlacken 1	GW Anomal (0.49)	−0.09
Mittlerer Stinkersee	SPI 24 (0.28)	0.09
Huldenlacke	T Anomal (0.13)	−0.2
Kleine Neubruchlacke	SPI 24 (0.21)	0.29
Heidlacke	GW Anomal (0.0)	−0.45
Unterer Stinkersee	SPI 24 (0.17)	−0.41
Sechsmahdlacke	GW Anomal (0.2)	0.14
Martenhofenlacke	SGI (0.22)	0.09
Oberer Stinkersee	SPI 24 (0.55)	0.29
Kuhbrunnlacke	GW Anomal (0.08)	0.1
Hottergrube	GW Anomal (0.0)	−0.45
Fuchslochlacke 3	SGI (0.32)	0.09
St. Martins Therme 2	SPI 24 (0.23)	−0.47
Fuchslochlacke 2	SGI (0.12)	0.09
St. Martins Therme 1	GW level ratio (0.15)	1.0
Fuchslochlacke 1	GW Anomal (0.31)	0.16
Hochstätten	P Anomal (0.16)	42.27
Herrnsee	SGI (0.33)	−0.03
Birnbaumlacke	SPI 24 (0.17)	0.73
Katschitzlacke	# Days ab. 25 °C (0.33)	64.0
Albersee	GW Anomal (0.21)	0.14
unbekannt	GW Anomal (0.08)	0.06
Zicksee	GW level ratio (0.09 )	1.0
Darscholacke	GW Anomal (0.12)	−0.13
Zicklacke	SGI (0.59)	−0.03

References

Lowenstein, T.K.; Hardie, L.A. Criteria for the recognition of salt-pan evaporites. Sedimentology 1985, 32, 627–644. [Google Scholar] [CrossRef]
Shaw, P.; Bryant, R. Pans, Playas and Salt Lakes. In Arid Zone Geomorphology; John Wiley and Sons, Ltd: Hoboken, NJ, USA, 2011; Chapter 15; pp. 373–401. [Google Scholar] [CrossRef]
Leemans, R.; De Groot, R. Millennium Ecosystem Assessment: Ecosystems and Human Well-Being: A Framework for Assessment; Island Press: Washington, DC, USA, 2003. [Google Scholar]
Wurtsbaugh, W.A.; Miller, C.; Null, S.E.; DeRose, R.J.; Wilcock, P.; Hahnenberger, M.; Howe, F.; Moore, J. Decline of the world’s saline lakes. Nat. Geosci. 2017, 10, 816–821. [Google Scholar] [CrossRef]
Williams, W.D. Lakes in Arid Environments. In The Lakes Handbook; John Wiley & Sons, Ltd: Oxford, UK, 2004; Chapter 8; pp. 200–240. [Google Scholar] [CrossRef]
Safaee, S.; Wang, J. Towards global mapping of salt pans and salt playas using Landsat imagery: A case study of western United States. Int. J. Remote Sens. 2020, 41, 8693–8716. [Google Scholar] [CrossRef]
Wasserman, J.; Adams, J.B.; Lemley, D.A. Investigating the potential for saltpan restoration for the provision of multiple ecosystem services. Afr. J. Aquat. Sci. 2022, 47, 436–446. [Google Scholar] [CrossRef]
Nayak, S.; Pandeya, A.; Gupta, M.; Trivedi, C.; Prasad, K.; Kadri, S. Application of Satellite Data for Monitoring Degradation of Tidal Wetlands of the Gulf of Kachchh, Western India. In Space and Humanity; Napolitano, L., Ed.; Pergamon: Oxford, UK, 1989; pp. 171–178. [Google Scholar] [CrossRef]
Picado, A.; Dias, J.M.; Fortunato, A.B. Effect of flooding the salt pans in the Ria de Aveiro. J. Coast. Res. 2009, 2, 1395–1399. [Google Scholar]
Da Silva, M.F.; Albuquerque, H.; Martins, F.; Buron, G. Salt pans: An Indissociable Natural and Cultural Heritage – A Comparative Study Between Aveiro, Portugal and Guérande, France. In Tourism Planning and Development in Western Europe; CABI, Ltd.: Hoboken, NJ, USA, 2022; pp. 65–79. [Google Scholar] [CrossRef]
Herzig, A. Soda pans–jewels of the Nationalpark Neusiedler See-Seewinkel An opinion based on limnological data. Acta ZooBot Austria 2020, 157, 81–114. [Google Scholar]
Boros, E.; Ecsedi, Z.; Oláh, J.; Szegedi, R.; Dunn, J. Ecology and Management of Soda Pans in the Carpathian Basin; Hortobágy Environmental Association: Hortobágy, Hungary, 2013. [Google Scholar]
Albert, R.; Werner, V.; Popp, M.; Fischer, M.; Niklfeld, H. Botanische Kostbarkeiten vor unserer Haustür Die Salzpflanzen im Gebiet des Neusiedler Sees, Burgenland (Österreich). Acta ZooBot Austria 2020, 157, 115–143. [Google Scholar]
Krachler, R.; Korner, I.; Dvorak, M.; Milazowszky, N.; Rabitsch, W.; Werba, F.; Zulka, P.; Kirschner, A. Die Salzlacken des Seewinkels: Erhebung des Aktuellen Oekologischen Zustandes Sowie Entwicklung Individueller LacKenerhaltungskonzepte Fuer Die Salzlacken des Seewinkels (2008–2011); Technical Report; Naturschutzbund Burgenland: Eisenstadt, Austria, 2012. [Google Scholar]
Cabela, A.; Grillitsch, H.; Tiedemann, F. Atlas zur Verbreitung und Ökologie der Amphibien und Reptilien in Österreich: Auswertung der Herpetofaunistischen Datenbank der Herpetologischen Sammlung des Naturhistorischen Museums in Wien; Technical Report; Umweltbundesamt: Wien, Austria, 2001. [Google Scholar]
Dvorak, M.; Landmann, A.; Teufelbauer, N.; Wichmann, G.; Berg, H.M.; Probst, R. The conservation status of the breeding birds of Austria: Red List (5th version) and Birds of Conservation Concern (1st version). Egretta 2017, 55, 6–42. [Google Scholar]
Dvorak, M.; Laber, J.; Ranner, A.; Pellinger, A.; Tatai, S.; Hadarics, T.; Dorogman, C.; Lang, A. Artenliste der Avifauna des Neusiedler See—Gebiets; Technical Report; Nationalpark Neusiedler See—Seewinkel Fertő–Hanság Nemzeti Park BirdLife Österreich: Neusiedl, Österreich, 2020. [Google Scholar]
Dvorak, M.; Bieringer, G.; Braun, B.; Grüll, A.; Karner-Ranner, E.; Kohler, B.; Korner, I.; Laber, J.; Nemeth, E.; Rauer, G.; et al. Population size, distribution and population trends of threatened and ecologically important bird species in the National Park Neusiedler See—Seewinkel: Results from the years 2001–2015. Egretta 2016, 54, 4–86. [Google Scholar]
Boros, E.; V.-Balogh, K.; Vörös, L.; Horváth, Z. Multiple extreme environmental conditions of intermittent soda pans in the Carpathian Basin (Central Europe). Limnologica 2017, 62, 38–46. [Google Scholar] [CrossRef]
Krachler, R.; Krachler, R.; Milleret, E.; Wesner, W. Limnochemische Untersuchungen zur aktuellen Situation der Salzlacken im burgenlaendischen Seewinkel. Burg. Heim. 2000, 62, 3–49. [Google Scholar]
Zimmermann-Timm, H.; Teubner, K. Folgen der Grundwassersenkung am Beispiel des Neusiedlers See Seewinkel (Burgenland, Österreich). In Warnsignal Klima: Boden und Landnutzung; Lozán, J.L.S.-W., Breckle, H.G., Kasang, D., Eds.; Wissenschaftliche Auswertungen in Kooperation mit GEO: Oxford, UK, 2021; pp. 142–149. [Google Scholar]
Horvath, Z.; Ptacnik, R.; Vad, C.; Chase, J. Habitat loss over six decades accelerates regional and local biodiversity loss via changing landscape connectance. Ecol. Lett. 2019, 22, 1019–1027. [Google Scholar] [CrossRef] [PubMed]
Mitter, H.; Schmid, E. Informing groundwater policies in semi-arid agricultural production regions under stochastic climate scenario impacts. Ecol. Econ. 2021, 180, 106908. [Google Scholar] [CrossRef]
Vanderhoof, M.K.; Lane, C.R.; McManus, M.G.; Alexander, L.C.; Christensen, J.R. Wetlands inform how climate extremes influence surface water expansion and contraction. Hydrol. Earth Syst. Sci. 2018, 22, 1851–1873. [Google Scholar] [CrossRef] [PubMed]
Sharma, L.; Naik, R.; Pandey, P. A Focus on Reaggregation of Playa Wetlandscapes in the Face of Global Ecological Disconnectivity. In Advances in Remote Sensing for Natural Resource Monitoring; John Wiley & Sons: Hoboken, NJ, USA, 2021; pp. 366–387. [Google Scholar] [CrossRef]
Williams, D.D. The Biology of Temporary Waters; Oxford University Press: Oxford, UK, 2006. [Google Scholar]
Nhiwatiwa, T.; Dalu, T. Seasonal variation in pans in relation to limno-chemistry, size, hydroperiod, and river connectivity in a semi-arid subtropical region. Phys. Chem. Earth, Parts A/B/C 2017, 97, 37–45. [Google Scholar] [CrossRef]
Turak, E.; Harrison, I.; Dudgeon, D.; Abell, R.; Bush, A.; Darwall, W.; Finlayson, C.M.; Ferrier, S.; Freyhof, J.; Hermoso, V.; et al. Essential Biodiversity Variables for measuring change in global freshwater biodiversity. Biol. Conserv. 2017, 213, 272–279. [Google Scholar] [CrossRef]
Foti, R.; del Jesus, M.; Rinaldo, A.; Rodriguez-Iturbe, I. Hydroperiod regime controls the organization of plant species in wetlands. Proc. Natl. Acad. Sci. USA 2012, 109, 19596–19600. [Google Scholar] [CrossRef]
Hess, L.L.; Melack, J.M.; Novo, E.M.; Barbosa, C.C.; Gastil, M. Dual-season mapping of wetland inundation and vegetation for the central Amazon basin. Remote Sens. Environ. 2003, 87, 404–428. [Google Scholar] [CrossRef]
Reschke, J.; Bartsch, A.; Schlaffer, S.; Schepaschenko, D. Capability of C-Band SAR for Operational Wetland Monitoring at High Latitudes. Remote Sens. 2012, 4, 2923–2943. [Google Scholar] [CrossRef]
Krzepek, K.; Schmidt, J.; Iwaszczuk, D. Fusion of SAR and Multi-spectral Time Series for Determination of Water Table Depth and Lake Area in Peatlands. PFG-Photogramm. Remote Sens. Geoinf. Sci. 2022, 90, 561–575. [Google Scholar] [CrossRef]
Kseňak, Ľ.; Pukanská, K.; Bartoš, K.; Blišťan, P. Assessment of the Usability of SAR and Optical Satellite Data for Monitoring Spatio-Temporal Changes in Surface Water: Bodrog River Case Study. Water 2022, 14, 299. [Google Scholar] [CrossRef]
Bartsch, A.; Trofaier, A.M.; Hayman, G.; Sabel, D.; Schlaffer, S.; Clark, D.B.; Blyth, E. Detection of open water dynamics with ENVISAT ASAR in support of land surface modelling at high latitudes. Biogeosciences 2012, 9, 703–714. [Google Scholar] [CrossRef]
Schlaffer, S.; Chini, M.; Dorigo, W.; Plank, S. Monitoring surface water dynamics in the Prairie Pothole Region of North Dakota using dual-polarised Sentinel-1 synthetic aperture radar (SAR) time series. Hydrol. Earth Syst. Sci. 2022, 26, 841–860. [Google Scholar] [CrossRef]
Prigent, C.; Papa, F.; Aires, F.; Rossow, W.B.; Matthews, E. Global inundation dynamics inferred from multiple satellite observations, 1993–2000. J. Geophys. Res. Atmos. 2007, 112, D12. [Google Scholar] [CrossRef]
Justice, C.; Vermote, E.; Townshend, J.; Defries, R.; Roy, D.; Hall, D.; Salomonson, V.; Privette, J.; Riggs, G.; Strahler, A.; et al. The Moderate Resolution Imaging Spectroradiometer (MODIS): Land remote sensing for global change research. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1228–1249. [Google Scholar] [CrossRef]
Wulder, M.A.; Loveland, T.R.; Roy, D.P.; Crawford, C.J.; Masek, J.G.; Woodcock, C.E.; Allen, R.G.; Anderson, M.C.; Belward, A.S.; Cohen, W.B.; et al. Current status of Landsat program, science, and applications. Remote Sens. Environ. 2019, 225, 127–147. [Google Scholar] [CrossRef]
Pekel, J.F.; Cottam, A.; Gorelick, N.; Belward, A. High-resolution mapping of global surface water and its long-term changes. Nature 2016, 540, 418–422. [Google Scholar] [CrossRef]
Jones, J. Improved Automated Detection of Subpixel-Scale Inundation—Revised Dynamic Surface Water Extent (DSWE) Partial Surface Water Tests. Remote Sens. 2019, 11, 374. [Google Scholar] [CrossRef]
Sheng, Y.; Song, C.; Wang, J.; Lyons, E.A.; Knox, B.R.; Cox, J.S.; Gao, F. Representative lake water extent mapping at continental scales using multi-temporal Landsat-8 imagery. Remote Sens. Environ. 2016, 185, 129–141. [Google Scholar] [CrossRef]
Heintzman, L.; Starr, S.; Mulligan, K.; Barbato, L.; McIntyre, N. Using Satellite Imagery to Examine the Relationship between Surface-Water Dynamics of the Salt Lakes of Western Texas and Ogallala Aquifer Depletion. Wetlands 2017, 37, 1055–1065. [Google Scholar] [CrossRef]
Lefebvre, G.; Davranche, A.; Willm, L.; Campagna, J.; Redmond, L.; Merle, C.; Guelmami, A.; Poulin, B. Introducing WIW for Detecting the Presence of Water in Wetlands with Landsat and Sentinel Satellites. Remote Sens. 2019, 11, 2210. [Google Scholar] [CrossRef]
Castañeda, C.; Herrero, J.; Auxiliadora Casterad, M. Landsat monitoring of playa-lakes in the Spanish Monegros desert. J. Arid Environ. 2005, 63, 497–516. [Google Scholar] [CrossRef]
Bowen, B.B.; Kipnis, E.L.; Raming, L.W. Temporal dynamics of flooding, evaporation, and desiccation cycles and observations of salt crust area change at the Bonneville Salt Flats, Utah. Geomorphology 2017, 299, 1–11. [Google Scholar] [CrossRef]
Chiloane, C.; Dube, T.; Shoko, C. Monitoring and assessment of the seasonal and inter-annual pan inundation dynamics in the Kgalagadi Transfrontier Park, Southern Africa. Phys. Chem. Earth Parts A/B/C 2020, 118–119, 102905. [Google Scholar] [CrossRef]
Bryant, R.; Rainey, M. Investigation of flood inundation on playas within the Zone of Chotts, using a time-series of AVHRR. Remote Sens. Environ. 2002, 82, 360–375. [Google Scholar] [CrossRef]
Chew, C.; Small, E. Estimating inundation extent using CYGNSS data: A conceptual modeling study. Remote Sens. Environ. 2020, 246, 111869. [Google Scholar] [CrossRef]
Li, H.; Mao, D.; Li, X.; Wang, Z.; Wang, C. Monitoring 40-Year Lake Area Changes of the Qaidam Basin, Tibetan Plateau, Using Landsat Time Series. Remote Sens. 2019, 11, 343. [Google Scholar] [CrossRef]
Schlaffer, S.; Chini, M.; Dettmering, D.; Wagner, W. Mapping Wetlands in Zambia Using Seasonal Backscatter Signatures Derived from ENVISAT ASAR Time Series. Remote Sens. 2016, 8, 402. [Google Scholar] [CrossRef]
Nationalpark Neusiedler See - Seewinkel. Salt pan basins, Unpublished own data. 2022.
Jajarmizadeh, M. A Review on Theoretical Consideration and Types of Models in Hydrology. J. Environ. Sci. Technol. 2012, 5, 249–261. [Google Scholar] [CrossRef]
Liu, Z.; Wang, Y.; Xu, Z.; Duan, Q. Conceptual Hydrological Models. In Handbook of Hydrometeorological Ensemble Forecasting; Duan, Q., Pappenberger, F., Thielen, J., Wood, A., Cloke, H.L., Schaake, J.C., Eds.; Springer Berlin Heidelberg: Berlin/Heidelberg, Germany, 2017; pp. 1–23. [Google Scholar] [CrossRef]
Liu, G.; Schwartz, F.W. An integrated observational and model-based analysis of the hydrologic response of prairie pothole systems to variability in climate. Water Resour. Res. 2011, 47, W02504. [Google Scholar] [CrossRef]
Blöschl, G.; Sivapalan, M. Scale issues in hydrological modelling: A review. Hydrol. Process. 1995, 9, 251–290. [Google Scholar] [CrossRef]
Patil, S.D.; Stieglitz, M. Comparing spatial and temporal transferability of hydrological model parameters. J. Hydrol. 2015, 525, 409–417. [Google Scholar] [CrossRef]
Chouaib, W.; Alila, Y.; Caldwell, P.V. Parameter transferability within homogeneous regions and comparisons with predictions from a priori parameters in the eastern United States. J. Hydrol. 2018, 560, 24–38. [Google Scholar] [CrossRef]
Li, C.Z.; Zhang, L.; Wang, H.; Zhang, Y.Q.; Yu, F.L.; Yan, D.H. The transferability of hydrological models under nonstationary climatic conditions. Hydrol. Earth Syst. Sci. 2012, 16, 1239–1254. [Google Scholar] [CrossRef]
Yang, W.; Xia, R.; Chen, H.; Wang, M.; Xu, C.Y. The impact of calibration conditions on the transferability of conceptual hydrological models under stationary and nonstationary climatic conditions. J. Hydrol. 2022, 613, 128310. [Google Scholar] [CrossRef]
Farmer, W.H.; Vogel, R.M. On the deterministic and stochastic use of hydrologic models. Water Resour. Res. 2016, 52, 5619–5633. [Google Scholar] [CrossRef]
Shen, C.; Chen, X.; Laloy, E. Editorial: Broadening the Use of Machine Learning in Hydrology. Front. Water 2021, 3, 681023. [Google Scholar] [CrossRef]
Nearing, G.S.; Kratzert, F.; Sampson, A.K.; Pelissier, C.S.; Klotz, D.; Frame, J.M.; Prieto, C.; Gupta, H.V. What Role Does Hydrological Science Play in the Age of Machine Learning? Water Resour. Res. 2021, 57, e2020WR028091. [Google Scholar] [CrossRef]
Lange, H.; Sippel, S. Machine Learning Applications in Hydrology. For.-Water Interact. 2020, 240, 233–257. [Google Scholar] [CrossRef]
Mosaffa, H.; Sadeghi, M.; Mallakpour, I.; Naghdyzadegan Jahromi, M.; Pourghasemi, H.R. Chapter 43—Application of machine learning algorithms in hydrology. In Computers in Earth and Environmental Sciences; Pourghasemi, H.R., Ed.; Elsevier: Amsterdam, The Netherlands, 2022; pp. 585–591. [Google Scholar] [CrossRef]
Osman, A.I.A.; Ahmed, A.N.; Huang, Y.F.; Kumar, P.; Birima, A.H.; Sherif, M.; Sefelnasr, A.; Ebraheemand, A.A.; El-Shafie, A. Past, present and perspective methodology for groundwater modeling-based machine learning approaches. Arch. Comput. Methods Eng. 2022, 29, 3843–3859. [Google Scholar] [CrossRef]
Afrifa, S.; Zhang, T.; Appiahene, P.; Varadarajan, V. Mathematical and Machine Learning Models for Groundwater Level Changes: A Systematic Review and Bibliographic Analysis. Fut. Internet 2022, 14, 259. [Google Scholar] [CrossRef]
Ahmadi, A.; Olyaei, M.; Heydari, Z.; Emami, M.; Zeynolabedin, A.; Ghomlaghi, A.; Daccache, A.; Fogg, G.E.; Sadegh, M. Groundwater Level Modeling with Machine Learning: A Systematic Review and Meta-Analysis. Water 2022, 14, 949. [Google Scholar] [CrossRef]
Ardabili, S.; Mosavi, A.; Dehghani, M.; Várkonyi-Kóczy, A.R. Deep learning and machine learning in hydrological processes climate change and earth systems a systematic review. In Proceedings of the Engineering for Sustainable Future: Selected papers of the 18th International Conference on Global Research and Education Inter-Academia–2019, Budapest, Hungary, 4–7 September 2019; Springer: Berlin/Heidelberg, Germany, 2020; pp. 52–62. [Google Scholar] [CrossRef]
Zounemat-Kermani, M.; Batelaan, O.; Fadaee, M.; Hinkelmann, R. Ensemble machine learning paradigms in hydrology: A review. J. Hydrol. 2021, 598, 126266. [Google Scholar] [CrossRef]
Xu, T.; Liang, F. Machine learning for hydrologic sciences: An introductory overview. WIREs Water 2021, 8, e1533. [Google Scholar] [CrossRef]
Papacharalampous, G.; Tyralis, H. A review of machine learning concepts and methods for addressing challenges in probabilistic hydrological post-processing and forecasting. Front. Water 2022, 4, 961954. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Tyralis, H.; Papacharalampous, G.; Langousis, A. A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources. Water 2019, 11, 910. [Google Scholar] [CrossRef]
Kraft, B.; Jung, M.; Körner, M.; Koirala, S.; Reichstein, M. Towards hybrid modeling of the global hydrological cycle. Hydrol. Earth Syst. Sci. 2022, 26, 1579–1614. [Google Scholar] [CrossRef]
Fahimi, F.; Yaseen, Z.M.; El-shafie, A. Application of soft computing based hybrid models in hydrological variables modeling: A comprehensive review. Theor. Appl. Climatol. 2017, 128, 875–903. [Google Scholar] [CrossRef]
Nourani, V.; Hosseini Baghanam, A.; Adamowski, J.; Kisi, O. Applications of hybrid wavelet–Artificial Intelligence models in hydrology: A review. J. Hydrol. 2014, 514, 358–377. [Google Scholar] [CrossRef]
Khandelwal, A.; Xu, S.; Li, X.; Jia, X.; Stienbach, M.; Duffy, C.; Nieber, J.; Kumar, V. Physics Guided Machine Learning Methods for Hydrology. arXiv 2020, arXiv:2012.02854. [Google Scholar]
Sit, M.; Demiray, B.Z.; Xiang, Z.; Ewing, G.J.; Sermet, Y.; Demir, I. A comprehensive review of deep learning applications in hydrology and water resources. Water Sci. Technol. 2020, 82, 2635–2670. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Chakraborty, S.; Tomsett, R.; Raghavendra, R.; Harborne, D.; Alzantot, M.; Cerutti, F.; Srivastava, M.; Preece, A.; Julier, S.; Rao, R.M.; et al. Interpretability of deep learning models: A survey of results. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
Wee, W.J.; Zaini, N.B.; Ahmed, A.N.; El-Shafie, A. A review of models for water level forecasting based on machine learning. Earth Sci. Inform. 2021, 14, 1707–1728. [Google Scholar] [CrossRef]
Hussaini, A.; Mahmud, M.R.; Wee, K.T.K.; Abubakar, A.G. A Review of Water Level Fluctuation Models and Modelling Initiatives. J. Comput. Theor. Nanosci. 2020, 17, 645–653. [Google Scholar] [CrossRef]
Zhu, S.; Lu, H.; Ptak, M.; Dai, J.; Ji, Q. Lake water-level fluctuation forecasting using machine learning models: A systematic review. Environ. Sci. Pollut. Res. 2020, 27, 44807–44819. [Google Scholar] [CrossRef] [PubMed]
Li, B.; Yang, G.; Wan, R.; Dai, X.; Zhang, Y. Comparison of random forests and other statistical methods for the prediction of lake water level: A case study of the Poyang Lake in China. Hydrol. Res. 2016, 47, 69–83. [Google Scholar] [CrossRef]
Choi, C.; Kim, J.; Han, H.; Han, D.; Kim, H.S. Development of Water Level Prediction Models Using Machine Learning in Wetlands: A Case Study of Upo Wetland in South Korea. Water 2020, 12, 93. [Google Scholar] [CrossRef]
Cartwright, J.; Morelli, T.L.; Grant, E.H.C. Identifying climate-resistant vernal pools: Hydrologic refugia for amphibian reproduction under droughts and climate change. Ecohydrology 2022, 15, e2354. [Google Scholar] [CrossRef]
Riley, J.W.; Stillwell, C.C. Predicting Inundation Dynamics and Hydroperiods of Small, Isolated Wetlands Using a Machine Learning Approach. Wetlands 2023, 43, 63. [Google Scholar] [CrossRef]
Dorigo, W.; Dietrich, S.; Aires, F.; Brocca, L.; Carter, S.; Cretaux, J.F.; Dunkerley, D.; Enomoto, H.; Forsberg, R.; Güntner, A.; et al. Closing the Water Cycle from Observations across Scales: Where Do We Stand? Bull. Am. Meteorol. Soc. 2021, 102, E1897–E1935. [Google Scholar] [CrossRef]
Sogno, P.; Klein, I.; Kuenzer, C. Remote Sensing of Surface Water Dynamics in the Context of Global Change—A Review. Remote Sens. 2022, 14, 2475. [Google Scholar] [CrossRef]
Solvik, K.; Bartuszevige, A.M.; Bogaerts, M.; Joseph, M.B. Predicting Playa Inundation Using a Long Short-Term Memory Neural Network. Water Resour. Res. 2021, 57, e2020WR029009. [Google Scholar] [CrossRef]
Shaeri Karimi, S.; Saintilan, N.; Wen, L.; Valavi, R. Application of Machine Learning to Model Wetland Inundation Patterns Across a Large Semiarid Floodplain. Water Resour. Res. 2019, 55, 8765–8778. [Google Scholar] [CrossRef]
Soltani, K.; Amiri, A.; Zeynoddin, M.; Ebtehaj, I.; Gharabaghi, B.; Bonakdari, H. Forecasting monthly fluctuations of lake surface areas using remote sensing techniques and novel Machine Learning Methods. Theor. Appl. Climatol. 2020, 143, 713–735. [Google Scholar] [CrossRef]
Daniel, J.; Rooney, R.C.; Robinson, D.T. Climate, land cover and topography: Essential ingredients in predicting wetland permanence. Biogeosciences 2022, 19, 1547–1570. [Google Scholar] [CrossRef]
Wulder, M.A.; White, J.C.; Loveland, T.R.; Woodcock, C.E.; Belward, A.S.; Cohen, W.B.; Fosnight, E.A.; Shaw, J.; Masek, J.G.; Roy, D.P. The global Landsat archive: Status, consolidation, and direction. Remote Sens. Environ. 2016, 185, 271–283. [Google Scholar] [CrossRef]
Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef]
Draganits, E.; Weißl, M.; Zámolyi, A.; Doneus, M. Lake Neusiedl area: A particular lakescape at the boundary between Alps and Pannonian Basin. In Landscapes and Landforms of Austria; Springer: Berlin/Heidelberg, Germany, 2022; pp. 207–222. [Google Scholar]
Häusler, H. Erläuterungen zu den Blättern 79 Neusiedl am See 80 Ungarisch-Altenburg 109 Pamhagen. Geol. Kt. Der Repub. ÖSterreich 2007, 1, 145–157. [Google Scholar]
Kirschner, A.; Krachler, R.; Krachler, R.; Korner, I. Renaturierung Ausgewählter Salzlacken des Burgenländischen Seewinkels; Technical Report; Naturschutzbund Burgenland: Eisenstadt, Austria, 2007. [Google Scholar]
Häusler, H. Wie gelangte das Salz in die Salzlacken?–Zum Stand der Forschung über die Entstehung des “Salzführenden Horizontes” und der Salzböden des Seewinkels. Acta ZooBot Austria 2020, 157, 145–157. [Google Scholar]
Köppen, W. The thermal zones of the Earth according to the duration of hot, moderate and cold periods and to the impact of heat on the organic world. Meteorol. Z. 2011, 20, 351–360. [Google Scholar] [CrossRef] [PubMed]
Soja, G.; Züger, J.; Knoflacher, M.; Kinner, P.; Soja, A.M. Climate impacts on water balance of a shallow steppe lake in Eastern Austria (Lake Neusiedl). J. Hydrol. 2013, 480, 115–124. [Google Scholar] [CrossRef]
Hackl, P.; Ledolter, J. A Statistical Analysis of the Water Levels at Lake Neusiedl. Austrian J. Stat. 2023, 52, 87–100. [Google Scholar] [CrossRef]
Crocetti, L.; Forkel, M.; Fischer, M.; Jurecka, F.; Grlj, A.; Salentinig, A.; Trnka, M.; Anderson, M.; Ng, W.T.; Kokalj, Ž.; et al. Earth Observation for agricultural drought monitoring in the Pannonian Basin (southeastern Europe): Current state and future directions. Reg. Environ. Chang. 2020, 20, 123. [Google Scholar] [CrossRef]
Tolotti, M.; Guella, G.; Herzig, A.; Rodeghiero, M.; Rose, N.L.; Soja, G.; Zechmeister, T.; Yang, H.; Teubner, K. Assessing the ecological vulnerability of the shallow steppe Lake Neusiedl (Austria-Hungary) to climate-driven hydrological changes using a palaeolimnological approach. J. Great Lakes Res. 2021, 47, 1327–1344. [Google Scholar] [CrossRef]
Eitzinger, J.; Kubu, G.; Formayer, H.; Haas, P.; Gerersdorfer, T.; Kromp-Kolb, H. Auswirkungen einer Klimaänderung auf den Wasserhaushalt des Neusiedlersees; Technical Report; Endbericht im Auftrag der Burgenländischen Landesregierung vom: Land Burgenland, Austrian, 2005. [Google Scholar]
Burgenland, L. Klimaszenarien für das Bundesland Burgenland BIS 2100. Technical report; Land Burgenland: Eisenstadt, Austria, 2016. [Google Scholar]
Haslinger, K.; Schöner, W.; Abermann, J.; Laaha, G.; Andre, K.; Olefs, M.; Koch, R. Apparent contradiction in the projected climatic water balance for Austria: Wetter conditions on average versus higher probability of meteorological droughts. Nat. Hazards Earth Syst. Sci. 2023, 23, 2749–2768. [Google Scholar] [CrossRef]
Wulder, M.A.; Roy, D.P.; Radeloff, V.C.; Loveland, T.R.; Anderson, M.C.; Johnson, D.M.; Healey, S.; Zhu, Z.; Scambos, T.A.; Pahlevan, N.; et al. Fifty years of Landsat science and impacts. Remote Sens. Environ. 2022, 280, 113195. [Google Scholar] [CrossRef]
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
RH. Natonalpark Neusiedler See–Seewinkel Bericht des Rechnungshofes. Technical Report 29, Rechnungshofs Österreich. 2020.
Hamed, K.H.; Rao, A.R. A modified Mann-Kendall trend test for autocorrelated data. J. Hydrol. 1998, 204, 182–196. [Google Scholar] [CrossRef]
Jović, A.; Brkić, K.; Bogunović, N. A review of feature selection methods with applications. In Proceedings of the 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 25–29 May 2015; pp. 1200–1205. [Google Scholar] [CrossRef]
Sima, C.; Dougherty, E.R. What should be expected from feature selection in small-sample settings. Bioinformatics 2006, 22, 2430–2436. [Google Scholar] [CrossRef]
Krachler, R. Renaturierung von Sodalacken im burgenländischen Seewinkel. PLUS LUCIS 2007, 1-2, 23–27. [Google Scholar]
Bloomfield, J.P.; Marchant, B.P. Analysis of groundwater drought building on the standardised precipitation index approach. Hydrol. Earth Syst. Sci. 2013, 17, 4769–4787. [Google Scholar] [CrossRef]
Turkeltaub, T.; Bel, G. The effects of rain and evapotranspiration statistics on groundwater recharge estimations for semi-arid environments. Hydrol. Earth Syst. Sci. 2023, 27, 289–302. [Google Scholar] [CrossRef]
Hughes, A.; Mansour, M.; Ward, R.; Kieboom, N.; Allen, S.; Seccombe, D.; Charlton, M.; Prudhomme, C. The impact of climate change on groundwater recharge: National-scale assessment for the British mainland. J. Hydrol. 2021, 598, 126336. [Google Scholar] [CrossRef]
Oroud, I. Temperature and evaporation dynamics of saline solutions. J. Hydrol. 1999, 226, 1–10. [Google Scholar] [CrossRef]
Oroud, I. Dynamics of evaporation from saline water bodies. J. Geophys. Res. Atmos. 2001, 106, 4695–4701. [Google Scholar] [CrossRef]
Liu, T.; Harman, C.J.; Kipnis, E.L.; Bowen, B.B. Modeling Episodic Ephemeral Brine Lake Evaporation and Salt Crystallization on the Bonneville Salt Flats, Utah. In Proceedings of the AGU Fall Meeting Abstracts, New Orleans, LA, USA, 11–15 December 2017; Volume 2017, p. EP11C-1568. [Google Scholar]
Muñoz Sabater, J.; Dutra, E.; Agustí-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 2021, 13, 4349–4383. [Google Scholar] [CrossRef]
WMO. WMO Guidelines on the Calculation of Climate Normals; Technical Report; WMO: Geneva, Switzerland, 2017. [Google Scholar]
Bloch, M.; Farkas, L.; Spiegler, K. Solar Evaporation of Salt Brines. Ind. Eng. Chem. 1951, 43, 1544–1553. [Google Scholar] [CrossRef]
Vicente-Serrano, S.M.; Beguería, S.; López-Moreno, J.I. A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index. J. Clim. 2010, 23, 1696–1718. [Google Scholar] [CrossRef]
McKee, T.B.; Doesken, N.J.; Kleist, J. The relationship of drought frequency and duration to time scales. In Proceedings of the 8th Conference on Applied Climatology, Anaheim, CA, USA, 17–22 January 1993; Volume 17, pp. 179–183. [Google Scholar]
Kumar, R.; Musuuza, J.L.; Van Loon, A.F.; Teuling, A.J.; Barthel, R.; Ten Broek, J.; Mai, J.; Samaniego, L.; Attinger, S. Multiscale evaluation of the Standardized Precipitation Index as a groundwater drought indicator. Hydrol. Earth Syst. Sci. 2016, 20, 1117–1131. [Google Scholar] [CrossRef]
Secci, D.; Tanda, M.G.; D’Oria, M.; Todaro, V.; Fagandini, C. Impacts of climate change on groundwater droughts by means of standardized indices and regional climate models. J. Hydrol. 2021, 603, 127154. [Google Scholar] [CrossRef]
Bueechi, E.; Fischer, M.; Crocetti, L.; Trnka, M.; Grlj, A.; Zappa, L.; Dorigo, W. Crop yield anomaly forecasting in the Pannonian basin using gradient boosting and its performance in years of severe drought. Agric. For. Meteorol. 2023, 340, 109596. [Google Scholar] [CrossRef]
Papagiannopoulou, C.; Miralles, D.G.; Decubber, S.; Demuzere, M.; Verhoest, N.E.C.; Dorigo, W.A.; Waegeman, W. A non-linear Granger-causality framework to investigate climate–vegetation dynamics. Geosci. Model Dev. 2017, 10, 1945–1960. [Google Scholar] [CrossRef]
Harris, C.R.; Millman, K.J.; Van Der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
Pandas Development Team, T. Pandas-Dev/Pandas: Pandas. 2023. Available online: https://zenodo.org/record/8092754 (accessed on 10 August 2023).
Collenteur, R.A.; Bakker, M.; Caljé, R.; Klop, S.A.; Schaars, F. Pastas: Open Source Software for the Analysis of Groundwater Time Series. Groundwater 2019, 57, 877–885. [Google Scholar] [CrossRef]
Bandhauer, M.; Isotta, F.; Lakatos, M.; Lussana, C.; Båserud, L.; Izsák, B.; Szentes, O.; Tveito, O.E.; Frei, C. Evaluation of daily precipitation analyses in E-OBS (v19.0e) and ERA5 by comparison to regional high-resolution datasets in European regions. Int. J. Climatol. 2022, 42, 727–747. [Google Scholar] [CrossRef]
Lavers, D.A.; Simmons, A.; Vamborg, F.; Rodwell, M.J. An evaluation of ERA5 precipitation for climate monitoring. Q. J. R. Meteorol. Soc. 2022, 148, 3152–3165. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
Bell, B.; Hersbach, H.; Simmons, A.; Berrisford, P.; Dahlgren, P.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis: Preliminary extension to 1950. Q. J. R. Meteorol. Soc. 2021, 147, 4186–4227. [Google Scholar] [CrossRef]
Nistor, M.M.; Cheval, S.; Gualtieri, A.F.; Dumitrescu, A.; Boţan, V.E.; Berni, A.; Hognogi, G.; Irimuş, I.A.; Porumb-Ghiurco, C.G. Crop evapotranspiration assessment under climate change in the Pannonian basin during 1991–2050. Meteorol. Appl. 2017, 24, 84–91. [Google Scholar] [CrossRef]
Cheval, S. The standardized precipitation index–an overview. Rom. J. Meteorol 2015, 12, 17–64. [Google Scholar]
Jensen, J. Remote Sensing of the Environment: An Earth Resource Perspective; Prentice Hall Series in Geographic Information Science; Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2007. [Google Scholar]
McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Morgenthaler, S. Exploratory data analysis. WIREs Comput. Stat. 2009, 1, 33–44. [Google Scholar] [CrossRef]
Waskom, M.L. seaborn: Statistical data visualization. J. Open Source Softw. 2021, 6, 3021. [Google Scholar] [CrossRef]
Lorena, A.; Garcia, L.P.; Lehmann, J.; de Souto, M.; Ho, T. How Complex Is Your Classification Problem?: A Survey on Measuring Classification Complexity. ACM Comput. Surv. 2019, 52, 1–34. [Google Scholar] [CrossRef]
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Dorigo, W.; Lucieer, A.; Podobnikar, T.; Čarni, A. Mapping invasive Fallopia japonica by combined spectral, spatial, and temporal analysis of digital orthophotos. Int. J. Appl. Earth Obs. Geoinf. 2012, 19, 185–195. [Google Scholar] [CrossRef]
Linusson, H. Multi-Output Random Forests; University of Borås/School of Business and IT: Borås, Swedish, 2013. [Google Scholar]
Faddoul, J.B.; Chidlovskii, B.; Gilleron, R.; Torre, F. Learning Multiple Tasks with Boosted Decision Trees. In Proceedings of the 2012th European Conference on Machine Learning and Knowledge Discovery in Databases—Volume Part I, Bristol, UK, 24–28 September 2012; Springer: Berlin/Heidelberg, Germany, 2012. ECMLPKDD’12. pp. 681–696. [Google Scholar]
Probst, P.; Wright, M.N.; Boulesteix, A.L. Hyperparameters and tuning strategies for random forest. WIREs Data Min. Knowl. Discov. 2019, 9, e1301. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Joseph, V.R. Optimal ratio for data splitting. Stat. Anal. Data Mining: ASA Data Sci. J. 2022, 15, 531–538. [Google Scholar] [CrossRef]
Xu, Y.; Goodacre, R. On splitting training and validation set: A comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J. Anal. Test. 2018, 2, 249–262. [Google Scholar] [CrossRef] [PubMed]
Ripley, B.D. Pattern Recognition and Neural Networks; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
Fukunaga, K.; Hummels, D. Leave-one-out procedures for nonparametric error estimates. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 421–423. [Google Scholar] [CrossRef]
Wong, T.T. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 2015, 48, 2839–2846. [Google Scholar] [CrossRef]
Cerqueira, V.; Torgo, L.; Mozetič, I. Evaluating time series forecasting models: An empirical study on performance estimation methods. Mach. Learn. 2020, 109, 1997–2028. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Buitinck, L.; Louppe, G.; Blondel, M.; Pedregosa, F.; Mueller, A.; Grisel, O.; Niculae, V.; Prettenhofer, P.; Gramfort, A.; Grobler, J.; et al. API design for machine learning software: Experiences from the scikit-learn project. arXiv 2013. [Google Scholar] [CrossRef]
Opitz, J.; Burst, S. Macro f1 and macro f1. arXiv 2019. [Google Scholar] [CrossRef]
Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [PubMed]
Zhu, Q. On the performance of Matthews correlation coefficient (MCC) for imbalanced dataset. Pattern Recognit. Lett. 2020, 136, 71–80. [Google Scholar] [CrossRef]
Rebala, G.; Ravi, A.; Churiwala, S. An Introduction to Machine Learning; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Goldstein, A.; Kapelner, A.; Bleich, J.; Pitkin, E. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. J. Comput. Graph. Stat. 2015, 24, 44–65. [Google Scholar] [CrossRef]
Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 2010, 26, 1340–1347. [Google Scholar] [CrossRef]
Saarela, M.; Jauhiainen, S. Comparison of feature importance measures as explanations for classification models. SN Appl. Sci. 2021, 3, 1–12. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning; Lulu.com: Morrisville, NC, USA, 2020. [Google Scholar]
Sun, F.; Roderick, M.L.; Farquhar, G.D. Rainfall statistics, stationarity, and climate change. Proc. Natl. Acad. Sci. USA 2018, 115, 2305–2310. [Google Scholar] [CrossRef]
Cancelliere, A.; Salas, J.D. Drought probabilities and return period for annual streamflows series. J. Hydrol. 2010, 391, 77–89. [Google Scholar] [CrossRef]
Tallaksen, L.M.; Hisdal, H.; Lanen, H.A.V. Space–time modelling of catchment scale drought characteristics. J. Hydrol. 2009, 375, 363–372. [Google Scholar] [CrossRef]
Miralles, D.G.; De Jeu, R.A.M.; Gash, J.H.; Holmes, T.R.H.; Dolman, A.J. Magnitude and variability of land evaporation and its components at the global scale. Hydrol. Earth Syst. Sci. 2011, 15, 967–981. [Google Scholar] [CrossRef]
Vanderhoof, M.K.; Alexander, L.C.; Todd, M.J. Temporal and spatial patterns of wetland extent influence variability of surface water connectivity in the Prairie Pothole Region, United States. Landsc. Ecol. 2015, 31, 805–824. [Google Scholar] [CrossRef]

Figure 1. Location of Seewinkel in Eastern Austria. Outlines of salt pan basins were provided by Lake Neusiedl—Seewinkel National Park administration [51]. Additionally, the groundwater stations used in the study are marked. The salt pans are colored based on the water occurrence product of the Global Surface Water (GSW) data set [39]. The basemap stems from basemap.at (web address: https://basemap.at/ (accessed on 14 August 2023)). The coordinate reference system is the MGI/Austria GK M34 (EPSG:31259; web address: https://epsg.io/31259 (accessed on 14 August 2023)). The map inset shows the location of the study area (marked with an X) within Austria.

Figure 2. Model splits separated into training sets (blue), validation sets (green), and test sets (pink). The years that correspond to each fold are indicated. Additionally, the cross-validation (CV) schemes are marked on the left side, and the arrows represent the introduction of the seven independent test folds to the leave-one-out cross-validation (LOOCV) scheme.

Figure 3. (a) Binary classification into ‘desiccated’ state (light grey) and ‘inundated’ state (dark blue) for each salt pan and for each year. The years 2002 and 2012 are missing due to data gaps. (b) First month of each year in which the salt pans desiccate. Only months between April and October are shown.

Figure 4. Histograms and scatterplots for Lange Lacke (a) and Unterer Stinkersee (b) for all nine predictors with coloring in the respective classes. Correlation coefficients (Pearson’s r and Spearman’s

ρ

) are additionally displayed on blue background.

Figure 4. Histograms and scatterplots for Lange Lacke (a) and Unterer Stinkersee (b) for all nine predictors with coloring in the respective classes. Correlation coefficients (Pearson’s r and Spearman’s

ρ

) are additionally displayed on blue background.

Figure 5. Fold-wise average Matthews correlation coefficient (MCC) over 30 model run for every model (GW—GROUNDWATER, METEO—METEOROLOGY, COM—COMBINED) and split. The averaged confusion matrices over 30 models run for all 37 folds are additionally displayed for all three models. The folds from independent test set are marked with (T).

Figure 6. Confusion matrix outcomes for all years (part of this study) from 1984 to 2022 for the three main models based on the results for the dependent and independent test sets. For each salt pan (3 rows), the top-most row represents the GROUNDWATER model, the middle row represents the METEOROLOGY model, and the last row represents the COMBINED model. The underlying confusion matrix is rolled out for all 34 salt pans as a function of time. Again, the folds from the independent test set are marked with (T).

Figure 7. Feature importance calculated as average across all folds and 30 model runs for the three main models.

Figure 8. The Partial Dependency Plots (PDP) are displayed for each predictor inside the COMBINED model for three selected salt pans: Lange Lacke, Unterer Stinkersee, and, additionally, a salt pan with a pronounced partial dependency dynamic against the respective predictor. For these salt pans, the individual conditional expectation (ICE) plots are also included in red with reduced line width.

Table 2. Hyperparameters chosen for the three model setups: GROUNDWATER, METEOROLOGY, and COMBINED. The numbers in parentheses relate to the average test set and training set performance given together with the standard deviation (SD; again, in parentheses). Furthermore, the entire selection of tested parameters, as well as the default parameters as proposed by scikit-learn [152], are indicated.

Hyperparameter	GROUNDWATER (0.6 (0.15)/0.65 (0.03))	METEOROLOGY (0.53 (0.12)/0.58 (0.02))	COMBINED (0.58 (0.14)/0.65 (0.03))	Tested Range by GridSearchCV	Default
n_estimators	40	40	40	40, 100, 300	100
max_feature	log2	log2	all	sqrt, log2, all	sqrt
max_depth	2	4	2	1, 2, 3, 4	ultd.
min_samples_leaf	5	3	7	3, 5, 7, 9, 10	1
min_samples_split	10	17	10	6, 10, 13, 17	2
max_leaf_nodes	5	2	7	2, 3, 5, 7	ultd.

Table 3. True positive ratio (TPR), overall accuracy (OA), and kappa values derived from confusion matrices between Landsat-based water maps and reference data based on Sentinel-2.

Date Landsat	Date Reference	TPR	OA	Kappa
28 May 2017	28 May 2017	0.90	0.95	0.90
29 April 2018	28 April 2018	0.50	0.75	0.50
12 June 2019	12 June 2019	0.53	0.81	0.58
8 August 2020	8 August 2020	0.88	0.94	0.88
9 September 2020	9 September 2020	0.86	0.93	0.86
22 July 2022	21 July 2022	0.82	0.91	0.82
17 October 2022	17 October 2022	0.61	0.81	0.62

Table 4. Average performance of different model setups inside LOOCV scheme separated for testing the seven independent test folds (1985, 1991, 1997, 2004, 2010, 2017, and 2022), and the thirty dependent test folds that have already been part of the validation set. Results are averaged over 30 model runs. As the SD was, in all cases, below 0.03, we disregarded this information for each metric.

Model/Score	GROUND-WATER 7	METEO-ROLOGY 7	COMB-INED 7	RAN-DOM 7	GROUND-WATER 30	METEO-ROLOGY 30	COMB-INED 30
$M C C$ -Test	0.59	0.66	0.57	0.36	0.61	0.57	0.61
$M C C$ -Train	0.68	0.66	0.68	0.56	0.68	0.66	0.68
F1-Macro Test	0.79	0.83	0.78	0.68	0.80	0.79	0.81
F1-Macro Train	0.84	0.83	0.84	0.78	0.84	0.83	0.84
Accuracy Test	0.80	0.83	0.79	0.7	0.81	0.80	0.82
Accuracy Train	0.85	0.83	0.85	0.79	0.85	0.84	0.85

Table 5. Salt pan-wise MCC (as averaged over all folds) for the three single model realizations (GW–GROUNDWATER, METEO–METEOROLOGY, COM–COMBINED) displayed in Figure 6 for the eight salt pans that exhibit a balanced class distribution.

Average MCC	Zicklacke	Katschitz-Lacke	Fuchsloch-Lacke 3	Oberer Stinkersee	Mittlerer Stinkersee	Wörten-Lacken 2	Neubruch-Lacke	Lange Lacke	Mean
Model GW	0.46	0.51	0.13	0.5	0.15	0.6	0.57	0.6	0.44
Model METEO	0.2	0.34	0.37	0.51	0.49	0.13	0.19	0.11	0.29
Model COM	0.46	0.35	0.15	0.4	0.28	0.44	0.4	0.44	0.37

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Schauer, H.; Schlaffer, S.; Bueechi, E.; Dorigo, W. Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data. Remote Sens. 2023, 15, 4659. https://doi.org/10.3390/rs15194659

AMA Style

Schauer H, Schlaffer S, Bueechi E, Dorigo W. Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data. Remote Sensing. 2023; 15(19):4659. https://doi.org/10.3390/rs15194659

Chicago/Turabian Style

Schauer, Henri, Stefan Schlaffer, Emanuel Bueechi, and Wouter Dorigo. 2023. "Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data" Remote Sensing 15, no. 19: 4659. https://doi.org/10.3390/rs15194659

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inundation–Desiccation State Prediction for Salt Pans in the Western Pannonian Basin Using Remote Sensing, Groundwater, and Meteorological Data

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Multispectral Satellite Imagery

2.3. Predictor Data

2.3.1. Predictor Selection

2.3.2. Groundwater Level

2.3.3. ERA5-Land Meteorology

3. Methods

3.1. Salt Pan Mapping

3.1.1. Derivation of Water Extent Time Series

3.1.2. Derivation of JASO Inundation State

3.2. Inundation State Prediction

3.2.1. Exploratory Data Analysis: Separability and Correlation Analysis

3.2.2. Random Forests

3.2.3. Model Setup

3.2.4. Model Testing

3.2.5. Model Validation

3.2.6. Evaluation Metrics

3.2.7. Feature Importance

4. Results

4.1. Salt Pan Mapping

4.2. Inundation State Prediction

4.2.1. Exploratory Data Analysis

4.2.2. Average Prediction Skill

4.2.3. Detailed Analysis of Single LOOCV Model Runs

4.2.4. Feature Importance

4.2.5. Partial Dependency

5. Discussion

5.1. Assumptions

5.2. Predictors

5.3. Target

5.4. Model Error

5.5. Model Transferability

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI