**1. Introduction**

Gross primary productivity (GPP) is the cumulative sum of organisms produced by plants absorbing CO2 during photosynthesis [1,2]. It drives the seasonal and annual variations in atmospheric CO2 concentration, which reflect the production capacity of terrestrial ecosystems under natural conditions [2,3], and is an important indicator for assessing ecosystem health and climate change [4]. Therefore, accurate prediction of GPP is crucial for ecosystem function evaluation and carbon balance research [2].

The common methods for quantifying and predicting GPP are based on processing observational data and process-based model simulation [2,5]. Observational data include

**Citation:** Yang, Q.; Nie, N.; Wang, Y.; Wu, X.; Liu, W.; Ren, X.; Wang, Z.; Wan, M.; Cao, R. Spatial–Temporal Correlation Considering Environmental Factor Fusion for Estimating Gross Primary Productivity in Tibetan Grasslands. *Appl. Sci.* **2023**, *13*, 6290. https:// doi.org/10.3390/app13106290

Academic Editor: Nathan J Moore

Received: 25 April 2023 Revised: 14 May 2023 Accepted: 18 May 2023 Published: 21 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

data obtained using the eddy covariance (EC) technique, in which GPP values are obtained by calculating net ecosystem exchange (NEE) from vertical turbulent transport in the atmosphere under meteorological conditions [5,6], and satellite data, which are commonly used for GPP estimation due to their stability and sustainability, such as the MODIS GPP standard product, the VPM model, and the EC-LUE model [2,7–9]. However, satellite GPP products cannot fully guarantee the reliability of data [10], which affects the accuracy of the prediction data and introduces uncertainties to related research. Process-based models mainly investigate and simulate ecological processes occurring in plants and have extensive theoretical foundations in related fields. However, process-based models have complex structures and often simulate ideal ecological processes that deviate from actual conditions [2,11], which affects model accuracy. In addition, plant organisms involve complex and nonlinear biological and chemical mechanisms [2,12,13], which pose a great challenge for process-based models to simulate these mechanisms.

Currently, artificial intelligence (AI) algorithms have been widely applied in various fields because they can fit complex nonlinear mapping relationships between predictive and driving factors without requiring as many complicated prior assumptions as traditional models do [14]. Commonly applied machine learning models include Random Forest (RF), Support Vector Machine (SVM), and neural networks such as Long-Short Term Memory (LSTM). Tramontana et al. [15], Ichii et al. [16], Wang et al. [17], and other researchers used AI methods for tree species classification and carbon flux prediction, demonstrating the potential of AI in ecology. In recent years, Zhang et al. [18], Yuan et al. [19], Yu et al. [20], Sarkar et al. [4], and others used RF, Convolutional Neural Network (CNN), and Deep Belief Network (DBN) to predict GPP and achieved good results.

In this work, we constructed a model with spatial–temporal correlation while considering environmental factor fusion based on the GeoMAN model, a network with a multi-level attention mechanism developed by Liang et al. [21]. We trained and parameterized the algorithm with observational data on GPP from various sites and environmental driving data to extract nonlinear mapping relationships between GPP and multiple environmental factors. We designed a series of case studies to assess the performance of this deep learning model, which was based on the attention mechanism, by examining the impacts of distance, vegetation, and environmental factors on the prediction results across various flux sites. Compared with the previous applications of AI models in cross-disciplinary fields, our method not only fully utilizes the high precision of AI in prediction but also considers the prior knowledge within the relevant field to ensure that the results are both more accurate and consistent with domain knowledge.

#### **2. Materials and Methods**

## *2.1. Study Area*

The flux sites used in this study were distributed in the Tibetan Plateau region, which is located in the alpine climate zone and has the climatic characteristics of long sunshine hours, intense sunlight, low temperatures, and scant rainfall. The regional average elevation exceeds 4000 m, the annual average temperature ranges from −5.75 to 2.57 ◦C, and the annual average amount of precipitation is 200–600 mm. Alpine grassland covers more than 60% of the surface area of this region [14,22], and it is a distinctive grassland ecosystem within all alpine areas in the world [23,24].

According to the *Atlas of Grassland Resources in China* (1:1,000,000) [25], alpine grasslands are subdivided into four sub-categories: alpine Kobresia meadow (KO), alpine shrub meadow (SH), alpine swamp meadow (SW), and alpine meadow steppe (AS) (Figure 1).

**Figure 1.** Geographical spread of alpine grasslands across China [14,23]. Black triangles represent the nine flux sites.

#### *2.2. Data*

2.2.1. Flux and Meteorological Data

The flux and meteorological data used in this study were collected from the China Terrestrial Ecosystem Flux Observation and Research Network (ChinaFLUX) [14,23,26], the Coordinated Observations and Integrated Research over Arid and Semi-arid China (COIRAS) [27], and the Heihe Watershed Allied Telemetry Experimental Research (HiWA-TER) [28], which were observed by nine flux stations distributed in the Tibetan Plateau region. The data spans from 2003 to 2014. These flux sites exemplify the broadest grassland ecosystem types, encompassing an extensive range of spatial, ecological, and weatherrelated circumstances [23].

Carbon flux data were processed using various methods, including triple coordinate rotation, Webb–Pearman–Leuning (WPL) correction, and outlier removal. The temporal resolution of the MODIS data was eight days, while that of temperature and photosynthetically active radiation was half an hour. The eddy covariance system was used to concurrently record these meteorological data, and any missing values were supplemented using the technique proposed by Schwalm et al. [29]. The data were then averaged and summed over eight days [14,23]. Finally, a total of 1421 site observation data points with a temporal resolution of eight days were obtained. The primary attributes of the nine flux sites in northern China's grasslands were shown in Table 1.


**Table 1.** Primary attributes of the nine flux sites in northern China's grasslands [14,23].

#### 2.2.2. Remote Sensing Data

In this research, the remote sensing data utilized comprised the following MODIS products: normalized difference vegetation index (NDVI), enhanced vegetation index (EVI) (MOD13A2) [14,30], and surface reflectance (MOD09A1) [14,31]. The spatial resolution of the NDVI and EVI products was 1000 m and the temporal resolution was 16 days, while the spatial resolution of the surface reflectance was 500 m and the temporal resolution was 8 days. In order to acquire data with consistent spatial and temporal resolution, the quality control and data completion approaches proposed by Ma et al. [32] and Xiao et al. [33] were applied. The surface reflectance data were used to calculate the land surface water index (LSWI) [34].
