1. Introduction
Global Navigation Satellite System (GNSS) signals are affected by atmospheric refraction and bending of the propagation path, resulting in propagation delay, which is one of the main sources of GNSS positioning [
1]. The troposphere is a non-dispersive medium, and the tropospheric delay is frequency-independent. This delay cannot be weakened by dual-frequency or multi-frequency combinations [
2,
3]. The tropospheric delay is related to the observed satellite elevation angle, and the Zenith Tropospheric Delay (ZTD) is usually mapped to the slant path of the observed satellite elevation angle by a mapping function. ZTD can be regarded as both a hydrostatic and non-hydrostatic component [
4]. The former is called Zenith Hydrostatic Delay (ZHD), which accounts for more than 90% of the total delay, and the latter is the Zenith Wet Delay (ZWD), caused by water vapor in the lower atmosphere, generally only accounting for 10% or less of the total delay [
5]. Chen and Liu [
6] counted and analyzed the modeling accuracy of 9 ZHD models and 18 ZWD models, and the results showed that the modeling accuracy of the existing models for the ZHD could reach the sub-centimeter level, while the modeling accuracy of the ZWD was poor, up to 10 cm.
Due to the high variation in water vapor in the lower atmosphere in terms of time, space, and altitude, it is difficult to accurately calculate ZWD in general [
7,
8], which makes it more difficult to estimate ZWD in the process of GNSS positioning solution, thereby prolonging the convergence time of Ambiguity Resolution (AR) [
9]. Therefore, accurate modeling of ZWD is a crucial issue that holds great significance for monitoring the atmospheric water vapor content [
10]. Prior accurate ZWD can significantly improve the performance of positioning and location-based service. Studies have shown that prior accurate ZWD constraints could shorten the convergence time of Precise Point Positioning (PPP), especially in the up direction [
11,
12,
13]. Jiang et al. [
14] confirmed that the residual caused by large height differences can be weakened by attaching tropospheric constraints, and the fixed rate of AR and coordinate accuracy can be improved. Generally, ZWD can be obtained by the following: 1. Empirical models, such as the University of New Brunswick (UNB) proposed models [
15] (including UNB, UNB3, and UNB3m) and the Global Pressure and Temperature (GPT) models [
16] (including GPT, GPT2, GPT2w, and GPT3); 2. The meteorological parameter model can achieve centimeter-level ZWD correction accuracy by inputting the surface meteorological parameters. The common models include the Hopfield [
17], Saastamoinen [
18], Black [
19], and Askne and Nordius (AN) [
20] models. Among them, the AN model also needs to provide the weighted mean temperature (
) and the parameter of the exponential decay vertical trend of the water vapor (
) as input parameters so it can describe the vertical water vapor variation over the site, and is considered the most accurate formula for ZWD estimation [
21]; 3. Measuring the atmospheric profiles by the radiosonde over the site and calculating the ZWD by integrating non-hydrostatic refractive index in each level; 4. ZWD is estimated as an unknown parameter [
11]. However, empirical models are difficult to accurately predict short-term or non-trend fluctuations of ZWD in some specific regions (generally between the latitudes of 30° N and 30° S) [
22], and the applicability of meteorological parameter models in specific regions is greatly reduced [
21]. Although the direct estimation as a parameter is effective, it may decrease the efficiency of an algorithm.
In recent years, machine learning (deep learning) algorithms have achieved promising applications in atmospheric modeling thanks to their excellent abilities in information extraction, nonlinear feature modeling, and large data processing [
23,
24]. Yang et al. [
25] used the Back Propagation (BP) neural network to model the residuals of the Hopfield model and Saastamoinen model. They improved the ZWD modeling accuracy of the meteorological parameter model. Li et al. [
26] used three machine learning algorithms of the BP neural network, the Radial Basis Function (RBF) neural network, and the Least Squares Support Vector Machine (LSSVM), combined with the fifth generation of the European Centre for Medium-Range Weather Forecasts Reanalysis (ERA5) and GNSS data, to construct the ZTD model of 2020 in North America. The experiments showed that the accuracy and stability of LSSVM and RBF were better than BP, but the LSSVM and RBF could not be applied to large sample data. Based on 118 radiosonde sites in China and surrounding areas, Gao et al. [
27] established two
models using the Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) neural network, comparing to the BP neural network, confirmed that LSTM had better generalization ability for long-term sequence processing than RNN due to its special memory cell, and the nonlinear fitting ability of LSTM was stronger than BP neural network. Lu et al. [
28] established a Precipitable Water Vapor (PWV) model on the West Coast of the United States by fusing Moderate-resolution Imaging Spectroradiometer (MODIS) and ERA5 data using the Convolutional Neural Network (CNN). Compared to the Multilayer Perceptron (MLP) algorithm, one verified that CNN could extract more detailed spatial features in multivariate time series data. Osah et al. [
29] constructed a regional ZTD model based on the location and surface meteorological parameters (pressure, temperature, and water vapor pressure) of four International GNSS Service (IGS) stations in West Africa using the deep learning algorithm. Ding et al. [
30] verified the effectiveness of multi-parameters for constructing the ZWD model by the multilayer feedforward neural network.
Tropospheric delay is greatly affected by location, season, and other factors. Current research focuses on improving the performance of ZWD models in low-latitude tropical regions [
31]. However, the algorithm’s structural characteristics and the adaptability of the atmospheric spatiotemporal characteristics are not considered in most studies. Therefore, taking advantage of the deep learning algorithms, this paper combines the local spatial feature extraction ability of CNN and the ability of LSTM to learn complex sequences with long-term dependencies for ZWD modeling research. The hybrid algorithm is proposed from encoder to decoder, while CNN is an encoder and LSTM is a decoder. The input parameters of each epoch are encoded by the multilayer CNN, and the local spatial features are identified and then compressed into a one-dimensional vector with spatiotemporal information transmitted to the LSTM. The multilayer LSTM receives the spatiotemporal features sequence, then extracts the long sequence of temporal dependencies and inter-site spatial features and decodes the output. The ZWD modeling strategies are considered in different scenarios and explore the influence of the surface meteorological parameters on ZWD regression modeling. This paper evaluates the accuracy of models established by the hybrid algorithm, using both modeling and non-modeled sites in the South American region for the year 2022. The assessment utilizes Root Mean Square Error (RMSE) as a metric and compares the proposed models with empirical and meteorological parameter models. The results indicate that the hybrid algorithm exhibits good spatiotemporal modeling ability in South America and shows a significant enhancement in ZWD modeling accuracy compared to the empirical and meteorological parameter models. This study validates the effectiveness of the hybrid approach in enhancing the modeling precision of ZWD in South America.
The structure of this paper is as follows.
Section 2 describes the principle and structure of the proposed hybrid deep learning algorithm. Subsequently,
Section 3 introduces the dataset employed for ZWD modeling and the determination of hyperparameters for the hybrid deep learning algorithm.
Section 4 is the experimental analysis. By comparing with the numerical integration of radiosonde data, the spatiotemporal modeling accuracy and generalization ability of different models are discussed. The conclusion is given in
Section 5.
5. Conclusions
This paper introduces a novel spatiotemporal modeling algorithm, denoted as CNN-LSTM (CL), which integrates both CNN and LSTM algorithms. This algorithm is designed to capture the spatiotemporal characteristics of tropospheric Zenith Wet Delay (ZWD) sequences. The proposed algorithm utilizes an encoding–decoding framework, where CNN is responsible for encoding input parameters and extracting local spatial features, thereby constructing spatiotemporal features sequence. LSTM is employed to capture long-term spatiotemporal features and decode the output to estimate ZWD. This algorithm facilitates the recognition and modeling of both spatial and temporal characteristics of ZWD, which are crucial for accurate modeling. The research explores three strategies and investigates the influence of surface meteorological parameters on ZWD modeling, considering correlation analysis and diverse scenario requirements. This paper is conducted using a 7-year dataset from radiosonde in South America, covering the years from 2015 to 2021, with data from 2022 serving as a validation set. The accuracy of the proposed ZWD models established under the same modeling strategy is assessed in comparison to the CNN algorithm and LSTM algorithm, as well as traditional models like the GPT3, Saastamoinen, and AN models. This aims to evaluate the accuracy and effectiveness of the proposed hybrid ZWD modeling algorithm.
The validation results of the 46 sites in South America for the year 2022 indicate that the overall accuracy of the CL-A model established based on the CL hybrid algorithm without surface meteorological parameters is slightly better than the GPT3 model, but the accuracy improvement in the 15° N–15° S region is 12%. The CL-B model, with the introduction of surface temperature, solves the problem of poor applicability of the Saastamoinen model in the Amazon Rain Forest area in northern South America, and the accuracy is improved by 38% in the area of 15° N–15° S. Reintroducing the surface water vapor pressure, the overall RMSE of ZWD estimated by the CL-C model is 3.60 cm, improved by 30% and 12%, respectively, compared to the Saastamoinen model and AN model. The CL-C model can significantly improve the performance by introducing the surface water vapor pressure, achieving approximately 44% and 19% improvements compared to the Saastamoinen and AN model in the 15° N–15° S region and 17% and 7% improvements in the 15° S–30° S region, respectively. The CL-C model effectively weakens the adverse effects of monsoon climate regions. Compared to the ZWD models established using the CNN algorithm and LSTM algorithm, the proposed hybrid algorithm performs better than the others.
To validate the performance of the proposed models at non-modeled sites in South America, data from an additional six sites for the year 2022 were selected for assessment. Results show that the ZWD models established using the CL hybrid algorithm exhibit strong generalization ability under the three strategies and reliability at different latitudes and climatic regions. They can effectively improve the overall accuracy of existing empirical models and meteorological parameter models in low latitudes and specific climate regions. Under the three strategies, the established models based on the proposed hybrid algorithm can improve the accuracy of the corresponding empirical and meteorological parameter models and describe the spatiotemporal variation in ZWD properly.
The hybrid deep learning algorithm, utilizing long-term historical data with multiple parameters as input, effectively captures the complex nonlinear relationships between input and output to establish ZWD models. This algorithm demonstrates reliable applicability and accuracy in South America, showcasing excellent performance at both modeling and non-modeling sites within the region. In this paper, South America is considered the research area, and the modeling and generalization capabilities of the proposed models are validated within this region. Expanding the modeling region by incorporating data from stations worldwide could be considered as the next research work.