Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms

Zhang, Zhan; Song, Qingping; Duan, Minzheng; Liu, Hailei; Huo, Juan; Han, Congzheng

doi:10.3390/rs17071123

Open AccessArticle

Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms

by

Zhan Zhang

^1,2,

Qingping Song

¹,

Minzheng Duan

^1,3

,

Hailei Liu

²

,

Juan Huo

^1,3,*

and

Congzheng Han

⁴

¹

Key Laboratory of Middle Atmospheric and Global Environment Observation, Institute of Atmospheric of Physics, Chinese Academy of Sciences, Beijing 100029, China

²

Key Laboratory of Atmospheric Sounding, Chengdu University of Information Technology, Chengdu 610225, China

³

University of Chinese Academy of Sciences, Beijing 101408, China

⁴

State Key Laboratory of Atmospheric Environment and Extreme Meteorology, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(7), 1123; https://doi.org/10.3390/rs17071123

Submission received: 24 January 2025 / Revised: 16 March 2025 / Accepted: 19 March 2025 / Published: 21 March 2025

(This article belongs to the Special Issue Precipitation, Flood and Earthquake Events Monitoring, Simulation, Analysis and Early Warning by Advanced Environmental Remote Sensing and AI)

Download

Browse Figures

Versions Notes

Abstract

:

Nowcasting is a critical technology for disaster prevention and mitigation, and the accuracy of radar echo extrapolation directly impacts forecasting performance. In most deep learning-based models, accurately predicting heavy precipitation remains a challenging task. Focusing on the region of China, this study proposes an improved model based on residual and attention mechanisms—RA-UNet—for precipitation nowcasting with a lead time of 3 h. The model introduces the residual neural network (ResNet) and the convolutional block attention module (CBAM) to integrate multi-scale features into the U-Net encoder–decoder architecture, enhancing its ability to capture the spatiotemporal evolution of precipitation systems. Meanwhile, depthwise separable convolutions are employed to replace conventional convolutions, significantly improving computational efficiency while preserving model performance. To evaluate the model’s performance, experiments were conducted using 6 min resolution radar echo data from China in 2024, with comparisons made against the optical flow (OF) method and the U-Net model. The experimental results show that RA-UNet demonstrates significant advantages in 3 h forecasting: its mean absolute error (MAE) is reduced by approximately 7%, the false alarm rate (FAR) decreases by about 20%, and it outperforms the comparison models in metrics such as the critical success index (CSI) and structural similarity index (SSIM). Notably, RA-UNet effectively mitigates intensity degradation in long-term forecasts, successfully predicting the trend of >40 dBZ strong echo cores in two typical cases and significantly improving the premature dissipation problem of precipitation fields. This study provides a new approach to refined forecasting of complex precipitation systems, and future work will combine multi-source data fusion with physical constraint mechanisms to further enhance precipitation event prediction capabilities.

Keywords:

echo extrapolation; precipitation nowcasting; deep learning; RA-UNet

1. Introduction

Precipitation, a common meteorological phenomenon, plays a vital role in agricultural production and daily life. However, heavy precipitation can lead to natural disasters, such as floods, landslides, and mudslides, which pose significant threats to human life and property [1,2]. As a crucial component of weather forecasting, accurate precipitation prediction is essential for decision-making, risk management, and minimizing losses in both life and property [3,4]. Nowcasting refers to high-resolution predictions of precipitation over brief time periods, with particular emphasis on accuracy and timeliness, especially when responding to sudden weather events [5,6]. Despite advancements, real-time, large-scale, and high-resolution precipitation forecasting remains challenging due to the complexity and uncertainty of atmospheric dynamics [7]. Precipitation forecasting models can generally be divided into two categories: Numerical Weather Prediction (NWP) models [8,9,10] and extrapolation models based on radar echo and satellite observations [11,12]. NWP models forecast precipitation by solving fluid dynamics and thermodynamic equations under initial and boundary conditions. However, these models are often influenced by initial condition fields and require substantial integration time before initiating forecasts. Additionally, NWP models are computationally expensive and have limitations in providing small-scale predictions, resulting in suboptimal performance for nowcasting (0–2 h). In contrast, radar-based extrapolation models predict the shape, location, and intensity of precipitation fields for the next few hours, relying on the advection characteristics of current cloud movements. Although these extrapolation models have limitations in simulating dynamics and thermodynamics, their spatiotemporal resolution is generally higher than that of NWP models, particularly in nowcasting. The optical flow method is the predominant technique used for nowcasting [13], which estimates the optical flow field between consecutive radar echo images. Despite its high computational efficiency, the optical flow method still struggles with accuracy issues due to optical flow vector estimation errors and the accumulation of extrapolation errors, hindering further improvements in forecast accuracy [14].

Therefore, researchers have increasingly turned to deep learning models, capitalizing on their robust data-processing capabilities. Deep learning models are capable of learning complex nonlinear relationships from large datasets, which demonstrate considerable promise for nowcasting [15,16,17]. A significant step forward in this regard was made by Shi et al. with the proposal of the convolutional long short-term memory (ConvLSTM) model [18]. This approach frames nowcasting as a spatiotemporal sequence prediction problem, integrating convolutional operations into long short-term memory (LSTM) networks, thereby enhancing the model’s ability to capture both temporal and spatial features in radar echo sequences. The empirical results indicate that the ConvLSTM model outperforms the optical flow method in forecasting accuracy. Consequently, numerous subsequent models have been developed based on ConvLSTM, progressively improving predictive performance. These include models such as TrajGRU, PredRNN, PredRNN++, and MIM [19,20,21,22]. However, these models primarily focus on structural improvements, and while they demonstrate enhanced predictive capabilities, they still face some inherent limitations. For instance, as a model’s complexity increases, the prediction outputs tend to become blurred and smoothed, which complicates the precise capture of detailed changes in precipitation fields. Furthermore, the growing size of these models necessitates significantly more memory and computational resources [23].

The Transformer is a deep learning model based on the self-attention mechanism, playing a crucial role in the field of natural language processing [24]. In the domain of short-term precipitation forecasting, Gao et al. proposed a fully Transformer-based model called Earthformer [25], which achieved significant improvements in forecast accuracy. However, this model demands high computational resources and memory. Chen et al. introduced a hybrid architecture, TransUNet [26], which integrates the local feature extraction capability of convolutional neural networks with the global context modeling of Transformers, forming a multi-scale feature fusion paradigm. Yang et al. further advanced this approach with the AA-TransUNet model [27], which employs an adaptive attention mechanism to reduce network parameters while maintaining high forecasting accuracy, providing valuable insights for model lightweighting.

Meanwhile, the rapid development of Generative Adversarial Networks (GAN) has introduced new approaches for spatiotemporal sequence prediction [28]. For instance, Zhan et al. proposed the GAN-LSTM model, leveraging the synergy between LSTMs and GAN to enhance feature extraction for spatiotemporal forecasting tasks [29]. Recent studies, such as DGMR and NowcastNet [30,31], have further developed a dual spatiotemporal discriminator architecture, using adversarial constraints to align forecasted precipitation distributions with real-world precipitation patterns. However, the inherent risks of mode collapse and training instability in GAN continue to hinder their practical application, leading researchers to explore diffusion models as a potential breakthrough [32,33]. Studies by Chen et al. and Hoogeboom et al. [34,35] suggest that while diffusion models can mitigate GAN-related shortcomings, their parameter sensitivity in high-dimensional spaces and the computational burden from iterative sampling still constrain their application to low-resolution precipitation forecasting scenarios.

U-Net [36], a deep learning architecture renowned for its strong image segmentation capabilities, offers several advantages, such as a simple structure and high computational efficiency. Consequently, numerous studies have explored its application in precipitation forecasting. Initially developed for medical image segmentation tasks, U-Net is characterized by a symmetric encoder–decoder structure with a U-shaped network. It extracts features via a contracting path and refines localization through an expanding path. This structure effectively preserves spatial information in radar echo images and, through skip connections, enables the fusion of features from different layers, thereby preventing the loss of radar echo details that commonly occurs in traditional convolutional neural networks at lower resolutions [37,38,39,40]. Researchers have progressively extended U-Net’s application to nowcasting, achieving notable successes, such as RainNet and SmaAt-UNet [41,42]. These models have notably enhanced the accuracy and robustness of precipitation forecasting by optimizing network structures and incorporating various feature extraction modules. However, despite its capability to preserve spatial information, U-Net often underperforms in handling precipitation intensity variations, rapidly moving precipitation systems, and extremely heavy precipitation events. This is evident in its difficulty in forecasting long-term precipitation and its tendency to underestimate high echo values. Thus, accurate and real-time nowcasting remains a formidable challenge [15,43,44].

In this study, we present RA-UNet, a deep learning model designed to enhance the accuracy and timeliness of nowcasting. Built on the U-Net architecture, RA-UNet integrates the residual neural network (ResNet) [45] and the convolutional block attention module (CBAM) [46], along with time-series residual convolution and attention modules. Additionally, depthwise separable convolution replaces traditional convolution to further optimize model efficiency. These innovations enable RA-UNet to more effectively capture temporal information from radar echo data, improving heavy precipitation forecast accuracy while reducing model complexity. The model is trained using radar reflectivity data from China, with a forecast horizon extending up to 3 h.

The paper is structured as follows. Section 2 introduces the dataset, model framework, and evaluation methods. Section 3 presents experimental results, including two case studies on precipitation forecasting. Section 4 concludes the study and discusses potential avenues for future research.

2. Materials and Methods

2.1. Radar Echo Reflectivity Dataset

The radar echo reflectivity dataset used in this study is sourced from the China Meteorological Administration (CMA), which operates the China New Generation Weather Radar (CINRAD) network. CINRAD comprises 217 Doppler weather radars deployed across mainland China, including 94 C-band and 123 S-band radars, providing extensive coverage of the country [47] (see Figure 1). These radar systems offer high spatiotemporal resolution reflectivity data, making them well-suited for real-time precipitation monitoring and forecasting. The dataset used in this study includes radar echo reflectivity mosaics from March to August 2024, totaling approximately 37,000 radar images with a temporal resolution of 6 min. Each original radar echo image has a resolution of 1349 × 1208 pixels, covering the area (73°E–135°E, 10°N–55°N).

To ensure data quality and standardize inputs for model training, we performed necessary preprocessing on the radar images. The preprocessing steps are as follows: first, the RGB values corresponding to different reflectivity levels were extracted from the color echo images; then, pixel values were matched to radar echo reflectivity using the “reflectivity color scale” provided in the radar images; annotation information (e.g., city names, river names, and boundary lines) was removed; nearest-neighbor interpolation was applied to fill gaps in the reflectivity matrix generated during extraction, resulting in a clean, gap-free reflectivity matrix. Given the potential interference from low-altitude objects (e.g., mountains, buildings, and trees) that may produce false echoes, a local mean filter algorithm was applied to denoise the radar images. Additionally, we applied cropping and downsampling to the images. Since most of the left side of the radar echo images (covering western China) contains no valid reflectivity values, including these regions would increase the training difficulty. Therefore, we first cropped the images to 1024 × 1024 and then downsampled them to 512 × 512 to conserve computational resources and suit the model’s input requirements. After these preprocessing steps, the total number of samples obtained is 37,428 × 512 × 512. The temporal resolution remains at the original 6 min interval, while the spatial resolution is approximately 10 km, covering most regions of China (with a latitude and longitude range of approximately 85°E–135°E, 15°N–55°N).

2.2. RA-Unet

This study proposed a nowcasting model framework based on residual and attention mechanisms, as illustrated in Figure 2. The model extends the encoder–decoder architecture of U-Net while building on its foundational structure. Overall, the input size of the model is (10, H, W) and the output size is (30, H, W), which means the model uses radar reflectivity images from the first hour to forecast precipitation for the next 3 h. The model adopts an encoder–decoder architecture, which can be seen as a process of information compression and feature extraction. Efficient spatial information transfer is achieved through skip connections.

In the encoder, max pooling (red arrows) and double convolution (blue arrows) are applied to halve the image size and double the number of feature maps, respectively. The encoder is followed by an equal number of decoder modules, consistent with the original U-Net architecture, utilizing four encoder–decoder blocks. The decoder consists of three main steps: first, bilinear upsampling (green arrows) is used to double the feature map size; second, the feature map generated by the decoder is concatenated with the corresponding encoder output via skip connections (gray arrows); and third, double convolution operations reduce the number of feature maps by half while preserving the feature map size. Skip connections allow the model to leverage multi-scale radar echo features, facilitating efficient spatial information transfer and improving the model’s ability to extract high-level features while retaining critical details, thus enhancing forecast accuracy. The final layer of the model consists of 1 × 1 convolution (purple arrows), producing a matrix of size (30, H, W), which represents the predicted output of the network.

To further enhance the model’s feature extraction capabilities, we incorporate a convolutional block attention module (CBAM, yellow arrows) between the encoder and decoder. Figure 3 illustrates the structure of the CBAM module. It applies attention mechanisms separately along the channel and spatial dimensions, enabling it to weight radar echo features at different scales and thereby emphasize the aspects most critical for precipitation forecasting. This module captures dynamic features of precipitation systems, such as movement speed, precipitation area, intensity, propagation, formation, and dissipation. Particularly in complex weather systems, CBAM helps the model focus on key precipitation areas, suppressing irrelevant features and thereby improving prediction accuracy.

To reduce computational complexity and improve model efficiency, depthwise separable convolutions are employed in the network. Compared to traditional convolutions, depthwise separable convolutions lower computational demands and reduce the number of parameters while maintaining high performance. This enables the model to perform efficiently, meeting the real-time demands of precipitation forecasting.

Additionally, to capture the temporal and spatial correlations within the radar echo sequence, we introduce a time-series residual convolution module (orange arrows). The module is implemented as follows: First, assume that the output of the encoder–decoder architecture is f and one of the intermediate features is m. An element-wise multiplication (m ⊙ f, as shown in Figure 2) is then performed to achieve a feature masking effect, which can be regarded as an “attention mechanism”. Next, a double convolution with a 4 × 4 kernel is applied to the encoder’s output for upsampling, primarily targeting feature extraction at the spatiotemporal scale. Let the output of this step be b. Finally, the features are combined using the formula (m ⊙ f + (1 − m) ⊙ b), as shown in Figure 2. This module incorporates residual learning, helping the model maintain effective transmission of echo information over multiple time steps while alleviating issues like gradient vanishing or explosion. By merging inputs from the current and previous time steps, the module captures dynamic changes in the precipitation system, such as movement speed, intensity variations, and local feature details. This temporal residual convolution design enhances the model’s ability to capture spatiotemporal features of radar echoes, particularly in regions with intense convective activity.

2.3. Model Training

In this study, all models are trained under the same setup, with the specific parameters shown in Table 1.

The initial learning rate of the model was set to 0.001, and the Adam optimizer with default parameters was employed to efficiently optimize the weight parameters of the deep learning model. During training, the maximum number of iterations was set to 100, with a batch size of 12. The dataset is divided into training, validation, and test sets in a ratio of 8:1:1, which are used for parameter learning, hyperparameter tuning, and final performance evaluation, respectively. We adopted the mean squared error as the loss function to measure the error between predicted and true values. To enhance the model’s convergence efficiency and avoid overfitting, a learning rate decay strategy was implemented: the learning rate was reduced to 90% of its current value if the validation loss did not decrease for 3 consecutive epochs. Additionally, an early stopping strategy was applied, where the training process automatically halted if the validation loss showed no improvement over the past 10 epochs. The training was conducted on an Nvidia 4090 GPU with 24 GB of memory. To evaluate the practicality of the model, key computational metrics were collected: the model contains 27,292,672 parameters (approximately 26 M), with a single-frame prediction time of less than 4 s, a total extrapolation time of less than 2 min for the 3 h task (30 frames), and peak GPU memory usage stabilized at 5.8 GB during inference.

2.4. Comparison Models

To facilitate a more effective comparison with the RA-UNet model proposed in this study, we selected two widely used radar echo extrapolation forecasting models: the Optical Flow (OF) model and the U-Net model. These models are commonly employed in radar echo extrapolation and nowcasting, providing a basis for comparison from both traditional and deep learning perspectives.

2.4.1. Optical Flow

The optical flow extrapolation algorithm essentially models the motion of cloud systems in the real world as a two-dimensional projection onto the camera plane and computes the motion velocity field of the target within the image (i.e., the optical flow field) [48]. It relies on two key assumptions: the grayscale invariance assumption, which posits that the pixel grayscale values of radar echoes remain unchanged between adjacent time steps, and the small motion assumption, which assumes that the movement of radar echo pixels along the time dimension is continuous and small. Let (x, y) denote the position of a pixel in the radar echo image and I(x, y, t) represent the pixel’s grayscale value at time t. At time t + dt, the pixel moves to position (x + dx, y + dy), in accordance with the grayscale invariance assumption.

I (x, y, t) = I (x + d x, y + d y, t + d t)

(1)

Based on the small motion assumption and Taylor expansion, we get:

I (x + d x, y + d y, t + d t) = I (x, y, t) + \frac{\partial I}{\partial x} d x + \frac{\partial I}{\partial y} d y + \frac{\partial I}{\partial t} d t + ε

(2)

where

ε

epsilon represents the second-order infinitesimal term, which can be neglected. Dividing both sides of the above equation by dt, and letting u = dx/dt and v = dy/dt, we simplify to:

I_{x} u + I_{y} v + I_{t} = 0

(3)

where

I_{x} = \frac{\partial I}{\partial x}, I_{y} = \frac{\partial I}{\partial y}, I_{t} = \frac{\partial I}{\partial t}

represent the partial derivatives of the pixel intensity with respect to x, y, and t, respectively, and (u, v) is the sought optical flow field. There is only one constraint equation, while the number of unknowns is two, which makes it impossible to solve for u and v uniquely. Therefore, solving the optical flow field requires additional constraint conditions. In this study, we adopt a local optical flow approach (using the two frames preceding the reference forecast time as input) that computes the optical flow field through a window-matching method based on local region constraints. A detailed mathematical derivation of the algorithm is not provided here.

2.4.2. U-Net

The second comparison model is U-Net, a widely adopted deep learning architecture in various computer vision tasks. U-Net consists of two primary components: the encoder and the decoder. The encoder progressively extracts high-level features from the input image through multiple convolutional and downsampling layers. The decoder, in turn, restores the image resolution through successive deconvolution and upsampling layers. A key feature of U-Net is the use of skip connections, which link corresponding layers of the encoder and decoder. These connections enable low-level features to be passed directly to the decoder, facilitating the capture of fine-grained details and enhancing the model’s predictive performance. The input configuration of U-Net is identical to that of RA-UNet, both using images captured 60 min prior to the reference forecast time as input.

2.5. Evaluation Metrics

To comprehensively assess the model’s performance, this study employs four widely used quantitative metrics in precipitation forecasting: the Mean Absolute Error (MAE), Critical Success Index (CSI), False Alarm Ratio (FAR), and Structural Similarity Index (SSIM). These metrics evaluate the model’s effectiveness in nowcasting from various aspects, including prediction accuracy, error magnitude, false alarm rate, and the preservation of spatial features:

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - x_{i}|

(4)

where

y_{i}

is the true value of the radar echo reflectivity,

x_{i}

is the predicted value by the model, and n is the total number of samples. A smaller MAE value indicates a smaller deviation between the model’s predicted values and the true values. This metric reflects the performance differences across different models in predicting precipitation intensity.

The calculation of the Critical Success Index (CSI) and False Alarm Ratio (FAR) relies on the concepts of hits, misses, and false alarms. Specifically, by establishing different radar echo reflectivity thresholds (corresponding to varying precipitation intensities), both the predicted and true values are converted into binary 0/1 matrices. In this study, we set three thresholds at 20 dBZ, 30 dBZ, and 40 dBZ, respectively. If the radar echo reflectivity value exceeds the set threshold, it is recorded as 1 (indicating precipitation occurrence); otherwise, it is recorded as 0 (indicating no precipitation). After binarization, if a pixel in both the predicted image and the true image is 1, it is recorded as a hit (indicating a successful prediction). If the pixel in the predicted image is 0 while it is 1 in the true image, it is recorded as a miss (indicating a false negative). If the pixel in the predicted image is 1 while it is 0 in the true image, it is recorded as a false alarm (indicating a false positive). These definitions are summarized in Table 2.

Thus, the calculation formulas for CSI and FAR are as follows:

C S I = \frac{h i t s}{h i t s + m i s s e s + f a l s e a l a r m s}

(5)

F A R = \frac{f a l s e a l a r m s}{h i t s + f a l s e a l a r m s}

(6)

The CSI quantifies the accuracy of the model’s predictions, with values ranging from 0 to 1. A higher CSI indicates greater prediction accuracy. The FAR, also ranging from 0 to 1, measures the model’s false alarm rate, where a lower value signifies a reduced false alarm rate and, consequently, higher prediction reliability.

Additionally, the SSIM between the true precipitation field and the predicted values was computed [49]. The formula for SSIM is as follows:

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(7)

Here, x and y represent the true precipitation field and the predicted values, respectively.

μ_{x}

and

μ_{y}

are the means of x and y,

σ_{x}^{2}

and

σ_{y}^{2}

are their variances, and

σ_{x y}

is the covariance between x and y. C₁ and C₂ are stability constants to avoid division by zero. The SSIM value ranges from 0 to 1, where a higher value indicates better preservation of the precipitation field’s structure and spatial distribution by the model during prediction.

3. Results

3.1. The Predictive Performance on the Test Dataset

A comprehensive evaluation of the model’s predictive performance was conducted using a test dataset comprising radar echo reflectivity data from the China region for July 2024, which was not part of the training process. Figure 4 illustrates the variation in MAE and SSIM across different models with respect to forecast lead time. As forecast lead time increases, the MAE for all models gradually rises; however, RA-UNet consistently maintains a lower MAE, indicating superior performance in predicting precipitation intensity. In comparison, U-Net outperforms OF up to a 90 min forecast horizon, after which OF gradually surpasses U-Net, though the difference remains marginal. In terms of SSIM, RA-UNet consistently achieves a high value (greater than 0.83), suggesting that it better preserves the spatial structure and shape of the precipitation field, thereby demonstrating greater prediction consistency. U-Net ranks second in SSIM, capturing spatial features of the precipitation field to some extent, while OF consistently produces the lowest SSIM, reflecting its inability to preserve the spatial structure, instead merely shifting the precipitation field spatially. The combined analysis of MAE and SSIM clearly indicates that RA-UNet excels in both precipitation intensity prediction and spatial structure reproduction.

The CSI and FAR results for each model at various radar echo reflectivity thresholds (20 dBZ, 30 dBZ, and 40 dBZ) are presented in Figure 5. It is evident that, across different reflectivity thresholds, RA-UNet consistently outperforms both U-Net and OF in terms of CSI, particularly in short-term predictions. However, as the forecast lead time increases, the CSI for RA-UNet shows a gradual decline, indicating a weakening trend. This may be attributed to the increased complexity of the precipitation field’s echo structure as the forecast duration lengthens, which poses a greater challenge to the model’s predictive capabilities. Additionally, at the 40 dBZ threshold, U-Net’s CSI drops near zero after 90 min, suggesting its limited ability to predict heavy precipitation events. In contrast, RA-UNet, by incorporating temporal residual and attention modules, effectively enhances its prediction performance for heavy precipitation, maintaining a high CSI at the 40 dBZ threshold. This highlights RA-UNet’s significant advantage in handling heavy precipitation events.

Further analysis of the FAR results reveals that RA-UNet performs excellently at the 20 dBZ and 40 dBZ thresholds, with its FAR gradually increasing over time but at a relatively steady rate. This trend is consistent with its higher CSI values, indicating that RA-UNet effectively controls the false alarm rate while maintaining a high hit rate. At the 30 dBZ threshold, U-Net exhibits relatively strong performance in terms of FAR, consistently outperforming other models, but this advantage diminishes at the 40 dBZ threshold, where its FAR rapidly exceeds 0.7 after one hour of forecast time. In contrast, OF consistently shows poor FAR performance across all thresholds, reflecting its overall lower forecasting skill. Overall, RA-UNet demonstrates the best comprehensive performance in terms of both CSI and FAR.

In summary, RA-UNet demonstrates notable advantages in nowcasting, owing to the incorporation of temporal residual and attention mechanisms. These mechanisms effectively mitigate information loss and reduce the accumulation of errors, thereby enhancing the model’s ability to capture heavy precipitation events. In contrast, U-Net, which performs convolution operations solely in the spatial domain, exhibits limitations in long-term forecasting and the prediction of heavy precipitation events. Specifically, at the 40 dBZ threshold, its CSI value approaches zero after 90 min, and its FAR rapidly exceeds 0.7 once the forecast time surpasses one hour, highlighting its difficulty in managing the complex evolution of precipitation fields. Meanwhile, OF exhibits consistently weak performance across all metrics, particularly in precipitation intensity prediction and spatial structure reproduction. Its MAE, SSIM, and FAR remain suboptimal, indicating that its Lagrangian extrapolation-based approach is insufficient for capturing the rapid changes and intricate features of precipitation fields.

3.2. Case Analysis

3.2.1. Case 1

The first case study is a continuous heavy rainfall and hailstorm event that occurred in China (with a latitude and longitude range of approximately 100°E–128°E, 18°N–48°N, mainly concentrated in the Guangxi region) from 19 to 22 April 2024, in Coordinated Universal Time (UTC). Figure 6 presents the forecast results starting from the reference time T (in this case, T = 12:48:00 UTC), with the forecast lead time expressed in minutes (e.g., T + 30 min corresponds to 13:18:00 UTC). The first row displays the ground truth, while the subsequent rows show the predictions from the RA-UNet, OF, and U-Net models. From the ground truth (GT), it can be observed that the precipitation system exhibits significant spatiotemporal evolution characteristics: within the 0–3 h forecast lead time, the area of low-intensity echo precipitation shows a shrinking trend over time, while the heavy precipitation area (>30 dBZ) displays a less pronounced expansion trend. Notably, the strong echo cores (>40 dBZ) exhibit intensity enhancement features under a quasi-stationary state. This differential evolution of high- and low-intensity echo regions essentially reflects the nonlinear coupling between updrafts and precipitation particle growth processes within the mesoscale convective system.

The RA-UNet model effectively captured the spatiotemporal evolution characteristics of the precipitation field. While the model’s depiction of the precipitation field’s morphology was somewhat generalized, it accurately reproduced the spatial distribution and temporal dynamics. Specifically, RA-UNet excelled in identifying and predicting heavy precipitation regions (>40 dBZ), with predictions showing strong consistency with actual observations. However, the model exhibited limitations in predicting extremely strong echoes (>50 dBZ). Although it successfully located these extreme regions, it tended to underestimate their intensity, with an average underestimation of 5–8 dBZ. This issue likely arises from the smoothing effect on local extrema, a consequence of the inherent regularization constraints within the deep learning framework.

In contrast, the OF method, while performing well in echo hit rates, suffered from significant false echo predictions. This limitation stems from its simplistic extrapolation mechanism, which relies on optical flow vectors between adjacent frames. This approach fails to effectively capture the birth and decay of precipitation systems, especially during rapid convection evolution. As a result, OF struggled to predict the development and maturation stages of newly formed convective cells, limiting its ability to forecast intense precipitation events.

The U-Net model performed the weakest, particularly in precipitation intensity prediction. U-Net exhibited a systematic bias, significantly underestimating the intensity of strong echo cores. Additionally, it suffered from severe blurring and smoothing, resulting in the loss of fine echo details.

We computed the MAE (see the bottom of Figure 6) to evaluate the prediction performance of reflectivity intensity at various forecast lead times. The results indicate that the OF and U-Net models perform poorly, with cumulative errors increasing as the forecast lead time extends. In contrast, RA-UNet achieves the lowest MAE across all lead times, and its error growth trend is significantly lower than that of the other models, demonstrating its ability to effectively capture the evolution of radar reflectivity intensity even at longer forecast lead times.

In comparison to the traditional OF and U-Net models, RA-UNet demonstrated a substantial improvement in capturing the evolution of the precipitation field. Specifically, RA-UNet better handled the formation and dissipation processes of precipitation systems, effectively alleviating the common issues of “false generation” and “premature dissipation” seen in traditional methods. This enhancement can be attributed to the attention mechanism and residual structure incorporated into RA-UNet, which enabled it to better capture the spatiotemporal correlations of precipitation systems

3.2.2. Case 2

The second case occurred at 14:24 UTC on 25 May 2024, involving a large-scale precipitation event that affected much of China (with a latitude and longitude range of approximately 95°E–135°E, 18°N–52°N). The precipitation system exhibited a typical banded structure, extending from the northeast to the southwest across the Chinese mainland and forming a narrow precipitation belt. Figure 7 presents the forecast results for this case, with the first row displaying the GT and the subsequent rows showing predictions from various models (in this case, T = 14:24:00 UTC). The GT reveals that multiple scattered rainbands gradually merged over time, a phenomenon indicating distinct spatial organization within the precipitation system, likely driven by convergence effects in the atmospheric circulation. Initially, at T + 30 min, the precipitation bands are more scattered; however, as time progresses (from T + 60 min to T + 180 min), these scattered precipitation cores progressively converge, resulting in a more continuous and concentrated precipitation area.

Overall, both RA-UNet and U-Net effectively capture the general shape of the precipitation area and provide an overview of the precipitation system’s evolution. RA-UNet excels in modeling the merging process of precipitation bands, accurately predicting the intensity and spatial changes in precipitation cores. In contrast, while U-Net captures some aspects of the precipitation area’s evolution, it suffers from significant degradation, with predicted echoes dissipating prematurely and an insufficient representation of precipitation intensity. The OF model, though it appears to offer detailed predictions of local convective activity and the evolution of the precipitation system, primarily shifts the precipitation field from the initial frame to the final one. This results in a large number of both “successful” and “invalid” predictions, which accounts for its high FAR, as reflected in the test set results. Over time, these errors accumulate, leading to poor long-term forecasting performance, particularly in capturing the evolution of the precipitation system. In contrast, RA-UNet significantly improves forecast accuracy, enhancing both precipitation intensity prediction and reducing premature echo dissipation. Compared to both OF and U-Net, RA-UNet is better at capturing the merging characteristics of core precipitation regions within the rainband and the finer details of convective precipitation. Notably, RA-UNet exhibits minimal performance decay with longer forecast lead times (>2 h), demonstrating strong generalization capabilities. However, similar to U-Net, RA-UNet underestimates the intensity of extreme strong echoes (>50 dBZ), a limitation likely due to the scarcity of extreme precipitation samples in the training data and the model’s inherent challenges in handling highly nonlinear strong convective systems. In terms of MAE, although the three models exhibit similar performance, RA-UNet consistently maintains the lowest error.

In conclusion, RA-UNet outperforms other models in forecasting complex precipitation systems. Compared to OF, RA-UNet captures local features and the dynamic evolution of the precipitation field more effectively; compared to U-Net, it significantly mitigates degradation effects and misses. Nevertheless, due to the complexities involved in cyclone propagation processes and the rarity of extreme precipitation events, all models require further refinement in rainfall shape forecasting and echo intensity prediction.

4. Discussion

This study proposes a nowcasting model, RA-UNet, based on a deep learning architecture, designed for radar echo extrapolation and short-term precipitation prediction in China, with a forecast lead time of up to 3 h. We compared RA-UNet with two commonly used extrapolation methods—the OF method and the classical U-Net model—through comprehensive performance analysis and evaluation using typical precipitation cases. The experimental results demonstrate that RA-UNet outperforms OF and U-Net on multiple evaluation metrics (e.g., MAE, SSIM, and FAR), particularly excelling in forecasting heavy precipitation. Additionally, case studies reveal that RA-UNet can generate relatively accurate precipitation field contours over longer forecast periods while preserving the core characteristics of high-intensity radar echoes, whereas OF and U-Net exhibit limitations in these aspects. Notably, although RA-UNet significantly mitigates the smoothing effect and intensity degradation of heavy precipitation through the introduction of temporal residual modules and attention mechanisms, it still inevitably experiences some degree of error accumulation over longer forecast lead times. The experimental results indicate an underestimation in the dynamic range prediction of extreme strong echoes (e.g., >50 dBZ), which may be attributed to the insufficient number of extreme samples in the training data and limitations in modeling the nonlinear interactions within complex convective systems. This is a common challenge faced by deep learning models in precipitation forecasting. In future work, we will integrate adaptive weighting mechanisms to enhance the model’s capability in predicting the dynamic range of extreme echoes. In contrast, while OF exhibits less of a smoothing effect, its strict assumptions mean it struggles to handle the evolution of strong convective precipitation and complex precipitation systems. Meanwhile, U-Net, constrained by its limited receptive field and spatiotemporal feature extraction capabilities, performs poorly in modeling dynamic precipitation fields.

RA-UNet demonstrates promising potential for short-term precipitation extrapolation tasks, but there is still room for improvement. The current study uses only radar echo data as input for model training, overlooking the potential contributions of other meteorological variables (e.g., wind fields, humidity, and temperature). However, the integration of multi-source data (e.g., satellite observations, NWP model outputs, and real-time observational data) can not only provide physical constraints for data-driven models but also further enhance forecast accuracy through data assimilation. Future research could consider incorporating multi-source meteorological data and physical process simulations to improve the model’s predictive capability for multi-scale precipitation systems [50]. Additionally, specialized studies targeting more extreme weather events (e.g., typhoons and heavy rainfall) and longer forecast lead times could be conducted to explore the potential of RA-UNet in practical early warning systems.

Author Contributions

Conceptualization, Z.Z.; methodology, Z.Z. and C.H.; software, Z.Z. and J.H.; validation, Z.Z. and Q.S.; formal analysis, Z.Z.; investigation, Z.Z.; resources, Z.Z. and J.H.; data curation, Z.Z. and H.L.; writing—original draft preparation, Z.Z. and Q.S.; writing—review and editing, H.L.; visualization, M.D.; supervision, M.D.; project administration, J.H.; funding acquisition, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key R&D Program of China (grants 2023YFC3010700) and the National Natural Science Foundation of China (grants 42275081).

Data Availability Statement

The radar echo reflectivity data can be found at http://www.nmc.cn/publish/radar/chinaall.html (accessed on 21 February 2025).

Acknowledgments

We thank the many contributors from the science team of the IAP who enabled our research and made this project possible.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fabry, F.; Seed, A.W. Quantifying and predicting the accuracy of radar-based quantitative precipitation forecasts. Adv. Water Resour. 2009, 32, 1043–1049. [Google Scholar] [CrossRef]
Li, M.; Yan, B.W. A study on the response characteristics of precipitation to environmental factor changes in the Guanzhong region of Shaanxi. J. Irrig. Drain. 2016, 35, 97–100. [Google Scholar]
Chen, L.; Cao, Y.; Ma, L.; Zhang, J. A deep learning-based methodology for precipitation nowcasting with radar. Earth Space Sci. 2020, 7, e2019EA000812. [Google Scholar] [CrossRef]
Czibula, G.; Mihai, A.; Albu, A.I.; Czibula, I.G.; Burcea, S.; Mezghani, A. Autonowp: An approach using deep autoencoders for precipitation nowcasting based on weather radar reflectivity prediction. Mathematics 2021, 9, 1653. [Google Scholar] [CrossRef]
Ehsani, M.R.; Zarei, A.; Gupta, H.V.; Barnard, K.; Behrangi, A. Nowcasting-Nets: Deep neural network structures for precipitation nowcasting using IMERG. arXiv 2021, arXiv:2108.06868. [Google Scholar]
Ayzel, G.; Heistermann, M.; Winterrath, T. Optical flow models as an open benchmark for radar-based precipitation nowcasting (rainymotion v0.1). Geosci. Model Dev. 2019, 12, 1387–1402. [Google Scholar] [CrossRef]
Kim, D.K.; Suezawa, T.; Mega, T.; Kikuchi, H.; Yoshikawa, E.; Baron, P.; Ushio, T. Improving precipitation nowcasting using a three-dimensional convolutional neural network model from multi parameter phased array weather radar observations. Atmos. Res. 2021, 262, 105774. [Google Scholar] [CrossRef]
Sun, J.; Xue, M.; Wilson, J.W.; Zawadzki, I.; Ballard, S.P.; Onvlee-Hooimeyer, J.; Joe, P.; Barker, D.M.; Li, P.W.; Golding, B.; et al. Use of NWP for nowcasting convective precipitation: Recent progress and challenges. Bull. Am. Meteorol. Soc. 2014, 95, 409–426. [Google Scholar] [CrossRef]
Bauer, P.; Thorpe, A.; Brunet, G. The quiet revolution of numerical weather prediction. Nature 2015, 525, 47–55. [Google Scholar] [CrossRef]
Yano, J.I.; Ziemiański, M.Z.; Cullen, M.; Termonia, P.; Onvlee, J.; Bengtsson, L.; Carrassi, A.; Davy, R.; Deluca, A.; Gray, S.L.; et al. Scientific challenges of convective-scale numerical weather prediction. Bull. Am. Meteorol. Soc. 2018, 99, 699–710. [Google Scholar] [CrossRef]
Germann, U.; Galli, G.; Boscacci, M.; Bolliger, M. Radar precipitation measurement in a mountainous region. Q. J. R. Meteorol. Soc. J. Atmos. Sci. Appl. Meteorol. Phys. Oceanogr. 2006, 132, 1669–1692. [Google Scholar] [CrossRef]
Sokol, Z.; Mejsnar, J.; Pop, L.; Bližňák, V. Probabilistic precipitation nowcasting based on an extrapolation of radar reflectivity and an ensemble approach. Atmos. Res. 2017, 194, 245–257. [Google Scholar] [CrossRef]
Zhang, L.; Wei, M.; Li, N.; Zhou, S.H. Application of the improved optical flow method in echo extrapolation forecasting. Sci. Technol. Eng. 2014, 14, 133–137+148. [Google Scholar]
Wang, G.L.; Zhao, C.G.; Liu, L.P.; Wang, H.Y. Error analysis of radar echo extrapolation forecasting. Plateau Meteorol. 2013, 32, 874–883. [Google Scholar]
Han, F.; Long, M.S.; Li, Y.A.; Xue, F.; Wang, J.M. Application of recurrent neural networks in radar nowcasting. J. Appl. Meteorol. Sci. 2019, 30, 61–69. [Google Scholar]
Saratha, S.; Tajuddin, W. Logic learning in hopfield networks. Mod. Appl. Sci. 2008, 2, 57–63. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting; MIT Press: Cambridge, MA, USA, 2015; Volume 28, pp. 802–810. [Google Scholar]
Shi, X.; Gao, Z.; Lausen, L.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Deep learning for precipitation nowcasting: A benchmark and a new model. Adv. Neural Inf. Process. Syst. 2017, 30, 5617–5627. [Google Scholar]
Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. PredRNN: Recurrent neural networks for predictive learning using spatiotemporal LSTMs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 879–888. [Google Scholar]
Wang, Y.; Gao, Z.; Long, M.; Wang, J.; Yu, P.S. PredRNN++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5123–5132. [Google Scholar]
Wang, Y.; Zhang, J.; Zhu, H.; Long, M.; Wang, J.; Yu, P.S. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9154–9162. [Google Scholar]
Huang, Q.; Chen, S.; Tan, J. TSRC: A deep learning model for precipitation short-term forecasting over China using radar echo data. Remote Sens. 2023, 15, 142. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Gao, Z.; Shi, X.; Wang, H.; Zhu, Y.; Wang, Y.B.; Li, M.; Yeung, D.Y. Earthformer: Exploring space-time transformers for earth system forecasting. Adv. Neural Inf. Process. Syst. 2022, 35, 25390–25403. [Google Scholar]
Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar]
Yang, Y.; Mehrkanoon, S. AA-TransUNet: Attention Augmented TransUNet For Nowcasting Tasks. arXiv 2022, arXiv:2202.04996. [Google Scholar]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Xu, Z.; Du, J.; Wang, J.; Jiang, C.; Ren, Y. Satellite Image Prediction Relying on GAN and LSTM Neural Networks. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–6. [Google Scholar]
Ravuri, S.; Lenc, K.; Willson, M.; Kangin, D.; Lam, R.; Mirowski, P.; Fitzsimons, M.; Athanassiadou, M.; Kashem, S.; Madge, S.; et al. Skilful precipitation nowcasting using deep generative models of radar. Nature 2021, 597, 672–677. [Google Scholar] [CrossRef]
Zhang, Y.; Long, M.; Chen, K.; Xing, L.; Jin, R.; Jordan, M.I.; Wang, J. Skilful nowcasting of extreme precipitation with nowcastnet. Nature 2023, 619, 526–532. [Google Scholar] [CrossRef] [PubMed]
Gao, Z.; Shi, X.; Han, B.; Wang, H.; Jin, X.; Maddix, D.; Zhu, Y.; Li, M.; Wang, Y. Prediff: Precipitation nowcasting with latent diffusion models. arXiv 2023, arXiv:2307.10422. [Google Scholar]
Zhao, Z.; Dong, X.; Wang, Y.; Hu, C. Advancing realistic precipitation nowcasting with a spatiotemporal transformer-based denoising diffusion model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4102115. [Google Scholar] [CrossRef]
Chen, T. On the importance of noise scheduling for diffusion models. arXiv 2023, arXiv:2301.10972. [Google Scholar]
Hoogeboom, E.; Heek, J.; Salimans, T. simple diffusion: End-to-end diffusion for high resolution images. arXiv 2023, arXiv:2301.11093. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. [Google Scholar]
Agrawal, S.; Barrington, L.; Bromberg, C.; Burge, J.; Gazen, C.; Hickey, J. Machine learning for precipitation nowcasting from radar images. arXiv 2019, arXiv:1912.12132. [Google Scholar]
Zhang, C.Q. Research on Weather Radar Echo Extrapolation Algorithm Based on Deep Learning. Master’s Thesis, Nanjing University of Information Science and Technology, Nanjing, China, 2023. [Google Scholar]
Lebedev, V.; Ivashkin, V.; Rudenko, I.; Ganshin, A.; Molchanov, A.; Ovcharenko, S.; Grokhovetskiy, R.; Bushmarinov, I.; Solomentsev, D. Precipitation nowcasting with satellite imagery. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2680–2688. [Google Scholar]
Pan, X.; Lu, Y.; Zhao, K.; Huang, H.; Wang, M.; Chen, H. Improving nowcasting of convective development by incorporating polarimetric radar variables into a deep learning model. Geophys. Res. Lett. 2021, 48, e2021GL095302. [Google Scholar]
Ayzel, G.; Scheffer, T.; Heistermann, M. RainNet v1.0: A convolutional neural network for radar-based precipitation nowcasting. Geosci. Model Dev. 2020, 13, 2631–2644. [Google Scholar]
Trebing, K.; Staǹczyk, T.; Mehrkanoon, S. Smaat-unet: Precipitation nowcasting using a small attention-unet architecture. Pattern Recognit. Lett. 2021, 145, 178–186. [Google Scholar]
Heye, A.; Venkatesan, K.; Cain, J. Precipitation nowcasting: Leveraging deep recurrent convolutional neural networks. In Proceedings of the Cray User Group (CUG), Redmond, WA, USA, 11 May 2017. [Google Scholar]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Min, C.; Chen, S.; Gourley, J.J.; Chen, H.; Zhang, A.; Huang, Y.; Huang, C. Coverage of China new generation weather radar network. Adv. Meteorol. 2019, 2019, 5789358. [Google Scholar]
Horn, B.K.P.; Schunck, B.G. Determining optical flow. Artif. Intell. 1981, 17, 185–203. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar]
Tan, J.; Huang, Q.; Chen, S. Deep learning model based on multi-scale feature fusion for precipitation nowcasting. Geosci. Model Dev. 2024, 17, 53–69. [Google Scholar] [CrossRef]

Figure 1. The topography of China and the distribution map of the new-generation weather radar network. White dots represent C-band radars and red dots represent S-band radars [47].

Figure 2. The framework of the short-term forecasting model based on residual and attention mechanisms.

Figure 3. Convolutional block attention module (CBAM) [46].

Figure 4. The variation trends of MAE and SSIM with forecast lead time.

Figure 5. The variation trends of CSI and FAR with forecast lead time at different thresholds.

Figure 6. Forecast results for Case 1 starting from 20 April 2024, 12:48:00 UTC (with a forecast area covering a latitude and longitude range of approximately 100°E–128°E, 18°N–48°N); (top): the first row shows the ground truth, while the subsequent rows display the predictions from different models; (bottom): the variation trends of MAE with forecast lead time.

Figure 7. Forecast results for Case 2 starting from 25 May 2024, 14:24:00 UTC (with a forecast area covering a latitude and longitude range of approximately 95°E–135°E, 18°N–52°N); (top): the first row shows the ground truth, while the subsequent rows display the predictions from different models; (bottom): the variation trends of MAE with forecast lead time.

Table 1. Model parameter settings.

Parameter	Default Value	Description
Initial learning rate	0.001	Controls the speed of model learning. If too small, convergence is slow; if too large, the loss may oscillate or increase.
Optimizer	Adam	The Adam optimization algorithm, an adaptive method based on first and second moment estimates. The default parameters are: learning rate = 0.001, β₁ = 0.9, β₂ = 0.999.
Maximum Iterations	100	The maximum number of training iterations, determining the total number of steps for model training.
Batch size	12	The amount of data input to the model during each training step. If too small, gradient fluctuations are large; if too large, generalization ability may decrease.
Training dataset	29,942	Used for model training (80% of the dataset).
Validation dataset	3742	Used for hyperparameter tuning and early stopping (10% of the dataset).
Test dataset	3742	Used for final performance evaluation (10% of the dataset).
Loss function	MSELoss	Mean Squared Error Loss function, used to compute the average squared difference between predicted and actual values.
Learning rate decay strategy	Adaptive learning rate adjustment	If the validation loss does not decrease for 3 consecutive epochs, the learning rate is reduced to 90% of its previous value.
Model Parameters	27,292,672	Storage requirement, indicating hardware storage demand.

Table 2. Confusion matrix for precipitation prediction.

Precipitation Event	Predicted Rain	Predicted No Rain
Actual rain	hits	misses
Actual no rain	falsealarms	Nan

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Z.; Song, Q.; Duan, M.; Liu, H.; Huo, J.; Han, C. Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms. Remote Sens. 2025, 17, 1123. https://doi.org/10.3390/rs17071123

AMA Style

Zhang Z, Song Q, Duan M, Liu H, Huo J, Han C. Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms. Remote Sensing. 2025; 17(7):1123. https://doi.org/10.3390/rs17071123

Chicago/Turabian Style

Zhang, Zhan, Qingping Song, Minzheng Duan, Hailei Liu, Juan Huo, and Congzheng Han. 2025. "Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms" Remote Sensing 17, no. 7: 1123. https://doi.org/10.3390/rs17071123

APA Style

Zhang, Z., Song, Q., Duan, M., Liu, H., Huo, J., & Han, C. (2025). Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms. Remote Sensing, 17(7), 1123. https://doi.org/10.3390/rs17071123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms

Abstract

1. Introduction

2. Materials and Methods

2.1. Radar Echo Reflectivity Dataset

2.2. RA-Unet

2.3. Model Training

2.4. Comparison Models

2.4.1. Optical Flow

2.4.2. U-Net

2.5. Evaluation Metrics

3. Results

3.1. The Predictive Performance on the Test Dataset

3.2. Case Analysis

3.2.1. Case 1

3.2.2. Case 2

4. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI