Next Article in Journal
Agrochemical Nitrogen Cycles, Photosynthesis Performance of Nitrogen Use Efficiency, and Yield of Maize
Previous Article in Journal
Using an Artificial Neural Network to Assess Several Rainfall Estimation Algorithms Based on X-Band Polarimetric Variables in West Africa
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance Evaluation of Numerical Weather Prediction Models in Forecasting Rainfall Events in Kerala, India

1
Department of Civil Engineering, National Institute of Technology, Calicut 673601, Kerala, India
2
Department of Biological Systems Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
*
Author to whom correspondence should be addressed.
Atmosphere 2025, 16(4), 372; https://doi.org/10.3390/atmos16040372
Submission received: 19 January 2025 / Revised: 16 March 2025 / Accepted: 21 March 2025 / Published: 25 March 2025

Abstract

:
Heavy rainfall events are the main cause of flooding, especially in regions like Kerala, India. Kerala is vulnerable to extreme weather due to its geographical location in the Western Ghats. Accurate forecasting of rainfall events is essential for minimizing the impact of floods on life, infrastructure, and agriculture. For accurate forecasting of heavy rainfall events in this region, region-specific evaluations of NWP model performance are very important. This study evaluated the performance of six Numerical Weather Prediction (NWP) models—NCEP, NCMRWF, ECMWF, CMA, UKMO, and JMA—in forecasting heavy rainfall events in Kerala. A comprehensive assessment of these models was performed using traditional performance metrics, categorical precipitation metrics, and Fractional Skill Scores (FSSs) across different forecast lead times. FSSs were calculated for different rainfall thresholds (100 mm, 50 mm, 5 mm). The results reveal that all models captured rainfall patterns well for the lower threshold of 5 mm, but most of the models struggled to accurately forecast heavy rainfall, especially for longer lead times. JMA performed well overall in most of the metrics except False Alarm Ratio (FAR). It showed high FAR, which revealed that it may predict false rainfall events. ECMWF demonstrated consistent performance. NCEP and UKMO performed moderately well. CMA, and NCMRWF had the lowest accuracy either due to more errors or biases. The findings underscore the trade-offs in model performance, suggesting that model selection should depend on the accuracy required or rainfall event prediction capability. This study recommends the use of Multi-Model Ensembles (MME) to improve forecasting accuracy, integrate the strengths of the best-performing models, and reduce biases. Future research can also focus on expanding observational networks and employing advanced data assimilation techniques for more reliable predictions, particularly in regions with complex terrain such as Kerala.

1. Introduction

Heavy rainfall events are one of the primary contributors to flooding, which poses significant risks to life, infrastructure, and agriculture across many regions of the world [1]. Both seasonal-monsoon-driven extreme events and climate-change-induced sea level rise and storm surge are commonly linked with floods and landslides [2]. In several tropical countries including India, where monsoon-driven rainfall is a crucial part of the climate system, accurate forecasting of rainfall and distribution of good quality data are critical for minimizing flood-related impacts and enhancing disaster monitoring and management strategies [3].
The state of Kerala and adjacent areas in the Western Ghats region located on the southwestern coast of India are particularly vulnerable to extreme weather events, including heavy rainfall and landslides, owing to its geographical location and topographical features [4]. These rainfall events often lead to widespread flooding and inundation, necessitating robust forecasting systems capable of predicting such events with high accuracy and reliability. The 2018 flood in Kerala was the most devastating in a century, with the previous major flood in the monsoon of 1924 [5]. Recently, there has been a notable rise in extreme rainfall events around the globe. The key factor in issuing a flood alert is the accuracy of forecasting heavy rainfall events. As such, predicting these events is crucial for minimizing damage and preventing loss of life [6].
Rainfall forecasts are currently provided using conventional methods such as satellite observations, weather radars, and Numerical weather prediction (NWP) models [7,8]. Among these observations, satellites and weather radars provide 41 qualitative forecasts, while NWP models provide quantitative forecasts [9,10]. A preliminary assessment of Global Precipitation Measurement (GPM)-based precipitation estimates over India for the 2014 monsoon season highlights that all products struggle with detecting precipitation in orographic regions like northeast and southeast India [11]. NWP models are physically based models represented by governing equations, processes, and parameters that cover land, atmosphere, and ocean conditions [12], and the outputs from these models are compared and used to assess the impact across multiple spatial and temporal scales [13]. Since the early 1990s, NWP output ensembles have been utilized. It is widely accepted that ensemble forecasts offer probabilistic insights that are more useful for understanding the limitations caused by unavoidable errors in the initial conditions than relying solely on deterministic forecasts [14,15,16].
NWP model verification plays a crucial role in meteorological research and operational forecasting activities [17]. When verification methodologies are carefully designed, the results can effectively address the needs of various stakeholders, including modelers, forecasters, and users of forecast information. Durai et al. evaluated the performance of two National Centers for Environmental Prediction Global Forecast System (NCEP) resolutions, namely, T574 and T382 over India during the 2011 summer monsoon, finding that both models showed skill in capturing heavy rainfall regions, with T574 outperforming T382, though both models exhibited biases in lower and upper tropospheric moisture and circulation [18]. Forecasting heavy rainfall using Quantitative Precipitation Forecasting (QPF) remains a significant challenge, even for advanced high-resolution NWP models [19,20].
Sharma et al. concentrated on evaluating the effectiveness of the United Kingdom Met Office Unified Model (UKMO) in forecasting intense rainfall events across India. The predictions effectively represented the overall characteristics of average monsoon rainfall, particularly highlighting higher precipitation levels along India’s western coast [21]. Ashrit et al. evaluated the accuracy of NWP models utilized by the National Centre for Medium-Range Weather Forecasting (NCMRWF) in forecasting the extreme rainfall that occurred in Kerala in August 2018 [22]. The performance of five global NWP models (National Centre for Unified Modelling (NCUM), UKMO, India Meteorological Department Global Forecast System (IMD GFS), NCEP, and European Centre for Medium-Range Weather Forecasts (ECMWF)) was assessed in forecasting daily rainfall over India during the 2020 monsoon season using both traditional and advanced verification methods. While all models accurately captured large-scale monsoon patterns, forecast accuracy decreased with lead time, with ECMWF performing best overall, though regional performance varied, and models struggled to predict localized, high-intensity rainfall events [23].
The performance of the Global Forecasting System (GFS) and Weather Research and Forecasting (WRF) models for forecasting monsoon rainfall over India during the summer monsoon season of 2014 was evaluated. The GFS model showed reasonable accuracy in predicting large-scale rainfall features and significant improvement from 2013 to 2014, particularly for longer forecast periods. In contrast, the WRF model exhibited a consistent over-prediction in rainfall, highlighting the need for bias correction and improved physical parameterization schemes for better monsoon predictions [24]. ECMWF, NCEP, and UKMO models were assessed for forecasting extreme rainfall over India, and they could predict rainstorms up to five days in advance with biases in spatial distribution and intensity. NCEP showed smaller spread but less accurate averages, and all models exhibited under-prediction and increased errors in longer forecasts [6]. Venkat Rao et al. evaluated the NCEP model for rainfall forecasting in the Nagavali and Vamsadhara river basins and found good performance with correlation coefficients > 0.3 and probability of detection > 0.6 for day-1 and day-3 forecasts. Bias analysis showed a shift from overestimation to underestimation with increasing lead time. Bias correction improved RMSE by >18% for day-1 forecasts, offering insights for flood forecasting, early warning, and water management [25]. The object-based verification method reveals that the WRF model overproduces large rain areas and underestimates the diurnal cycle of rainfall, with a positive size bias, particularly in the afternoon. However, this method is sensitive to object size, which can lead to inaccurate results for smaller-scale events [26]. Another evaluation of the GFS T1534 model for the 2016–2017 monsoon seasons reveals a wet bias over land and overestimation of lighter rainfall, while underestimating heavier rainfall [17]. Singhal et al. conducted an inter-comparison of four gridded quantitative precipitation forecasts—ECMWF, Japan Meteorological Agency (JMA), NCMRWF, and UKMO—over the Ganga River basin in India and found ECMWF to be a suitable substitute for NCMRWF in detecting spatial patterns of extreme precipitation. In terms of NSE, JMA outperforms the other three NWP models. Additionally, JMA exhibited similar patterns of probability of detection when compared to ECMWF [27]. Sonawane et al. evaluated the JMA model’s ability to predict monsoon rainfall over 32 years. The study revealed that the model could effectively represent fluctuations in Indian summer monsoon precipitation, especially when incorporating influences such as El Niño and the Indian Ocean Dipole concurrently [28].
Over the years, a variety of spatial verification methods have been developed to more effectively assess the performance of high-resolution forecasts [29]. Gallus et al. noted that spatial verification provides more detailed and relevant metrics of forecast skill, offering a clearer picture of the accuracy of high-resolution predictions. Neighborhood or fuzzy methods evaluate forecast accuracy within space-time neighborhoods. In these methods, all grid-scale values surrounding an observation are treated as equally plausible estimates of the true value. “Neighborhood verification” assesses forecast skill by varying neighborhood sizes and conducting verification across different spatial scales and intensity thresholds [30]. The Fractions Skill Score (FSS), introduced by Roberts and Lean, exemplifies such fuzzy spatial verification techniques. Instead of a direct grid-to-grid comparison, FSS evaluates forecast accuracy within local neighborhoods of observations [31,32].
The performance of the NWP models can vary depending on several factors, including forecast lead time, precipitation thresholds, and regional geographical characteristics. Kerala, situated along the Western Ghats, experiences complex orographic influences that significantly affect heavy rainfall events. The interaction between the steep terrain and moisture-laden low-level westerly winds from the Arabian Sea enhances precipitation, leading to highly localized and intense rainfall. The intensity and variability of the monsoon flow further influence precipitation patterns, making accurate forecasting particularly challenging. Despite Kerala’s vulnerability to extreme rainfall and associated flood risks, there is a lack of region-specific evaluation of rainfall forecast models, limiting efforts to enhance prediction accuracy under its unique climatic conditions. Given the critical need for reliable heavy rainfall forecasting, this research aims to evaluate the ability of different NWP models in the short-term forecasting of heavy rainfall events in Kerala, India, and to assess their performance using a range of evaluation metrics. By focusing on traditional performance metrics, categorical precipitation metrics, and fractional skill scores, this study provides a comprehensive understanding of the strengths and weaknesses of various models in predicting heavy rainfall across different forecast periods.
The outcomes of this analysis are intended to guide the selection of appropriate NWP models for operational forecasting, particularly in the context of short-term forecasting, which is crucial for timely flood warning systems. Furthermore, the insights from this study are expected to contribute to improve the overall accuracy and reliability of rainfall forecasts, ultimately aiding in the better management of flood risks in vulnerable regions like Kerala.

2. Study Area and Data Sources

Kerala, situated between 8° and 13° N latitude and 74° to 78° E longitude (Figure 1), occupies the southwestern edge of the Indian subcontinent. The state features a highly varied topography characterized by coastal plains, mid-elevation hills, and the rugged Western Ghats, which rise to over 2695 m. This geographical diversity significantly influences Kerala’s climatic and hydrological dynamics. The state experiences a humid tropical monsoon climate dominated by two principal rainfall seasons: the Southwest Monsoon (June–September), contributing over 70% of the annual precipitation, and the Northeast Monsoon (October–December) [33,34,35]. Kerala’s dense river network, comprising 44 rivers, supports its hydrological system. The Western Ghats act as an orographic barrier, intensifying monsoon rainfall and influencing large-scale atmospheric processes like integrated vapor transport and atmospheric river activity. This complex interaction between terrain, monsoon dynamics, and hydrological processes makes Kerala an ideal study area for examining extreme precipitation events, which are important for water resource management in a changing climate.
The IMD and the NCMRWF collaboratively developed a merged satellite–gauge rainfall dataset to improve monsoon rainfall analysis and numerical model validation. This dataset, known as the IMD-NCMRWF merged satellite–gauge product, integrates GPM-based near real-time multi-satellite precipitation estimates with IMD’s dense rain gauge network across India, enhancing accuracy [36,37]. This product is recognized as one of the most reliable gridded rainfall datasets for the Indian region, and the IMD-NCMRWF merged satellite–gauge product is widely utilized in hydro-meteorological research and forecasting applications [38,39,40]. Studies have demonstrated its effectiveness in capturing tropical cyclone (TC) rainfall over India, making it a valuable resource for extreme weather studies [41]. For this study, the gridded rainfall used to validate NWP model rainfall forecasts is the IMD-NCMRWF merged (Satellite + Gauge) dataset for the monsoon season (June–September) from 2018 to 2022, available at a 0.25° × 0.25° grid resolution.
The daily precipitation forecasts from six NWP models were downloaded from the ECMWF TIGGE Data Retrieval website. These models were the NCEP, NCMRWF, ECMWF, China Meteorological Administration (CMA) GFS, UKMO, and JMA. The forecasts cover 1-day, 2-day, and 3-day rainfall predictions, with a 0.25° grid resolution used in this study. The data span the monsoon season (June–September) from 2018 to 2022 and specifically focus on Kerala. Both the observed and past forecast datasets were taken in gridded format, covering the entire Kerala state, and were then considered for further performance evaluation. The dataset consists of a grid of latitude and longitude points ranging from 8° N to 13° N and 74° E to 78° E, with a 0.25° resolution, ensuring consistent spatial coverage across the study region.

3. Methodology

This study evaluated the accuracy of six different operational forecast models: NCEP, NCMRWF, ECMWF, CMA, UKMO, and JMA. The models’ performance was analyzed for their ability to accurately predict rainfall in Kerala during the monsoon season (June–September) from 2018 to 2022. Table 1 below provides details of the six operational models evaluated in this study (source: https://confluence.ecmwf.int/display/TIGGE/Models (accessed on 21 November 2024)). Model performance was evaluated using several accuracy and skill measures for day-1 to day-3 forecasts of rainfall over Kerala.
The overall methodology for this study is summarized in Figure 2. Different verification methods were used in the past to evaluate the performance of different NWP models. It is crucial not to focus solely on a particular method. Even models with an overall strong performance may not always be the best fit for every specific application or region [23,42]. The Traditional Verification Methods, Categorical Statistics Verification Methods, and Fractional Skill Score (FSS) were used in this study to assess precipitation forecast accuracy of NWP models.

3.1. Traditional Verification Methods

Traditional methods, namely, root mean square error (RMSE), anomaly correlation coefficient (ACC), mean error (ME), and relative bias (RBIAS) quantifying errors and biases, help to identify discrepancies between observed and predicted rainfall. The mathematical equations for RMSE, ACC, RBIAS, and ME are presented in Table 2. RMSE quantifies the absolute average error, giving more weight to larger discrepancies. The ACC measures the similarity of the forecasted precipitation anomalies (deviations from the mean) with the observed anomalies, indicating how well the forecast captures the patterns of precipitation, especially during extreme events [25,43,44]. RBIAS is particularly important for assessing systematic biases and identifying over- or under-prediction trends in precipitation forecasts, and it is essential for flood forecasting and model calibration [25,43,44]. The ME represents the averaged magnitude of differences between the forecasted and reference datasets.

3.2. Categorical Statistics Verification Methods

Categorical methods, namely, Probability of Detection (POD), False Alarm Rate (FAR), Critical Success Index (CSI), and Equitable Threat Score (ETS) focus on identifying rainfall events and false alarms, crucial for operational forecasting and flood preparedness [25,43,45]. These verification methods evaluate the performance of precipitation forecasts by comparing predicted and observed events using contingency tables (Table 3). These methods categorize forecasts into hits, misses, false alarms, and correct negatives to calculate skill scores that assess forecast accuracy and reliability. POD measures the ability to correctly forecast rainfall events; FAR quantifies over-predictions; CSI represents the proportion of correctly forecasted rainfall events among all occurrences; and the ETS accounts for hits due to chance. Equations for calculating skill scores are presented in Table 4. These methods typically require a threshold (1 mm/day) to classify events as rain or no-rain, ensuring consistency in verification [25,39,43,45,46,47,48,49,50,51,52].
Here, Hits (H) represents correctly predicted rainfall events; False Alarms (F) indicates instances where rainfall was forecasted but did not occur; Misses (M) refers to observed rainfall events that were not predicted by the forecast; and True negatives (T) denote cases where no rainfall was observed, and the forecast correctly predicted no rainfall.

3.3. Fractional Skill Score (FSS)

Fractional Skill Score (FSS) assesses how well the forecasted rainfall matches the observed rainfall over a specified threshold and spatial domain, valuable for extreme weather and localized rainfall predictions [46,47]. The evaluation focused on three rainfall thresholds (5.0 mm, 50.0 mm, and 100.0 mm). The Fractional Brier Score (FBS) and Worst Possible Fractional Brier Score F B S W O R S T were computed for each mode and threshold by comparing forecasted and observed rainfall. The Fractional Skill Score (FSS) was then derived to assess model accuracy, with higher FSS indicating better performance. The FSS values range from 0 (for the worst forecast) to 1 for the perfect forecast [46,47,53,54,55].
F S S = 1 F B S F B S W O R S T
FBS = 1 N i = 1 n O i F i 2
where F i and O i are the forecast and observed fractions, respectively, at each point i and have values between 0 and 1, and N is the number of pixels in the verification area.
F B S w W O R S T = 1 N i = 1 N O i 2 + i = 1 N F i 2

4. Results and Discussion

4.1. Traditional Verification Methods

The analysis of forecast performance reveals several key trends and insights across the NWP models. Key metrics such as anomaly correlation coefficient (ACC), root mean square error (RMSE), mean error (ME), and relative bias (RBIAS) are used to evaluate each model’s ability to predict rainfall accurately (Figure 3). Figure 3 compares forecast performance across NWP models using these metrics. A key observation from Figure 3 is that the systematic decline in forecast skill with increasing lead time, reflected in decreasing ACC values and rising RMSE across all models from Day 1 to Day 3. Several studies have reported similar findings, emphasizing the challenges of maintaining forecast reliability as lead time increases [6,25,27,56,57,58]. It may be due to the errors in initial conditions, assumptions in model physical process, and the chaotic nature of the atmosphere, which increase over time, leading to diminished accuracy and reduced agreement between predicted and observed rainfall as lead time increases. Among the models, JMA excels in short-term forecasting, achieving the highest ACC and lowest RMSE, which agrees with the findings of Singhal et al. [27]. ECMWF demonstrates strong performance, though its accuracy declines slightly over time. The first-day average ACC for NCEP is approximately 0.34, which aligns with similar studies conducted in the region [6,25]. An ACC greater than 0.3 is regarded as a reliable indicator of precipitation forecast accuracy [24]. NCMRWF shows lower ACC and higher RMSE, indicating poor forecast skill. NCEP and UKMO show moderate performance considering all the three days of the forecast.
A positive RBIAS indicates overestimation (forecast is higher than observed). All models generally over-predicted precipitation, indicated by their positive RBIAS values. A possible reason for the overprediction of rainfall could be the model overestimating the area experiencing wet conditions, resulting in rainfall being predicted over a larger region than what actually occurred [59,60,61]. Additionally, a significant bias was evident in all models over the Arabian Sea near the Indian west coast [23]. ECMWF had the lowest RBIAS values. NCEP and UKMO showed moderate errors [61]. CMA exhibited the highest RBIAS, indicating significant forecast inaccuracies.
Among the models, JMA performed better, achieving the highest ACC values on all three days (0.394 on Day 1), indicating its superior ability to capture precipitation patterns, which agrees with earlier studies [27,28]. It also achieved the lowest RMSE across all days (14.73 on Day 1 and 15.84 on Day 3), reflecting minimal error magnitudes. While its ME values remained low, highlighting accurate predictions, JMA’s RBIAS, though moderate (38.92 on Day 1, improving to 29.13 on Day 3), suggests some room for bias correction. ECMWF demonstrated superior consistency, making it more reliable for predictions. NCEP and UKMO displayed moderate errors, while CMA and NCMRWF struggled to maintain forecast quality. Table 5, Table 6 and Table 7 present the comparative rankings of NWP models for Day 1, Day 2, and Day 3 forecasts, respectively, based on key performance metrics.

4.2. Categorical Statistics Verification Methods

The performance of NWP models in categorical precipitation verification metrics Probability of Detection (POD), False Alarm Ratio (FAR), Critical Success Index (CSI), and Equitable Threat Score (ETS) for Days 1, 2, and 3 revealed critical insights into their forecasting skill. Figure 4 compares categorical precipitation verification metrics across different NWP models, highlighting variations in performance.
Across all models, a systematic decrease in POD, CSI, and ETS was observed from Day 1 to Day 3, which aligns with the expected decline in forecast skill with increasing lead time due to the chaotic nature of atmospheric dynamics and accumulating uncertainties in the models [24,25,27,49,58]. FAR remained relatively consistent for most models, indicating that false alarms persist at similar levels regardless of lead time, reflecting challenges in refining detection thresholds [27]. ECMWF and JMA showed high POD, CSI, and ETS, which indicated good extreme event detection. The ECMWF model demonstrated a more balanced performance, with consistently high POD values coupled with relatively moderate FAR (0.41 on all days). This balance resulted in competitive CSI scores and ETS scores, which were the highest among all models [27,58,61]. The relatively smaller decline in ETS compared to other models’ further underscores ECMWF’s robustness over increasing lead times. JMA had the highest FAR (0.45–0.47), which reveals that it may predict false rainfall events. The overall CSI value across all models ranged from 0.49 to 0.57 over the three-day period, which is aligned with other studies [25].
A few studies showed a POD above 0.8 for CMA, indicating their strong capability in detecting rainfall events [24,25]. Conversely, in this study, the CMA model consistently underperformed, with the lowest POD values, suggesting limited skill in detecting precipitation events. Its FAR values were comparable to other models, but its CSI and ETS scores were the lowest across all days. CMA faced significant challenges across all metrics, underscoring the need for improvements in its model physics and assimilation methods. The challenges were further compounded in regions with complex topography, where models often fail to capture localized extreme weather events. Similar issues have been reported in other studies conducted in comparable geographical regions [44,61,62,63]. The findings highlight the inherent trade-offs in precipitation forecasting, where achieving a balance between detection, accuracy, and reliability is key to improving forecast skill.
This analysis highlights the variability in model performance, emphasizing the use of data assimilation schemes to forecast inaccuracies. Since each model employs a unique assimilation technique, their effectiveness in representing key atmospheric processes differs. This indicates that the reliability of extreme event predictions depends on how well a model integrates observational data and captures essential physical mechanisms [44,63,64,65]. The poor verification scores, combined with a high False Alarm Ratio (FAR exceeding 0.5), indicate significant challenges in existing microphysics schemes when simulating warm-rain extreme events in weakly forced synoptic conditions. These shortcomings emphasize the necessity for improvements in the representation of cloud and drizzle auto conversion as well as accretion processes [50,66]. The sources of errors in each model vary based on specific weather conditions, influencing their ability to forecast precipitation accurately. For stratiform rainfall, errors often arise from inaccuracies in cloud microphysics and large-scale dynamics, leading to misrepresentation of precipitation intensity and spatial distribution. In the case of convective rainfall, deficiencies in convection schemes can result in poor simulation of localized thunderstorms, causing underprediction or false alarms. Orographic rainfall, which is strongly influenced by terrain interactions, presents challenges due to coarse model resolution and inadequate representation of topographically induced lifting. Seasonal variations also contribute to model biases, as monsoon dynamics, moisture availability, and large-scale circulation patterns affect precipitation predictability. Addressing these errors requires improvements in physical parameterizations, higher-resolution models, and better data assimilation techniques to enhance rainfall forecasting accuracy across different weather conditions.

4.3. Fractional Skill Score (FSS)

Extreme rainfall events can significantly impact model performance. This effect is assessed by computing skill scores across various rainfall thresholds [58]. FSS is calculated using Equations (1)–(3) to evaluate the performance of the six NWP models for rainfall forecasting. The Fractional Skill Score (FSS) values for the first, second, and third forecast days were compared across the models for three rainfall thresholds (5.0 mm, 50.0 mm, and 100.0 mm). Figure 5 illustrates the comparison of FSS across NWP models for three thresholds.
FSS measures the model’s ability to match observed data, with values closer to 1 indicating better skill. In general, an FSS value greater than 0.5 is typically considered a reliable indicator of a forecast’s effectiveness [55]. As the forecast period extends, FSS decreases, reflecting a loss of forecast skill over time. This decline is due to increased uncertainty and model limitations in longer forecast horizons. The decline in FSS over time is due to atmospheric instability, where small initial errors grow, making long-term forecasts less accurate. Lower model resolution and boundary condition error also contribute, as they become more significant with time. Additionally, parameterization of physical processes, like convection, increases uncertainty in longer-range forecasts, reducing model skill. There was a decrease in FSS as the rainfall threshold increased. This pattern was evident across all models in the data, where FSS values were higher at lower thresholds and decreased for higher thresholds [23]. This trend reflects the general difficulty of predicting heavy rainfall events accurately compared to lighter ones, as the complexity of extreme events reduces model performance. Overall, NCEP and UKMO models showed higher forecast skill (FSS), especially for short-term forecasts and lower thresholds. ECMWF and JMA models, however, exhibited a significant decline in forecast skill, particularly for higher thresholds and longer periods [23]. This is due to the inherent unpredictability of atmospheric systems, where small errors grow over time, and model resolution limitations that affect long-term accuracy. The findings highlight the importance of considering both the forecast period and threshold when choosing a model.
As the rainfall threshold decreased, models became more aligned in predicting light to moderate rainfall, since they rely on large-scale atmospheric patterns that are similar across models. This results in FSS values converging, suggesting that river managers can expect consistent forecast dynamics for storms that require less precision in capturing small-scale details [53]. However, at higher thresholds (e.g., 50, 100 mm), small-scale processes like convection and orography become more important, leading to variations in model performance and divergence in FSS values. Therefore, the models’ ability to predict extreme events differs more at higher thresholds, while they are more consistent in forecasting lower-intensity rainfall.

5. Summary

The evaluation of six NWP models, namely, NCEP, NCMRWF, ECMWF, CMA, UKMO, and JMA, reinforces the importance of accurate precipitation forecasts. The analysis of NWP model performance across various metrics, including forecast accuracy, precipitation detection, and forecast skill, reveals key insights into the capabilities and limitations of different models as forecast lead time and rainfall thresholds increase. While all models captured rainfall patterns well, they struggled with accurately forecasting heavy rainfall events, particularly as forecast lead times and rainfall thresholds increased. Heavy precipitation events, especially those influenced by complex terrains such as the Western Ghats along India’s west coast, significantly impact model performance. The interaction between the mountain range and moisture-laden low-level westerly winds from the Arabian Sea enhances rainfall in this region, where the intensity of the large-scale monsoon flow influences both low-level wind patterns and the associated precipitation. This region is crucial for monsoon studies due to its significant role in modulating rainfall patterns.
All models showed declining accuracy as the forecast lead time increased. This pattern is a commonly observed feature of NWP models, which can be attributed to the complex and unpredictable nature of atmospheric processes and the growing uncertainty in the initial conditions as the forecast horizon lengthens. By analyzing traditional performance and categorical precipitation metrics, it was observed that JMA performed best but suffered from false alarms, while ECMWF demonstrated strong overall performance in different metrics. NCEP and UKMO had moderate performance, while CMA and NCMRWF showed very poor performance.
For rainfall thresholds, all models faced challenges as the threshold increased, particularly for extreme rainfall events. JMA and ECMWF showed better performance in traditional and categorical precipitation metrics, while analyzing FSS, ECMWF, and JMA showed a more significant decline in forecast skill as the rainfall threshold increased, particularly for extreme rainfall events.
Incorporating higher-resolution models and advanced data assimilation techniques can significantly improve the accuracy of extreme rainfall forecasts. High-resolution models capture finer-scale atmospheric processes and better represent local terrain influences, which is crucial for regions like Kerala with complex topography. Additionally, data assimilation techniques, such as four-dimensional variational (4D-Var) and ensemble Kalman filtering (EnKF), enhance initial conditions by integrating real-time observational data, leading to more accurate precipitation predictions. Combining these approaches with improved parameterization of convective processes and land–atmosphere interactions can help mitigate model biases and improve performance at higher rainfall thresholds.
Multi-Model Ensembles (MME) improve precipitation forecasting by combining outputs from multiple models to reduce errors and enhance reliability. One key approach is bias correction, where statistical methods like quantile mapping and Bayesian model averaging adjust systematic errors in individual models, leading to more accurate predictions, particularly for extreme rainfall events. Additionally, weighted averaging techniques assign weights to different models based on their past performance. MME also enhance spatial and temporal coverage by blending models that perform well under different conditions, which is particularly beneficial for regions with complex topography like Kerala. Furthermore, machine learning techniques, such as neural networks and random forests, can be employed to intelligently merge model outputs by identifying nonlinear relationships between model biases and observed rainfall patterns. Given Kerala’s highly localized and topographically influenced rainfall patterns, object-based verification methods face challenges in accurately capturing small-scale convective systems and extreme precipitation events. However, integrating object-based evaluation in future studies, along with grid-based assessments, could help provide a more comprehensive assessment by improving the analysis of spatial displacement and structural errors in precipitation forecasts [67].

6. Conclusions

The analysis of forecast performance across multiple NWP models emphasizes the inherent trade-offs involved in precipitation forecasting. JMA and ECMWF performed best, followed by NCEP and UKMO, while CMA and NCMRWF had the lowest accuracy due to higher biases. JMA excelled in short-term detection but struggled with false alarms. ECMWF showed superior consistency and was more reliable for medium-range forecast but showed a greater decline in performance for higher thresholds and longer forecast. NCEP and UKMO were effective for short-term forecasts, particularly for lower rainfall thresholds, but their skill diminished for more extreme events. CMA and NCMRWF faced challenges in precipitation detection and accurate forecasting, with higher biases and lower skill scores across all metrics. These findings underscore the need for selecting models based on the specific forecast horizon, threshold, and required reliability, with ECMWF and JMA being better suited for short- to medium-range forecasts and NCEP and UKMO showing more robustness for lower thresholds and shorter forecast periods. The results highlight the complexity of precipitation forecasting, where achieving a balance between detection, accuracy, and reliability is crucial for improving overall model performance. Using higher-resolution models improves local terrain representation, while data assimilation techniques like 4D-Var and EnKF enhance initial conditions, leading to better extreme rainfall forecasts.

Author Contributions

Conceptualization, V.N., S.K.P. and V.S.; Methodology, V.N., S.K.P., N.S.P. and V.S.; Software, V.N. and N.S.P.; Formal analysis, N.S.P.; Investigation, V.N. and N.S.P.; Resources, S.K.P.; Data curation, S.K.P. and V.S.; Writing—original draft, V.N. and N.S.P. Writing—review & editing, S.K.P. and V.S.; Supervision, S.K.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by funding from the Student Innovative Projects, Center for Innovation, Entrepreneurship and Incubation, NIT Calicut, Kerala, India (NITC/CIEI/SIP-BUDGET/2024-25/01/serial No.13). We sincerely appreciate their support in facilitating this research. The corresponding author’s (V. Sridhar) effort was funded in part as a Fulbright-Nehru senior scholar funded by the United States India Educational Foundation.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in this study for validating model rainfall forecasts is the IMD–NCMRWF merged (Satellite + Gauge) data, available at a 0.25° × 0.25° grid resolution (https://imdpune.gov.in/cmpg/Griddata/Rainfall_25_NetCDF.html (accessed on 24 November 2024)). Additionally, the daily precipitation forecasts from six Numerical Weather Prediction (NWP) models were downloaded from the ECMWF TIGGE Data Retrieval website (https://apps.ecmwf.int/datasets/data/tigge/levtype%3Dsfc/type%3Dcf/ (accessed on 24 November 2024)).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kundzewicz, Z.W.; Kanae, S.; Seneviratne, S.I.; Handmer, J.; Nicholls, N.; Peduzzi, P.; Mechler, R.; Bouwer, L.M.; Arnell, N.; Mach, K.; et al. Flood risk and climate change: Global and regional perspectives. Hydrol. Sci. J. 2013, 59, 1–28. [Google Scholar] [CrossRef]
  2. Rao, G.V.; Nagireddy, N.R.; Keesara, V.R.; Sridhar, V.; Srinivasan, R.; Umamahesh, N.V.; Pratap, D. Real-time flood forecasting using an integrated hydrologic and hydraulic model for the Vamsadhara and Nagavali basins, Eastern India. Nat. Hazards 2024, 120, 6011–6039. [Google Scholar]
  3. Whitehurst, D.; Friedman, B.; Kochersberger, K.; Sridhar, V.; Weeks, J. Drone-based community assessment, planning, and disaster risk management for sustainable development. Remote. Sens. 2021, 13, 1739. [Google Scholar] [CrossRef]
  4. Sujatha, E.R.; Sridhar, V. Mapping debris flow susceptibility using analytical network process in Kodaikkanal Hills, Tamil Nadu (India). J. Earth Syst. Sci. 2017, 126, 116. [Google Scholar] [CrossRef]
  5. Hunt, K.M.R.; Menon, A. The 2018 Kerala floods: A climate change perspective. Clim. Dyn. 2020, 54, 2433–2446. [Google Scholar] [CrossRef]
  6. Sagar, S.K.; Rajeevan, M.; Rao, S.V.B.; Mitra, A. Prediction skill of rainstorm events over India in the TIGGE weather prediction models. Atmos. Res. 2017, 198, 194–204. [Google Scholar] [CrossRef]
  7. Lekula, M.; Lubczynski, M.W.; Shemang, E.M.; Verhoef, W. Validation of satellite-based rainfall in Kalahari. Phys. Chem. Earth Parts A/B/C 2018, 105, 84–97. [Google Scholar] [CrossRef]
  8. Sridevi, C.; Kumar Singh, K.; Suneetha, P.; Reval Durai, V.; Kumar, A. Vještina prognoze oborine iznad Indije tijekom ljetnog monsuna 2015. GFS Modelom. Geofiz. 2018, 35, 40–52. [Google Scholar]
  9. Shahrban, M.; Walker, J.P.; Wang, Q.J.; Seed, A.; Steinle, P. An evaluation of numerical weather prediction based rainfall forecasts. Hydrol. Sci. J. 2016, 61, 2704–2717. [Google Scholar] [CrossRef]
  10. Sridevi, C.; Singh, K.K.; Suneetha, P.; Durai, V.R.; Kumar, A. Rainfall forecasting skill of GFS model at T1534 and T574 resolution over India during the monsoon season. Meteorol. Atmos. Phys. 2019, 132, 35–52. [Google Scholar] [CrossRef]
  11. Prakash, S.; Mitra, A.K.; AghaKouchak, A.; Liu, Z.; Norouzi, H.; Pai, D.S. A preliminary assessment of GPM-based multi-satellite precipitation estimates over a monsoon dominated region. J. Hydrol. 2018, 556, 865–876. [Google Scholar] [CrossRef]
  12. Sridhar, V.; Anderson, K.A. Human-induced modifications to land surface fluxes and their implications on water management under past and future climate change conditions. Agric. For. Meteorol. 2017, 234–235, 66–79. [Google Scholar] [CrossRef]
  13. Setti, S.; Maheswaran, R.; Sridhar, V.; Barik, K.K.; Merz, B.; Agarwal, A. Inter-comparison of gauge-based gridded data, reanalysis and satellite precipitation product with an emphasis on hydrological modeling. Atmosphere 2020, 11, 1252. [Google Scholar] [CrossRef]
  14. Bartholmes, J.C.; Thielen, J.; Ramos, M.H.; Gentilini, S. The european flood alert system EFAS—Part 2: Statistical skill assessment of probabilistic and deterministic operational forecasts. Hydrol. Earth Syst. Sci. 2009, 13, 141–153. [Google Scholar] [CrossRef]
  15. Fritsch, J.M.; Houze Jr, R.A.; Adler, R.; Bluestein, H.; Bosart, L.; Brown, J.; Carr, F.; Davis, C.; Johnson, R.H.; Junker, N.; et al. Quantitative precipitation forecasting: Report of the eighth prospectus development team, US Weather Research Program. Bull. Am. Meteorol. Soc. 1998, 79, 285–299. [Google Scholar]
  16. Zhu, Y. Ensemble forecast: A new approach to uncertainty and predictability. Adv. Atmos. Sci. 2005, 22, 781–788. [Google Scholar] [CrossRef]
  17. Mukhopadhyay, P.; Prasad, V.S.; Krishna, R.P.M.; Deshpande, M.; Ganai, M.; Tirkey, S.; Sarkar, S.; Goswami, T.; Johny, C.J.; Roy, K.; et al. Performance of a very high-resolution global forecast system model (GFS T1534) at 12.5 km over the Indian region during the 2016–2017 monsoon seasons. J. Earth Syst. Sci. 2019, 128, 155. [Google Scholar] [CrossRef]
  18. Durai, V.R.; Bhowmik, S.K.R. Prediction of Indian summer monsoon in short to medium range time scale with high resolution global forecast system (GFS) T574 and T382. Clim. Dyn. 2013, 42, 1527–1551. [Google Scholar] [CrossRef]
  19. Ebert, E.E.; Damrath, U.; Wergen, W.; Baldwin, M.E. Supplement to The WGNE assessment of short-term quantitative precipitation forecasts. Bull. Am. Meteorol. Soc. 2003, 84, 492. [Google Scholar] [CrossRef]
  20. Golding, B. Quantitative precipitation forecasting in the UK. J. Hydrol. 2000, 239, 286–305. [Google Scholar] [CrossRef]
  21. Sharma, K.; Ashrit, R.; Bhatla, R.; Mitra, A.K.; Iyengar, G.R.; Rajagopal, E.N. Skill of predicting heavy rainfall over india: Improvement in recent years using UKMO global model. Pure Appl. Geophys. 2017, 174, 4241–4250. [Google Scholar] [CrossRef]
  22. Ashrit, R.; Sharma, K.; Kumar, S.; Dube, A.; Karunasagar, S.; Arulalan, T.; Mamgain, A.; Chakraborty, P.; Kumar, S.; Lodh, A.; et al. Prediction of the August 2018 heavy rainfall events over Kerala with high-resolution NWP models. Meteorol. Appl. 2020, 27, e1906. [Google Scholar] [CrossRef]
  23. Ashrit, R.; Thota, M.S.; Dube, A.; Kumar, K.N.; Karunasagar, S.; Kumar, S.; Singh, H.; Meka, R.; Krishna, R.P.M.; Mitra, A.K. Evaluation of five high-resolution global model rainfall forecasts over India during monsoon 2020. J. Earth Syst. Sci. 2022, 131, 259. [Google Scholar] [CrossRef]
  24. Rao, Y.R.; Durai, V.R.; Das, A.K. NWP products for monsoon weather monitoring and prediction at various temporal/spatial scales. Monsoon2014 2015, 109, 109–129. [Google Scholar]
  25. Rao, G.V.; Reddy, K.V.; Sridhar, V.; Srinivasan, R.; Umamahesh, N.; Pratap, D. Evaluation of NCEP-GFS-based Rainfall forecasts over the Nagavali and Vamsadhara basins in India. Atmos. Res. 2022, 278, 106326. [Google Scholar]
  26. Davis, C.; Brown, B.; Bullock, R. Object-based verification of precipitation forecasts. Part I: Methodology and ap-plication to mesoscale rain areas. Mon. Weather. Rev. 2006, 134, 1772–1784. [Google Scholar] [CrossRef]
  27. Singhal, A.; Jaseem, M.; Jha, S.K. Spatial connections in extreme precipitation events obtained from NWP forecasts: A complex network approach. Atmos. Res. 2022, 282, 106538. [Google Scholar] [CrossRef]
  28. Sonawane, K.; Pattanaik, D.R.; Pai, D.S. Inter-annual variability of Indian monsoon rainfall in the JMA’s seasonal ensemble prediction system in relation to ENSO and IOD. Mausam 2021, 70, 767–780. [Google Scholar] [CrossRef]
  29. Gilleland, E.; Ahijevych, D.; Brown, B.G.; Casati, B.; Ebert, E.E. Intercomparison of spatial forecast verification methods. Weather. Forecast. 2009, 24, 1416–1430. [Google Scholar] [CrossRef]
  30. Gallus, W.A. Application of object-based verification techniques to ensemble precipitation forecasts. Weather. Forecast. 2010, 25, 144–158. [Google Scholar] [CrossRef]
  31. Roberts, N. Assessing the spatial and temporal variation in the skill of precipitation forecasts from an NWP model. Meteorol. Appl. 2008, 15, 163–169. [Google Scholar]
  32. Roberts, N.M.; Lean, H.W. Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Weather. Rev. 2008, 136, 78–97. [Google Scholar] [CrossRef]
  33. Renu, S.; Pramada, S.K.; Yadav, B.K. Seawater intrusion susceptibility and modeling: A case study of Kerala, India. Acta Geophys. 2024, 73, 1927–1945. [Google Scholar]
  34. Renu, S.; Pramada, S.K. Use of grace and in-situ data to characterize groundwater status along the coast of Kerala. J. Earth Syst. Sci. 2023, 132, 137. [Google Scholar] [CrossRef]
  35. Sivan, S.D.; Pramada, S.K. Spatiotemporal analysis of historic and future drought characteristics over a monsoon dominated humid region (Kerala) in India. Environ. Dev. Sustain. 2024, 1–25. [Google Scholar] [CrossRef]
  36. Mitra, A.K.; Bohra, A.K.; Rajeevan, M.N.; Krishnamurti, T.N. Daily Indian precipitation analysis formed from a merge of rain-gauge data with the TRMM TMPA satellite-derived rainfall estimates. J. Meteorol. Soc. Jpn. Ser. II 2009, 87A, 265–279. [Google Scholar]
  37. Prakash, S.; Mohapatra, M. Mean rainfall characteristics of tropical cyclones over the North Indian Ocean using a merged satellite-gauge daily rainfall dataset. Nat. Hazards 2023, 119, 1437–1459. [Google Scholar] [CrossRef]
  38. Reddy, M.V.; Momin, I.M.; Mitra, A.K.; Pai, D.S. Evaluation and inter-comparison of high-resolution multi-satellite rainfall products over India for the southwest monsoon period. Int. J. Remote Sens. 2019, 40, 4577–4603. [Google Scholar] [CrossRef]
  39. Sharma, K.; Ashrit, R.; Kumar, S.; Milton, S.; Rajagopal, E.N.; Mitra, A.K. Unified model rainfall forecasts over India during 2007–2018: Evaluating extreme rains over hilly regions. J. Earth Syst. Sci. 2021, 130, 82. [Google Scholar]
  40. Prakash, S.; Bhan, S.C. How accurate are infrared-only and rain gauge-adjusted multi-satellite precipitation products in the southwest monsoon precipitation estimation across India? Environ. Monit. Assess. 2023, 195, 515. [Google Scholar] [CrossRef]
  41. Reddy, M.V.; Mitra, A.K.; Momin, I.M.; Krishna, U.V.M. How accurately satellite precipitation products capture the tropical cyclone rainfall? J. Indian Soc. Remote Sens. 2022, 50, 1871–1884. [Google Scholar] [CrossRef]
  42. Mariani, S.; Casaioli, M.; Calza, M. Forecast Verification: A Summary of Common Approaches, and Examples of Application; University degli Studi di Trento, Dipartimento di Ingegneria Civile e Ambientale: Trento, Italy, 2008. [Google Scholar]
  43. Anjum, M.N.; Ahmad, I.; Ding, Y.; Shangguan, D.; Zaman, M.; Ijaz, M.W.; Sarwar, K.; Han, H.; Yang, M. Assessment of IMERG-V06 precipitation product over different hydro-climatic regimes in the Tianshan Mountains, North-Western China. Remote Sens. 2019, 11, 2314. [Google Scholar] [CrossRef]
  44. Zhu, Z.; Wu, J.; Huang, H. The influence of 10–30-day boreal summer intraseasonal oscillation on the extended-range forecast skill of extreme rainfall over southern China. Clim. Dyn. 2023, 62, 69–86. [Google Scholar] [CrossRef]
  45. Ebert, E.; McBride, J. Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrol. 2000, 239, 179–202. [Google Scholar] [CrossRef]
  46. Rossa, A.; Nurmi, P.; Ebert, E. Overview of methods for the verification of quantitative precipitation forecasts. In Precipitation: Advances in Measurement, Estimation and Prediction; Springer: Berlin/Heidelberg, Germany, 2008; pp. 419–452. [Google Scholar]
  47. Wolff, J.K.; Harrold, M.; Fowler, T.; Gotway, J.H.; Nance, L.; Brown, B.G. Beyond the basics: Evaluating model-based precipitation forecasts using traditional, spatial, and object-based methods. Weather. Forecast. 2014, 29, 1451–1472. [Google Scholar] [CrossRef]
  48. Roy, S.S.; Sharma, P.; Sen, B.; Devi, K.S.; Devi, S.S.; Gopal, N.K.; Kumar, N.; Mishra, K.; Katyar, S.; Singh, S.P.; et al. A new paradigm for short-range forecasting of severe weather over the Indian region. Meteorol. Atmos. Phys. 2021, 133, 989–1008. [Google Scholar]
  49. Pattanaik, D.R.; Alone, A. District Level Extended Range Forecast of Monsoon Rainfall Over India: Prospects and Limitations. Pure Appl. Geophys. 2024, 181, 349–372. [Google Scholar] [CrossRef]
  50. Radhakrishna, B.; Gayatri, V.; Rao, T.N. Characteristics of extreme rainfall from warm clouds associated with a mesoscale convective system: Sensitivity of different microphysical and cumulus parameterization schemes. Theor. Appl. Clim. 2025, 156, 72. [Google Scholar] [CrossRef]
  51. Oliva, A.S.; Ojeda, M.G.-V.; Agudo, R.A. Evaluation of the Sensitivity of the Weather Research and Forecasting Model to Changes in Physical Parameterizations During a Torrential Precipitation Event of the El Niño Costero 2017 in Peru. Water 2025, 17, 209. [Google Scholar] [CrossRef]
  52. Madhulatha, A.; Das, A.K.; Bhan, S.; Mohapatra, M.; Pai, D.; Pattanaik, D.; Mukhopadhyay, P. Feasibility of model output statistics (MOS) for improving the quantitative precipitation forecasts of IMD GFS model. J. Hydrol. 2024, 649, 132454. [Google Scholar] [CrossRef]
  53. Das, P.; Posch, A.; Barber, N.; Hicks, M.; Duffy, K.; Vandal, T.; Singh, D.; van Werkhoven, K.; Ganguly, A.R. Hybrid physics-AI outperforms numerical weather prediction for extreme precipitation nowcasting. NPJ Clim. Atmos. Sci. 2024, 7, 282. [Google Scholar] [CrossRef] [PubMed]
  54. Raj, B.; Sahoo, S.; Puviarasan, N.; Chandrasekar, V. Operational assessment of high resolution weather radar based precipitation nowcasting system. Atmosphere 2024, 15, 154. [Google Scholar] [CrossRef]
  55. Skok, G.; Roberts, N. Analysis of Fractions Skill Score properties for random precipitation fields and ECMWF forecasts. Q. J. R. Meteorol. Soc. 2016, 142, 2599–2610. [Google Scholar] [CrossRef]
  56. Ozkaya, A. Assessing the numerical weather prediction (NWP) model in estimating extreme rainfall events: A case study for severe floods in the southwest Mediterranean region, Turkey. J. Earth Syst. Sci. 2023, 132, 125. [Google Scholar] [CrossRef]
  57. Tahir, W.; Ibrahim, Z.; Ramli, S. Geostationary meteorological satellite-based quantitative rainfall estimation (GMS-Rain) for flood forecasting. Malays. J. Civ. Eng. 2009, 21. [Google Scholar]
  58. Ranade, A.; Mitra, A.K.; Singh, N.; Basu, S. A verification of spatio-temporal monsoon rainfall variability across Indian region using NWP model output. Meteorol. Atmos. Phys. 2014, 125, 43–61. [Google Scholar] [CrossRef]
  59. Bhowmik, S.K.R.; Durai, V.R. Application of multimodel ensemble techniques for real time district level rainfall forecasts in short range time scale over Indian region. Meteorol. Atmos. Phys. 2009, 106, 19–35. [Google Scholar] [CrossRef]
  60. Mitra, A.K.; Iyengar, G.R.; Durai, V.R.; Sanjay, J.; Krishnamurti, T.N.; Mishra, A.; Sikka, D.R. Experimental real-time multi-model ensemble (MME) prediction of rainfall during monsoon 2008: Large-scale medium-range aspects. J. Earth Syst. Sci. 2011, 120, 27–52. [Google Scholar] [CrossRef]
  61. Duan, M.; Ma, J.; Wang, P. Preliminary comparison of the CMA, ECMWF, NCEP, and JMA ensemble prediction systems. J. Meteorol. Res. 2012, 26, 26–40. [Google Scholar] [CrossRef]
  62. Ran QiHua, R.Q.; Fu Wang, F.W.; Liu Yan, L.Y.; Li TieJian, L.T.; Shi KaiFang, S.K.; Bellie Sivakumar, B.S. Evaluation of quantitative precipitation predictions by ECMWF, CMA, and UKMO for flood forecasting: Application to two basins in China. Nat. Hazards Rev. 2018, 19, 05018003. [Google Scholar]
  63. Zhang, K.; Li, J.; Zhu, Z.; Li, T. Implications from Subseasonal prediction skills of the prolonged heavy snow event over Southern China in early 2008. Adv. Atmos. Sci. 2021, 38, 1873–1888. [Google Scholar] [CrossRef]
  64. Davis, C.A.; Manning, K.W.; Carbone, R.E.; Trier, S.B.; Tuttle, J.D. Coherence of warm-season continental rainfall in numerical weather prediction models. Mon. Weather. Rev. 2003, 131, 2667–2679. [Google Scholar] [CrossRef]
  65. Hillard, U.; Sridhar, V.; Lettenmaier, D.P.; McDonald, K.C. Assessing snowmelt dynamics with NASA scatterometer (NSCAT) data and a hydrologic process model. Remote Sens. Environ. 2003, 86, 52–69. [Google Scholar]
  66. Wang, H.; Xue, M.; Yin, J.; Deng, H. Comparison of Simulated Warm-Rain Microphysical Processes in a Record-Breaking Rainfall Event Using Polarimetric Radar Observations. J. Geophys. Res. Atmos. 2023, 128, e2023JD038742. [Google Scholar] [CrossRef]
  67. Hiraga, Y.; Tahara, R. Sensitivity of localized heavy rainfall in Northern Japan to WRF physics parameterization schemes. Atmos. Res. 2024, 314, 107802. [Google Scholar] [CrossRef]
Figure 1. Kerala map with topography.
Figure 1. Kerala map with topography.
Atmosphere 16 00372 g001
Figure 2. Overall methodology.
Figure 2. Overall methodology.
Atmosphere 16 00372 g002
Figure 3. Comparison of the metrices across NWP models: (a) RMSE; (b) ACC; (c) RBIAS; (d) ME.
Figure 3. Comparison of the metrices across NWP models: (a) RMSE; (b) ACC; (c) RBIAS; (d) ME.
Atmosphere 16 00372 g003
Figure 4. Comparison of the metrices across NWP models: (a) POD; (b) CSI; (c) FAR; (d) ETS.
Figure 4. Comparison of the metrices across NWP models: (a) POD; (b) CSI; (c) FAR; (d) ETS.
Atmosphere 16 00372 g004aAtmosphere 16 00372 g004b
Figure 5. Comparison of FSS across NWP models: (a) threshold 100 mm; (b) threshold 50 mm; (c) threshold 5 mm.
Figure 5. Comparison of FSS across NWP models: (a) threshold 100 mm; (b) threshold 50 mm; (c) threshold 5 mm.
Atmosphere 16 00372 g005aAtmosphere 16 00372 g005b
Table 1. NWP models used in this study.
Table 1. NWP models used in this study.
NWP ModelForecast Range (Days)Organization/Center
NCEP 0–16National Centers for Environmental Prediction (NCEP)
NCMRWF 0–10National Centre for Medium Range Weather Forecasting (NCMRWF), India
ECMWF0–15European Centre for Medium-Range Weather Forecasts (ECMWF)
CMA0–15China Meteorological Administration (CMA)
UKMO0–15United Kingdom Met Office (UKMO)
JMA 0–11Japan Meteorological Agency (JMA)
Table 2. Equations for RMSE, ACC, BIAS, and ME.
Table 2. Equations for RMSE, ACC, BIAS, and ME.
MetricEquation
RMSE R M S E = 1 n i = 1 n O i F i 2
ACCACC = i = 1 n O i O m F i F m i = 1 n O i O m 2 i = 1 n F i F m 2
RBIASRBIAS = 1 n i = 1 n P i O i × 100 O i
MEME = 1 n i = 1 n P i O i
Table 3. Contingency tables.
Table 3. Contingency tables.
Predicted\ActualRainfall (Yes)No Rainfall (No)
Rainfall (Yes)Hits (H)False alarms (F)
No Rainfall (No)Misses (M)True negatives (T)
Table 4. Equations for skill score calculation.
Table 4. Equations for skill score calculation.
Skill ScoreEquationPurpose
Probability of Detection (POD) P O D = H H + M Measures the proportion of actual rainfall events correctly predicted
False Alarm Rate (FAR) F A R = F H + F Measures the proportion of predicted rainfall events that were false alarms
Equitable Threat Score (ETS) E T S = H Expected   Hits H + F + M Expected   Hits Evaluates the forecast accuracy while accounting for random chance
Critical Success Index (CSI) C S I = H H + F + M Measures the proportion of correct predictions among all possible outcomes
Table 5. Ranking of NWP models for the 1st day forecast.
Table 5. Ranking of NWP models for the 1st day forecast.
Day 1 Rankings
ModelACC RankRMSE RankME RankRBIAS Rank
CMA6554
ECMWF2222
JMA1113
NCEP4446
NCMRWF3665
UKMO5331
Table 6. Ranking of NWP models for the 2nd day forecast.
Table 6. Ranking of NWP models for the 2nd day forecast.
Day 2 Rankings
ModelACC RankRMSE RankME RankRBIAS Rank
CMA6556
ECMWF2221
JMA1113
NCEP4344
NCMRWF3665
UKMO5432
Table 7. Ranking of NWP models for the 3rd day forecast.
Table 7. Ranking of NWP models for the 3rd day forecast.
Day 3 Rankings
ModelACC RankRMSE RankME RankRBIAS Rank
CMA4456
ECMWF3211
JMA1123
NCEP2344
NCMRWF5665
UKMO6532
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nitha, V.; Pramada, S.K.; Praseed, N.S.; Sridhar, V. Performance Evaluation of Numerical Weather Prediction Models in Forecasting Rainfall Events in Kerala, India. Atmosphere 2025, 16, 372. https://doi.org/10.3390/atmos16040372

AMA Style

Nitha V, Pramada SK, Praseed NS, Sridhar V. Performance Evaluation of Numerical Weather Prediction Models in Forecasting Rainfall Events in Kerala, India. Atmosphere. 2025; 16(4):372. https://doi.org/10.3390/atmos16040372

Chicago/Turabian Style

Nitha, V., S. K. Pramada, N. S. Praseed, and Venkataramana Sridhar. 2025. "Performance Evaluation of Numerical Weather Prediction Models in Forecasting Rainfall Events in Kerala, India" Atmosphere 16, no. 4: 372. https://doi.org/10.3390/atmos16040372

APA Style

Nitha, V., Pramada, S. K., Praseed, N. S., & Sridhar, V. (2025). Performance Evaluation of Numerical Weather Prediction Models in Forecasting Rainfall Events in Kerala, India. Atmosphere, 16(4), 372. https://doi.org/10.3390/atmos16040372

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop