1. Introduction
Heavy rainfall events are one of the primary contributors to flooding, which poses significant risks to life, infrastructure, and agriculture across many regions of the world [
1]. Both seasonal-monsoon-driven extreme events and climate-change-induced sea level rise and storm surge are commonly linked with floods and landslides [
2]. In several tropical countries including India, where monsoon-driven rainfall is a crucial part of the climate system, accurate forecasting of rainfall and distribution of good quality data are critical for minimizing flood-related impacts and enhancing disaster monitoring and management strategies [
3].
The state of Kerala and adjacent areas in the Western Ghats region located on the southwestern coast of India are particularly vulnerable to extreme weather events, including heavy rainfall and landslides, owing to its geographical location and topographical features [
4]. These rainfall events often lead to widespread flooding and inundation, necessitating robust forecasting systems capable of predicting such events with high accuracy and reliability. The 2018 flood in Kerala was the most devastating in a century, with the previous major flood in the monsoon of 1924 [
5]. Recently, there has been a notable rise in extreme rainfall events around the globe. The key factor in issuing a flood alert is the accuracy of forecasting heavy rainfall events. As such, predicting these events is crucial for minimizing damage and preventing loss of life [
6].
Rainfall forecasts are currently provided using conventional methods such as satellite observations, weather radars, and Numerical weather prediction (NWP) models [
7,
8]. Among these observations, satellites and weather radars provide 41 qualitative forecasts, while NWP models provide quantitative forecasts [
9,
10]. A preliminary assessment of Global Precipitation Measurement (GPM)-based precipitation estimates over India for the 2014 monsoon season highlights that all products struggle with detecting precipitation in orographic regions like northeast and southeast India [
11]. NWP models are physically based models represented by governing equations, processes, and parameters that cover land, atmosphere, and ocean conditions [
12], and the outputs from these models are compared and used to assess the impact across multiple spatial and temporal scales [
13]. Since the early 1990s, NWP output ensembles have been utilized. It is widely accepted that ensemble forecasts offer probabilistic insights that are more useful for understanding the limitations caused by unavoidable errors in the initial conditions than relying solely on deterministic forecasts [
14,
15,
16].
NWP model verification plays a crucial role in meteorological research and operational forecasting activities [
17]. When verification methodologies are carefully designed, the results can effectively address the needs of various stakeholders, including modelers, forecasters, and users of forecast information. Durai et al. evaluated the performance of two National Centers for Environmental Prediction Global Forecast System (NCEP) resolutions, namely, T574 and T382 over India during the 2011 summer monsoon, finding that both models showed skill in capturing heavy rainfall regions, with T574 outperforming T382, though both models exhibited biases in lower and upper tropospheric moisture and circulation [
18]. Forecasting heavy rainfall using Quantitative Precipitation Forecasting (QPF) remains a significant challenge, even for advanced high-resolution NWP models [
19,
20].
Sharma et al. concentrated on evaluating the effectiveness of the United Kingdom Met Office Unified Model (UKMO) in forecasting intense rainfall events across India. The predictions effectively represented the overall characteristics of average monsoon rainfall, particularly highlighting higher precipitation levels along India’s western coast [
21]. Ashrit et al. evaluated the accuracy of NWP models utilized by the National Centre for Medium-Range Weather Forecasting (NCMRWF) in forecasting the extreme rainfall that occurred in Kerala in August 2018 [
22]. The performance of five global NWP models (National Centre for Unified Modelling (NCUM), UKMO, India Meteorological Department Global Forecast System (IMD GFS), NCEP, and European Centre for Medium-Range Weather Forecasts (ECMWF)) was assessed in forecasting daily rainfall over India during the 2020 monsoon season using both traditional and advanced verification methods. While all models accurately captured large-scale monsoon patterns, forecast accuracy decreased with lead time, with ECMWF performing best overall, though regional performance varied, and models struggled to predict localized, high-intensity rainfall events [
23].
The performance of the Global Forecasting System (GFS) and Weather Research and Forecasting (WRF) models for forecasting monsoon rainfall over India during the summer monsoon season of 2014 was evaluated. The GFS model showed reasonable accuracy in predicting large-scale rainfall features and significant improvement from 2013 to 2014, particularly for longer forecast periods. In contrast, the WRF model exhibited a consistent over-prediction in rainfall, highlighting the need for bias correction and improved physical parameterization schemes for better monsoon predictions [
24]. ECMWF, NCEP, and UKMO models were assessed for forecasting extreme rainfall over India, and they could predict rainstorms up to five days in advance with biases in spatial distribution and intensity. NCEP showed smaller spread but less accurate averages, and all models exhibited under-prediction and increased errors in longer forecasts [
6]. Venkat Rao et al. evaluated the NCEP model for rainfall forecasting in the Nagavali and Vamsadhara river basins and found good performance with correlation coefficients > 0.3 and probability of detection > 0.6 for day-1 and day-3 forecasts. Bias analysis showed a shift from overestimation to underestimation with increasing lead time. Bias correction improved RMSE by >18% for day-1 forecasts, offering insights for flood forecasting, early warning, and water management [
25]. The object-based verification method reveals that the WRF model overproduces large rain areas and underestimates the diurnal cycle of rainfall, with a positive size bias, particularly in the afternoon. However, this method is sensitive to object size, which can lead to inaccurate results for smaller-scale events [
26]. Another evaluation of the GFS T1534 model for the 2016–2017 monsoon seasons reveals a wet bias over land and overestimation of lighter rainfall, while underestimating heavier rainfall [
17]. Singhal et al. conducted an inter-comparison of four gridded quantitative precipitation forecasts—ECMWF, Japan Meteorological Agency (JMA), NCMRWF, and UKMO—over the Ganga River basin in India and found ECMWF to be a suitable substitute for NCMRWF in detecting spatial patterns of extreme precipitation. In terms of NSE, JMA outperforms the other three NWP models. Additionally, JMA exhibited similar patterns of probability of detection when compared to ECMWF [
27]. Sonawane et al. evaluated the JMA model’s ability to predict monsoon rainfall over 32 years. The study revealed that the model could effectively represent fluctuations in Indian summer monsoon precipitation, especially when incorporating influences such as El Niño and the Indian Ocean Dipole concurrently [
28].
Over the years, a variety of spatial verification methods have been developed to more effectively assess the performance of high-resolution forecasts [
29]. Gallus et al. noted that spatial verification provides more detailed and relevant metrics of forecast skill, offering a clearer picture of the accuracy of high-resolution predictions. Neighborhood or fuzzy methods evaluate forecast accuracy within space-time neighborhoods. In these methods, all grid-scale values surrounding an observation are treated as equally plausible estimates of the true value. “Neighborhood verification” assesses forecast skill by varying neighborhood sizes and conducting verification across different spatial scales and intensity thresholds [
30]. The Fractions Skill Score (FSS), introduced by Roberts and Lean, exemplifies such fuzzy spatial verification techniques. Instead of a direct grid-to-grid comparison, FSS evaluates forecast accuracy within local neighborhoods of observations [
31,
32].
The performance of the NWP models can vary depending on several factors, including forecast lead time, precipitation thresholds, and regional geographical characteristics. Kerala, situated along the Western Ghats, experiences complex orographic influences that significantly affect heavy rainfall events. The interaction between the steep terrain and moisture-laden low-level westerly winds from the Arabian Sea enhances precipitation, leading to highly localized and intense rainfall. The intensity and variability of the monsoon flow further influence precipitation patterns, making accurate forecasting particularly challenging. Despite Kerala’s vulnerability to extreme rainfall and associated flood risks, there is a lack of region-specific evaluation of rainfall forecast models, limiting efforts to enhance prediction accuracy under its unique climatic conditions. Given the critical need for reliable heavy rainfall forecasting, this research aims to evaluate the ability of different NWP models in the short-term forecasting of heavy rainfall events in Kerala, India, and to assess their performance using a range of evaluation metrics. By focusing on traditional performance metrics, categorical precipitation metrics, and fractional skill scores, this study provides a comprehensive understanding of the strengths and weaknesses of various models in predicting heavy rainfall across different forecast periods.
The outcomes of this analysis are intended to guide the selection of appropriate NWP models for operational forecasting, particularly in the context of short-term forecasting, which is crucial for timely flood warning systems. Furthermore, the insights from this study are expected to contribute to improve the overall accuracy and reliability of rainfall forecasts, ultimately aiding in the better management of flood risks in vulnerable regions like Kerala.
2. Study Area and Data Sources
Kerala, situated between 8° and 13° N latitude and 74° to 78° E longitude (
Figure 1), occupies the southwestern edge of the Indian subcontinent. The state features a highly varied topography characterized by coastal plains, mid-elevation hills, and the rugged Western Ghats, which rise to over 2695 m. This geographical diversity significantly influences Kerala’s climatic and hydrological dynamics. The state experiences a humid tropical monsoon climate dominated by two principal rainfall seasons: the Southwest Monsoon (June–September), contributing over 70% of the annual precipitation, and the Northeast Monsoon (October–December) [
33,
34,
35]. Kerala’s dense river network, comprising 44 rivers, supports its hydrological system. The Western Ghats act as an orographic barrier, intensifying monsoon rainfall and influencing large-scale atmospheric processes like integrated vapor transport and atmospheric river activity. This complex interaction between terrain, monsoon dynamics, and hydrological processes makes Kerala an ideal study area for examining extreme precipitation events, which are important for water resource management in a changing climate.
The IMD and the NCMRWF collaboratively developed a merged satellite–gauge rainfall dataset to improve monsoon rainfall analysis and numerical model validation. This dataset, known as the IMD-NCMRWF merged satellite–gauge product, integrates GPM-based near real-time multi-satellite precipitation estimates with IMD’s dense rain gauge network across India, enhancing accuracy [
36,
37]. This product is recognized as one of the most reliable gridded rainfall datasets for the Indian region, and the IMD-NCMRWF merged satellite–gauge product is widely utilized in hydro-meteorological research and forecasting applications [
38,
39,
40]. Studies have demonstrated its effectiveness in capturing tropical cyclone (TC) rainfall over India, making it a valuable resource for extreme weather studies [
41]. For this study, the gridded rainfall used to validate NWP model rainfall forecasts is the IMD-NCMRWF merged (Satellite + Gauge) dataset for the monsoon season (June–September) from 2018 to 2022, available at a 0.25° × 0.25° grid resolution.
The daily precipitation forecasts from six NWP models were downloaded from the ECMWF TIGGE Data Retrieval website. These models were the NCEP, NCMRWF, ECMWF, China Meteorological Administration (CMA) GFS, UKMO, and JMA. The forecasts cover 1-day, 2-day, and 3-day rainfall predictions, with a 0.25° grid resolution used in this study. The data span the monsoon season (June–September) from 2018 to 2022 and specifically focus on Kerala. Both the observed and past forecast datasets were taken in gridded format, covering the entire Kerala state, and were then considered for further performance evaluation. The dataset consists of a grid of latitude and longitude points ranging from 8° N to 13° N and 74° E to 78° E, with a 0.25° resolution, ensuring consistent spatial coverage across the study region.
5. Summary
The evaluation of six NWP models, namely, NCEP, NCMRWF, ECMWF, CMA, UKMO, and JMA, reinforces the importance of accurate precipitation forecasts. The analysis of NWP model performance across various metrics, including forecast accuracy, precipitation detection, and forecast skill, reveals key insights into the capabilities and limitations of different models as forecast lead time and rainfall thresholds increase. While all models captured rainfall patterns well, they struggled with accurately forecasting heavy rainfall events, particularly as forecast lead times and rainfall thresholds increased. Heavy precipitation events, especially those influenced by complex terrains such as the Western Ghats along India’s west coast, significantly impact model performance. The interaction between the mountain range and moisture-laden low-level westerly winds from the Arabian Sea enhances rainfall in this region, where the intensity of the large-scale monsoon flow influences both low-level wind patterns and the associated precipitation. This region is crucial for monsoon studies due to its significant role in modulating rainfall patterns.
All models showed declining accuracy as the forecast lead time increased. This pattern is a commonly observed feature of NWP models, which can be attributed to the complex and unpredictable nature of atmospheric processes and the growing uncertainty in the initial conditions as the forecast horizon lengthens. By analyzing traditional performance and categorical precipitation metrics, it was observed that JMA performed best but suffered from false alarms, while ECMWF demonstrated strong overall performance in different metrics. NCEP and UKMO had moderate performance, while CMA and NCMRWF showed very poor performance.
For rainfall thresholds, all models faced challenges as the threshold increased, particularly for extreme rainfall events. JMA and ECMWF showed better performance in traditional and categorical precipitation metrics, while analyzing FSS, ECMWF, and JMA showed a more significant decline in forecast skill as the rainfall threshold increased, particularly for extreme rainfall events.
Incorporating higher-resolution models and advanced data assimilation techniques can significantly improve the accuracy of extreme rainfall forecasts. High-resolution models capture finer-scale atmospheric processes and better represent local terrain influences, which is crucial for regions like Kerala with complex topography. Additionally, data assimilation techniques, such as four-dimensional variational (4D-Var) and ensemble Kalman filtering (EnKF), enhance initial conditions by integrating real-time observational data, leading to more accurate precipitation predictions. Combining these approaches with improved parameterization of convective processes and land–atmosphere interactions can help mitigate model biases and improve performance at higher rainfall thresholds.
Multi-Model Ensembles (MME) improve precipitation forecasting by combining outputs from multiple models to reduce errors and enhance reliability. One key approach is bias correction, where statistical methods like quantile mapping and Bayesian model averaging adjust systematic errors in individual models, leading to more accurate predictions, particularly for extreme rainfall events. Additionally, weighted averaging techniques assign weights to different models based on their past performance. MME also enhance spatial and temporal coverage by blending models that perform well under different conditions, which is particularly beneficial for regions with complex topography like Kerala. Furthermore, machine learning techniques, such as neural networks and random forests, can be employed to intelligently merge model outputs by identifying nonlinear relationships between model biases and observed rainfall patterns. Given Kerala’s highly localized and topographically influenced rainfall patterns, object-based verification methods face challenges in accurately capturing small-scale convective systems and extreme precipitation events. However, integrating object-based evaluation in future studies, along with grid-based assessments, could help provide a more comprehensive assessment by improving the analysis of spatial displacement and structural errors in precipitation forecasts [
67].