Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks

Kong, Liang; Guo, Yanhui; Lee, Chung-wei

doi:10.3390/appliedmath4020022

Open AccessArticle

Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks

by

Liang Kong

¹,

Yanhui Guo

^2,*

and

Chung-wei Lee

²

¹

Department of Mathematical Sciences and Philosophy, University of Illinois Springfield, Springfield, IL 62703, USA

²

Department of Computer Science, University of Illinois Springfield, Springfield, IL 62703, USA

^*

Author to whom correspondence should be addressed.

AppliedMath 2024, 4(2), 427-441; https://doi.org/10.3390/appliedmath4020022

Submission received: 29 December 2023 / Revised: 6 March 2024 / Accepted: 11 March 2024 / Published: 1 April 2024

Download

Browse Figures

Versions Notes

Abstract

Accurate forecasting of the coronavirus disease 2019 (COVID-19) spread is indispensable for effective public health planning and the allocation of healthcare resources at all levels of governance, both nationally and globally. Conventional prediction models for the COVID-19 pandemic often fall short in precision, due to their reliance on homogeneous time-dependent transmission rates and the oversight of geographical features when isolating study regions. To address these limitations and advance the predictive capabilities of COVID-19 spread models, it is imperative to refine model parameters in accordance with evolving insights into the disease trajectory, transmission rates, and the myriad economic and social factors influencing infection. This research introduces a novel hybrid model that combines classic epidemic equations with a recurrent neural network (RNN) to predict the spread of the COVID-19 pandemic. The proposed model integrates time-dependent features, namely the numbers of individuals classified as susceptible, infectious, recovered, and deceased (SIRD), and incorporates human mobility from neighboring regions as a crucial spatial feature. The study formulates a discrete-time function within the infection component of the SIRD model, ensuring real-time applicability while mitigating overfitting and enhancing overall efficiency compared to various existing models. Validation of the proposed model was conducted using a publicly available COVID-19 dataset sourced from Italy. Experimental results demonstrate the model’s exceptional performance, surpassing existing spatiotemporal models in three-day ahead forecasting. This research not only contributes to the field of epidemic modeling but also provides a robust tool for policymakers and healthcare professionals to make informed decisions in managing and mitigating the impact of the COVID-19 pandemic.

Keywords:

epidemic differential equations; recurrent neural network; prevalence forecasting; COVID-19

MSC:

34H15; 65Z05; 68T07; 68T09

1. Introduction

Coronavirus disease 2019 (COVID-19) has evolved into a global pandemic following the emergence of a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As of 10 December 2022, the World Health Organization (WHO) has reported over 643 million confirmed cases and a lamentable toll of over 6.6 million deaths worldwide [1]. Effectively mitigating the impact of COVID-19 necessitates collaborative efforts to comprehend and predict the multifaceted aspects of this complex phenomenon. This includes the development of accurate epidemic-spreading models and the exploration of both non-pharmaceutical and pharmaceutical intervention policies [2,3,4,5,6,7,8,9].

Despite stringent measures such as travel bans, mask mandates, gathering restrictions, and the closure of public transport and schools, coupled with vaccination efforts and novel antiviral drugs, the pandemic has persisted with an alarming increase in cases in the United States and globally [10]. Since December 2021, the emergence of new Omicron variants has propelled the COVID-19 pandemic to a new peak, surpassing the previous apex of April 2021 [11,12,13]. By July 2022, BA.5, an Omicron subvariant, had become one of the dominant strains in the United States [14,15]. Although global infection cases and deaths have recently shown a decline, some regions, countries, and cities are grappling with an unprecedented surge. China, for instance, is contending with a surge in infections, following the easing of COVID-19 policies in early December 2022.

A fundamental goal of epidemiological modeling is the development of an efficient risk prediction model. This model empowers public health workers and governments to assess the efficacy of public health protocols and predict the risk of COVID-19 infection at regional and national levels. However, the complexities of infection spread render prominent compartmental models unreliable [16,17,18]. Challenges arise due to limited data, making it difficult to model population density, the impact of viral mutations, human mobility under varying lockdown levels, and qualitative social aspects such as culture and lifestyle [19,20]. Despite these challenges, research indicates that region-based mitigation factors, including lockdown levels and monitoring, significantly influence infection rates [21,22,23].

A noteworthy gap exists in the literature regarding the combined use of compartmental and deep learning-based prediction approaches for COVID-19. Our work seeks to address this gap by demonstrating the utility of parsimonious differential equations models and deep learning methods. By employing parsimoniously designed models, we aim to identify critical pandemic features and make policy-relevant discoveries, even with limited data. This approach provides a robust framework for a broader cohort of quantitative researchers to predict COVID-19 infections and enhance our understanding of the ongoing pandemic.

In this study, we underscore the pivotal role of spatial heterogeneity in computational epidemiology. Our focus centers on the development of a hybrid spatiotemporal epidemic model aimed at forecasting infection numbers. This model combines the susceptible–infectious–recovered–deceased (SIRD) compartmental model with a long short-term memory (LSTM) network, addressing the need for a comprehensive approach in predicting the dynamics of infectious diseases. Notably, our spatiotemporal model integrates human mobility by incorporating infection rates from neighboring regions, thus surpassing the predictive capabilities of standalone conventional epidemiological or machine learning-based models. This integration results in enhanced prediction accuracy, mitigating tendencies to over- or under-fit, as evidenced by our findings [24].

The key contributions of this study include the following:

Introduction of the SIRD–eRNN (edge recurrent neural network) model, a hybrid combining the compartmental model with the RNN model, tailored for predicting infection numbers.
Demonstration of significantly reduced model prediction errors, compared to a pure RNN model.
Enhancement of precision and accuracy in short- and medium-term trend forecasting by the SIRD–eRNN model.
Compatibility and generalizability of the proposed model, allowing for the seamless addition of more compartments.

Preliminary results indicate that integrating compartmental and RNN models effectively heightens epidemic trend prediction accuracy, aligning closely with real-world data and holding valuable implications for practical epidemic forecasting.

The paper’s structure unfolds as follows: Section 2 reviews related research, providing context for our contributions. In Section 3, we articulate the problem statement and detail our approaches. Section 4 delves into experiential details, evaluation criteria, performance comparison, and insightful conclusions drawn from the analysis. Finally, Section 5 offers concluding remarks and outlines avenues for future studies, ensuring a comprehensive understanding of the implications and potential advancements in computational epidemiology.

2. Related Works

A plethora of epidemic models has recently emerged to capture the dynamics of COVID-19 and evaluate the efficacy of countermeasures like social distancing, contact tracing, and testing. Two prominent categories of methodologies in epidemiological modeling include mechanistic modeling, rooted in mathematical literature on dynamical systems, and data science methodologies in medicine and public health [25,26]. Compartmental models, a simplified form of infectious disease mathematical modeling, are often governed by ordinary differential equations, exemplified by the SIRD model. This model encapsulates the basic mechanisms of an epidemic, categorizing individuals in a population as susceptible, infected, recovered, or deceased. However, challenges arise when certain variables, such as those related to asymptomatic carriers, are not empirically available, necessitating estimation through statistical models or optimization procedures based on observed data.

While the SIRD model is efficient for temporal data, it shares critical limitations with similar compartmental models. Firstly, the choice of variables is crucial and must adequately characterize the spreading mechanisms of a particular disease, posing challenges when understanding these mechanisms is incomplete. Secondly, SIRD simplifies human mobility or contact structures for mathematical tractability, neglecting the complexity introduced by spatiotemporal data, as evident in the case of COVID-19 [27,28].

In response to these challenges, machine learning models have been proposed to estimate the dynamics of COVID-19 transmission. For example, Huang et al. [29] employed a convolutional neural network to predict the cumulative number of COVID-19 deaths, while Bock et al. [30] explored the performance of different deep learning models with varying input data. El Hajji et al. [31] proposed a dynamical system involving both deterministic (with or without delay) and stochastic “SIR” epidemic models with nonlinear incidence rates in a continuous reactor is considered. Bousquet et al. [32] introduced an algorithm combining the SIR model with the LSTM neural network, allowing real-time forecasting and time-dependent parameter estimates. Ma et al. [33] proposed an LSTM–Markov model, integrating the LSTM model’s output with the prediction errors of the Markov model to reduce errors. However, existing models often consider only the number of confirmed cases without incorporating other factors such as human mobility.

As the epidemic progresses, surveillance data become more detailed, raising the question of extending the SIRD temporal model to a spatiotemporal model. This extension would enable the inclusion of information from spatially neighboring regions, enhancing risk prediction. Notably, deep learning models, while powerful, require substantial training data. In the case of the novel SARS-CoV-2 virus, data scarcity leads to gradient vanishing and explosion problems, making the direct application of spatiotemporal LSTM models to COVID-19 challenging [34].

In this study, we introduce the SIRD-eRNN hybrid model, which integrates time-dependent features from the SIRD model with the incorporation of human mobility effects through LSTM. A discrete-time function is derived specifically for the infectious (I) component of the SIRD model. The construction of the SIRD-eRNN model is inherently parsimonious, offering enhanced interpretability, computational efficiency, and resistance to overfitting. Notably, these models can operate in real-time and demonstrate effectiveness even with relatively modest amounts of training data. A notable advantage of our hybrid model is its ability to estimate parameters, including those within the mechanistic model, without reliance on subjectively chosen prior information. This flexibility enhances the adaptability of the model to diverse scenarios. Our approach is scalable and robust, effectively extracting crucial dynamics underlying disease transmission mechanisms from publicly available surveillance data. This scalability ensures accurate predictions of future disease infections, reinforcing the utility and reliability of our proposed model in forecasting and understanding the trajectory of the ongoing pandemic.

3. Proposed Model

To achieve precise and efficient forecasting of the COVID-19 spread, we present a hybrid model that seamlessly integrates the SIRD time-dependent features with the influence of human mobility through an LSTM network. The intricacies of this hybrid model are elucidated in the following sections.

3.1. Susceptible–Infectious–Recovered–Deceased Compartmental Epidemiology Process Model

A modification of the Kermack–McKendrick equations offers a robust framework for capturing the dynamics of disease. The classical SIR model delineates a population into susceptible, infected, and recovered compartments to depict the progression of a contagious ailment. Expanding upon this, the SIRD model introduces a mortality compartment, providing a nuanced representation by distinguishing deaths from recoveries. Particularly valuable for addressing the challenges posed by severe diseases such as COVID-19, the SIRD model facilitates a more realistic portrayal of outbreaks.

By segregating deaths and recoveries, the SIRD model enhances the capacity for in-depth analysis of the entire disease burden and its impact on the population. This comprehensive understanding, integral for informed public health planning and interventions, contributes significantly to refining strategies in the face of the ongoing pandemic.

Furthermore, our consideration extends to a specific geographical region, assumed to have connections with proximate neighbors and isolation from other regions. Within this context, we define key quantities that form the basis of our analytical exploration.

$S (t)$ : the number of susceptible individuals at time $t, S (t) \geq 0$ for any $t \geq 0$ ;
$I (t)$ : the number of infected individuals at time $t, I (t) \geq 0$ for any $t \geq 0$ ;
$R (t)$ : the cumulative number of recovered individuals up to time $t, R (t) \geq 0$ for any $t \geq 0$ ;
$D (t)$ : the number of deceased as a result of the disease up to time $t, D (t) \geq 0$ for any $t \geq 0$ .

We introduce an innovative spatiotemporal model that amalgamates spatial information with a discrete-time infection equation (I-equation) derived from the SIRD differential equations. The I-equation adeptly captures observed temporal dynamics, while a recurrent neural network (RNN) is deployed to capture latent spatial information. Distinct from the original method [22], our model, termed SIRD–eRNN, distinguishes itself by utilizing a differential equation with only three parameters (the I-equation) to accommodate limited temporal data. This characteristic renders the I-equation notably more concise than the RNN.

The following equations describe the proposed SIRD–eRNN model:

\frac{d S}{d t} = - β S I,

(1)

\frac{d I}{d t} = β S I - γ I - μ I,

(2)

\frac{d R}{d t} = γ I,

(3)

\frac{d D}{d t} = μ I,

(4)

where β is the transmission rate, γ is the recovery rate, and μ is the death rate. In this model, the underlying hypothesis is that the recovered people are no longer susceptible to infection. The classic SIRD epidemiological model assumes a closed population and does not account for the importation of infections from outside the modeled region. It conserves the total population size, normalized to

S + I + R + D = 1 .

Theorem 1.

For any non-negative initial condition, the solution of system (1)–(4) remains non-negative and bounded.

Proof.

Rewrite Equation (1) as follows:

\frac{d S (t)}{d t} + β S (t) I (t) = 0,

(5)

Let

W = e x p (\int_{0}^{t} β I (t) d t)

be the integrating factor of Equation (5), and rewrite it as

\int_{0}^{t} \frac{d}{d t} [S (t) \times W] d t = 0, t > 0,

which implies

S (t) e x p (\int_{0}^{t} β I (t) d t) - S (0) = 0

For

S (0) \geq 0,

S (t) \geq S (0) e x p (\int_{0}^{t} - β I (t) d t) .

Hence,

S (t)

is non-negative for

t > 0 .

In a similar process, we show that

I (t),

R (t),

and

D (t)

are non-negative for

t > 0 .

Next, we show the solutions of system (1)–(4) are closed inside region

Ω

, where

Ω = \{(S (t), I (t), R (t), D (t)) \in R^{4} : N (t) \leq N_{0}\}

; here,

N_{0}

is a constant.

Indeed, by adding (1)–(4), we have the following:

\frac{d N}{d t} = 0,

which implies

N (t) \leq N_{0}

for all

t > 0 .

Hence the solution is positively bounded. □

3.2. Derivation of Discrete-Time I-Equation

We add a novel feature,

I_{e}

, which represents the external infection in a result of human mobility from the neighbors of a region where

I_{e} \geq 0

for any

t \geq 0

. The SIRD epidemic system, with

I_{e}

as external forcing, becomes the following:

\frac{d S}{d t} = - β_{1} S I - β_{2} S I_{e},

(6)

\frac{d I}{d t} = β_{1} S I + β_{2} S I_{e} - γ I - μ I,

(7)

\frac{d R}{d t} = γ I,

(8)

\frac{d D}{d t} = μ I .

(9)

We first prove that the solutions are uniformly bounded in a positive invariant region through the following theorem.

Theorem 2.

For any non-negative initial condition, the solution of system (6)–(9) remains non-negative and bounded.

Proof.

Equation (6) can be rewritten as follows:

\frac{d S (t)}{d t} + (β_{1} I (t) + β_{2} I_{e} (t)) S (t) = 0,

(10)

Let

V = e x p (\int_{0}^{t} β_{1} I (t) + β_{2} I_{e} (t) d t)

be the integrating factor of Equation (10), and rewrite it as follows:

\int_{0}^{t} \frac{d}{d t} [S (t) \times V] d t = 0, t > 0,

We have

S (t) e x p (\int_{0}^{t} β_{1} I (t) + β_{2} I_{e} (t) d t) - S (0) = 0

For

S (0) \geq 0,

S (t) \geq S (0) e x p (\int_{0}^{t} - [β_{1} I (t) + β_{2} I_{e} (t)] d t) .

Hence,

S (t)

is non-negative for

t > 0 .

In a similar process, we show that

I (t),

R (t),

and

D (t)

are non-negative for

t > 0 .

Similarly to the previous theorem, we can show that the solutions of system (6)–(9) are closed inside region

Ω

, where

Ω = \{(S (t), I (t), R (t), D (t)) \in R^{4} : N (t) \leq N_{0}\}

; here,

N_{0}

is a constant.

Indeed, by adding (6)–(9), we obtain the following:

\frac{d N}{d t} = 0,

which implies

N (t) \leq N_{0}

for all

t > 0 .

Hence, the solution is positively bounded. □

It follows, from (8) and (9), that the following is true:

R (t) = R (t_{0}) + γ \int_{t_{0}}^{t} I d τ,

D (t) = D (t_{0}) + μ \int_{t_{0}}^{t} I d τ .

Therefore,

S (t) = 1 - I (t) - R (t) - D (t) = 1 - I (t) - R (t_{0}) - γ \int_{t_{0}}^{t} I d τ - D (t_{0}) - μ \int_{t_{0}}^{t} I d τ .

By substituting

S (t)

into (7) we have,

\frac{d I}{d t} = (β_{1} I + β_{2} I_{e}) S - γ I - μ I = (β_{1} I + β_{2} I_{e}) (1 - I (t) - R (t_{0}) - γ \int_{t_{0}}^{t} I d τ - D (t_{0}) - μ \int_{t_{0}}^{t} I d τ) - γ I - μ I .

Approximating the integral by Riemann sum and applying the forward Euler method, the discrete approximation is written as follows:

\begin{matrix} I (t + 1) = S - γ I - μ I = (1 - γ - μ) I (t) + (β_{1} I (t) + β_{2} I_{e} (t)) \\ \cdot (1 - I (t) - R (t_{0}) - D (t_{0}) - \frac{(γ + μ) (t - t_{0})}{p + 1} \sum_{j = 0}^{p} I (t - j)) . \end{matrix}

(11)

where p is the number of days that are considered before day t.

At the beginning of the infection of the model, when

t_{0} = 0

,

R (t_{0}) = 0

and

D (t_{0}) = 0

, I discrete-time I-equation is simplified as follows:

\begin{matrix} I (t + 1) = S - γ I - μ I = (1 - γ - μ) I (t) + (β_{1} I (t) + β_{2} I_{e} (t)) \\ \cdot (1 - I (t) - (γ + μ) \frac{t}{p + 1} \sum_{j = 0}^{p} I (t - j)) . \end{matrix}

(12)

If we let

I_{e} (t) = 0

, then we have an approximation of

I (t)

for the original SIRD model, which is a pure temporal model, named I-model:

\begin{matrix} I (t + 1) = S - γ I - μ I = (1 - γ - μ + β) I (t) - β I^{2} (t) \\ - β (γ + μ) \frac{t}{p + 1} I (t) \sum_{j = 0}^{p} I (t - j) . \end{matrix}

(13)

It is difficult to predict how a region’s population interacts with those of adjacent areas, which affects the infection speed. As a result,

I_{e} (t)

is latent information that is difficult to model mathematically. We use recurrent neural networks with LSTM cells of RNN to extract latent spatial information

I_{e}

.

3.3. Edge Feature and Latent $I_{e}$

The spatial information is used to define the latent information

I_{e}^{v} (t)

in a region

v

. Let C be the neighbors of region v. The spatial feature of

v

at time t is defined as follows:

f_{e}^{v} (t) = \frac{1}{|C|} \sum_{i : v_{i} \in C}^{} [I_{i} (t - 1), \dots, I_{i} (t - p)],

(14)

where the vector

f_{e}^{v} (t)

represents the values for the preceding p days, where each element is computed as the average of the case numbers in the neighboring regions for each respective day.

I_{i} (t)

is the infection population of the region

v_{i}

at time t. An RNN is employed to estimate the value of

I_{i} (t)

based on its spatial feature

f_{e}^{v} (t)

. We feed

f_{e}^{v} (t)

into an edge RNN, which includes three stacked LSTM cells (see Figure 1a) with a following dense layer to output

I_{e}

. The integrated procedure to generate edge features is demonstrated by Figure 1b. We named our model SIRD–eRNN as a result of its integrated design of SIRD equations and edge RNN.

In Figure 1a, a set of three stacked LSTM cells is configured as a RNN. The term “stacked LSTM network” refers to a specific RNN architecture characterized by multiple LSTM layers arranged sequentially. This design allows the network to comprehend intricate patterns and dependencies within sequential data.

Each LSTM layer within the stack processes input sequences and maintains its independent hidden state, facilitating the capture of complex temporal dependencies in the data. The input sequences are initially introduced into the first LSTM layer in the stack. Subsequently, each LSTM layer retains a hidden state that captures pertinent information from the input sequence. This hidden state undergoes continual updates and is propagated to the subsequent time step. The output from the initial LSTM layer serves as input for the subsequent layer in the stack. As the input sequences progress through each LSTM layer, the network achieves the capability to discern hierarchical and abstract representations of the input data, with each layer capturing distinct levels of abstraction. The final LSTM layer in the stack either produces the output sequence or connects to additional layers for specific tasks like classification, regression, or sequence generation.

The stacking of LSTM layers is instrumental in empowering the network to learn hierarchical features and representations from input sequences. This proves particularly advantageous in tasks involving time-series data, natural language processing, and other sequential data analyses where a nuanced understanding of context across multiple levels is paramount.

3.4. Training the Proposed Model

The discrete approximation of the proposed model is written as follows:

\hat{I} (t) = S - γ I - μ I = (1 - γ - μ) I (t - 1) \begin{matrix} + (β_{1} I (t - 1) + β_{2} I_{e} (t - 1)) (1 - I (t - 1) - (γ + μ) \frac{t}{p + 1} \sum_{j = 1}^{p + 1} y (t - j)) . \end{matrix}

(15)

To train the RNN model, the loss function is defined using the mean squared error (MSE) of data and the model’s outputs as follows:

L o s s = \frac{1}{T - p - 1} \sum_{t = p + 1}^{T} {(I (t) - \hat{I} (t))}^{2} .

(16)

where T is the total number of the days.

Training consists of minimizing the loss function over parameters in both the infection equation and the recurrent neural network, hence the two components (I-equation and RNN) of the model are fully coupled while learning from data. Weights of the RNN and the dense layer, in addition to I-equation parameters

β_{1}, β_{2},

and

γ

, are learned using Adam gradient descent.

4. Experimental Results

Retrospective estimates of COVID-19 activity were generated utilizing our SIRD–eRNN model for the timeframe spanning from 24 February 2020 to 18 June 2020, across 20 Italian regions, as illustrated in Figure 2. The data are organized in a tabular format, encompassing columns for date, the 20 distinct regions of Italy, and the new confirmed cases reported on each date in every region. Regions with fewer than three neighboring regions were excluded from the analysis, leaving the following seven regions for experimentation: Campania, Emilia-Romagna, Lazio, Lombardia (Lombardy), Molise, Piemonte (Piedmont), and Umbria.

To assess the model’s performance, we compare the estimates produced by SIRD–eRNN with the ground truth: the daily reported COVID-19 new cases [35]. The root mean squared error (RMSE) serves as the evaluation metric. Setting p = 3 in Equation (12), we partitioned the available data for 113 days into an 80-day training set and a 33-day testing set, adhering to an approximate 7:3 training/testing ratio.

All experiments were conducted using Python 3.7 and TensorFlow 1.5, executed on a server equipped with an Intel Xeon processor boasting 128 GB memory (Intel Corporation, Santa Clara, CA, USA) and an NVIDIA Tesla K40 GPU with 12 GB memory (Nvidia Corporation, Santa Clara, CA, USA).

In Figure 3, we illustrate the ratio of daily confirmed COVID-19 cases per total population, juxtaposed with the projected case numbers, across nine regions. The graphical representation spans over time to provide a comprehensive view of the trends. The predictions derived from our SIRD–eRNN model are depicted by the solid orange line. Notably, the model demonstrates its proficiency in accurately estimating the infection rate during both the training and testing stages, effectively capturing the nuanced dynamics of transmission akin to the real ground truth rate.

A comparative analysis with the prediction results of the IeRNN model [22], as illustrated in Figure 4, reveals that the SIRD–eRNN aligns more closely with the ground truth rate across most regions, with the exception of the Umbria region. This observation underscores the effectiveness of our proposed model in providing reliable and realistic predictions, showcasing its potential for robust applications in forecasting COVID-19 transmission dynamics.

We examined our model predictions using the root mean square error (RMSE), which is defined as follows:

R M S E = \sqrt{\frac{\sum_{t = 1}^{T} {({\hat{y}}_{t} - y_{t})}^{2}}{T}} .

(17)

where

y_{t}

and

{\hat{y}}_{t}

are real and predicted values at time t, respectively.

T

is the total number of samples. The smaller the RMSE, the better the prediction results.

In Table 1, depicting the three-day ahead forecast, SIRD–eRNN consistently outperforms the IeRNN model, yielding smaller RMSE values across most regions. The average RMSE in the testing stage for all regions is 3.19314 × 10⁻⁴, which is notably lower than the corresponding result for IeRNN, which stands at 3.23159 × 10⁻⁴. Excluding the Umbria region, where the RMSE for SIRD–eRNN is 3.62327 × 10⁻⁴ and for IeRNN is 3.72205 × 10⁻⁴, the proposed model consistently demonstrates superior performance.

The discrepancy in performance is particularly noteworthy given the compact and simple nature of the I-Model with its three parameters. Despite this simplicity, it is apparent that the I-Model falls short in delivering accurate three-day ahead predictions compared to the sophistication of the proposed SIRD–eRNN model. It is crucial to emphasize that the infection population rate is a percentage figure, and considering its substantial base number, even minor differences in RMSE values can translate into significant disparities in the estimated rate.

Given the inherent nonlinearity in the evolution of infectious diseases, we specifically compare nonlinear modes for three-day ahead predictions. The presented results unequivocally establish the superior accuracy of our model in contrast to IeRNN. Our SIRD–eRNN model consistently outperforms other models, demonstrating its efficacy in achieving precise and reliable predictions.

Our proposed SIRD–eRNN model demonstrates marginally better performance compared to IeRNN, as shown in Table 1. At first glance, the improvement may seem small. However, this incremental gain highlights the efficacy of the modifications we have incorporated into our approach. These enhancements enable our model to make more precise predictions, generalize more robustly, and capture the intricate dynamics of the system more accurately. In short, the modest bump in performance reflects meaningful improvements in the model’s capabilities.

5. Discussion and Future Work

This study proposed a novel hybrid model that combines classic epidemic equations with a recurrent neural network (RNN) to predict the spread of the COVID-19 pandemic. The model integrates time-dependent features, namely the numbers of individuals classified as susceptible, infectious, recovered, and deceased (SIRD), and incorporates human mobility from neighboring regions as a crucial spatial feature. The model was validated using a publicly available COVID-19 dataset sourced from Italy and showed exceptional performance, surpassing existing spatiotemporal models in three-day ahead forecasting. The main research question of this study was as follows: how can the predictive capabilities of COVID-19 spread models be enhanced by integrating epidemic differential equations and recurrent neural networks? The results of this study provide a positive answer to this question and demonstrate the feasibility and effectiveness of the proposed hybrid model.

The results of this study are consistent with previous research that suggests the importance of incorporating time-dependent and spatial features in epidemic modeling. The proposed hybrid model builds on the strengths of both classic epidemic equations and RNNs, which have been widely used in infectious disease modeling and forecasting. The model leverages the SIRD equations to capture the dynamics of the disease transmission and the RNN to learn the nonlinear patterns and temporal dependencies of the data. The model also considers the impact of human mobility on the disease spread, which is often neglected or oversimplified in conventional models. By incorporating these features, the model can account for the heterogeneity and complexity of the COVID-19 pandemic and provide more accurate and reliable predictions.

The implications and benefits of this research are manifold for both society and the research community. For society, the proposed hybrid model can provide timely and precise forecasts of the COVID-19 spread, which can inform and support public health planning and the allocation of healthcare resources at all levels of governance, both nationally and globally. The model can also help policymakers and healthcare professionals to evaluate the effectiveness of different intervention strategies and to design optimal policies to manage and mitigate the impact of the pandemic. For the research community, the proposed hybrid model can contribute to the field of epidemic modeling by introducing a novel and robust approach that combines mathematical and deep learning methods. The model can also serve as a reference and a benchmark for future studies that aim to develop and improve COVID-19 spread models.

However, this research also has some limitations and challenges that need to be acknowledged and addressed. One limitation is the quality and availability of the data used to train and test the model. The COVID-19 dataset sourced from Italy may not be representative of other countries or regions that have different demographic, geographic, and socio-economic characteristics. The data may also contain errors, inconsistencies, or biases due to the variability and uncertainty of the testing and reporting procedures. These factors may affect the validity and generalizability of the model and its predictions. Another challenge is the dynamic and evolving nature of the COVID-19 pandemic, which may require constant updating and fine-tuning of the model parameters and features. The model may not be able to capture the effects of new variants, mutations, or outbreaks of the virus, or the changes in human behavior, mobility, and compliance with public health measures. These factors may reduce the accuracy and reliability of the model and its predictions over time.

Therefore, based on the discussion, some recommendations can be made for practical implementation and further research. For practical implementation, the proposed hybrid model should be applied and validated in different countries or regions that have different COVID-19 situations and data sources. The model should also be regularly monitored and adjusted to reflect the latest data and information on the pandemic. For further research, the proposed hybrid model can be extended and improved by incorporating more features and variables that may affect the spread of COVID-19, such as weather, climate, vaccination, testing, contact tracing, and social distancing. The model can also be compared with other models that use different methods or data to predict the COVID-19 spread. These efforts can help to enhance the predictive capabilities of COVID-19 spread models and to provide more insights and solutions for tackling the pandemic.

Author Contributions

Conceptualization, L.K. and Y.G.; methodology, Y.G.; software, Y.G.; validation, L.K., Y.G. and C.-w.L.; formal analysis, L.K.; investigation, Y.G.; resources, Y.G.; data curation, Y.G.; writing—original draft preparation, L.K.; writing—review and editing, Y.G.; visualization, Y.G.; supervision, C.-w.L.; project administration, C.-w.L.; funding acquisition, L.K. and Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the SHIELD Illinois and Discovery Partners Institute.

Data Availability Statement

The data that support the findings of this study are openly available at https://github.com/lzj994/COVID-19-Data/blob/master/covid-19/italy.csv (accessed on 1 September 2023).

Acknowledgments

SHIELD Illinois and Discovery Partners Institute support this endeavor.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

WHO Coronavirus (COVID-19) Dashboard. Available online: https://covid19.who.int/ (accessed on 20 September 2022).
Estrada, E. COVID-19 and SARS-CoV-2. Modeling the present, looking at the future. Phys. Rep. 2020, 869, 1–51. [Google Scholar] [CrossRef] [PubMed]
Gallo, L.; Frasca, M.; Latora, V.; Russo, G. Lack of practical identifiability may hamper reliable predictions in COVID-19 epidemic models. Sci. Adv. 2022, 8, eabg5234. [Google Scholar] [CrossRef] [PubMed]
Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [PubMed]
Chen, N.; Zhou, M.; Dong, X.; Qu, J.; Gong, F.; Han, Y.; Qiu, Y.; Wang, J.; Liu, Y.; Wei, Y.; et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study. Lancet 2020, 395, 507–513. [Google Scholar] [CrossRef] [PubMed]
Wiersinga, W.J.; Rhodes, A.; Cheng, A.C.; Peacock, S.J.; Prescott, H.C. Pathophysiology, Transmission, Diagnosis, and Treatment of Coronavirus Disease 2019 (COVID-19). JAMA 2020, 324, 782. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Wang, Y.; Ye, D.; Liu, Q. Review of the 2019 novel coronavirus (SARS-CoV-2) based on current evidence. Int. J. Antimicrob. Agents 2020, 55, 105948. [Google Scholar] [CrossRef]
Chinazzi, M.; Davis, J.T.; Ajelli, M.; Gioannini, C.; Litvinova, M.; Merler, S.; Pastore y Piontti, A.; Mu, K.; Rossi, L.; Sun, K.; et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science 2020, 368, 395–400. [Google Scholar] [CrossRef] [PubMed]
Sartorius, B.; Lawson, A.B.; Pullan, R.L. Modelling and predicting the spatio-temporal spread of COVID-19, associated deaths and impact of key risk factors in England. Sci. Rep. 2021, 11, 5378. [Google Scholar] [CrossRef] [PubMed]
Samuel, J.; Rahman, M.M.; Ali, G.M.; Samuel, Y.; Pelaez, A.; Chong, P.H.; Yakubov, M. Feeling Positive about Reopening? New Normal Scenarios from COVID-19 US Reopen Sentiment Analytics. IEEE Access 2020, 8, 142173–142190. [Google Scholar] [CrossRef]
Potential Rapid Increase of Omicron Variant Infections in the United States. Available online: https://archive.cdc.gov/www_cdc_gov/coronavirus/2019-ncov/science/forecasting/mathematical-modeling-outbreak.html (accessed on 1 July 2023).
le Rutte, E.A.; Shattock, A.J.; Chitnis, N.; Kelly, S.L.; Penny, M.A. Modelling the impact of Omicron and emerging variants on SARS-CoV-2 transmission and public health burden. Commun. Med. 2022, 2, 93. [Google Scholar] [CrossRef]
Barnard, R.C.; Davies, N.G.; Jit, M.; Edmunds, W.J. Modelling the medium-term dynamics of SARS-CoV-2 transmission in England in the Omicron era. Nat. Commun. 2022, 13, 4879. [Google Scholar] [CrossRef] [PubMed]
Callaway, E. What the latest omicron subvariants mean for the pandemic. Nature 2022, 606, 848–849. [Google Scholar] [CrossRef] [PubMed]
Khan, K.; Karim, F.; Ganga, Y.; Bernstein, M.; Jule, Z.; Reedoy, K.; Cele, S.; Lustig, G.; Amoako, D.; Wolter, N.; et al. Omicron BA.4/BA.5 escape neutralizing immunity elicited by BA.1 infection. Nat. Commun. 2022, 13, 4686. [Google Scholar] [CrossRef] [PubMed]
Velásquez, R.M.A.; Lara, J.V.M. Forecast and evaluation of COVID-19 spreading in USA with reduced-space Gaussian process regression. Chaos Solitons Fractals 2020, 136, 109924. [Google Scholar] [CrossRef] [PubMed]
Yousaf, M.; Zahir, S.; Riaz, M.; Hussain, S.M.; Shah, K. Statistical analysis of forecasting COVID-19 for upcoming month in Pakistan. Chaos Solitons Fractals 2020, 138, 109926. [Google Scholar] [CrossRef] [PubMed]
Saba, A.I.; Elsheikh, A.H. Forecasting the prevalence of COVID-19 outbreak in Egypt using nonlinear autoregressive artificial neural networks. Process Saf. Environ. Prot. 2020, 141, 1–8. [Google Scholar] [CrossRef] [PubMed]
Ren, H.; Zhao, L.; Zhang, A.; Song, L.; Liao, Y.; Lu, W.; Cui, C. Early forecasting of the potential risk zones of COVID-19 in China’s megacities. Sci. Total Environ. 2020, 729, 138995. [Google Scholar] [CrossRef]
Chin, V.; Samia, N.I.; Marchant, R.; Rosen, O.; Ioannidis, J.P.; Tanner, M.A.; Cripps, S. A case study in model failure? COVID-19 daily deaths and ICU bed utilisation predictions in New York state. Eur. J. Epidemiol. 2020, 35, 733–742. [Google Scholar] [CrossRef]
Anderson, R.M.; Heesterbeek, H.; Klinkenberg, D.; Hollingsworth, T.D. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Lancet 2020, 395, 931–934. [Google Scholar] [CrossRef]
Li, Z.; Zheng, Y.; Xin, J.; Zhou, G. A Recurrent Neural Network and Differential Equation based Spatiotemporal Infectious Disease Model with Application to COVID-19. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Virtual Event, 2–4 November 2020; pp. 93–103. [Google Scholar] [CrossRef]
Zheng, Y.; Li, Z.; Xin, J.; Zhou, G. A Spatial-temporal Graph based Hybrid Infectious Disease Model with Application to COVID-19. In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, Online Streaming, 4–6 February 2021; pp. 357–364. [Google Scholar] [CrossRef]
Sun, J.; Chen, X.; Zhang, Z.; Lai, S.; Zhao, B.; Liu, H.; Wang, S.; Huan, W.; Zhao, R.; Ng, M.T.; et al. Forecasting the long-term trend of COVID-19 epidemic using a dynamic model. Sci. Rep. 2020, 10, 21122. [Google Scholar] [CrossRef]
Perra, N.; Gonçalves, B. Modeling and Predicting Human Infectious Diseases. In Social Phenomena; Springer International Publishing: Cham, Switzerland, 2015; pp. 59–83. [Google Scholar] [CrossRef]
Yang, W.; Zhang, D.; Peng, L.; Zhuge, C.; Hong, L. Rational evaluation of various epidemic models based on the COVID-19 data of China. Epidemics 2021, 37, 100501. [Google Scholar] [CrossRef] [PubMed]
Bertozzi, A.L.; Franco, E.; Mohler, G.; Short, M.B.; Sledge, D. The challenges of modeling and forecasting the spread of COVID-19. Proc. Natl. Acad. Sci. USA 2020, 117, 16732–16738. [Google Scholar] [CrossRef] [PubMed]
Roda, W.C.; Varughese, M.B.; Han, D.; Li, M.Y. Why is it difficult to accurately predict the COVID-19 epidemic? Infect. Dis. Model. 2020, 5, 271–281. [Google Scholar] [CrossRef] [PubMed]
Li, D.; Huang, G.; Zhang, G.; Wang, J. Driving factors of total carbon emissions from the construction industry in Jiangsu Province, China. J. Clean Prod. 2020, 76, 123179. [Google Scholar] [CrossRef]
Bock, S.; Weis, M. A Proof of Local Convergence for the Adam Optimizer. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar] [CrossRef]
El Hajji, M.; Sayari, S.; Zaghdani, A. Mathematical Analysis of An ‘Sir’ Epidemic Model in a Continuous Reactor—Deterministic and Probabilistic Approaches. J. Korean Math. Soc. 2021, 58, 45–67. [Google Scholar]
Bousquet, A.; Conrad, W.H.; Sadat, S.O.; Vardanyan, N.; Hong, Y. Deep learning forecasting using time-varying parameters of the SIRD model for Covid-19. Sci. Rep. 2022, 12, 3030. [Google Scholar] [CrossRef] [PubMed]
Ma, R.; Zheng, X.; Wang, P.; Liu, H.; Zhang, C. The prediction and analysis of COVID-19 epidemic trend by combining LSTM and Markov method. Sci. Rep. 2021, 11, 17421. [Google Scholar] [CrossRef]
Chimmula, V.K.R.; Zhang, L. Time series forecasting of COVID-19 transmission in Canada using LSTM networks. Chaos Solitons Fractals 2020, 135, 109864. [Google Scholar] [CrossRef]
Italian Region. Italian COVID-19 Data. Available online: https://github.com/lzj994/COVID-19-Data/blob/master/covid-19/italy.csv (accessed on 1 September 2023).

Figure 1. (a) Three stacked LSTM cells in edge RNN; (b) generating the

I_{e}

of one region.

Figure 1. (a) Three stacked LSTM cells in edge RNN; (b) generating the

I_{e}

of one region.

Figure 2. Italian regional map.

Figure 3. Prediction results of the SIRD–eRNN model in different regions. Blue curve: exact value; orange curve: prediction; dotted line: train–test split. (a–g): results of Campania, Emilia-Romagna, Lazio, Lombardia, Molise, Piemonte and Umbria.

Figure 4. Prediction results of the IeRNN model in different regions. Blue curve: exact value; orange curve: prediction; dotted line: train–test split. (a–g): results of Campania, Emilia-Romagna, Lazio, Lombardia, Molise, Piemonte and Umbria.

Table 1. Comparison of RMSE between SIRD–eRNN and IeRNN using 3-day ahead forecast.

	SIRD–eRNN		IeRNN
State	Training	Test	Training	Test
Campania	0.00027	0.000096	0.00029	0.00010
Emilia-Romagna	0.00131	0.00048	0.00146	0.00042
Lazio	0.00041	0.00020	0.00044	0.00021
Lombardia	0.00150	0.00049	0.00168	0.00054
Molise	0.00037	0.00019	0.00041	0.00021
Piemonte	0.00164	0.00071	0.00181	0.00074
Umbria	0.00046	0.00006	0.00050	0.00003
Average	0.00085	0.00032	0.00094	0.00032
Average without Umbria	0.00092	0.00036	0.00101	0.00037

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kong, L.; Guo, Y.; Lee, C.-w. Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks. AppliedMath 2024, 4, 427-441. https://doi.org/10.3390/appliedmath4020022

AMA Style

Kong L, Guo Y, Lee C-w. Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks. AppliedMath. 2024; 4(2):427-441. https://doi.org/10.3390/appliedmath4020022

Chicago/Turabian Style

Kong, Liang, Yanhui Guo, and Chung-wei Lee. 2024. "Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks" AppliedMath 4, no. 2: 427-441. https://doi.org/10.3390/appliedmath4020022

APA Style

Kong, L., Guo, Y., & Lee, C.-w. (2024). Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks. AppliedMath, 4(2), 427-441. https://doi.org/10.3390/appliedmath4020022

Article Menu

Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks

Abstract

1. Introduction

2. Related Works

3. Proposed Model

3.1. Susceptible–Infectious–Recovered–Deceased Compartmental Epidemiology Process Model

3.2. Derivation of Discrete-Time I-Equation

3.3. Edge Feature and Latent $I_{e}$

3.4. Training the Proposed Model

4. Experimental Results

5. Discussion and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Enhancing COVID-19 Prevalence Forecasting: A Hybrid Approach Integrating Epidemic Differential Equations and Recurrent Neural Networks

Abstract

1. Introduction

2. Related Works

3. Proposed Model

3.1. Susceptible–Infectious–Recovered–Deceased Compartmental Epidemiology Process Model

3.2. Derivation of Discrete-Time I-Equation

3.3. Edge Feature and Latent I e

3.4. Training the Proposed Model

4. Experimental Results

5. Discussion and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Edge Feature and Latent $I_{e}$