Next Article in Journal
Evaluating the Antiparasitic Activity of Novel BPZ Derivatives Against Toxoplasma gondii
Next Article in Special Issue
Compartmentalized Replication of SARS-Cov-2 in Upper vs. Lower Respiratory Tract Assessed by Whole Genome Quasispecies Analysis
Previous Article in Journal
Heterogeneity of Molecular Characteristics among Staphylococcus argenteus Clinical Isolates (ST2250, ST2793, ST1223, and ST2198) in Northern Taiwan
Previous Article in Special Issue
SARS-CoV-2 RNA Persistence in Naso-Pharyngeal Swabs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting the Spreading of COVID-19 across Nine Countries from Europe, Asia, and the American Continents Using the ARIMA Models

by
Ovidiu-Dumitru Ilie
1,*,
Roxana-Oana Cojocariu
1,
Alin Ciobica
1,*,
Sergiu-Ioan Timofte
2,
Ioannis Mavroudis
3,4 and
Bogdan Doroftei
5
1
Department of Research, Faculty of Biology, “Alexandru Ioan Cuza” University, 700505 Iasi, Romania
2
Department of Biology, Faculty of Biology, “Alexandru Ioan Cuza” University, 700505 Iasi, Romania
3
Leeds Teaching Hospitals NHS Trust, Great George St., Leeds LS1 3EX, UK
4
Laboratory of Neuropathology and Electron Microscopy, School of Medicine, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
5
Faculty of Medicine, University of Medicine and Pharmacy “Grigore T. Popa”, 700115 Iasi, Romania
*
Authors to whom correspondence should be addressed.
Microorganisms 2020, 8(8), 1158; https://doi.org/10.3390/microorganisms8081158
Submission received: 12 July 2020 / Revised: 29 July 2020 / Accepted: 29 July 2020 / Published: 30 July 2020
(This article belongs to the Special Issue SARS-CoV-2: Epidemiology and Pathogenesis)

Abstract

:
Since mid-November 2019, when the first SARS-CoV-2-infected patient was officially reported, the new coronavirus has affected over 10 million people from which half a million died during this short period. There is an urgent need to monitor, predict, and restrict COVID-19 in a more efficient manner. This is why Auto-Regressive Integrated Moving Average (ARIMA) models have been developed and used to predict the epidemiological trend of COVID-19 in Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India, these last three countries being otherwise the most affected presently. To increase accuracy, the daily prevalence data of COVID-19 from 10 March 2020 to 10 July 2020 were collected from the official website of the Romanian Government GOV.RO, World Health Organization (WHO), and European Centre for Disease Prevention and Control (ECDC) websites. Several ARIMA models were formulated with different ARIMA parameters. ARIMA (1, 1, 0), ARIMA (3, 2, 2), ARIMA (3, 2, 2), ARIMA (3, 1, 1), ARIMA (1, 0, 3), ARIMA (1, 2, 0), ARIMA (1, 1, 0), ARIMA (0, 2, 1), and ARIMA (0, 2, 0) models were chosen as the best models, depending on their lowest Mean Absolute Percentage Error (MAPE) values for Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India (4.70244, 1.40016, 2.76751, 2.16733, 2.98154, 2.11239, 3.21569, 4.10596, 2.78051). This study demonstrates that ARIMA models are suitable for making predictions during the current crisis and offers an idea of the epidemiological stage of these regions.

1. Introduction

The outbreak with the new coronavirus (COVID-19) caused by severe acute respiratory syndrome (SARS-CoV-2) has led to a ‘global pandemic’ due to its unprecedented speed of spreading worldwide. Since patient zero that was reported back in mid-November, over ten million people from two hundred and sixteen territories were identified as SARS-CoV-2-infected patients [1].
Significant discoveries have been made in this context during these last nine months. During this period, the clinical panel has been established [2,3,4,5,6,7,8,9]. However, the early studies have also revealed a low [3,4,5,10], up to medium [6,7,8,9] incidence of gastrointestinal deficiencies. The most common symptom was diarrhea [11,12,13,14], which suggests a potential route of action of COVID-19 at the level of the digestive tract.
Unfortunately, until the 10th of July 2020, more than half a million people have died, predisposition being higher in people suffering from chronic diseases and, especially elderly [15]. However, the number of people confirmed positive varies due to finite capacities in epidemiological surveillance between countries.
It can be said without a shadow of a doubt, that this member of the zoonotic coronavirus family has spread over the entire world until the present day. Given that scientists are in a fight against the clock, the need for a sustainable and reliable strategy for planning health infrastructure to control the spread is crucial. This need is all the more imperative as there is no SARS-CoV-2 treatment/vaccine [15].
Modeling daily cases are pivotal for management and future directions. Estimating COVID-19 possible evolution or regression through mathematical and statistical models is groundbreaking to determine short and long-term case estimates. Such approaches are viable not only to predict the COVID-19 spreading course, but also to allocate the resources necessary to restrict the virus spreading [15].
Distinct approaches have been applied with relatively high accuracy for different prediction purposes. Some examples are represented by statistical methods aiming to predict epidemic cases. These include time series [16], or simulation models [17,18], multivariate linear regression [19], backpropagation neural network [20,21,22], and gray forecasting [23,24].
Any epidemiology evolution is defined and influenced by different factors, more precisely by a tendency of randomness. Retrospectively, the usage of the statistics tools above-mentioned are insufficient for analysis and are difficult to generalize. This is why the Automatic Regressive Integrated Moving Average (ARIMA) model has been successfully applied at a much larger scale in various fields, mainly due to its easy-to-use concept and utility algorithm [25].
Therefore, the present study aims to estimate the prevalence trend in Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, and Hungary as Central European countries. Moreover, we will also consider the most affected countries presently, such as USA, Brazil, and India.

2. Materials and Methods

2.1. Data

The daily prevalence data of COVID-19 was taken from The Ministry of Internal Affairs of Romania (https://www.mai.gov.ro/informare-covid-19-grupul-de-comunicare-strategica), World Health Organization (WHO) (https://covid19.who.int/?gclid=CjwKCAjwi_b3BRAGEiwAemPNUYzgrAMkQXN5Z848tjCmGZLJecod03yWxqW_bN248wjgdezXeYg0RoCeFcQAvD_BwE), and European Centre for Disease Prevention and Control (ECDC) (https://www.ecdc.europa.eu/en). MS Excel was used to build a time-series database. Descriptive statistics of the COVID-19 data for the established intervals (10 March and 10 July) are given in Table 1. In order to create an optimum ARIMA model, at least 30 observations are needed [26].
Data analyzed corresponds to the period between 10 March and 10 July. The data set was used to perform and analyze a case estimation model by applying ARIMA that could help us to predict the SARS-CoV-2 evolution in the future.
Therefore, for this study, a time series containing at least 45 data was used to predict COVID-19 prevalence in six Central and Eastern European countries (Romania, Bulgaria, Serbia, Ukraine, Republic of Moldova, and Hungary) was conducted. Furthermore, the same concept was applied for one country from South America (Brazil) and North America (United States of America), and one from South Asia (India) over the next fourteen days with 95% relative confidence intervals (CI).
As seen from Figure 1, the COVID-19 outbreak hit Ukraine harder than the other five countries between the established period. The first case in Ukraine was reported on 3 March 2020. In contrast with the related regions, the COVID-19 pandemic had started earlier in Romania (26 February) and later in the other four (4 March in Hungary, 6 March in Serbia, 7 March in the Republic of Moldova, and 8 March in Bulgaria). In Ukraine, the total number of confirmed cases of COVID-19 reported during the period is 52,043, the highest number of new cases reported being 1366 registered on 6 July.
The overall prevalence for Romania was 31,381, the second hardest-hit region, followed by the Republic of Moldova with 18,666, Serbia with 17,342, Bulgaria with 6672, and Hungary with 4220 cases. Analogous, the second highest incidence between the remaining five regions was in Romania with 614 new cases in 9 July, followed by the Republic of Moldova with 478 on 18 June, 445 in Serbia on 17 April, 330 in Bulgaria on 10 July, and 210 in Hungary on 10 April.
On the other hand, the first case reported in the USA took place on 20 January, almost one week later compared with Romania. The second hardest-hit region was Brazil, where the first case was reported on 26 February, while in India on 30 January. The overall prevalence for these three countries is as follows: USA with 3,038,325, Brazil with 1,713,160, and India with 793,892 cases. Concerning the incidence, the highest was as expected in USA with 64,630 on 10 July, followed by Brazil with 54,771 on 21 June, and last, India with 26,506 on 10 July.

2.2. ARIMA Models

A time series is simply a series of time-dependent data points [27] used for analyses dedicated to revealing reliable and meaningful statistical data for the subsequent prediction of values of a series [28]. Since it was introduced by Box and Jenkins approximately half a century ago, ARIMA began to be used at a much larger scale [26].
In most cases, ARIMA is used since it takes into account all trends and periodic changes, even random disturbances. Thus, ARIMA is suitable for a large spectrum of data, from seasonality to cyclicity. In this context can be modeled a temporal dependency in a flexible manner.
Non-seasonal ARIMA models are defined by three parameters (p, d, q) where p is the order of autoregression, d is the degree of differencing, and q the order of moving average [29]. ARIMA offers the possibility to be modified so that can be conducted different and simple AR, I, or MA models.
AR (p) usually explains the present value Yt, unidirectionally it terms of its previous values Yt−1, Yt−2, ..., Yt−p, and the current residuals εt. MA (q) refers to the current value of the time series Yt in terms of its current and previous residuals εt−1, εt−2,…, εt−𝑞. The general formula of AR (p) and MA (q) can be expressed in Equations (1) and (2).
Yt = Φ1Yt−1 + Φ2Yt−2 + … + ΦpYt−p + εt
Yt = θ1 εt−1−θ2 εt−2−… θp εt−p + εt
where:
p—past value;
Φ and θ—parameters that indicate the autoregression, and moving average, respectively;
t—time;
Yt—observed value at a time t;
εt—value of the random shock dependent by t;
p—past value.
In other words, ARMA (p, q) model expresses the current values, as well as its previous ones and residuals linearly. The corresponding formula is given in the below equation:
Yt = α + Φ1Yt−1 + Φ2Yt−2 + … + ΦpYt−p + εt − θ1 εt−1 − θ2 εt−2 − …θp εt − q
where:
α—constant;
εt−1—value of the previous random shock.

2.3. Model Selection

In the present study, three performance criteria entitled Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) were applied to test the predictive accuracy of the current ARIMA model. Mathematically, the equations for these three criteria are presented above:
R M S E = 1 n t = 1 n e t 2
M A E = 1 n t = 1 n | e t |
M A P E = 100 % n i = 1 n | e t y t |
where:
yt—value observed at a time t;
et—difference between values;
n—number of time points;
For a better fit of the data, RMSE, MAE, and MAPE must have low values. All analyses were performed using STATGRAPHICS Centurion (v.18.1.13) software with a statistically significant level of p < 0.005.

3. Results and Discussion

Forecasting the Prevalence of COVID-19 Pandemic Using the ARIMA Model

The ARIMA modeling is composed of four repetitive steps: assessment of the model, estimation of parameters, diagnostic checking, and prediction. The first step is to control whether the time series’ mean, variance, and autocorrelation constancy over time are stationary and seasonal for a better accuracy [30]. In this context, Time Series plot, Autocorrelation Function (ACF), and Partial Autocorrelation Function (PACF) (Figure 2) graphs were constructed to verify the seasonality and stationarity. On one hand, ACF can determine whether the previous values from the series are related to the following one, while PACF highlights the degree of correlation between a variable and a lag of the said variable [31]. Estimated autocorrelations for the time series of the established countries are shown in Figure 3. Straight lines represent two standard deviations limits, while bars that extend beyond the lines indicate statistically significant autocorrelations.
Additionally, a series of ARIMA models have been also created, and their performances were compared using various statistical tools. All statistical procedures were performed on the transformed COVID-19 data. ARIMA models with the minimum MAPE values were selected as the best model. Among the tested models, the ARIMA (1, 1, 0), ARIMA (3, 2, 2), ARIMA (3, 2, 2), ARIMA (3, 1, 1), ARIMA (1, 0, 3), ARIMA (1, 2, 0), ARIMA (1, 1, 0), ARIMA (0, 2, 1), and ARIMA (0, 2, 0) models were chosen as the best models for Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India. The models fitted the COVID-19 data are presented in Figure 2 and Table 2 and Table 3 with a minimum MAPEUkraine = 4.70244, MAPERomania = 1.40016, MAPERepublic of Moldova = 2.76751, MAPESerbia = 2.16733, MAPEBulgaria = 2.98154, MAPEHungary = 2.11239, MAPEUSA = 3.21569, MAPEBrazil = 4.10596, MAPEIndia = 2.78051.
Table 3 shows the parameter estimates for the best models. The p-values of the associated with the parameters are less than 0.005, so the terms are considerably different from zero at the 95.0% CI. The fitted and predicted values are presented in Figure 3. As seen in Table 4, the next 14-day estimate of confirmed cases may be between 52,816–59,679 in Ukraine, 31,838–38,650 in Romania, and 18,836–21,601 in the Republic of Moldova, 17,639–21,313 in Serbia, 6931–10,000 in Bulgaria, 4225–4319 in Hungary, 3.10259 × 106–3.90611 × 106 in USA, 1.75087 × 106–2.24113 × 106 in Brazil, and 8.20308 × 105–116,489 × 106 in India, respectively.
In the present study, an ARIMA model has been selected, in which the best model forecast for future data is given by a parametric model relating the most recent data value to previous data values and previous noise, or residuals in this context. The output summarizes the statistical significance of the terms in the forecasting model. Terms with p-values less than 0.05 are statistically significantly different from zero at the 95.0% confidence level. The p-value for the AR(x) or term is less than 0.05, so it is significantly different from 0. The p-value for the MA(x) term is less than 0.05, so it is significantly different from 0. When the trend is increasing, in order to obtain a linearity or central trend, the model also chooses q. The estimated standard deviation of the input white noise depends on the best model that was selected during the simulations performed.
According to the current literature, this would be the first study of such a manner. Therefore, the idea of a cluster of nations, and the rate of the spread between them is novel. This adds to the fact that this is the first study to address the situation of the most affected nations globally. In the present study the current situation of the COVID-19 pandemic in Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India was presented, and the ongoing trend and extent of the outbreak were estimated by the ARIMA model. According to our best of knowledge, this study is the first of its kind to implement ARIMA models to predict the prevalence of COVID-19 in such a manner.
In the current literature can be found limited data regarding the usage of ARIMA for the prediction of the COVID-19 course. Most reports evaluated the situation from western and southern Asia. Reports regarding the status of Europe are elusive for an unknown reason, and as a consequence, Europe gradually become the second mainland (Table 5). It should be also mentioned that papers that have been subjected to the peer-review process were excluded.
Effective strategies are now all more imperative to control the spreading of COVID-19. Thus, estimating epidemiological trends is crucial for the allocations of medical resources and production activities.
Among the most effective alternatives that proved their efficacity is quarantine. Chintalapudi et al. [34] have discussed the beneficial impact lockdown had within the Italian population in terms of transmissibility. A data-driven model analysis demonstrated a decrement up to 35% of total registered cases, concomitantly with an increase up to 66% of recovered cases after lockdown and self-isolation. The accuracy of these two parameters was 93.75 and 84.4%, respectively.
This tendency of regression proved to be true according to the results obtained by another group of authors. The accuracy of six performance metric models has been tested. Long short-term memory (LSTM) was found to be the most accurate during the study, perspective predictions within the next two weeks being made. Thus, is expected a slight decrease in the number of the total cumulative cases [35].
These observations are strengthened by the results of Papastefanopoulos et al. [40]. Six different time series approaches were also utilized to test the accuracy concerning the COVID-19 outbreak for the top ten most affected countries. Machine learning time series methods were efficiently used to estimate the percentage of the population that will be affected.
By using a stochastic modified SEIR model (susceptible–exposed–infectious–recovered) and due to lack of effective pharmaceutical interventions against SARS-CoV-2, López et al. [41] concluded that social confinement should remain in place for the next two months. Behavior, awareness, and immunity decay is attributed to 99% of the current wave. The gradual incorporation of up to 50% of daily working proportion should be also considered.
It has been recently shown that Black and South Asian people are more prone to infection and subsequently death than the rest. Among the risk factors is age, being male, deprivation, diabetes, asthma, and numerous other medical conditions following the analysis of a cohort consisting of 17,278,392 UK individuals [42].
If all these restrictions are not respected, humanity will face a second wave of infections much more severe than the previous one [37] according to the latest statistics reported by WHO. Most certainly, governments’ internal politics and capability in managing the current situation would be definitory during this temporary crisis [33,36,37,38].
Assuming that 20% of the population of each country in the US will be infected, age-specific mortality pattern shown that counties will be probably heavily affected. These findings suggest the adequate allocation of the medical care resources per capita needed to outside communities to restrain the spread [43].
Chakraborty et al. [32] revealed that to people over the age of 65 should be paid more attention, which is why for them it is recommended intensive care and isolation. In addition, they suggests that the locktime period must be extended, in parallel with the arranging medical centers by increasing the number of beds.
Furthermore, Demongeot et al. [39] have brought a new perspective regarding the important role temperature has on COVID-19 spreading, reflected by the total number of active cases. It seems that high temperature directly reduces contagion rates, but this does not mean seasonal temperature could not support the later reappearance following the usage of time series methods.

4. Conclusions

Forecasting the prevalence of a disease is crucial for health departments to create an optimum environment and conditions for patients. As has been presented throughout this manuscript, time series models play an important role in disease prediction. In this study, ARIMA time series models were successfully applied to estimate the overall prevalence of COVID-19 in nine countries, six of them being neighbors, while the other three are the most affected today.

Author Contributions

Writing—original draft, O.-D.I., R.-O.C., S.-I.T.; Software, S.-I.T.; Conceptualization, Visualization, Writing—review and editing, O.-D.I., A.C., I.M.; Methodology and Validation, B.D. All authors have read and agreed to the published version of the manuscript.

Funding

A.C. is supported by a research grant for Young Teams offered by UEFISCDI Romania, No. PN-III-P1-1.1-TE-2016-1210, contract No. 58 from 02/05/2018, called “Complex study regarding the interactions between oxidative stress, inflammation and neurological manifestations in the pathophysiology of irritable 278 bowel syndrome (animal models and human patients)”.

Acknowledgments

Not applicable, with the exception of the research grant mentioned above.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Rothan, H.A.; Byrareddy, S.N. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. J. Autoimmun. 2020, 109, 102433. [Google Scholar] [CrossRef] [PubMed]
  2. Zhu, N.; Zhang, D.; Wang, W.; Li, X.; Yang, B.; Song, J.; Zhao, X.; Huang, B.; Shi, W.; Lu, R.; et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N. Engl. J. Med. 2020, 382, 727–733. [Google Scholar] [CrossRef] [PubMed]
  3. Guan, W.; Ni, Z.; Hu, Y.; Liang, W.; Ou, C.; He, J.; Liu, L.; Shan, H.; Lei, C.; Hui, D.S.C.; et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N. Engl. J. Med. 2020, 382, 1708–1720. [Google Scholar] [CrossRef] [PubMed]
  4. Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [Green Version]
  5. Chen, N.; Zhou, M.; Dong, X.; Qu, J.; Gong, F.; Han, Y.; Qiu, Y.; Wang, J.; Liu, Y.; Wei, Y.; et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study. Lancet 2020, 395, 507–513. [Google Scholar] [CrossRef] [Green Version]
  6. Jin, X.; Lian, J.-S.; Hu, J.-H.; Gao, J.; Zheng, L.; Zhang, Y.-M.; Hao, S.-R.; Jia, H.-Y.; Cai, H.; Zhang, X.-L.; et al. Epidemiological, clinical and virological characteristics of 74 cases of coronavirus-infected disease 2019 (COVID-19) with gastrointestinal symptoms. Gut 2020, 69, 1002. [Google Scholar] [CrossRef] [Green Version]
  7. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 2020, 395, 1054–1062. [Google Scholar] [CrossRef]
  8. Wang, D.; Hu, B.; Hu, C.; Zhu, F.; Liu, X.; Zhang, J.; Wang, B.; Xiang, H.; Cheng, Z.; Xiong, Y.; et al. Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus–Infected Pneumonia in Wuhan, China. JAMA 2020, 323, 1061–1069. [Google Scholar] [CrossRef]
  9. Lin, L.; Jiang, X.; Zhang, Z.; Huang, S.; Zhang, Z.; Fang, Z.; Gu, Z.; Gao, L.; Shi, H.; Mai, L.; et al. Gastrointestinal symptoms of 95 cases with SARS-CoV-2 infection. Gut 2020, 69, 997. [Google Scholar] [CrossRef]
  10. Xu, X.-W.; Wu, X.-X.; Jiang, X.-G.; Xu, K.-J.; Ying, L.-J.; Ma, C.-L.; Li, S.-B.; Wang, H.-Y.; Zhang, S.; Gao, H.-N.; et al. Clinical findings in a group of patients infected with the 2019 novel coronavirus (SARS-Cov-2) outside of Wuhan, China: Retrospective case series. BMJ 2020, 368, m606. [Google Scholar] [CrossRef] [Green Version]
  11. Zhang, H.; Kang, Z.; Gong, H.; Xu, D.; Wang, J.; Li, Z.; Li, Z.; Cui, X.; Xiao, J.; Zhan, J.; et al. Digestive system is a potential route of COVID-19: An analysis of single-cell coexpression pattern of key proteins in viral entry process. Gut 2020, 69, 1010. [Google Scholar] [CrossRef]
  12. Ong, J.; Young, B.E.; Ong, S. COVID-19 in gastroenterology: A clinical perspective. Gut 2020, 69, 1144. [Google Scholar] [CrossRef] [PubMed]
  13. Song, Y.; Liu, P.; Shi, X.L.; Chu, Y.L.; Zhang, J.; Xia, J.; Gao, X.Z.; Qu, T.; Wang, M.Y. SARS-CoV-2 induced diarrhoea as onset symptom in patient with COVID-19. Gut 2020, 69, 1143. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Liang, W.; Feng, Z.; Rao, S.; Xiao, C.; Xue, X.; Lin, Z.; Zhang, Q.; Qi, W. Diarrhoea may be underestimated: A missing link in 2019 novel coronavirus. Gut 2020, 69, 1141. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Wang, L.; Li, J.; Guo, S.; Xie, N.; Yao, L.; Cao, Y.; Day, S.W.; Howard, S.C.; Graff, J.C.; Gu, T.; et al. Real-time estimation and prediction of mortality caused by COVID-19 with patient information based algorithm. Sci. Total Environ. 2020, 727, 138394. [Google Scholar] [CrossRef]
  16. Kurbalija, V.; Radovanović, M.; Ivanović, M.; Schmidt, D.; von Trzebiatowski, G.L.; Burkhard, H.-D.; Hinrichs, C. Time-series analysis in the medical domain: A study of Tacrolimus administration and influence on kidney graft function. Comput. Biol. Med. 2014, 50, 19–31. [Google Scholar] [CrossRef]
  17. Nsoesie, E.; Beckman, R.; Shashaani, S.; Nagaraj, K.; Marathe, M. A Simulation Optimization Approach to Epidemic Forecasting. PLoS ONE 2013, 8, e67164. [Google Scholar] [CrossRef] [Green Version]
  18. Orbann, C.; Sattenspiel, L.; Miller, E.; Dimka, J. Defining epidemics in computer simulation models: How do definitions influence conclusions? Epidemics 2017, 19, 24–32. [Google Scholar] [CrossRef]
  19. Thomson, M.C.; Molesworth, A.M.; Djingarey, M.H.; Yameogo, K.R.; Belanger, F.; Cuevas, L.E. Potential of environmental models to predict meningitis epidemics in Africa. Trop. Med. Int. Health 2006, 11, 781–788. [Google Scholar] [CrossRef]
  20. Liu, Q.; Li, Z.; Ji, Y.; Martinez, L.; Zia, U.H.; Javaid, A.; Lu, W.; Wang, J. Forecasting the seasonality and trend of pulmonary tuberculosis in Jiangsu Province of China using advanced statistical time-series analyses. Infect. Drug Resist. 2019, 12, 2311–2322. [Google Scholar] [CrossRef] [Green Version]
  21. Ren, H.; Li, J.; Yuan, Z.-A.; Hu, J.-Y.; Yu, Y.; Lu, Y.-H. The development of a combined mathematical model to forecast the incidence of hepatitis E in Shanghai, China. BMC Infect. Dis. 2013, 13, 421. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Zhang, X.; Liu, Y.; Yang, M.; Zhang, T.; Young, A.; Li, X. Comparative Study of Four Time Series Methods in Forecasting Typhoid Fever Incidence in China. PLoS ONE 2013, 8, e63116. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Wang, Y.; Shen, Z.; Jiang, Y. Comparison of ARIMA and GM(1,1) models for prediction of hepatitis B in China. PLoS ONE 2018, 13, e0201987. [Google Scholar] [CrossRef]
  24. Zhang, L.; Wang, L.; Zheng, Y.; Wang, K.; Zhang, X.; Zheng, Y. Time Prediction Models for Echinococcosis Based on Gray System Theory and Epidemic Dynamics. Int. J. Environ. Res. Public Health 2017, 14, 262. [Google Scholar] [CrossRef] [Green Version]
  25. Cao, L.; Liu, H.; Li, J.; Yin, X.; Duan, Y.; Wang, J. Relationship of meteorological factors and human brucellosis in Hebei province, China. Sci. Total Environ. 2020, 703, 135491. [Google Scholar] [CrossRef] [PubMed]
  26. Wilson, T.G. Time Series Analysis: Forecasting and Control, 5th Edition, by George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel and Greta M. Ljung, 2015. Published by John Wiley and Sons Inc., Hoboken, NJ USA, pp. 712, ISBN: 978-1-118-67502-1. J. Time Ser. Anal. 2016, 37. [Google Scholar] [CrossRef]
  27. Fanoodi, B.; Malmir, B.; Jahantigh, F.F. Reducing demand uncertainty in the platelet supply chain through artificial neural networks and ARIMA models. Comput. Biol. Med. 2019, 113, 103415. [Google Scholar] [CrossRef]
  28. Benvenuto, D.; Giovanetti, M.; Vassallo, L.; Angeletti, S.; Ciccozzi, M. Application of the ARIMA model on the COVID-2019 epidemic dataset. Data Br. 2020, 29, 105340. [Google Scholar] [CrossRef]
  29. Li, X.; Zhang, C.; Zhang, B.; Liu, K. A comparative time series analysis and modeling of aerosols in the contiguous United States and China. Sci. Total Environ. 2019, 690, 799–811. [Google Scholar] [CrossRef]
  30. Elevli, S.; Uzgören, N.; Bingöl, D.; Elevli, B. Drinking water quality control: Control charts for turbidity and pH. J. Water Sanit. Hyg. Dev. 2016, 6, 511–518. [Google Scholar] [CrossRef]
  31. He, Z.; Tao, H. Epidemiology and ARIMA model of positive-rate of influenza viruses among children in Wuhan, China: A nine-year retrospective study. Int. J. Infect. Dis. 2018, 74, 61–70. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Chakraborty, T.; Ghosh, I. Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis. Chaos Solitons Fractals 2020, 135, 109850. [Google Scholar] [CrossRef] [PubMed]
  33. Ahmar, A.S.; del Val, E.B. SutteARIMA: Short-term forecasting method, a case: Covid-19 and stock market in Spain. Sci. Total Environ. 2020, 729, 138883. [Google Scholar] [CrossRef] [PubMed]
  34. Chintalapudi, N.; Battineni, G.; Amenta, F. COVID-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in Italy: A data driven model approach. J. Microbiol. Immunol. Infect. 2020, 53, 396–403. [Google Scholar] [CrossRef] [PubMed]
  35. Kırbaş, İ.; Sözen, A.; Tuncer, A.D.; Kazancıoğlu, F. Comparative analysis and forecasting of COVID-19 cases in various European countries with ARIMA, NARNN and LSTM approaches. Chaos Solitons Fractals 2020, 110015. [Google Scholar] [CrossRef]
  36. Ceylan, Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total Environ. 2020, 729, 138817. [Google Scholar] [CrossRef]
  37. Singh, R.K.; Rani, M.; Bhagavathula, A.S.; Sah, R.; Rodriguez-Morales, A.J.; Kalita, H.; Nanda, C.; Sharma, S.; Sharma, Y.D.; Rabaan, A.A.; et al. Prediction of the COVID-19 Pandemic for the Top 15 Affected Countries: Advanced Autoregressive Integrated Moving Average (ARIMA) Model. JMIR Public Health Surveill. 2020, 6, e19115. [Google Scholar] [CrossRef]
  38. Modeling and Forecasting for the number of cases of the COVID-19 pandemic with the Curve Estimation Models, the Box-Jenkins and Exponential Smoothing Methods. Eurasian J. Med. Oncol. 2020, 4, 160–165.
  39. Demongeot, J.; Flet-Berliac, Y.; Seligmann, H. Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics. Biology 2020, 9, 94. [Google Scholar] [CrossRef]
  40. Papastefanopoulos, V.; Linardatos, P.; Kotsiantis, S. COVID-19: A Comparison of Time Series Methods to Forecast Percentage of Active Cases per Population. Appl. Sci. 2020, 10, 3880. [Google Scholar] [CrossRef]
  41. López, L.; Rodó, X. The end of social confinement and COVID-19 re-emergence risk. Nat. Hum. Behav. 2020, 4, 746–755. [Google Scholar] [CrossRef] [PubMed]
  42. Williamson, E.J.; Walker, A.J.; Bhaskaran, K.; Bacon, S.; Bates, C.; Morton, C.E.; Curtis, H.J.; Mehrkar, A.; Evans, D.; Inglesby, P.; et al. OpenSAFELY: Factors associated with COVID-19 death in 17 million patients. Nature 2020. [Google Scholar] [CrossRef] [PubMed]
  43. Miller, I.F.; Becker, A.D.; Grenfell, B.T.; Metcalf, C.J.E. Disease and healthcare burden of COVID-19 in the United States. Nat. Med. 2020. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The (a,b) prevalence and (a’,b’) incidence of the COVID-19 within the established countries.
Figure 1. The (a,b) prevalence and (a’,b’) incidence of the COVID-19 within the established countries.
Microorganisms 08 01158 g001aMicroorganisms 08 01158 g001bMicroorganisms 08 01158 g001cMicroorganisms 08 01158 g001d
Figure 2. The estimated Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) graphs to predict the epidemiological trend of COVID-19 prevalence for Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India.
Figure 2. The estimated Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) graphs to predict the epidemiological trend of COVID-19 prevalence for Ukraine, Romania, the Republic of Moldova, Serbia, Bulgaria, Hungary, USA, Brazil, and India.
Microorganisms 08 01158 g002aMicroorganisms 08 01158 g002b
Figure 3. Time-series plots for the best ARIMA models.
Figure 3. Time-series plots for the best ARIMA models.
Microorganisms 08 01158 g003
Table 1. Descriptive statistics on the prevalence and incidence of coronavirus (COVID-19) in the established countries.
Table 1. Descriptive statistics on the prevalence and incidence of coronavirus (COVID-19) in the established countries.
(a) Prevalence
ContinentsCountryMeanSE MeanSt. DevMinimumMaximumSkewnessKurtosis
Central and Eastern Europe Ukraine17,545.341424.0215793.16152,0430.5836−0.8257
Romania13,958.65837.079283.602531,381−0.0447−1.1718
Republic of Moldova6341.02516.675730.20318,6660.6936−0.7263
Serbia8159.77468.225192.87517,342−0.3619−1.1779
Bulgaria2063.34151.291677.90466720.7943−0.0721
Hungary2618.48138.751538.90124220−0.5892−1.2725
South and North America, and South Asia USA1,242,336.3580,297.41890,541.356963,038,3250.1309−1.1182
Brazil405,199.8645,415.32503,680.30251,713,1601.15760.0707
India168,929.4219,430.70215,496.9150793,8021.33650.7176
(b) Incidence
ContinentsCountryMeanSE MeanSt. DevMinimumMaximumSkewnessKurtosis
Central and Eastern Europe Ukraine423.1025.90287.33013660.3931−0.0052
Romania255.0011.65129.2566140.2298−0.0560
Republic of Moldova151.7410.03111.2704780.75730.1075
Serbia140.9810.22113.3804450.8188−0.4346
Bulgaria54.215.0856.4403301.97564.8870
Hungary34.233.0934.3602101.69794.7398
South and North America, and South Asia USA24,697.991152.3012,779.70064,6300.13780.8261
Brazil13,927.921307.4114,499.97054,7710.8961−0.3030
India6453.31644.937152.72026,5061.14020.2809
Table 2. Comparison of tested Auto-Regressive Integrated Moving Average (ARIMA) models.
Table 2. Comparison of tested Auto-Regressive Integrated Moving Average (ARIMA) models.
CountryModelRMSEMAEMAPE
Ukraine(1, 1, 0)182.40386.8574.70244
(0, 2, 0)184.53484.66944.75145
(3, 2, 0)140.18487.08744.86564
(3, 0, 0)140.83483.91045.02043
(2, 2, 0)141.80986.68185.08194
Romania(3, 2, 2)72.281154.82831.40016
(1, 2, 3)77.424657.10171.45906
(2, 2, 3)74.515455.28091.48125
(3, 2, 3)76.456456.49771.52647
(2, 2, 2)78.798658.46571.53212
Republic of Moldova(3, 2, 2)61.165843.58172.76751
(3, 2, 1)60.784943.51312.77257
(2, 2, 1)60.659743.57492.77809
(3, 2, 3)61.606343.85932.84718
(2, 2, 3)61.34143.85422.85937
Serbia(3, 1, 1)43.007928.80862.16733
(2, 1, 3)43.040929.1272.17147
(1, 1, 3)42.863329.11742.17271
(3, 1, 0)42.865928.8472.17729
(2, 1, 2)42.868629.18412.17814
Bulgaria(1, 0, 3)33.473223.14312.98154
(2, 0, 2)33.763523.05373.04647
(3, 0, 0)33.599522.82793.08918
(3, 2, 0)35.406423.74863.08997
(2, 2, 2)78.798658.46571.53212
Hungary(1, 2, 0)23.045215.1012.11239
(0, 2, 3)21.798513.63162.15973
(3, 2, 0)22.672914.27142.16096
(3, 0, 0)22.656314.4882.16571
(2, 2, 3)21.987313.62722.16876
USA(1, 1, 0)6539.464673.823.21569
(0, 2, 0)6541.24710.643.2431
(3, 2, 1)5818.424379.883.29508
(1, 2, 3)5868.514434.363.29553
(2, 2, 3)5888.314430.093.29999
Brazil(0, 2, 1)6134.913838.174.10596
(2, 1, 0)6493.373521.694.14127
(2, 2, 1)5454.193118.734.15452
(1, 2, 0)6515.513598.894.16568
(3, 2, 1)5457.523082.24.1698
India(0, 2, 0)642.607416.1322.78051
(1, 1, 0)574.812376.2352.7951
(2, 1, 0)570.378373.2473.06874
(1, 1, 2)524.071358.2943.19978
(3, 0, 1)543.562358.1253.29689
Table 3. Parameters of ARIMA models.
Table 3. Parameters of ARIMA models.
Country and Best ModelParametersEstimateStandard Errort-Statisticp-Value
Ukraine (1, 1, 0)AR(1)0.9438440.032540429.00530.000000
Romania (3, 2, 2)AR(3)−0.4106280.103264−3.976480.000122
MA(2)−0.7589110.0916899−8.277020.000000
Republic of Moldova (3, 2, 2)AR(3)−0.16248910.6563−0.01524820.987860
MA(2)0.34145926.51060.01288010.989746
Serbia (3, 1, 1)AR(3)0.2419241.062520.2276890.820282
MA(1)−0.5723392.98064−0.1920180.848058
Bulgaria (1, 0, 3)AR(1)1.027690.00227845451.0480.000000
MA(3)−0.2673460.0937488−2.851720.005128
HungaryAR(1)−0.4010320.0836831−4.792270.000005
USA (1, 1, 0)AR(1)0.994410.021704745.81540.000000
Brazil (0, 2, 1)MA(1)0.7584220.056564513.40810.000000
India (0, 2, 0)no parameter (s)
Table 4. Prediction of total confirmed cases of COVID-19 for the next fourteen days according to ARIMA models with 95% confidence interval.
Table 4. Prediction of total confirmed cases of COVID-19 for the next fourteen days according to ARIMA models with 95% confidence interval.
Ukraine ARIMA (1,1,0)Romania ARIMA (3,2,2)Republic of Moldova ARIMA (3,2,2)
Lower 95%Upper 95% Lower 95%Upper 95% Lower 95%Upper 95%
PeriodForecastLimitLimitPeriodForecastLimitLimitPeriodForecastLimitLimit
11-7-2052,816.052,454.953,177.111-7-2031,838.231,694.931,981.511-7-2018,836.618,715.518,957.8
12-7-2053,545.652,756.254,335.012-7-2032,261.632,023.732,499.512-7-2019,037.218,806.519,268.0
13-7-2054,234.252,941.655,526.913-7-2032,719.832,386.233,053.513-7-2019,259.318,940.019,578.5
14-7-2054,884.253,031.456,736.914-7-2033,267.332,849.633,685.114-7-2019,478.919,081.419,876.5
15-7-2055,497.653,040.657,954.715-7-2033,872.933,362.934,383.015-7-2019,691.419,211.920,170.8
16-7-2056,076.752,980.259,173.116-7-2034,469.633,845.735,093.516-7-2019,901.519,332.320,470.6
17-7-2056,623.252,859.460,386.917-7-2035,003.734,237.535,769.817-7-2020,113.119,448.220,778.0
18-7-2057,139.052,685.561,592.418-7-2035,477.434,549.736,405.118-7-2020,326.119,561.321,090.8
19-7-2057,625.852,464.962,786.719-7-2035,938.634,844.537,032.819-7-2020,539.019,670.821,407.3
20-7-2058,085.352,202.963,967.720-7-2036,438.835,182.037,695.720-7-2020,751.619,775.921,727.3
21-7-2058,519.051,904.265,133.821-7-2036,992.835,575.438,410.221-7-2020,964.019,876.722,051.2
22-7-2058,928.351,573.066,283.722-7-2037,572.235,988.439,156.022-7-2021,176.419,973.722,379.1
23-7-2059,314.751,212.867,416.623-7-2038,132.936,369.639,896.123-7-2021,389.020,067.122,710.8
24-7-2059,679.350,826.868,531.924-7-2038,650.736,693.240,608.124-7-2021,601.520,156.823,046.2
Serbia ARIMA (3,1,1)Bulgaria ARIMA (1,0,3)Hungary ARIMA (1,2,0)
Lower 95%Upper 95% Lower 95%Upper 95% Lower 95%Upper 95%
PeriodForecastLimitLimitPeriodForecastLimitLimitPeriodForecastLimitLimit
11-7-2017,639.617,554.517,724.811-7-206931.56865.226997.7911-7-204225.994180.364271.62
12-7-2017,927.017,765.118,088.812-7-207179.187065.117293.2512-7-204233.594147.544319.64
13-7-2018,214.217,956.818,471.613-7-207405.167239.77570.6313-7-204240.544102.744378.34
14-7-2018,501.818,135.518,868.014-7-207610.227392.897827.5514-7-204247.754051.784443.73
15-7-2018,786.818,300.419,273.215-7-207820.957559.818082.115-7-204254.863993.944515.78
16-7-2019,072.018,454.519,689.516-7-208037.537736.958338.116-7-204262.013930.384593.64
17-7-2019,355.418,597.420,113.517-7-208260.097922.858597.3417-7-204269.143861.354676.93
18-7-2019,638.318,730.520,546.118-7-208488.838116.768860.8918-7-204276.283787.294765.27
19-7-2019,919.918,854.120,985.819-7-208723.898318.289129.519-7-204283.423708.464858.38
20-7-2020,200.718,968.821,432.620-7-208965.478527.29403.7320-7-204290.563625.124955.99
21-7-2020,480.419,075.021,885.821-7-209213.738743.449684.0221-7-204297.693537.495057.89
22-7-2020,759.219,173.222,345.222-7-209468.878966.969970.7822-7-204304.833445.765163.91
23-7-2021,036.919,263.522,810.323-7-209731.079197.8110,264.323-7-204311.973350.085273.86
24-7-2021,313.719,346.423,281.024-7-2010,000.59436.0410,565.024-7-204319.113250.65387.62
USA ARIMA (1,1,0)Brazil ARIMA (0,2,1)India ARIMA (0,2,0)
Lower 95%Upper 95% Lower 95%Upper 95% Lower 95%Upper 95%
PeriodForecastLimitLimitPeriodForecastLimitLimitPeriodForecastLimitLimit
11-7-203.10259 × 1063.08965 × 1063.11554 × 10611-7-201.75087 × 1061.73873 × 1061.76302 × 10611-7-20820,308819,036821,580
12-7-203.1665 × 1063.13762 × 1063.19539 × 10612-7-201.78858 × 1061.76922 × 1061.80795 × 10612-7-20846,814843,969849,659
13-7-203.23006 × 1063.18183 × 1063.27828 × 10613-7-201.8263 × 1061.79985 × 1061.85275 × 10613-7-20873,320868,560878,080
14-7-203.29325 × 1063.2228 × 1063.3637 × 10614-7-201.86401 × 1061.83027 × 1061.89775 × 10614-7-20899,826892,858906,794
15-7-203.3561 × 1063.26091 × 1063.45129 × 10615-7-201.90172 × 1061.86038 × 1061.94306 × 10615-7-20926,332916,897935,767
16-7-203.41859 × 1063.2964 × 1063.54077 × 10616-7-201.93943 × 1061.89016 × 1061.98871 × 10616-7-20952,838940,702964,974
17-7-203.48073 × 1063.3295 × 1063.63196 × 10617-7-201.97714 × 1061.91958 × 1062.03471 × 10617-7-20979,344964,291994,397
18-7-203.54253 × 1063.36035 × 1063.7247 × 10618-7-202.01486 × 1061.94866 × 1062.08105 × 10618-7-201.00585 × 106987,6791.02402 × 106
19-7-203.60398 × 1063.3891 × 1063.81885 × 10619-7-202.05257 × 1061.9774 × 1062.12774 × 10619-7-201.03236 × 1061.01088 × 1061.05383 × 106
20-7-203.66508 × 1063.41586 × 1063.91431 × 10620-7-202.09028 × 1062.0058 × 1062.17476 × 10620-7-201.05886 × 1061.0339 × 1061.08382 × 106
21-7-203.72585 × 1063.44073 × 1064.01097 × 10621-7-202.12799 × 1062.03387 × 1062.22211 × 10621-7-201.08537 × 1061.05675 × 1061.11399 × 106
22-7-203.78627 × 1063.46379 × 1064.10876 × 10622-7-202.16571 × 1062.06163 × 1062.26978 × 10622-7-201.11187 × 1061.07944 × 1061.14431 × 106
23-7-203.84636 × 1063.48513 × 1064.2076 × 10623-7-202.20342 × 1062.08907 × 1062.31776 × 10623-7-201.13838 × 1061.10197 × 1061.17479 × 106
24-7-203.90611 × 1063.5048 × 1064.30743 × 10624-7-202.24113 × 1062.11621 × 1062.36605 × 10624-7-201.16489 × 1061.12435 × 1061.20542 × 106
Table 5. Studies conducted to predict COVID-19 spreading in which were used distinct statistical approaches.
Table 5. Studies conducted to predict COVID-19 spreading in which were used distinct statistical approaches.
DiseaseMethod(s)Reference
COVID-19Hybrid ARIMA-WBF[32] $
SutteARIMA[33] *
Seasonal ARIMA[34] *
ARIMA[35] *
NARNN
LSTM
ARIMA[36] *
ARIMA[37] $
ARIMA[38] $
ARIMA[39] $
ARIMA[40] $
HWAAS
TBAT
Facebook’s Prophet
DeepAR
N-Beats
$ European and non-European countries are included; * strictly European countries included.

Share and Cite

MDPI and ACS Style

Ilie, O.-D.; Cojocariu, R.-O.; Ciobica, A.; Timofte, S.-I.; Mavroudis, I.; Doroftei, B. Forecasting the Spreading of COVID-19 across Nine Countries from Europe, Asia, and the American Continents Using the ARIMA Models. Microorganisms 2020, 8, 1158. https://doi.org/10.3390/microorganisms8081158

AMA Style

Ilie O-D, Cojocariu R-O, Ciobica A, Timofte S-I, Mavroudis I, Doroftei B. Forecasting the Spreading of COVID-19 across Nine Countries from Europe, Asia, and the American Continents Using the ARIMA Models. Microorganisms. 2020; 8(8):1158. https://doi.org/10.3390/microorganisms8081158

Chicago/Turabian Style

Ilie, Ovidiu-Dumitru, Roxana-Oana Cojocariu, Alin Ciobica, Sergiu-Ioan Timofte, Ioannis Mavroudis, and Bogdan Doroftei. 2020. "Forecasting the Spreading of COVID-19 across Nine Countries from Europe, Asia, and the American Continents Using the ARIMA Models" Microorganisms 8, no. 8: 1158. https://doi.org/10.3390/microorganisms8081158

APA Style

Ilie, O. -D., Cojocariu, R. -O., Ciobica, A., Timofte, S. -I., Mavroudis, I., & Doroftei, B. (2020). Forecasting the Spreading of COVID-19 across Nine Countries from Europe, Asia, and the American Continents Using the ARIMA Models. Microorganisms, 8(8), 1158. https://doi.org/10.3390/microorganisms8081158

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop