1. Introduction
After more than two years of serious economic and health crises, COVID-19 will soon likely enter an endemic stage. However, concerns about the occurrence of one viral after another have reached a fever pitch. The world is facing a second new viral outbreak-the monkeypox outbreak. The “monkeypox virus” (MPV) the causative agent of monkeypox is not new, as it was first discovered in 1958 in Copenhagen [
1]. However, the first documented case of MPV was in a nine-month-old child from the Democratic Republic of Congo (DRC) in 1970 [
2]. Since then, the outbreaks have risen but are primarily limited to the African continent. However, a limited spread to Europe and North America was also noted [
3]. More than 48 confirmed cases in six different African countries from 1970 to 1979 were observed, including 38 cases in DRC, 4 in Liberia, 3 in Nigeria, and single cases in Cameroon, and Cote d’Ivoire. By 1986 the total cases reached 400 with mortality approaching 10%. Similarly, small outbreaks in equatorial Central and West Africa were also observed [
4], including 500 cases in DRC alone between 1991 and 1999 [
5]. Since the MVP has been in decline or reached an endemic situation in the African continent.
However, once again the MVP infection hits Portugal, Spain, and Canada, when on 18 May 2022, with 14, 7, and 13 cases, respectively reported in these countries [
6]. The MVP continues to spread to Belgium, Sweden, and Italy when they confirm their first MPV cases. Similarly, on 20 May 2022, Australia reported two cases. One was from Sydney and the other was in Melbourne. With each passing day, the MVP continues to grow rapidly. It’s when Switzerland and Israel confirmed their first cases on 21 May. Belgium introduces a 21-day mandatory quarantine for MVP. Which reflects the seriousness of this possible pandemic [
7]. Thus far, the MVP hits more than 50 countries including Denmark, Canada, North America, United Arab Emirates, the Czech Republic, Slovenia, and the Canary Islands.
A cumulative total of 21,099 confirmed cases have been reported as of 28 July 2022 worldwide. Similarly, a single death from MVP has also been reported to WHO from 42 countries in five WHO Regions [
8]. The majority of the confirmed cases, i.e., 98% have been reported since May 2022. Adding to the health concerns, the MVP has greatly affected people’s lives as well as the world’s economy. Among such questions, the people’s and government’s main concerns lie in the control of the disease and searching for effective community or country-wide interventions. For this purpose, a valid analysis and modeling of the data on daily confirmed cases and mortalities are required.
Several Mathematical and statistical models and methods are available which have been widely used for observing the behavior of epidemiological diseases and pandemics. Statistical models such as grey forecasting methods [
9,
10], mechanistic models and methods [
11], Neural Networks (NN) [
12,
13], multivariate linear regression [
14], computer-generated simulation models [
15], time series models [
16], and the Interrupted Time Series (ITS) regression models [
17,
18] were successfully applied to predict the intensity and behavior of the epidemic disease in near future. Among such models, time series analysis and neural networks are key and more realistic methods to predict the behavior, nature, and future of epidemics. There has been quite extensive literature reporting time series analysis for estimating several future scenarios of different diseases and epidemics. However, epidemics are mainly random phenomena due to which the general spread of the outbreaks is characterized by randomness. Statistical methods cannot be generalized for the prevalence of the epidemic in the future that can capture the randomness of the epidemic. To encounter such a problem, a valid and more acceptable method, the Automatic-Regressive Integrated Moving-Average (ARIMA), has been successfully adopted by practitioners in Health science and other fields for estimating epidemics. In many previous studies, the ARIMA model was used for predicting and assessing the incidence and prevalence of diseases. For example, the ARIMA model was applied for estimating Dengue Fever [
19], Malaria [
20], Hepatitis [
21], Tuberculosis [
22], Influenza [
23], etc. Further, the same ARIMA model was used for predicting the intensity of the past SARS outbreak. The ARIMA model is widely used for forecasting and prediction because it can account for changing trends, cyclicity, periodicity, and random disturbance in time series.
In the present study, we predicted the cumulative cases of MVP at the top throughout the world via ARIMA and Neural Networks. The appropriate ARIMA models for cumulative cases were identified, and then the number of confirmed cases was predicted for the 10 days The main objective of the present paper is to compare and find the most appropriate predictive model and to provide a realistic estimate for the peak time, the intensity of the epidemic, and a realistic picture of the future behavior of the outbreak. The study provides a road map for the concerned authorities to supply and plan resources effectively to control the epidemic.
3. Results and Discussion
The daily cumulative samples of monkeypox are collected for analysis purposes. Recommendations on the minimum necessary number of time points for time series analysis vary, however, there is considerable consensus that this minimum requirement is in the middle two-digit range, for instance, “… 40 observations is often mentioned as the minimum number of observations for a time series analysis” [
27], “Most time series experts suggest that the use of time series analysis requires at least 50 observations in the time series.” [
30]. There are a total of 84 samples that are part of the analysis therefore formal time series analysis can be performed for future forecasting. The analysis begins by making a graph of the monkeypox cumulative cases. The graph of the monkeypox series is presented in
Figure 2.
For processing the analysis ahead, we first describe the summary of the monkeypox data the results are shown in
Table 2, and then we apply the ARIMA methodology and then we apply the machine learning model. For the ARIMA model, we begin with the first step of the methodology which is the identification of the model, and to achieve this end we begin with the stationary test. For the stationary confirmation, we apply the Augmented dicky fuller test to the series and after confirming that there is no non-stationarity in the series, we make the correlogram which is the plot of ACF and PACF to identify the model (
Table 3). By applying the ADF test it is found that the series is not stationary and to make it stationary we apply a different transformation.
From the graphical perspective, it is found that the series has stationarity in nature and by applying the 1st difference it is removed as mentioned in
Table 4. Now to proceed with the analysis we will make the ACF and PACF of this 1st difference series to estimate the significant parameter. The correlogram is given below to move on to the second step of this methodology (
Figure 3).
By using the order of the correlogram and using the subjective approach we will estimate the significant parameters of the series. The different combination of the candidate model is given in
Table 5. From the output, it is found that among the three different classes of models the model ARIMA (7,1,7) is the best fit for the series as it has low values of the accuracy measure so the model is found significant according to the accuracy criteria, we will check the model and apply the diagnostic checking. To this end, we will make the ACF of the residuals and if there is no lag out from the boundary of 95% confidence interval the candidate model seems to be the best and most significant to model the series. The ACF of the candidate model ARIMA (7,1,7) is given in
Figure 4. From the ACF plot, it can be observed that no lag exceeds the confidence limits, so the model seems significant in forecasting the series of Monkeypox. Further the actual versus the fitted values from the model ARIMA (7,1,7) are shown in
Figure 5.
Actual cases are the observed number of monkeypox cases and fitted cases are those which have been obtained from the ARIMA model. Now the model ARIMA (7,1,7) is used for forecasting purposes. The values with a 95% confidence interval are given below (
Table 6).
Table 6 points give the forecasted results from the ARIMA model of monkeypox cases for future predictions with their confidence intervals.
Multilayer Perceptron Model
In this part, the model is used with the different combinations of the input and hidden neurons with a single hidden layer. The sigmoid activation function is used in the single feed-forward hidden layer. The model is selected according to the criteria of accuracy. A different combination of the models for the monkeypox data is given in
Table 7. From
Table 7 it is found that the model with the single input layer with 10 hidden neurons has the lowest accuracy measures and also the observed versus the fitted values seem quite well, which is given below in
Figure 6, further this model is used for forecasting purposes. Forecast values of the MLP model for the monkeypox data are shown in
Table 8.
Here, Actual cases are the observed number of monkeypox cases and fitted cases are those which have been obtained from the MLP model.
Table 8 points give the forecasted result from the MLP model of monkeypox cases for future predictions with their confidence intervals.
4. Conclusions
In this work, the comparative analysis was made using the classical time series model with the machine learning mode. First, in this work, we applied the ARIMA model and found the significant one to forecast the series. From the results, it was found that the monkeypox series followed the ARIMA (7,1,7) model among the other candidate models, with the root mean square error of 150.78. Comparatively, we applied the multilayer perceptron model with a different number of hidden neurons with a single hidden layer that uses the sigmoid activation function. The output of this model using single input with 10 hidden neurons resulted in significantly accurate measurements, as this model had the root mean square error of 54.40, which is much better than the ARIMA model; furthermore, the actual versus the fitted plot confirmed that the multilayer perceptron model had a better fit for the monkeypox data than the ARIMA model. For future work, the extreme learning machine model (ELM) support vector machine (SVM) and other unorganized methods with different activation functions can be applied for a better fit. In the light of conclusion drawn from the study, it can be stated that this new monkeypox pandemic is alarmingly increasing in different countries where these cases have been reported. An effort was made to select a suitable model, which will help the authorities to adopt the proper measures for minimizing its effects. If the respective management is unable to stop or reduce the transmission, the entire world may be faced with yet another catastrophe on the level of public health. More importantly, this study provided a comparison of two different forecasting methods and observed that the MLP model is the most reliable forecasting model by comparing it with conventional models. However, the main limitation which can be faced is that the comprehensive study of forecasting this pandemic is still challenging due to the lack of complete data from each country. Therefore, efforts should be made to gather the complete dataset images from the whole world in order to detect its future effects using deep learning or artificial intelligence.