Next Article in Journal
Heavy Metal Contamination in Oryza sativa L. at the Eastern Region of Malaysia and Its Risk Assessment
Next Article in Special Issue
The Effect of Transborder Mobility on COVID-19 Incidences in Belgium
Previous Article in Journal
Effects of Personal Low-Frequency Stimulation Device on Myalgia: A Randomized Controlled Trial
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach

1
Department of Environmental Education and Management, Faculty of Education, Near East University, Nicosia 700006, Cyprus
2
Department of Environmental Engineering, Faculty of Civil and Environmental Engineering, Near East University, Nicosia 700006, Cyprus
3
Department of Economics, Yusuf Maitama Sule University, Kano 700282, Nigeria
4
Department of Electrical and Computer Engineering, Faculty of Engineering, Baze University, Abuja 900288, Nigeria
5
Faculty of Clinical Sciences, Bayero University, Kano 700006, Nigeria
6
Department of Analytical Chemistry, Faculty of Pharmacy, Near East University, TRNC, Mersin 99138, Turkey
7
Interdisciplinary Research Center for Membrane and Water Security, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
8
Department of Civil Engineering, Faculty of Engineering, Baze University, Abuja 900288, Nigeria
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2022, 19(2), 738; https://doi.org/10.3390/ijerph19020738
Submission received: 5 December 2021 / Revised: 6 January 2022 / Accepted: 6 January 2022 / Published: 10 January 2022

Abstract

:
Reliable modeling of novel commutative cases of COVID-19 (CCC) is essential for determining hospitalization needs and providing the benchmark for health-related policies. The current study proposes multi-regional modeling of CCC cases for the first scenario using autoregressive integrated moving average (ARIMA) based on automatic routines (AUTOARIMA), ARIMA with maximum likelihood (ARIMAML), and ARIMA with generalized least squares method (ARIMAGLS) and ensembled (ARIMAML-ARIMAGLS). Subsequently, different deep learning (DL) models viz: long short-term memory (LSTM), random forest (RF), and ensemble learning (EML) were applied to the second scenario to predict the effect of forest knowledge (FK) during the COVID-19 pandemic. For this purpose, augmented Dickey–Fuller (ADF) and Phillips–Perron (PP) unit root tests, autocorrelation function (ACF), partial autocorrelation function (PACF), Schwarz information criterion (SIC), and residual diagnostics were considered in determining the best ARIMA model for cumulative COVID-19 cases (CCC) across multi-region countries. Seven different performance criteria were used to evaluate the accuracy of the models. The obtained results justified both types of ARIMA model, with ARIMAGLS and ensemble ARIMA demonstrating superiority to the other models. Among the DL models analyzed, LSTM-M1 emerged as the best and most reliable estimation model, with both RF and LSTM attaining more than 80% prediction accuracy. While the EML of the DL proved merit with 96% accuracy. The outcomes of the two scenarios indicate the superiority of ARIMA time series and DL models in further decision making for FK.

1. Introduction

On 31 December 2019, there were many instances of pneumonia in China with no known background. The cases were reported in early December 2019, and many of those who were infected lived or worked at the Huanan local Seafood Wholesale Market, despite the fact that the remainder of the cases had no connection to this location [1]. A novel coronavirus, designated as 2019-nCoV by WHO on 7 January, was discovered in one of these patients [2]. The new virus was termed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by the Coronavirus Study Team (CST) [2], and was later renamed COVID-19 by WHO. As of 30 January 2020, seven thousand, seven hundred and thirty-six verified incidents and twelve thousand, one hundred and sixty-seven probable occurrences had been reported in the Republic of China, with eighty-two instances reported in eighteen additional countries [3]. On 30 January 2020, WHO designated the SARS-CoV-2 epidemic a public health emergency of international concern (PHEI) and a pandemic on 11 March 2020 [3,4].
According to the Chinese National Health Commission, the percentage of deaths among confirmed cases in the Republic of China was 2.1% as of 4 February 2020 [4]. COVID-19 spreads quickly, resulting in a high number of deaths; moreover, accessible data as well as published findings are rapidly expanding. As of 17 May 2021, the disease had claimed the lives of over three million individuals, with over 163 million confirmed cases in over 220 nations and territories [5]. It is unclear what influence the Huanan Seafood Market played in the transmission of the new virus. The majority of the first COVID-19 cases were related to this market, indicating that the virus was likely transmitted from animals to humans [6]. According to genetic evidence, the virus was introduced into the Huanan market from an unknown source and quickly spread throughout the city, despite the fact that human-to-human transmission is known to have occurred earlier [6].
Human-to-human transmission was first suggested by the large number of affected family members and later confirmed by health experts [6]. COVID-19 has had a detrimental influence on Africa’s health, security, politics, and society. Already frail healthcare facilities were overwhelmed by the rapidly growing number of cases during the pandemic’s dramatic peak. The continued function of vital health services has also been disturbed in several African nations, resulting in a supply–demand imbalance. The pandemic has had a significant impact on non-communicable disease treatment, regular vaccination, prenatal care, family planning and contraception, and other services. Several researchers have attempted to anticipate the CCC [7,8], with ARIMA being used in several of these studies [9,10]. Seasonal ARIMA for example, SARIMA model is associated with epidemiological models based on phone call data [11,12]. Some research, on the other hand, focused on the impact of governmental measures—such as lockdown and social separation—on COVID-19 dissemination [11].
Various new AI-based models have yet to be applied to COVID-19 situations, despite suggestions in the literature to employ different versions of these models—such as neural networks—for novel COVID-19 modeling. Another reason to investigate novel modeling methods is the fact that correct simulation of COVID-19 in a research region can save money, energy, and time; as a result, the choice of modeling methodology is given a lot of thought when forecasting these important trends [12,13,14,15,16,17,18,19,20]. On the other hand, studies of COVID-19 related to image segmentation have been explored in [12,13,21]. In poorer countries, where the budget for environmental quality evaluation and monitoring is lower than in wealthier ones, modeling approaches are more relevant. According to the Scopus database’s reported literature for 2020–2021, there exists a lot of interest in power system simulation using the feasibility of ML models. The primary keyword occurrence clusters and temporal regional spans across the literature are presented in Figure 1a,b, respectively. Over 2000 articles were included, demonstrating the importance of this topic in terms of COVID-19 modeling. The investigation of new machine learning models capable of solving engineering challenges is always ongoing, and both academics and scientists are interested in the research domain of novel and sophisticated modeling methodologies that can be applied to COVID-19.
The present study makes the following contributions: This is the first research, to the best of the authors’ knowledge, in which an ARIMA model is combined with GLS and deep learning models (random forest (RF) and long-short term memory (LSTM)) to predict CCC under environmental protection knowledge. This research also considers the Economic Community of West African States (ECOWAS), a significant economic grouping. This paper examines the time series characteristics of the CCC using two-unit root tests before modeling, eliminating the risk of relying on a single unit root test. The ARIMA models estimated with ML methods and those estimated with GLS are compared. Despite the fact that ARIMA was used by ArunKumar et al. [7], Alabdulrazzaq et al. [9], and Guleryuz [14], none of these studies applied the GLS, RF, LSTM or EML estimation method and more than one unit root test. This research can help policymakers to analyze hospitalization needs and adopt interventions targeted at flattening the COVID-19 curve.

2. Materials and Methods

The current study proposes two different scenarios. The first scenario aimed to model the cumulative COVID-19 cases in four different counties using various classifications of ARIMA models: ARIMA based on automatic routines (AUTOARIMA), ARIMA estimated according to the Box–Jenkins procedure with maximum likelihood method (ARIMAML), ARIMA estimated with the generalized least squares method (ARIMAGLS), and ensembled ARIMAML and ARIMAGLS (ARIMAML-ARIMAGLS). The second scenario employed a novel deep learning model for the estimation of uncertain environmental knowledge regarding forests during the COVID-19 era. For this purpose, the experimental data used in this research were divided, with 70% used for calibration and 30% for the verification phase with validation practices. The model’s results were evaluated using the k-fold cross-validation methodology, which is considered the best way to achieve unbiased model performance prediction with a small data set [7,8]. Although various validation methods can be used, the k-fold cross-validation strategy represents the most practical option for achieving an unbiased goodness-of-fit prediction (for a restricted data set).
The challenge in determining whether one model outperforms others in reality is the fundamental incentive for using several data-intelligence models. As a result, selecting acceptable models for a specific scenario can be difficult for modelers [22,23]. Only by identifying and selecting several data-driven—and primarily linear—models can this complexity be addressed, despite their shortcomings in handling extremely non-linear and complicated data. Figure 2 presents the flowcharts used in the construction of the current study for scenarios I and II, respectively. Defined in Equation (1), the input data are gathered, pre-processed, and normalized, as shown in the flowchart. The data were normalized before the model was trained, which is commonly carried out to improve the model’s efficiency and accuracy.
y = 0.05 + 0.95 × x x m i n x m a x x m i n
where y represents normalized data, x represents measured data, and xmax and xmin represent the measured data’s maximum and minimum values, respectively.

2.1. ARIMA Model

Box and Jenkins [24] introduced the ARIMA model concept. Equation (2) represents the ARIMA (p, d, q) model. Autoregressive order, integration order, and moving average order are represented by the letters p, d, and q, respectively. ARIMA is a type of model used in time series forecasting where a collection of observed data from the past are analyzed and used to design a model describing the underlying relationship. This model is further used to predict/extrapolate into the future [20]. A variable’s future value is assumed to be a linear function of several observed data points and random errors [25,26,27,28]. The time series is generated using:
Ψ Y t = α 0 + Γ ε t
where Y t is the stationary dependent variable, ε t is the white noise error term, Ψ = 1 i = 1 p α i L i and Γ = j = 0 q β j L j , and L is the lag operator defined as L i Y t = Y t i for i = 1 , 2 , , . For more details on ARIMA, see [24]. If Y t is stationary, the series can be modelled as an ARMA ( p , q ) process, otherwise it has to be modelled as an ARIMA (p, d, q) process. Figure 3 presents the procedures employed by this study in modeling the CCC.

2.2. Random Forest (RF)

Random forest (RF) is an effective supervised learning technique mainly used for classification and regression problems in machine learning [29]. Breiman, [30] introduced RF as a practical ensemble algorithm which provides an additional non-stationarity layer to the bagging approach [29,30,31,32]. RF fulfils its role by using a random sampling mechanism to generate several decision trees. Generally, forecasts are derived from the mean outputs of these systems, which form a vast ensemble of trees. Bootstrapping or a random selection of inputs are utilized to create the various foundation trees, which is how the decisions are made. Lately, there has been a surge in curiosity regarding RF, which is already used in various applications. The RF architecture used to determine the final features to form the RF tree is presented in Figure 4.

2.3. Long Short-Term Memory Neural Network (LSTM)

Long short-term memory neural network (LSTM) is a widely used deep learning model in science and engineering. It is capable of analyzing complex and high-dimensional data in a relatively short period with minimal human resources when compared with conventional data collection and analysis [33,34]. The LSTM is a sort of recurrent algorithm that can successfully tackle gradient explosion and gradient disappearance during RNN training while also increasing RNN performance. The LSTM model was created to compensate for the traditional RNN’s inability to memorize sequences of 10 or more characters. The recurrent models are chained iterative approaches that are connected and repeated. The LSTM model, which uses special memory cells to store information, has a chain with an almost identical structure to that of RNN [14,35,36] (see Figure 5).

3. Data Processing and Validation

The following paragraph provides definitions of the variables analyzed in this study. The cumulative total of COVID-19 patients’ daily laboratory records is referred to as the CCC. The data for this study were taken from the World Health Organization’s (WHO) COVID-19 global data database. Each country’s sample runs from the first day a COVID-19 case for the first scenario was recorded in the country through to 1 September 2021. The sample size (N) for each country can be seen in Table 1. Table 1 also shows the CCC’s descriptive data for the four nations (LY, NG, TR, and ZA). The sample size for each country is represented by the number of observations (N). The greater the number of observations, the earlier the country reported the first instance of COVID-19. Accordingly, the first COVID-19 case in the region was discovered in NG, followed by ZA, TR, and then LY. The mean (Y) reflects the CCC’s average value, the median (Ymed) shows the CCC’s value in the center of the sample for each country, and the standard deviation (σ) represents the CCC’s dispersion from the mean for each country. The earliest number of instances documented for each nation is given by the minimum (Ymin). For instance, NG’s first COVID-19 record is 5, while LY’s is 1. The maximum value (Ymax) is the CCC as of 1 September 2021. A histogram of the instances for each country is shown in Figure 6.
For the second scenario, a well-structured questionnaire was developed and subdivided into seven sections with questions and possible responses under subheadings. These included questions on the demographic characteristics of students and questions to evaluate students’ general knowledge regarding forests, forest protection, the importance of forests, poor forest administration, the dangers of deforestation, and how individuals and governments can take responsibility for forests. For exploratory and data-driven analysis, the important variables were selected based on dependency analysis, in which the following variables from the questionnaire were used: forest knowledge (FK), forest importance to the country (FIC), priority for recreational activities (ICA), vital goal of forest (VGF), the government is responsible for taking care of forest problems (GRF), sources of forest knowledge (SFK), benefit of forest protection to man and his environment (BFE), responsibility of individuals to protect the forest in their locality (RIF), and the dangers of cutting forest down (DCF). To understand the effect of COVID-19 on forest knowledge and determine the most dominant parameter, a sensitivity analysis was performed and the results are presented in Figure 7.
The degree to which the relationship between the parameters can be expressed using a linear function and a non-linear function is referred to as sensitivity analysis. The strength of the correlation is not dependent on the direction or sign. A positive coefficient indicates that an increase in the first parameter would correspond to a rise in the second parameter. In contrast, a negative correlation indicates an inverse relationship, in which one parameter increases when the other parameter decreases [37,38].

Evaluation Criteria

Root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), symmetric MAPE (SMAPE), and Theil inequality coefficient (TIC) were the evaluation criteria used in this study. Additionally, the determination coefficient (R2) and correlation coefficient (R) were used to assess goodness-of-fit, and one statistical error, the mean squared error (MSE), was used to evaluate models of the second scenario. The above evaluation criteria are presented in Equations (3)–(9).
RMSE = t = T + 1 T + h y ^ t y t 2 h
MAE = 1 h t = T + 1 T + h y ^ t y t
MAPE = 1 h t = T + 1 T + h y ^ t y t y t × 100
SMAPE = 1 h t = T + 1 T + h y ^ t y t y ^ t + y t   ×   2   ×   100
U 1 = t = T + 1 T + h y ^ t y t 2 h t = T + 1 T + h y ^ t 2 h + t = T + 1 T + h y t 2 h 1
R 2 = 1 j = 1 N Y obs , j Y com , j 2 j = 1 N Y obs , j Y ¯ obs , j 2
MSE = 1 N   i = 1 N ( Y obsi Y comi ) 2
R = i = 1 N Y obs Y ¯ obs Y com Y ¯ com i = 1 N Y obs Y ¯ obs 2 i = 1 N Y com Y ¯ com 2
Equations (3)–(7) include the actual value y t , the forecast value y ^ t , the forecast horizon h , and the training/testing sample T .

4. Results and Discussion

In this section, the results for both multi-regional COVID-19 modeling and the prediction effect of forest knowledge during the COVID-19 pandemic are analyzed. It is worth mentioning that, to the best of the authors’ knowledge, there is no existing published research that employed this approach. Another motivation for the current research was to conduct a comprehensive bibliographic review of COVID-19 using AI-based models. The results of both analyses are presented in the section below.

4.1. Result for Various Type of ARIMA

As stated above, various ARIMA models were employed to analyze and forecast the CCC of four different countries. Prior to modeling, pre-analysis was conducted to determine the reliability of the data. As such, the stationarity of the data was evaluated using formal unit root tests, i.e., augmented Dickey–Fuller (ADF) and Phillips–Perron (PP) tests. These tests have been developed by several studies in the literature. Table 1 presents the results of the unit root test. From the table it can be seen that the CCC values for each country tend to be non-stationary at normal level and become stationary at first difference, excluding TR and ZA which underwent second differences. In addition, ADF revealed that the CCC for TR is I (2) while that of the other countries is I (1). When contradictory findings were found, the PP unit root test took precedence since it can detect near-unit root processes.
ARIMA models have been employed to predict COVID-19-related parameters; for example, Toga et al. [39] applied ARIMA and ANN to forecast COVID-19 prevalence in Turkey. Moreover, ensemble ARIMA has not yet received appropriate attention in the literature. For each of the countries (TR, LY, NG, and ZA), four models were evaluated and their forecast performance was assessed. These four models were: ARIMA with maximum likelihood (ARIMAML), ARIMA with generalized least squares (ARIMAGLS), ARIMA with automatic routines (AUTOARIMA), and ARIMAML with ARIMAGLS (ARIMAML-ARIMAGLS). Before model development, the dominant model selection approach was used, which significantly affects the accuracy of any intelligent computational models. Several input selection approaches including correlation, auto-correlation, and principal components analysis have been reported in the literature but are associated with linearity problems. The minimal value of forecast statistics such as RMSE, MAE, MAPE, SMAPE, Theil U1, and Theil U2 are presented in Table 2 and Table 3 and were used to assess the performance of the models. This study employed two forecasting techniques: individual building forecasting and ARIMAML and ARIMAGLS ensemble forecasting. The results of ARIMAML and ARIMAGLS are simply averaged. The ensembled model’s output was then analyzed and compared with the output of individual models.
The ensemble’s principal goal is to create more accurate and dependable estimates than those produced by a single model [40]. It was also confirmed by [41,42,43] that the ensemble technique has numerous advantages over the use of linear modeling methods, including in the initial stage of model selection and in the output of already selected models used for the ensemble. This can reduce the inconsistency in model development because no single model suits and fits all data. The expected performance and accuracy of the model depends on the nature, relationship between variables, uniformity, size, range, etc. of the data, as well as the method used. Because each country’s CCC is integrated of order one, they can be represented by ARIMA (p,1, q) processes. The combination of autoregressive order (p) and moving average order (q) that produced the lowest Schwarz information criteria (SIC) with white-noise mistakes was chosen. Because each country’s CCC is I (1), the initial difference between each CCC was used before estimating. The probable moving average order is determined using the autocorrelation function (ACF), whereas the probable autoregressive order is determined using the partial autocorrelation function (PACF). Figure 8 and Figure 9 show the ACF and PACF graphs, respectively. According to autoregression theory, typical time series data, particularly COVID-19 variables such as CCC, may be forecasted using the lags of the same time series or processes that influence the output variables, as is the case in this study. If the same variable is used, the ACF and PACF may be used to determine the optimal number of lags, whereas the cross-correlation function (CCF) can be utilized if various variables are used.
Understanding the modeling process and data division is quite crucial in any modeling procedure; as such, the modeling forecast for the training phase is presented in Table 3. The results of various performance evaluation matrices, such as RMSE, MAPE, and MAE, are presented for all of the employed models. Based on these results, it can be seen that the lowest RMSE value associated with LY, NG, TR, and ZA is for AUTOARIMA, ARIMA(ML-GLS), ARIMA(ML-GLS), and ARIMAML, respectively. It has been reported in the literature that, for better analysis of model accuracy, different performance indices should be included. MAPE is one such recommended index, and MAPE values between 1–10 is recommended for the best results. The training results indicated that almost all of the model’s performance can be justified by considering the values of MAPE (see Figure 10). According to Figure 10, LY (MAPE = 1.5798), NG (MAPE = 1.2575), TR (MAPE = 1.6228), and ZA (MAPE = 1.0482). ARIMAGLS outperformed the other models for LY, NG, and ZA in terms of MAPE values, while AUTOARIMA was the best model for TR.
A quantitative comparison of the results can be conducted using the testing phase sample presented in Table 4. The testing phase represents an essential stage in any modeling in order to validate and generalize the model. The modeling results for CCC show that AUTOARIMA and ARIMAGLS outperformed the other models. It is also worth mentioning that, in most situations, RMSE and Theil U1 agreed on the same model as being the best. It is also worth noting that when other accuracy metrics (MAE, MAPE, SMAPE, and Theil) conflict with RMSE, RMSE takes precedence. Figure 11a,b shows the forecast comparison graphs for the training and testing samples for each nation. The line graphs showing the actual and anticipated levels of CCC are barely distinct. This suggests that using ARIMA to forecast the CCC is valid.
The Taylor diagrams for the training and testing samples are shown in Figure 12a,b. The model with the largest dot had the best RMSE. ARIMAML-ARIMAGLS had the best prediction accuracy for seven countries, ARIMAGLS for four countries, and ARIMAML for three countries, as shown in Figure 8. Figure 9 indicates that ARIMAGLS outperforms the other models in predicting accuracy for three countries, whereas ARIMAML outperformed the other models for one country. Table 3 and Table 4 are visually summarized in Figure 8 and Figure 9, respectively. The fan plots for the training and testing samples are shown in Figure 13a,b. The RMSE of the four models are shown on the graph. The greater the model’s prediction performance, the narrower the angle of the sector of the fan plot (see, Figure 13a,b). The performance of four of the best-performing models described in this study was examined in the preceding analysis. The top performing models, according to individual forecast data, are ARIMAGLS and ARIMAML, while combining the two models produces the best prediction accuracy. Because GLS yields inconsistent empirical fit and parameter estimations, the ARIMAGLS model beats other models. Finally, due to the applicability of merging predictions, combining the ARIMAML and ARIMAGLS produces the greatest predicted accuracy in most circumstances when compared with separate models.

4.2. Result of Deep Learning Model

For the development of the AI-based models used in the current study, simulations were performed in MATLAB 9.3 (R2020a). Suitable model architecture of both the LSTM and RF models was optimized and selected using trial and error. As reported in the literature [44,45,46], modeling results must satisfy certain evaluation indicators. The outcomes of the simulated models were evaluated using the most utilized performance criteria, including R2, MSE, RMSE, and R in both the calibration (70%) and verification (30%) stages. The predicted results were derived from M1 and M2, and the simulated quantitative assessment results are presented in tabular form. Table 5 shows the results of the performance analysis for RF, LSTM, and EML models. It can be seen from the results that all three AI-based models can produce good performance accuracy for the evaluation of FK and management. This is due to the powerful ability of non-linear AI-based models to describe complex systems. Between the two AI-based models (RF and LSTM), SLSTM-M1 emerged as the best combination for FK estimation with values of R2 = 0.9393, MSE = 0.0450, and R = 0.9692 in the calibration phase.
Further analysis of the results demonstrated that RF-M2 served as the second-best model, follow by RF-M1, and lastly LSTM-M1. The estimation results regarding goodness-of-fit are presented by radar charts (see Figure 14). For the results of single models, it can be concluded that the performance accuracy of the models follows the following order: LSTM-M1 > RF-M2 > ARF-M1 > LSTM-M2. All models achieved good results but cannot serve the purpose of estimation since LSTM is a deep learning model that better captures the non-linear relationship between the variables. Similarly, the profound advantages of deep learning and LSTM include their capability to analyze complex and high-dimensional data in a relatively short period with minimal human resources compared with conventional data collection and analysis. To compare the predictive performance of this study, the R values greater than 0.7 indicated an excellent model. According to [47], R2 values greater than 0.8 are satisfactory for any analysis using AI-based models. To further analyze the predictive performance of the model, point-by-point probability plots were generated between the observed and predicted values for the best models, as depicted in Figure 15. From the plots, it can be observed that higher agreement between the observed and predicted values was achieved by LSTM-M1. For this reason, quantitative analysis of the models can be performed using the determination coefficient (R2). LSTM-M1 increased the prediction accuracy of the best RF by 4% in the calibration phase and by 2% in the verification phase.
The data used in the second scenario were pre-analyzed using a variety of techniques including normalization and reliability tests. The conceptual understanding of each input parameter is critical in assessing the strength of predictive models in soft-computing analysis. As a result, all of the study regions were subjected to stationery and consistency studies utilizing Cronbach’s alpha technique and unit root test. It should be noted that the preliminary examination of a single parameter or input for any time series is extremely important because their forecast accuracy might significantly add to the models’ improved performance. According to Dickey et al. [22], the ADF test is essential for obtaining trustworthy and valid results that ensure the stationarity of all variables. All of the criteria mentioned were met by the experimental data used in this study. The findings demonstrate that the computational modeling methodologies explored have varied levels of appropriateness when considering the evaluation criteria. Furthermore, the aggregate findings showed that EML was the most effective simulation in terms of performance requirements. Though it is impossible to rank the methods according to their acquired precision, the ELM method model had the best forecast accuracy, with a fit to the data of above 97%. The error plots in Figure 16 present a visual comparison of the model combinations with regards to MSE. For the total goodness-of-fit, an error plot depicts the level of agreement between the observed and projected load. The error map clearly reveals that the ELM method model is more accurate than the RF and LSTM models.
The above findings are supported by the capacity of the ELM approach to handle non-linear systems. Unlike RMSE, MSE seems to have a more natural standard measure error and is explicit. It is a model performance measure that is commonly employed in regression analysis. The MAE for a test set is the average of the basic values of all instances in the verification set’s forecast errors. Table 4 also demonstrates its promise in terms of error values. According to the literature, the lowest MSE values suggest the best results and vice versa. The ensemble model’s efficiency can be linked to the hybrid model’s ability to produce more promising outcomes than a single model. For both research and engineering, it is critical to report how dependable AI-based models are.
The overall judgement between the best single model and ensemble learning is provided using a two-dimensional Taylor diagram, as presented in Figure 17. By considering the actual and estimated values, a Taylor diagram highlights and summarizes several statistical indices such as R, RMSE, and standard deviation [48]. Taylor diagrams can be found in a wide range of fields, including applied and social sciences. Surprisingly, to the best knowledge of the researchers, this is the first study to employ this graph in FK forecasting. In addition, this graphic can be used to compare the internal consistency of different models. As a result, the diagram can be viewed as a collection of polar plot points. A detailed description and discussion of the Taylor diagram can be found in [49]. In the testing phase, the ELM approaches obtained greater goodness of fit with a value of R = 0.98, as shown in Figure 17. These findings show that deep learning and ensemble techniques are capable of capturing complicated non-linear patterns between load demand factors for both training and testing.

5. Conclusions

This study estimated and evaluated the forecast performance of four distinct ARIMA models: ARIMAML, ARIMAGLS, AUTOARIMA, and ARIMAML-ARIMAGLS. The models were estimated using CCC time series data for four countries (Brazil, Turkey, Libya, and South Africa). Two sub-samples were employed: 75% for training and 25% for testing. For the training subsample, AUTOARIMA was found to be the best model for Libya, ARIMAGLS for Turkey, and ARIMAML for Nigeria and South Africa. For the testing sample, AUTOARIMA had the best predictive ability for Libya and Nigeria, while ARIMAML was the best for Turkey and South Africa. No evidence was found that ensembling ARIMAML and ARIMAGLS produced the best forecast accuracy in both sub-samples. The results of this study can serve as a reference for modeling the CCC and devising health-related policies.
Nevertheless, for the DL results, AI-based models were developed based on sensitivity analysis to estimate forest knowledge using RF and LSTM models. The performance criteria were evaluated using R2, R, MSE, and RMSE. The predictive results demonstrated that AI-based models could predict forest knowledge with less input combination. The results further indicated that all deep learning approach models are capable and satisfactory tools for modeling forest knowledge. Deep learning LSTM-M1 emerged as the best and most reliable estimation model among the AI models analyzed. Although it is difficult to rank the models by their achieved accuracies, the ELM techniques approach showed the best relative prediction accuracy, attaining a goodness of fit greater than 97%. The outcomes also suggested the development of AI-based models in this field. Other non-linear models and optimization techniques should be employed, such as non-linear ensemble techniques (NET), gaussian process regression models (GPRM), gradient boasting (GB), extreme learning machines (ELM), genetic algorithms (GA), emerging optimization (EO), and kernel models (KM) to improve the estimation accuracy. It is also suggested to expand these techniques to other geo-environmental locations across the globe. This is in line with Areepong and Sunthornwat, [50] who concluded that future research should focus on estimating the maximum number of visitors that can enter a country while maintaining control of the number of COVID-19 cases. It may also be of interest to investigate the use of both forecasting models to anticipate and assess the spread of COVID-19 in other nations.

Author Contributions

Conceptualization, A.A. and S.I.A.; methodology, S.I.A., S.M., A.G.U.; software, S.I.A. and S.M. validation, F.A., H.S.M. and A.A.J.; formal analysis, A.G.U. and S.I.A.; investigation, F.A. and A.A.; resources, A.A.; data curation, A.G.U. and H.S.M.; writing—original draft preparation, A.A.; writing—review and editing, S.I.A. and S.M.; visualization, S.I.A. and A.A.J.; supervision, S.I.A. and F.A.; project administration, A.A. and F.A.; funding acquisition, A.A. and S.I.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Yes, all Agreed.

Data Availability Statement

All supporting information is accessible upon request from the editor-in-chief. Other data that support the conclusions of this study may be found in the World Health Organization’s (WHO) COVID-19 database, which can be accessed at the following site: https://covid19.who.int/info, accessed on 10 November 2021. In addition, the data was accessed on 30 November 2021.

Acknowledgments

The authors would like to acknowledge the Near East University, King Fahad University of Petroleum and Minerals and the World Health Organization (WHO, Geneva, Switzerland) for providing the required assistance to carry out this research.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lu, H.; Stratton, C.W.; Tang, Y.-W. Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle. J. Med. Virol. 2020, 92, 401–402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Ajay, H.; Sahajal, D.; Inderpaul, S.; Ritesh Agarwal, S. Primary cavitary sarcoidosis: A case report, systematic review, and proposal of new diagnostic criteria. Lung India 2018, 35, 41–46. [Google Scholar] [CrossRef]
  3. Burki, T.K. Coronavirus in China. Lancet Respir. Med. 2020, 8, 238. [Google Scholar] [CrossRef]
  4. Yu, P.; Zhu, J.; Zhang, Z.; Han, Y. A Familial Cluster of Infection Associated With the 2019 Novel Coronavirus Indicating Possible Person-to-Person Transmission During the Incubation Period. J. Infect. Dis. 2020, 221, 1757–1761. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Sohail, A.; Iftikhar, M.; Arif, R.; Ahmad, H.; Gepreel, K.A.; Iftikhar, S. Dengue control measures via cytoplasmic incompatibility and modern programming tools. Results Phys. 2021, 21, 103819. [Google Scholar] [CrossRef]
  6. Pang, N.T.P.; Kamu, A.; Kassim, M.A.M.; Ho, C.M. Monitoring the impact of Movement Control Order (MCO) in flattening the cummulative daily cases curve of Covid-19 in Malaysia: A generalized logistic growth modeling approach. Infect. Dis. Model. 2021, 6, 898–908. [Google Scholar] [CrossRef] [PubMed]
  7. ArunKumar, K.; Kalaga, D.V.; Kumar, C.M.S.; Chilkoor, G.; Kawaji, M.; Brenza, T.M. Forecasting the dynamics of cumulative COVID-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: Auto-Regressive Integrated Moving Average (ARIMA) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA). Appl. Soft Comput. 2021, 103, 107161. [Google Scholar] [CrossRef] [PubMed]
  8. Rostami-Tabar, B.; Rendon-Sanchez, J.F. Forecasting COVID-19 daily cases using phone call data. Appl. Soft Comput. 2021, 100, 106932. [Google Scholar] [CrossRef] [PubMed]
  9. Alabdulrazzaq, H.; Alenezi, M.N.; Rawajfih, Y.; Alghannam, B.A.; Al-Hassan, A.A.; Al-Anzi, F.S. On the accuracy of ARIMA based prediction of COVID-19 spread. Results Phys. 2021, 27, 104509. [Google Scholar] [CrossRef]
  10. Tang, Y.; Wang, S. Mathematic modeling of COVID-19 in the United States. Emerg. Microbes Infect. 2020, 9, 827–829. [Google Scholar] [CrossRef]
  11. Mati, S. Do as your neighbours do? Assessing the impact of lockdown and reopening on the active COVID-19 cases in Nigeria. Soc. Sci. Med. 2021, 270, 113645. [Google Scholar] [CrossRef]
  12. Shi, F.; Wang, J.; Shi, J.; Wu, Z.; Wang, Q.; Tang, Z.; He, K.; Shi, Y.; Shen, D. Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19. IEEE Rev. Biomed. Eng. 2021, 14, 4–15. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Zhou, L.; Li, Z.; Zhou, J.; Li, H.; Chen, Y.; Huang, Y.; Xie, D.; Zhao, L.; Fan, M.; Hashmi, S.; et al. A Rapid, Accurate and Machine-Agnostic Segmentation and Quantification Method for CT-Based COVID-19 Diagnosis. IEEE Trans. Med. Imaging 2020, 39, 2638–2652. [Google Scholar] [CrossRef]
  14. Guleryuz, D. Forecasting outbreak of COVID-19 in Turkey; Comparison of Box–Jenkins, Brown’s exponential smoothing and long short-term memory models. Process. Saf. Environ. Prot. 2021, 149, 927–935. [Google Scholar] [CrossRef] [PubMed]
  15. Gong, Y. Advanced Bash-Scripting Guide An in-depth exploration of the art of shell scripting Table of Contents. Water Resour. Manag. 2013, 37, 2267–2274. [Google Scholar]
  16. Tiwari, M.K.; Adamowski, J. Urban water demand forecasting and uncertainty assessment using ensemble wavelet-bootstrap-neural network models. Water Resour. Res. 2013, 49, 6486–6507. [Google Scholar] [CrossRef]
  17. Pham, Q.B.; Gaya, M.; Abba, S.; Abdulkadir, R.; Esmaili, P.; Linh, N.T.T.; Sharma, C.; Malik, A.; Khoi, D.N.; Dung, T.D.; et al. Modelling of Bunus regional sewage treatment plant using machine learning approaches. Desalin. Water Treat. 2020, 203, 80–90. [Google Scholar] [CrossRef]
  18. Vandamme, J.-P.; Meskens, N.; Superby, J.-F. Predicting Academic Performance by Data Mining Methods. Educ. Econ. 2007, 15, 405–419. [Google Scholar] [CrossRef]
  19. Zhang, J.; Zhu, Y.; Zhang, X.; Ye, M.; Yang, J. Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas. J. Hydrol. 2018, 561, 918–929. [Google Scholar] [CrossRef]
  20. Pham, Q.B.; Abba, S.I.; Usman, A.G.; Linh, N.T.T.; Gupta, V.; Malik, A.; Costache, R.; Vo, N.D.; Tri, D.Q. Potential of Hybrid Data-Intelligence Algorithms for Multi-Station Modelling of Rainfall. Water Resour. Manag. 2019, 33, 5067–5087. [Google Scholar] [CrossRef]
  21. Wang, G.; Liu, X.; Li, C.; Xu, Z.; Ruan, J.; Zhu, H.; Meng, T.; Li, K.; Huang, N.; Zhang, S. A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions from CT Images. IEEE Trans. Med. Imaging 2020, 39, 2653–2663. [Google Scholar] [CrossRef] [PubMed]
  22. Dickey, D.A.; Fuller, W.A. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar] [CrossRef]
  23. Musa, B.; Yimen, N.; Abba, S.I.; Adun, H.H.; Dagbasi, M. Multi-state load demand forecasting using hybridized support vector regression integrated with optimal design of off-grid energy Systems—A metaheuristic approach. Processes 2021, 9, 1166. [Google Scholar] [CrossRef]
  24. Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Chichester, UK, 2015. [Google Scholar]
  25. Shi, X.Z.; Chen, Z.R.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
  26. Liu, P.; Wang, J.; Sangaiah, A.K.; Xie, Y.; Yin, X. Analysis and Prediction of Water Quality Using LSTM Deep Neural Networks in IoT Environment. Sustainability 2019, 11, 2058. [Google Scholar] [CrossRef] [Green Version]
  27. Abba, S.; Hadi, S.J.; Sammen, S.S.; Salih, S.Q.; Abdulkadir, R.; Pham, Q.B.; Yaseen, Z.M. Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination. J. Hydrol. 2020, 587, 124974. [Google Scholar] [CrossRef]
  28. Mubarak, A.; Esmaili, P.; Ameen, Z.; Abdulkadir, R.; Gaya, M.; Ozsoz, M.; Saini, G.; Abba, S. Metro-environmental data approach for the prediction of chemical oxygen demand in new Nicosia wastewater treatment plant. Desal. Water Treat. 2021, 221, 31–40. [Google Scholar] [CrossRef]
  29. Abba, S.I.; Linh, N.T.T.; Abdullahi, J.; Ali, S.I.A.; Pham, Q.B.; Abdulkadir, R.A.; Costache, R.; Anh, D.T. Hybrid machine learning ensemble techniques for modeling dissolved oxygen concentration. IEEE Access 2020, 8, 157218–157237. [Google Scholar] [CrossRef]
  30. Breiman, L. Random Forests. Mach. Learn. 2001, 6, 5–32. [Google Scholar] [CrossRef] [Green Version]
  31. Hameed, H.I.A.; Seidu, R. Random forest tree for predicting fecal indicator organisms in drinking water supply. In Proceedings of the 2017 IEEE International Conference on Behavioral, Economic, Socio-cultural Computing (BESC), Krakow, Poland, 16–18 October 2017; pp. 1–6. [Google Scholar] [CrossRef]
  32. Al-Mukhtar, M. Random forest, support vector machine, and neural networks to modelling suspended sediment in Tigris River-Baghdad. Environ. Monit. Assess. 2019, 191, 673. [Google Scholar] [CrossRef]
  33. Wunsch, A.; Liesch, T.; Broda, S. Groundwater Level Forecasting with Artificial Neural Networks: A Comparison of LSTM, CNN and NARX. Hydrol. Earth Syst. Sci. Discuss. 2020, 552, 1–23. [Google Scholar]
  34. Ismail, A.A.; Wood, T.; Bravo, H.C. Improving Long-Horizon Forecasts with Expectation-Biased LSTM NetWorks. 2018. Available online: http://arxiv.org/abs/1804.06776 (accessed on 10 November 2021).
  35. Zhang, D.; Hølland, E.S.; Lindholm, G.; Ratnaweera, H. Hydraulic modeling and deep learning based flow forecasting for optimizing inter catchment wastewater transfer. J. Hydrol. 2018, 567, 792–802. [Google Scholar] [CrossRef]
  36. Qin, H. Comparison of Deep Learning Models On Time Series Forecasting: A Case Study of Dissolved Oxygen Prediction. arXiv 2019, arXiv:1911.08414. [Google Scholar]
  37. Usman, A.G.; Işik, S.; Abba, S.I.; Meriçli, F. Chemometrics-based models hyphenated with ensemble machine learning for retention time simulation of isoquercitrin in Coriander sativum L. using high-performance liquid chromatography. J. Sep. Sci. 2021, 44, 843–849. [Google Scholar] [CrossRef]
  38. Abdullahi, H.U.; Usman, A.G.; Abba, S.I. Modelling the Absorbance of a Bioactive Compound in HPLC Method using Artificial Neural Network and Multilinear Regression Methods. DUJOPAS 2020, 6, 362–371. [Google Scholar]
  39. Toğa, G.; Atalay, B.; Toksari, M.D. COVID-19 prevalence forecasting using Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANN): Case of Turkey. J. Infect. Public Health 2021, 14, 811–816. [Google Scholar] [CrossRef]
  40. Baba, M.N.; Makhtar, M.; Abdullah, S.; Awang, M.K. Current Issues in Ensemble Methods and its Ap-Plications. J. Theor. Appl. Inf. Technol. 2015, 81, 266–276. [Google Scholar]
  41. Nourani, V.; Elkiran, G.; Abba, S.I. Wastewater treatment plant performance analysis using artificial intelligence—An ensemble approach. Water Sci. Technol. 2018, 78, 2064–2076. [Google Scholar] [CrossRef] [PubMed]
  42. Kazienko, P.; Lughofer, E.; Trawiński, B. Hybrid and ensemble methods in machine learning J. UCS special issue. J. Univers. Comput. Sci. 2013, 19, 457–461. [Google Scholar]
  43. Abba, S.I.; Elkiran, G.; Nourani, V. Non-linear ensemble modeling for multi-step ahead prediction of treated cod in wastewater treatment plant. In Proceedings of the International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions; Springer: Cham, Switzerland, 2019. [Google Scholar]
  44. Pham, Q.B.; Sammen, S.S.; Abba, S.I.; Mohammadi, B.; Shahid, S.; Abdulkadir, R.A. A new hybrid model based on relevance vector machine with flower pollination algorithm for phycocyanin pigment concentration estimation. Environ. Sci. Pollut. Res. 2021, 28, 32564–32579. [Google Scholar] [CrossRef]
  45. Abba, S.I.; Abdulkadir, R.A.; Sammen, S.S.; Usman, A.G.; Meshram, S.G.; Malik, A.; Shahid, S. Comparative implementation between neuro-emotional genetic algorithm and novel ensemble computing techniques for modelling dissolved oxygen concentration. Hydrol. Sci. J. 2021, 66, 1584–1596. [Google Scholar] [CrossRef]
  46. Sammen, S.S.; Ehteram, M.; Abba, S.I.; Abdulkadir, R.A.; Ahmed, A.N.; El-Shafie, A. A new soft computing model for daily streamflow forecasting. Stoch. Environ. Res. Risk Assess. 2021, 35, 2479–2491. [Google Scholar] [CrossRef]
  47. Abba, S.I.; Abdulkadir, R.A.; Sammen, S.S.; Pham, Q.B.; Lawan, A.A.; Esmaili, P.; Malik, A.; Al-Ansari, N. Integrating feature extraction approaches with hybrid emotional neural networks for water quality index modeling. Appl. Soft Comput. 2022, 114, 108036. [Google Scholar] [CrossRef]
  48. Shamshirband, S.; Esmaeilbeiki, F.; Zarehaghi, D.; Neyshabouri, M.; Samadianfard, S.; Ghorbani, M.A.; Mosavi, A.; Nabipour, N.; Chau, K.W. Comparative analysis of hybrid models of firefly optimization algorithm with support vector machines and multilayer perceptron for predicting soil temperature at different depths. Eng. Appl. Comput. Fluid Mech. 2020, 14, 939–953. [Google Scholar] [CrossRef]
  49. Taylor, K.E. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. 2001, 106, 7183–7192. [Google Scholar] [CrossRef]
  50. Areepong, Y.; Sunthornwat, R. Forecasting modeling of the number of cumulative COVID-19 cases with deaths and recoveries removal in Thailand. Sci. Eng. Health Stud. 2020, 15, 21020004. [Google Scholar]
Figure 1. (a) Major keywords in the literature on COVID-19, determined using machine learning models (2020–2021); (b) investigated research regions for the COVID-19 prediction.
Figure 1. (a) Major keywords in the literature on COVID-19, determined using machine learning models (2020–2021); (b) investigated research regions for the COVID-19 prediction.
Ijerph 19 00738 g001
Figure 2. The overall flowchart of the models.
Figure 2. The overall flowchart of the models.
Ijerph 19 00738 g002
Figure 3. Algorithm for developing the ARIMA models.
Figure 3. Algorithm for developing the ARIMA models.
Ijerph 19 00738 g003
Figure 4. Overall description of RF.
Figure 4. Overall description of RF.
Ijerph 19 00738 g004
Figure 5. Long short-term memory neural network.
Figure 5. Long short-term memory neural network.
Ijerph 19 00738 g005
Figure 6. Bar plot of CCC for the four regions.
Figure 6. Bar plot of CCC for the four regions.
Ijerph 19 00738 g006
Figure 7. Box-auto correlation analysis between the variables.
Figure 7. Box-auto correlation analysis between the variables.
Ijerph 19 00738 g007
Figure 8. ACF of the first difference of CCC for the four countries.
Figure 8. ACF of the first difference of CCC for the four countries.
Ijerph 19 00738 g008
Figure 9. PACF of the first difference of CCC for the four countries.
Figure 9. PACF of the first difference of CCC for the four countries.
Ijerph 19 00738 g009
Figure 10. MAPE and RMSE values for: (a) LY, (b) NG, (c) TR, and (d) ZA for both training and testing phases.
Figure 10. MAPE and RMSE values for: (a) LY, (b) NG, (c) TR, and (d) ZA for both training and testing phases.
Ijerph 19 00738 g010
Figure 11. Forecast comparison graph for the: (a) training, and (b) testing samples.
Figure 11. Forecast comparison graph for the: (a) training, and (b) testing samples.
Ijerph 19 00738 g011
Figure 12. Taylor diagram for the: (a) training, and (b) testing samples.
Figure 12. Taylor diagram for the: (a) training, and (b) testing samples.
Ijerph 19 00738 g012aIjerph 19 00738 g012b
Figure 13. Fan plot of RMSE for the: (a) training, and (b) testing samples.
Figure 13. Fan plot of RMSE for the: (a) training, and (b) testing samples.
Ijerph 19 00738 g013aIjerph 19 00738 g013b
Figure 14. Radar chart for all of the models for R2 and R.
Figure 14. Radar chart for all of the models for R2 and R.
Ijerph 19 00738 g014
Figure 15. Results of cumulative distribution function for all of the single models.
Figure 15. Results of cumulative distribution function for all of the single models.
Ijerph 19 00738 g015
Figure 16. Error performance in term of MSE for all of the models.
Figure 16. Error performance in term of MSE for all of the models.
Ijerph 19 00738 g016
Figure 17. Taylor graphical representations of ensemble models.
Figure 17. Taylor graphical representations of ensemble models.
Ijerph 19 00738 g017
Table 1. Descriptive statistics of the variables.
Table 1. Descriptive statistics of the variables.
Y Y m e d Y m a x Y m i n σ N
LY129,333.3111,124366,7891116,504.4605
NG104,581.595,934213,464572,519.22631
TR2,906,6112,355,8398,503,22012,635,359619
ZA1,252,2981,231,5972,927,4995960,752.1625
Table 2. ADF and PP unit root test results.
Table 2. ADF and PP unit root test results.
VariablesNoneConstantConstant and TrendNoneConstantConstant and TrendDecision
LY2.6761.594−1.986−2.291 ***−3.933 ***−4.527 ***I(1)
NG1.244−0.705−2.225−2.179 ***−3.058 ***−3.041 ***I(1)
TR0.9310.311−2.582−0.89−1.865−2.167I(2)
ZA0.929−0.444−3.166−1.72−2.368−2.277I(2)
LY7.2043.413−1.797−10.572 ***−17.217 ***−20.344 ***I(1)
NG3.755−0.116−1.627−11.264 ***−17.258 ***−17.263 ***I(1)
TR7.8393.865−1.414−0.907−1.882−2.143I(2)
ZA4.3290.882−2.221−4.091 ***−4.112 ***−4.567 ***I(1)
*** signifies regrectionof null hypothesis at 1% level of significance.
Table 3. Evaluation results for the training sample.
Table 3. Evaluation results for the training sample.
ModelsRMSEMAEMAPESMAPETheil U1Theil U2
ARIMAML287.4491171.18106.3295954.6517720.0013024.456228
AUTOARIMA266.0884158.24715.0974154.0002330.0012043.065781
ARIMAGLS287.4962170.34281.5798001.6134710.0013020.904637
ARIMAML-ARIMAGLS287.4308170.59673.5871803.0102010.0013022.431895
ARIMAML194.9946107.52199.9135605.1221600.0009464.291490
AUTOARIMA220.0399124.11181.4427781.4947420.0010670.834548
ARIMAGLS195.6538106.96221.2575311.3782210.0009490.900122
ARIMAML-ARIMAGLS195.1350107.06575.4104063.7421840.0009462.199975
ARIMAML1384.8780667.14469.4217451.8018500.0002596.971105
AUTOARIMA9364.00906847.12001.6228601.5339540.0017480.715107
ARIMAGLS1384.8720666.12568.9168071.7756900.0002596.619425
ARIMAML-ARIMAGLS1384.8530666.63429.1692761.7889190.0002596.795260
ARIMAML1685.9570818.275214.2665803.6037600.00076813.075140
AUTOARIMA6842.28404613.74601.2569741.2417670.0031130.538986
ARIMAGLS1690.7090819.53121.0482771.1054260.0007700.673285
ARIMAML-ARIMAGLS1686.6170814.93977.1808492.6747330.0007686.467583
Table 4. Evaluation results for the testing sample.
Table 4. Evaluation results for the testing sample.
ModelsRMSEMAEMAPESMAPETheil U1Theil U2
ARIMAML287.4491171.18106.3295954.6517720.0013024.456228
AUTOARIMA266.0884158.24715.0974154.0002330.0012043.065781
ARIMAGLS287.4962170.34281.5798001.6134710.0013020.904637
ARIMAML-ARIMAGLS287.4308170.59673.5871803.0102010.0013022.431895
ARIMAML194.9946107.52199.9135605.1221600.0009464.291490
AUTOARIMA220.0399124.11181.4427781.4947420.0010670.834548
ARIMAGLS195.6538106.96221.2575311.3782210.0009490.900122
ARIMAML-ARIMAGLS195.1350107.06575.4104063.7421840.0009462.199975
ARIMAML1384.8780667.14469.4217451.8018500.0002596.971105
AUTOARIMA9364.00906847.12001.6228601.5339540.0017480.715107
ARIMAGLS1384.8720666.12568.9168071.7756900.0002596.619425
ARIMAML-ARIMAGLS1384.8530666.63429.1692761.7889190.0002596.795260
ARIMAML1685.9570818.275214.2665803.6037600.00076813.075140
AUTOARIMA6842.28404613.74601.2569741.2417670.0031130.538986
ARIMAGLS1690.7090819.53121.0482771.1054260.0007700.673285
ARIMAML-ARIMAGLS1686.6170814.93977.1808492.6747330.0007686.467583
Table 5. Results and performance analysis of the models.
Table 5. Results and performance analysis of the models.
Calibration Phase Verification Phase
ModelsR2MSERRMSER2MSERRMSE
RF-M10.89820.07050.94770.26550.85261.02050.92341.0102
RF-M20.90820.06260.95300.25020.89850.77150.94790.8784
LSTM-M10.94470.03360.97200.18330.93930.04500.96920.2121
LSTM-M20.88760.37050.94210.60870.88640.83740.94150.9151
NN-EML0.97760.03050.98810.17460.96940.03740.98450.1933
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Alamrouni, A.; Aslanova, F.; Mati, S.; Maccido, H.S.; Jibril, A.A.; Usman, A.G.; Abba, S.I. Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach. Int. J. Environ. Res. Public Health 2022, 19, 738. https://doi.org/10.3390/ijerph19020738

AMA Style

Alamrouni A, Aslanova F, Mati S, Maccido HS, Jibril AA, Usman AG, Abba SI. Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach. International Journal of Environmental Research and Public Health. 2022; 19(2):738. https://doi.org/10.3390/ijerph19020738

Chicago/Turabian Style

Alamrouni, Abdelgader, Fidan Aslanova, Sagiru Mati, Hamza Sabo Maccido, Afaf. A. Jibril, A. G. Usman, and S. I. Abba. 2022. "Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach" International Journal of Environmental Research and Public Health 19, no. 2: 738. https://doi.org/10.3390/ijerph19020738

APA Style

Alamrouni, A., Aslanova, F., Mati, S., Maccido, H. S., Jibril, A. A., Usman, A. G., & Abba, S. I. (2022). Multi-Regional Modeling of Cumulative COVID-19 Cases Integrated with Environmental Forest Knowledge Estimation: A Deep Learning Ensemble Approach. International Journal of Environmental Research and Public Health, 19(2), 738. https://doi.org/10.3390/ijerph19020738

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop