**3. Results**

The model can be applied to any country, region, and in different time periods. We exemplify its usage, without loss of generality, using data available until 31 March 2020. The data source is the daily World Health Organisation reports (see World Health Organisation 2020), from which we have extracted the "Total confirmed new cases". Figure 1 presents the observed evolution of the daily new cases of infection: for China (starting from 20 January), Iran, South Korea and Italy (starting from 21 February). We choose to consider data until the end of March (Figure 1) and make predictions for the beginning of April because at that time contagion counts in the analysed countries were still high and predictions challenging. Being the count response variable a Poisson, its variance depends on the number of observed counts, a number which has been declining in the considered countries, from April onwards, when not before.

**Figure 1.** Observed infection counts.

Figure 1 shows that, as of 31 March 2020, COVID-19 contagion in China has completed a full cycle, with an upward trend, a peak, and a downward trend. South Korea seems to have had a similar situation, with a smaller intensity. Italy has followed a similar path, with a larger intensity. The contagion dynamics in Iran is more difficult to interpret, and is still quite erratic.

The application of our model can better qualify these conclusions. The estimated model parameters for China, using all data available until 31 March, are shown in Table 1.


**Table 1.** Model estimates for China, with standard errors and *p*-values.

Table 1 shows that all estimated autoregressive coefficients are significant, confirming the presence of both a short-term dependence and a long-term trend. From an interpretational viewpoint, the estimate of *α* shows that, if the expectation of new cases for yesterday was close to 0, 100 new cases observed yesterday generate about 40 new expected cases today. According to the value estimated for *β*, an expectation of 100 new cases for yesterday generates instead about 2 new expected cases today, if no cases were observed yesterday.

With the aim of better interpreting the time series of the other countries, which on 31 March seem not to have completed their contagion cycle yet, we repeatedly fit the model to the Chinese data, using increasing amounts of data, in a retrospective way. More precisely, we first fit the model on the

first 15 counts from China (a minimal requirement for statistical consistency of the results), then on the first 16, and so on. For each fit we plot the estimated *α* and *β* parameters in Figure 2.

**Figure 2.** Evolution of the *α* and *β* parameters for Chinese daily infection counts.

Figure 2 shows that, until February 11th (the 23rd day reported) *β* is greater than *α*, indicating the presence of a still increasing trend (the *β* component) that absorbs the short-term component. After that time, downward trend data is accumulated, *β* starts decreasing and *α* increasing. The results approximate the values in Table 1 around 20 February: after this date the estimated parameters become stable, as the difference between subsequent estimates becomes lower than 0.01.

What obtained from the Chinese data suggests to use the PAR model to assess at which stage the contagion cycle is in the other countries. We thus estimate the model parameters for the other three countries, using the data available until 31 March. Our results show that, for Iran, on that date the *α* parameter prevails, with an estimated value equal to 0.96, indicating a process mainly driven by a short-term dependence on the previous time points. However, further analyses reveal that the parameters estimated for Iran are very unstable. The estimated *β* parameter for South Korea is not significant, indicating absence of a trend effect on the daily counts, consistently with what observed in Figure 1. For Italy, instead, *α* is about 0.51, higher than *β* 0.38, similarly to China but with a lower difference between the two parameters, indicating that, at the end of March, the trend component is weakening.

To conclude, we believe that our model can constitute a useful statistical tool for decision makers: in each country, once a minimal series of data is collected (we sugges<sup>t</sup> 15 days) the values of *α* and *β* can be monitored along time, to reveal at which stage the contagion dynamics is: well beyond the peak (as in China and South Korea); close to or right after the peak (as Italy on 31 March); or in a situation that could indicate that the peak has been reached, but which needs more data to be understood (as Iran at the end of March).

The full reproducibility of our model can easily extend its application to more countries and time periods as data becomes available.

To better understand the advantages of our proposed specification and, at the same time, to show its possible improvement, we now compare it with two alternative models, one simpler and one more complex.

The first one is a classic exponential growth model, that is a regression of the number of daily new cases on the time, expressed as days since the outbreak:

$$
\log(y\_t) = \kappa\_0 + \kappa\_1 t. \tag{1}
$$

The second alternative model we consider is a PARX model Agosto et al. (2016), that is a Poisson autoregressive model with a covariate. As a covariate we use time: the number of days since the outbreak, as in the classical exponential model. Thus, we extend the PAR model as follows:

$$
\log(\lambda\_t) = \omega + \mathfrak{a}\log(1 + y\_{t-1}) + \beta\log(\lambda\_{t-1}) + \gamma t\_t
$$

We now apply the three models-estimated using data until the end of March - to make 10-day ahead predictions of the daily new cases. The results obtained for South Korea, Iran and Italy are shown in Figures 3–5.

**Figure 3.** Daily infection counts in South Korea: observed and predicted values.

**Figure 4.** Daily infection counts in Iran: observed and predicted values.

**Figure 5.** Daily infection counts in Italy: observed and predicted values.

Figures 3–5 all show the limits of the exponential model, which, being a "static" model, cannot capture time variations in the contagion dynamics, differently from both the PAR and the PARX. The latter, being dynamic models, can better adapt to disease count variations, without the need to often adjust the estimates and find a saturation point, as it would be the case for the exponential model.

To compare the models in terms of out-of-sample predictive performance, in Table 2 we report the value of Root Mean Squared Error (RMSE) and Mean Percentage error (MPE) for the three specifications.


**Table 2.** Out-of-sample error measures.

The results in Table 2 show that the PAR model always outperforms the other two, except in the case of South Korea, for which the preferable specification turns out to be Poisson autoregression including the time since outbreak as a covariate. This finding is consistent with what observed in Figures 3–5 and confirms the superiority of Poisson autoregressive models over the exponential growth model. This advantage explains the potential impact of our proposal, which is successfully implemented and weekly updated in the infographic website of the Center for European Policy Studies1.

**Author Contributions:** All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The work of the Authors is receiving support from the European Union's Horizon 2020 training and innovation programme "FIN-TECH", under the gran<sup>t</sup> agreemen<sup>t</sup> No. 825215 (Topic ICT-35-2018, Type of actions: CSA). The paper is the result of the joint collaboration between the two authors.

**Conflicts of Interest:** The authors declare no conflicts of interest.
