**1. Introduction**

In exponential growth, the population grows at a rate proportional to its current size. This is unrealistic, since in reality, growth will not exceed some maximum, called its carrying capacity. The logistic equation [1] (Chapter 6) deals with this problem by ensuring that the growth rate of the population decreases once the population reaches its carrying capacity [2]. Statistical modelling of the logistic equation's growth and decay is accomplished with the *logistic distribution* [3] and [4] (Chapter 22), noting that the tails of the logistic distribution are heavier than those of the ubiquitous normal distribution. The normal and logistic distributions are both symmetric, however, real data often exhibits skewness [5], which has given rise to extensions of the normal distribution to accommodate for skewness, as in the skew normal [6] and epsilon skew normal [7] distributions. Subsequently, skew logistic distributions were also devised, as in [8,9].

Epidemics, such as COVID-19, are traditionally modelled by compartmental models such as the SIR (Susceptible-Infected-Removed) model and its extension, the SEIR (Susceptible-Exposed-Infected-Removed) model, which estimate the trajectory of an epidemic [10]. These models typically rely on assumptions on how the disease is transmitted and progresses [11], and are routinely used to understand the consequences of policies such as mask wearing and social distancing [12]. Time series models [13], on the other hand, employ historical data to make forecasts about the future, are generally simpler than compartmental models, and are able to make forecasts on, for example, number of cases, hospitalisations and deaths. The SIR model can be interpreted as a logistic growth model [14,15]. However, as the data is inherently skewed, a skewed logistic statistical model would be a natural choice, although, as such, it does not rely on biological assumptions in its forecasts [16].

**Citation:** Levene, M. A Skew Logistic Distribution for Modelling COVID-19 Waves and Its Evaluation Using the Empirical Survival Jensen–Shannon Divergence. *Entropy* **2022**, *24*, 600. https://doi.org/10.3390/e24050600

Academic Editors: Karagrigoriou Alexandros and Makrides Andreas

Received: 21 March 2022 Accepted: 21 April 2022 Published: 25 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Herein, we present a novel yet simple (one may argue the simplest), three parameter skewed extension to the logistic distribution to allow for asymmetry; c.f. [16]. Nevertheless, if instead of our extension we deploy one of the other skew logistic distributions (such as the one described in [8]) the results would no doubt be comparable to the results we obtain herein; however, we pursue our simpler extension, detailing its statistical properties.

In the context of analysing epidemics, the logistic distribution is normally preferred, as it is a natural distribution to use in modelling population growth and decay. However, we still briefly mention a comparison of the results we obtain in modelling COVID-19 waves with the skew logistic distribution, to one which, instead, employs a skew normal distribution (more specifically we choose the, flexible, epsilon skew normal distribution [7]). The result of this comparison implies that utilising the epsilon skew normal distribution leads, overall, to results which are comparable to those when utilising the skew logistic distribution. However, in practice, it is still preferable to make use of the skew logistic distribution as it is the natural model to deploy in this context [17], since, on the whole, it is more consistent with the data as its tails are heavier than those of a skew normal distribution.

Epidemics are said to come in "waves". The precise definition of a wave is somewhat elusive [18], but it is generally accepted that, assuming we have a time series of the number of, say, daily hospitalisations, a wave will span over a period from one valley (minima) in the time series to another valley, with a peak (maxima) in between them. There is no strict requirement that waves do not overlap, although, for simplicity we will not consider any overlap as such; see [18], for an attempt to give an operational definition of the concept of epidemic wave. In order to combine waves, we make use of the concept of *bi-logistic growth* [19,20], or more generally, multi-logistic growth, which allows us to sum two or more instances of logistic growth when the time series spans over more than a single wave.

To fit the skew logistic distribution to the time series data we employ maximum likelihood, and to evaluate the goodness-of-fit we make use of the recently formulated *empirical survival Jensen–Shannon divergence* (E*SJS*) [21,22] and the well-established *Kolmogorov– Smirnov two-sample test statistic* (*KS*2) [23] (Section 6.3). The E*SJS* is an information-theoretic goodness-of-fit measure of a fitted parametric continuous distribution, which overcomes the inadequacy of the *coefficient of determination*, *R*2, as a goodness-of-fit measure for nonlinear models [24]. The *KS*2 statistic also satisfies this criteria regarding *R*2; however, we observe that the 95% bootstrap confidence intervals [25] we obtain for the E*SJS* are narrower than those for the *KS*2, suggesting that the E*SJS* is more powerful [26] than the *KS*2. Another well-known limitation of the *KS*2 statistic is that it is less sensitive to discrepancies at the tails of the distribution than the E*SJS* statistic is, in the sense that as opposed to E*SJS* it is "local", i.e., its value is determined by a single point [27].

The rest of the paper is organised as follows. In Section 2, we introduce a skew logistic distribution, which is a simple extension of the standard, symmetric, logistic distribution obtained by adding to it a single skew parameter and derive some of its properties. In Section 3, we formulate the solution to the maximum likelihood estimation of the parameters of the skew logistic distribution. In Section 4, we make use of an extension of the skew logistic distribution to the bi-skew logistic distribution to model a time series of COVID-19 data items having more than a single wave. In Section 5, we provide analysis of daily COVID-19 deaths in the UK from 30 January 2020 to 30 July 2021, assuming the skew logistic distribution as an underlying model of the data. The evaluation of goodness-of-fit of the skew logistic distribution to the data makes use of the recently formulated E*SJS*, and compares the results to those when employing the *KS*2 instead. We observe that the same technique, which we applied to the analysis of COVID-19 deaths, can be used to model new cases and hospitalisations. Finally, in Section 6, we present our concluding remarks. It is worth noting that in the more general setting of information modelling, being able to detect epidemic waves may help supply chains in planning increased resistance to such adverse events [28]. We note that all computations were carried out using the Matlab software package.
