The authors wish to update the Abstract and Section 3 in their paper published in the
International Journal of Environmental Research and Public Health (IJERPH) [
1].
They would like to rewrite the abstract as follows:
Abstract: Foodborne diseases have a big impact on public health and are often underreported. This is because a lot of patients delay treatment when they suffer from foodborne diseases. In Hunan Province (China), a total of 21,226 confirmed foodborne disease cases were reported from 1 March 2015 to 28 February 2016 by the Foodborne Surveillance Database (FSD) of the China National Centre for Food Safety Risk Assessment (CFSA). The purpose of this study was to make use of the daily number of visiting patients to forecast the daily true number of patients. Our main contribution is that we take the reporting delays into consideration and apply a Bayesian hierarchical model for this forecast problem. The data shows that there were 21,226 confirmed cases reported among 21,866 visiting patients, a proportion as high as 97%. Given this observation, the Bayesian hierarchical model was established to predict the daily true number of patients using the number of visiting patients. We use several scoring rules to assess the performance of different nowcasting procedures. We conclude that Bayesian nowcasting with consideration of right truncation of the reporting delays has a good performance for short-term forecasting and could effectively predict the epidemic trends of foodborne diseases. Meanwhile, this approach could provide a methodological basis for future foodborne disease monitoring and control strategies, which are crucial for public health.
In the end of the first paragraph in Section 3, the authors would like to update the last two sentences of the paragraph and add some citations. The revised sentences are as follows:
In this paper, we apply a Bayesian nowcasting model proposed by Höhle and an der Heiden [
2] to forecast the daily total number of cases. Thanks to Salmon et al. [
3], who provided a convenient R package “surveillance”, the inference for the model could be easily performed. The R package surveillance also contains a few other nowcasting methods that we also tried and did comparisons with using the scoring rules implemented in the package. The results are shown in Section 4. Below we review the model.
The authors revised Section 3.2 to describe the inference approach proposed by Höhle and an der Heiden (in Section 3.2 of [
2]) in some greater detail. This part is now as follows:
For the convenience of the reader, we describe the inference approach proposed by Höhle and an der Heiden (in Section 3.2 of [
2]) in some greater detail.
Define
as the (time-homogeneous) probability that a case will have a reporting delay of
days. The
’s satisfy the following equation:
. Following Kalbfleisch and Lawless’s previous work [
4] and Zeger et al.’s previous work [
5], we assume that the occurrence time of cases follows an underlying inhomogeneous Poisson process. A reasonable data generating process for the daily number of cases is thus as follows:
where
denotes the Poisson distribution with expectation
and
denotes the multinomial distribution with size parameter
and probability vector
. Nowcasting for a given time
can thus be divided into steps of determining the
’s, estimating the unknown delay distribution (i.e., the
’s), and finally predicting the unobserved
’s in order to compute the total
. As
increases, and if the assumption about a time-homogeneous delay distribution is acceptable, the available data make it possible to estimate the delay distribution better and better, and hence the quality of the predictions near
improves with time.
Consider a fixed time
and define
as the probability vector denoting that a case is reported with a delay of
days given the observed incomplete information at time
, i.e., the set of
where
. We choose as prior distribution the generalised Dirichlet distribution
with fixed constants
and
. Now we use Property 3 in the Web appendix of [
2] that shows that the posterior of
under right-truncated multinomial sampling is again a GD distribution with parameters
given by
hence, for a given
we can assume the following model hierarchy for the time points
:
where
is the proportion reported within a delay of
days. We denote by
the gamma distribution with parameters
. For this hierarchical model, the marginal distribution of
is a negative binomial distribution with the following mean and variance:
To estimate given the observed counts at time , we have to perform two steps: (1) update the delay distribution and (2) update the prediction for :
- (1)
For the given T we compute
as stated above. We then draw for
random vectors
by the algorithm of Wong [
6] and calculate
- (2)
Given the updated delay distribution and the observed counts , we can now update the prediction of .
For
we approximate by Monte Carlo sampling
An application of Bayes theorem provides
, where
is the normalization constant and
for all
. The factors of the last equation can be evaluated using the distributional assumptions of the model hierarchy. For numerical convenience we do not sum over the entire support
to get the normalization, but instead approximate
where
is chosen sufficiently large.
Finally, the authors would like to update “proposed in this paper” to “described in this paper”.
The changes do not affect the results. The manuscript will be updated and the original will remain online on the article webpage, with a reference to this addendum.