Next Article in Journal
Study of a New Software Reliability Growth Model under Uncertain Operating Environments and Dependent Failures
Previous Article in Journal
A Comparison of the Tortuosity Phenomenon in Retinal Arteries and Veins Using Digital Image Processing and Statistical Methods
Previous Article in Special Issue
Spatial Autocorrelation of Global Stock Exchanges Using Functional Areal Spatial Principal Component Analysis
 
 
Article
Peer-Review Record

Forecasting Canadian Age-Specific Mortality Rates: Application of Functional Time Series Analysis

Mathematics 2023, 11(18), 3808; https://doi.org/10.3390/math11183808
by Azizur Rahman * and Depeng Jiang
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Mathematics 2023, 11(18), 3808; https://doi.org/10.3390/math11183808
Submission received: 25 July 2023 / Revised: 2 September 2023 / Accepted: 4 September 2023 / Published: 5 September 2023

Round 1

Reviewer 1 Report

This is a well-written paper modeling and forecasting Canadian age-specific mortality rates using well-established methods from functional time series literature. Although the methods are not new, overall methodology is novel for this particular application. My only comments are:

1) It would be nice to make at least some of the codes used for this analysis publicly available for reproducibility's sake. Also, the software/packages used should be mentioned in the paper.

 2) None of the figures quantify uncertainty (e.g. using 95% CIs). When making statements as in lines 268-270 (-increase in mortality observed in males compared to total and females -), it'd be nice to know whether this increase is statistically significant. If it is not possible to obtain 95% CIs, it should be mentioned as a limitation. 

 

Author Response

 

At first, we would like to thank our reviewer for his/her insightful thoughts and suggestions. Would also like to thank for his/her time that he dedicated to read this manuscript thoroughly. We tried to address all the interesting points that reviewer pointed out to develop this manuscript. Below are the corresponding point by point responses that addressed reviewers’ comments and suggestions.

 

 

Reviewer’s comments and Responses (All added and corrected lines in the text are shaded in yellow color)

 

  1. It would be nice to make at least some of the codes used for this analysis publicly available for reproducibility’s sake. Also, the software/packages used should be mentioned in the paper.

 

Response: We would like to thank reviewer for pointing out this issue. We have added some codes in the supplementary file for reproducibility purpose. We have mentioned in the text about the software and packages that have been used in lines 241-242, page 6.

 

  1. None of the figures quantify uncertainly (e.g., using 95% CIs). When making statements as in lines 268-270 (-increase in mortality observed in males compared to total and females-), it would be nice to know whether this increase is statistically significant. It is not possible to obtain 95% Cis, it should be mentioned as a limitation.

Response: This is an excellent comment, and we now explained this in lines 407 to 410 as a limitation in Section 4, page 12.

R code: We will also provide this code in supplementary file in final submission

## A Functional Time Series Model for Forecasting Canadian Age Specific Mortality
#  CanMat:  a dataframe object with 17 rows and 30 columns
#             containing the log mortality values of 5 years age group 
#             and years 1991 through 2019

# provide directory of the RData
getwd()
setwd("C:/Lenevo/Lenevo_laptop/RData")

# Installing and loading some required packages
install.packages("fda") 
install.packages("ftsa")
install.packages("demography")
install.packages("forecast")
install.packages("rainbow")
install.packages("fields")
install.packages("ROMIplot")
library(fda)
library(ftsa)
library(demography)
library(forecast)
library(rainbow)
library(ROMIplot)
library(fields)


# Loading Canada Mortality Rate (Total series)
can.mort.log<-read.csv("Log_mortality_rate_Canada.csv",header=F)
medage<-read.csv("MedAge.csv",header=T)
myear<-read.csv("Mortality_Year.csv",header=T)
rownames(can.mort.log)<-medage$MedAge
colnames(can.mort.log)<-myear$Year
mat=as.matrix(can.mort.log)

# Ploting raw data for Total series
windows()
matplot(mat,type="l")
CanMat = mat
dim(CanMat)
CanLogMat_t = as.matrix(CanMat)

# Figure 1 Plot of total mortality rates in Canada for three selected years

  dimnames(CanLogMat_t)[[2]] <- paste('Year', 1991:2019, sep='')

  Fig1.data = cbind(CanLogMat_t[, c('Year1991', 'Year2001', 'Year2011')])
  nr=ncol(Fig1.data)                     
  CanTime = c(7,12,17,22,27,32,37,42,47,52,57,62,67,72,77,82,87)
  CanRng = c(7,87)
  quartz()
  matplot(CanTime, Fig1.data,
          type='l',lwd=2,xlab='Age',ylab='Log Death Rate',col=1:3,
          cex.lab=1.5,cex.axis=1.5)
  legend("topleft", colnames(Fig1data), col=seq_len(nr), cex=0.8, 
         lty=seq_len(nr), lwd=2)
  title("Canada:Total Death Rate")

#  Smoothing log mortality observations (total series)---------------------#

# Obtaining smoothed functions for total series
  nbasis_t = 80
  norder_t = 6
  CanBasis_t = create.bspline.basis(CanRng, nbasis_t, norder_t)
  D2fdPar_t = fdPar(CanBasis_t, lambda=1e-7)
  CanLogMortfd = smooth.basis(CanTime, CanLogMat_t, D2fdPar_t)$fd
  tt=CanLogMortfd$coefs #smoothed values
  rownames(tt)<-seq(1:80)
  colnames(tt)<-myear$Year
  mort_fts.t<- rainbow::fts(x = 1:80, y = (tt),xname="Age",yname="Total Mortality Rates")#Main line
  colnames(mort_fts.t$y) <- seq_along(colnames(mort_fts.t$y))

# Canada Mortality Rate (Male)----------------------------------------------#
  can.mortm.log<-read.csv("Log_mortality_male_Canada.csv",header=F)
  mat_m=as.matrix(can.mortm.log)
  CanMatm = mat_m
  dim(CanMatm)
  CanLogMatm = as.matrix(CanMatm)

# Smoothing the log mortality observations for male series
  nbasis_m = 85
  norder_m = 6
  CanBasis_m = create.bspline.basis(CanRng, nbasis_m, norder_m)
  D2fdPar_m = fdPar(CanBasis_m, lambda=1e-7)
  CanLogMortmfd = smooth.basis(CanTime, CanLogMatm, D2fdPar_m)$fd
  mm=CanLogMortmfd$coefs #smoothed values
  mort_fts.m<- rainbow::fts(x = 1:85, y = (mm),xname="Age",yname="Male Mortality Rates")#Main line
  colnames(mort_fts.m$y) <- seq_along(colnames(mort_fts.m$y))


# Canada Mortality Rate (Female)-------------------------------------------#
  can.mortf.log<-read.csv("Log_mortality_female_Canada.csv",header=F)
  matf=as.matrix(can.mortf.log)
  CanMatf = matf
  dim(CanMatf)
  CanLogMatf = as.matrix(CanMatf)
  
# Smoothing the log mortality observations for female series
  
  nbasis_f = 85
  norder_f = 6
  CanBasis_f = create.bspline.basis(CanRng, nbasis_f, norder_f)
  D2fdPar_f = fdPar(CanBasis_f, lambda=1e-7)
  CanLogMortffd = smooth.basis(CanTime, CanLogMatf, D2fdPar_f)$fd
  ff=CanLogMortffd$coefs #smoothed values
  mort_fts.f<- rainbow::fts(x = 1:85, y = (ff),xname="Age",yname="Female Mortality Rates")#Main line
  colnames(mort_fts.f$y) <- seq_along(colnames(mort_fts.f$y))


# Figure 2 Plot of smoothed functional log mortality series (a) to (c)

  windows()
  par(mfrow=c(1,3),mar=c(8,8,4,2))
  plot(mort_fts.m,col=rainbow(40),xlab="Age",
       ylab="Log Mortality Rates (per 1000 popluation)",
       main="Smoothed Mortality:Canada(Male)")
  plot(mort_fts.f,col=rainbow(40),xlab="Age",
       ylab="Log Mortality Rates (per 1000 popluation)",
       main="Smoothed Mortality:Canada(Female)")
  plot(mort_fts.t,col=rainbow(40),xlab="Age",
       ylab="Log Mortality Rates (per 1000 popluation)",
       main="Smoothed Mortality:Canada(Total)")
  
# Figure 3 Mean function for smoothed funcitonal mortality series 

  windows()
  par(mfrow=c(1,1),mar=c(8,8,4,2))
  plot(mean.fd(CanLogMortfd),main="Mean function",xlab="Age",ylim=c(-1,2.2),col="black") 
  lines(mean.fd(CanLogMortmfd),col="red",lty=2)
  lines(mean.fd(CanLogMortffd),col="green",lty=3)
  title("Smoothed Mean Function of Log Mortality Rate Canada: Total(black), Male(Red) and Female(Green)")
  
# Figure 4 Ten year (2020-2029) forecasted mortality rate of Canada for Total, Male and Female sereis
  windows()
  par(mfrow=c(1,3),mar=c(8,8,4,2))
  plot(forecast(ftsm(mort_fts.t,order=2),h=10)) # Total Series
  legend("topleft",c("2020","2029"),col=c("red","blue"),lty=1)
  title("Forecsted for 2020-2029:Total (both Sex)")
   
  plot(forecast(ftsm(mort_fts.m,order=2),h=10)) # Male Series
  legend("topleft",c("2020","2029"),col=c("red","blue"),lty=1)
  title("Forecsted for 2020-2029:Male")
  
  plot(forecast(ftsm(mort_fts.f,order=2),h=10)) # Female Series
  legend("topleft",c("2020","2029"),col=c("red","blue"),lty=1)
  title("Forecsted for 2020-2029:Female")

# Figure 5 Interval (95% CI) and point forecasted values of 2020 Mortality rate Canada
  windows()
  par(mfrow=c(1,3),mar=c(8,8,4,2))
  canmort.t=forecast(ftsm(mort_fts.t,order=2),h=1)
  #Plot the lower and upper bounds
  plot(canmort.t,ylim=c(-1.5,2))
  lines(canmort.t$lower,col=2);lines(canmort.t$upper,col=2)
  title("Forecsted value:Total 2020")

  canmort.m=forecast(ftsm(mort_fts.m,order=2),h=1)
  #Plot the lower and upper bounds
  plot(canmort.m,ylim=c(-1.5,2))
  lines(canmort.m$lower,col=2);lines(canmort.m$upper,col=2)
  title("Forecsted value:Male 2020")
  
  canmort.f=forecast(ftsm(mort_fts.f,order=2),h=1)
  #Plot the lower and upper bounds  
  plot(canmort.f,ylim=c(-1.5,2))
  lines(canmort.f$lower,col=2);lines(canmort.f$upper,col=2)
  title("Forecsted value:Female 2020")
 
# Figure 6, 7 and 8 Principal Component Regression Output (Total, Male and Female Series)
  plot(forecast(ftsm(mort_fts.t,order=2),h=1),"components") # total series

  plot(forecast(ftsm(mort_fts.m,order=2),h=1),"components") # Male series

  plot(forecast(ftsm(mort_fts.f,order=2),h=1),"components") # Female series

  
# Figure 9 image plot of age effect measuring for total series
  windows()
  par(mfrow=c(1,2),mar=c(8,8,4,2))
  image.plot(as.matrix(medage),as.matrix(myear),CanLogMat_t,
           xlab="age",ylab="age",cex.lab=1.5,cex.axis=1.5)
  ROMI.plot(Dx = NULL, Nx = NULL, mx = NULL, smooth = TRUE)

 

 

#----------------------------End-----------------------------------#

Author Response File: Author Response.docx

Reviewer 2 Report

This paper focuses on forecasting Canadian age-specific mortality rates using a functional data analysis approach. The introduction is clear and the literature review is a bit lacking. They give a good understanding of the problem at hand, the whole process, and the methods used. The efficacy of the work is demonstrated through the real data example.

 

Below are my main comments:

 

·       For equation 3, it is not mentioned that function f belongs to which space and what is x.

 

·       In equation 3, f_t, t denotes the (functional) feature. But in equation 6, t denotes time in y_t but then what is x?. There needs to be consistency and clarity wrt the notations.

 

·       What is z in equation 6 and well as what is capital Epsilon_t in equation 6.

 

·       Integrated squared forecast error is defined before section 3 but is never used.

 

·       There is no comparison to any other methods. (ARIMA, LSTM, other Functional models, etc). Some baseline comparison is needed to understand the merit and demerit of the approach.

 

·       In Figures 6,7 and 8. Basis function 1 corresponds to First Functional principal component, if yes, then it is not very clear.

 

·       In section 4, the paper claims “a novel approach of forecasting demographic series”, can the novelty of the approach please be elaborated/explained? As a reader, it seems like an application of FPCA/FAR(1).

 

 

Below are my minor comments:

·       A more recent literature review will make the paper stronger.

 

·       Currently, in the functional literature, this book is very popular and covers a lot of the recent advances in the field.

https://www.taylorfrancis.com/books/mono/10.1201/9781315117416/introduction-functional-data-analysis-piotr-kokoszka-matthew-reimherr

 

·       There is a lot of functional data analysis work being done for forecasting, especially in Machine learning and Deep learning context, in case the author wants to expand the literature. For e.g.:

https://ieeexplore.ieee.org/abstract/document/9378087

Please see the comments above

Author Response

At first, we would like to thank our reviewer for his/her insightful thoughts and suggestions. Would also like to thank for his/her time that he dedicated to read this manuscript thoroughly. We tried to address all the interesting points that reviewer pointed out to develop this manuscript. Below are the corresponding point by point responses that addressed reviewers’ comments and suggestions.

 

 

Reviewer’s comments and Responses (All added and corrected lines in the text are shaded in yellow color)

 

  1. For equation 3, it not mentioned that function f belongs to which space and what is x.

 

Response: We would like to thank reviewer for pointing out this issue. The manuscript has now been more carefully proofread and we have now clearly addressed this issue in lines 168-169, page 4 and are highlighted in yellow color.

 

  1. In equation 3, f_t, t denotes the (functional) feature. But in equation 6, t denotes time in y_t but then what is x? There needs to be consistency and clarify wrt the notations.

 

Response: This is an excellent comment, and we now explained all the notations used in the formula corresponding to each equation that are introduced in the lines 218, 221, 223-224, page 5. Moreover, we tried to clearly define them in lines 121-124, page 3.

 

  1. What is z in equation 6 and well as what is capital Epsilon_t in equation 6.

 

Response: Thanks for pointing out this, we actually missed that point. We have now defined what is z and what Epsilon_t means in equation 6 in lines 221 and 223-224, page 5.

 

  1. Integrated squared forecast error is defined before section 3 but is never used.

 

Response: This is an excellent and interesting comment, and we respond as follows: The integrated squared forecast error has been used by default when we obtained 10 year forecasted mortality value for all three groups and created the figure 4 and 5. By theory as mentioned in lines 231-235 in the section 3, when we predict the next s (10 year) period value, the package “ftsa” automatically created this value. However, we did not use this value as our article did not consider any baseline comparison to understand the merit and demit of the approach.

 

 

  1. There is no comparison to any other methods. (ARIMA, LSTM, other Functional models, etc). Some baseline comparison is needed to understand the merit and demerit of the approach.

 

Response: This is an excellent comment, and we respond as follows: The results of FTSA method comparison for all three series with exponential smoothing model are now reported by the authors through estimated fitted curves and interval forecast accuracy measure in the Supplementary material file. In addition, we describe the theory of interval forecast accuracy measure in paragraph “Evaluation of Interval Forecast Accuracy” in lines 245-256, page 6. Smallest interval forecasting scores were obtained for our model compared to discrete value model such as exponential smoothing mentioned in lines 316-320, page 9.

 

In case of prediction, we mentioned in the manuscript that so far for the simplicity of the work in terms of visualization, we reported only one year forecast (forecast horizon was h=1). However, our method can be used to forecast values for rest of the days as well by changing h=2,3,4…and so on.

 

  1. In Figures 6,7, and 8 basis function 1 corresponds to first principal component, if yes, then it is not very clear.

 

Response: This is an excellent comment, and we have responded in lines 327-330 for figure 6; lines 337-339 for figure 7, and lines 341-343 for figure 8 in page 9 and 10.

 

  1. In section 4, the paper claims “ a novel approach of forecasting demographic series”, can the novelty of the approach please be elaborated/explained? As a reader, it seems like an application of FPCA/FAR(1).

Response: This is an excellent comment, and we corrected this line in section 4. We are not claiming now that it is a novel method rather, we are saying an “applied approach” that has been introduced by Hyndman and Ullah (2017).

In addition, authors tried to improve language and the literature section by the suggested books and articles.

Author Response File: Author Response.pdf

Reviewer 3 Report

Review of "Forecasting Canadian Age-Specific Mortality Rates: Application of Functional Time Series Analysis" by Rahman & Jiang (2023)

Authors developed a manuscript as an attempt to forecasting canadian mortality rates using a functional autoregressive model of order 1. The methodology is not novel and is based on Hyndmann's works. The methods are well applied to canadian datasets and results are clear. However, I think that actual version is not novel from a methodological point of view and can not recommend for publication in Mathematics journal (which serves a theoretical/methodological journal for experts in Mathematics/Statistics topics). 

Given that this is an application, I recommend to authors to submit the work in an applied journal, or proposed a novel method a re-submit to Mathematics journal.

Author Response

Thank you so much for your time to review our paper. Yes, the method is not new but the application to of this method to this scenario: in the context of Canadian population is completely new. At first I was also thinking whether I will submit it to the "Mathematics" journal but I found this is a special issues focusing on theory and application of FDA in different scenarios. Since, our paper is application based on Canadian elderly population, that's why we choose "Mathematics" journal. 

Round 2

Reviewer 2 Report

The authors have improved the paper by considering the feedback. Thank you for sharing the revised version of the paper.

There are still some minor comments that need to be addressed:

  1. There is no comparison to any other methods. (ARIMA, LSTM, other Functional models, etc). Some baseline comparison is needed to understand the merit and demerit of the approach.

I missed the part where the Forecast Accuracy was given for the authors approach and also can it be added to the table in the supplementary materials along with ARIMA and ES.

  1. In Figures 6,7, and 8 basis function 1 corresponds to first principal component, if yes, then it is not very clear.

This is still not clear to me. (basis function 1)= (first principal component)?

 

Author Response

At first, we would like to thank our reviewer for his/her insightful thoughts and suggestions. Would also like to thank for his/her time that he dedicated to read this manuscript thoroughly. We tried to address all the interesting points that reviewer pointed out to develop this manuscript. Below are the corresponding point by point responses that addressed reviewers’ comments and suggestions.

 

 

Reviewer’s comments and Responses (All added and corrected lines in the text are shaded in yellow color)

 

 

  1. There is no comparison to any other methods. (ARIMA, LSTM, other Functional models, etc). Some baseline comparison is needed to understand the merit and demerit of the approach.

 

Response: This is an excellent comment, and we respond as follows: The results of FTSA method comparison for all three series with exponential smoothing model are now reported by the authors through estimated fitted curves and interval forecast accuracy measure (mean interval score) in the Supplementary material file. In addition, we describe the theory of interval forecast accuracy measure in paragraph “Evaluation of Interval Forecast Accuracy” in lines 245-256, page 6. Smallest interval forecasting scores were obtained for our model compared to discrete value model such as exponential smoothing mentioned in lines 316-320, page 9.

In the supplementary table we use a footnote where it clearly describes that FTSA approach uses ARIMA(2,1,1) as univariate method for forecasting principal component scores that’s why we did not use ARIMA for forecasting y values as a discrete time series model rather we use exponential smoothing technique (ets) and table shows the forecast accuracy measure (mean interval score) for both ARIMA(2,1,1) (our FTSA uses it) and est (as an univariate discrete approach).

 

  1. In Figures 6,7, and 8 basis function 1 corresponds to first principal component, if yes, then it is not very clear.

 

Response: This is an excellent comment, and we have responded in lines specific to 325-330 for figure 6; lines 337-339 for figure 7, and lines 341-343 for figure 8 in page 9 and 10.

Author Response File: Author Response.docx

Reviewer 3 Report

Review of "Forecasting Canadian Age-Specific Mortality Rates: Application of Functional Time Series Analysis" by Rahman and Jiang (2023)

In this second review, anf given the reasons explained by the authors, I reconsidered the manuscript for possible publication in Mathematics journal. The following issues must be addressed by the authors before I could recommend publication:

1. L182: delta could be 0.9 instead of 90% ? i.e., delta could be a number and not a percentage.

2. L216: "Tt" <-> "y_t" ?

3. L424: For this further work, I commend to add the reference: Kuschel, K., Carrasco, R., Idrovo-Aguirre, B.J., Duran, C., Contreras-Reyes, J.E. (2023). Preparing Cities for Future Pandemics: Unraveling the Influence of Urban and Housing Variables on COVID-19 Incidence in Santiago de Chile. Healthcare 11(16), 2259.

 

1. L141: "discreate" <-> "discrete".

2. Figure 4 and others: "forecsted" <-> "forecasted".

 

 

Author Response

At first, we would like to thank our reviewer for his/her insightful thoughts and suggestions. Would also like to thank for his/her time that he dedicated to read this manuscript thoroughly. We tried to address all the interesting points that reviewer pointed out to develop this manuscript. Below are the corresponding point by point responses that addressed reviewers’ comments and suggestions.

 

 

Reviewer’s comments and Responses (All added and corrected lines in the text are shaded in yellow color)

 

  1. L182: delta could be 0.9 instead of 90% ? i.e., delta could be a number and not a percentage.

 

Response: We would like to thank reviewer for pointing out this issue. The manuscript has now been more carefully proofread and we have now clearly addressed this issue in lines 182, page 4 and are highlighted in yellow color.

 

  1. L216: "Tt" <-> "y_t" ?

Response: This is an excellent comment, and we now corrected in in line 216.

 

  1. L424: For this further work, I commend to add the reference: Kuschel, K., Carrasco, R., Idrovo-Aguirre, B.J., Duran, C., Contreras-Reyes, J.E. (2023). Preparing Cities for Future Pandemics: Unraveling the Influence of Urban and Housing Variables on COVID-19 Incidence in Santiago de Chile. Healthcare 11(16), 2259.

Response: Thanks for pointing out this, we added this accordingly in line 424.

 

  1. L141: "discreate" <-> "discrete".

Response: Now corrected it in line 141.

 

  1. Figure 4 and others: "forecsted" <-> "forecasted".

Response: Corrected for figure 4 and figure 5 onwards.

Author Response File: Author Response.docx

Back to TopTop