Next Article in Journal
Machine Learning-Based Model Predictive Control of Two-Time-Scale Systems
Next Article in Special Issue
Ridge-Type Pretest and Shrinkage Estimation Strategies in Spatial Error Models with an Application to a Real Data Example
Previous Article in Journal
Unified Algorithm of Factorization Method for Derivation of Exact Solutions from Schrödinger Equation with Potentials Constructed from a Set of Functions
Previous Article in Special Issue
Bayesian Subset Selection of Seasonal Autoregressive Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian Identification Procedure for Triple Seasonal Autoregressive Models

by
Ayman A. Amin
1,* and
Saeed A. Alghamdi
2,*
1
Department of Statistics, Mathematics, and Insurance, Faculty of Commerce, Menoufia University, Menoufia 32952, Egypt
2
Department of Statistics, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Mathematics 2023, 11(18), 3823; https://doi.org/10.3390/math11183823
Submission received: 1 August 2023 / Revised: 28 August 2023 / Accepted: 4 September 2023 / Published: 6 September 2023
(This article belongs to the Special Issue Bayesian Inference, Prediction and Model Selection)

Abstract

:
Triple seasonal autoregressive (TSAR) models have been introduced to model time series date with three layers of seasonality; however, the Bayesian identification problem of these models has not been tackled in the literature. Therefore, in this paper, we have the objective of filling this gap by presenting a Bayesian procedure to identify the best order of TSAR models. Assuming that the TSAR model errors are normally distributed along with employing three priors, i.e., normal-gamma, Jeffreys’ and g priors, on the model parameters, we derive the marginal posterior distributions of the TSAR model parameters. In particular, we show that the marginal posteriors are multivariate t and gamma distributions for the TSAR model coefficients vector and precision, respectively. Using the marginal posterior distribution of the TSAR model coefficients vector, we present an identification procedure for the TSAR models based on a sequence of t-test of significance. We evaluate the accuracy of the proposed Bayesian identification procedure by conducting an extensive simulation study, followed by a real application to hourly electricity load datasets in six European countries.

1. Introduction

Modeling time series data with multiple seasonalities is a challenging task [1,2,3], and researchers have extended autoregressive moving average (ARMA) models to fit these time series and accommodate multiple seasonalities [4,5,6,7]. Bayesian analysis of seasonal ARMA (SARMA) models and their extensions is difficult to handle since the likelihood function of these time series models is complicated and analytically intractable, which complicates the posterior and predictive analyses. Accordingly, analytical approximations and Markov Chain Monte Carlo (MCMC) methods-based approximations have been introduced in the literature to ease and simplify the Bayesian analysis of these models. In general, the analytical approximations idea is mainly carried out by modifying the posterior and predictive densities of SARMA models to be approximated in standard or closed-form distributions that can be analytically tractable; see, for example, Shaarawy and Broemeling [8], Shaarawy and Ismail [9], Shaarawy and Ali [10]. In addition, the idea of approximations based on MCMC methods is carried out by using one or more of these methods to simulate the available conditional posterior or predictive densities of SARMA models to empirically approximate their corresponding marginal posterior or predictive densities; see, for example, Barnett et al. [11], Vermaak et al. [12], Amin [13].
The Bayesian analysis of single SARMA models is well established in the literature [14], starting from the work of Barnett et al. [11,15], which presents the MCMC methods to develop the Bayesian estimation of seasonal autoregressive (SAR) and ARMA models. In addition, Vermaak et al. [12] used the Metropolis within Gibbs sampling algorithm to propose the Bayesian estimation of SAR models in the field of modeling speech production for voiced sounds. Ismail [16,17] introduced the Gibbs sampler to present the Bayesian analysis of the SAR and seasonal moving average (SMA) models, respectively. On the other hand, Shaarawy and Ali [10] presented the analytical approximation to introduce the Bayesian identification of SAR models. Moreover, Ismail and Amin [18] applied the Gibbs sampler to present the Bayesian inference of SARMA models, and, recently, Amin [19] introduced both a Bayesian estimation and prediction of SARMA models via a Gibbs sampler.
The Bayesian analysis of double seasonal ARMA (DSARMA) models is relatively recent and few works have been presented in the literature. Ismail and Zahran [20] presented the analytical approximations to introduce the Bayesian estimation of double SAR (DSAR) models, and Amin [21] applied the Gibbs sampler to develop the Bayesian estimation of DSARMA models. In addition, Amin [22,23] presented analytical approximations to introduce the Bayesian estimation of DSARMA models and the Bayesian identification of DSAR models, respectively. Moreover, recently, Amin [24,25] presented the Bayesian analysis of DSAR models via the Gibbs sampler. For the time series with three layers of seasonality, none have introduced or discussed their Bayesian analysis, except our recent work of introducing the analytical approximations [13] and the Gibbs sampling algorithm [26] to present the Bayesian estimation of the TSAR models. However, specifying the best TSAR model order, i.e., the TSAR model identification, is the first and an important stage of the Bayesian analysis of these models that needs to be tackled to model high-frequency time series with three layers of seasonality in real applications.
Therefore, our contribution in this paper is to fill this gap by introducing a testing-based Bayesian procedure to identify the best order of TSAR models. Assuming that the TSAR model errors are normally distributed along with employing three priors, i.e., normal-gamma, Jeffreys’ and g priors, on the model parameters, we first derive the marginal posterior distributions of these TSAR model parameters. In particular, we derive the marginal posteriors to be multivariate t and gamma distributions for the TSAR model coefficients vector and precision, respectively. Since the derived marginal posterior distribution of the TSAR model coefficients vector is a multivariate t-distribution, we present an identification procedure for the TSAR models based on a sequence of t-test of significance, which is a simple and easy procedure to be adopted and applied in real applications. We evaluate the accuracy of the proposed Bayesian identification procedure by conducting an extensive simulation study. In addition, we show the applicability of our work on real hourly electricity load time series datasets in six European countries.
The remainder of this paper is structured as follows. The TSAR models are first introduced in Section 2. In Section 3, we discuss the posterior analysis of these TSAR models, followed by the proposed Bayesian identification procedure. We discuss the simulation study setting and results and then present a real application of the proposed Bayesian identification of TSAR models on hourly electricity load time series in six European countries in Section 4. Finally, in Section 5, the paper is concluded.

2. TSAR Models

The time series { z t } is said to be generated from the zero-mean triple seasonal autoregressive (TSAR) model of order p, P 1 , P 2 and P 3 , i.e., TSAR(p)( P 1 ) s 1 ( P 2 ) s 2 ( P 3 ) s 3 , if it has the form:
ϕ p ( B ) Φ P 1 ( B s 1 ) Π P 2 ( B s 2 ) Ψ P 3 ( B s 3 ) z t = ε t ,
where { ε t } is a sequence of random errors that are assumed to be identically independently normally distributed with zero mean and unknown precision τ , s 1 , s 2 and s 3 are the seasonal periods and B is the back-shift operator defined as B r z t = z t r .
ϕ p ( B ) = 1 ϕ 1 B ϕ 2 B 2 ϕ p B p
is the non-seasonal autoregressive operator of order p. The TSAR model (1) has three seasonal autoregressive operators to accommodate the triple seasonal patterns [13], which are:
Φ P 1 ( B s 1 ) = 1 Φ 1 B s 1 Φ 2 B 2 s 1 Φ P 1 B P 1 s 1 ,
Π P 2 ( B s 2 ) = 1 Π 1 B s 2 Π 2 B 2 s 2 Π P 2 B P 2 s 2
and
Ψ P 3 ( B s 3 ) = 1 Ψ 1 B s 3 Ψ 2 B 2 s 3 Ψ P 3 B P 3 s 3
with orders P 1 , P 2 and P 3 , respectively.
The non-seasonal and seasonal autoregressive coefficients are ϕ = ϕ 1 , ϕ 2 , , ϕ p T , Φ = Φ 1 , Φ 2 , , Φ P 1 T , Π = Π 1 , Π 2 , , Π P 2 T and Ψ = Ψ 1 , Ψ 2 , , Ψ P 3 T , respectively.
Using the summation notation, we can expand and simplify the compact form of the TSAR model (1) to be:
z t = i = 1 p ϕ i z t i + j = 1 P 1 Φ j z t j s 1 + m = 1 P 2 Π m z t m s 2 + k = 1 P 3 Ψ k z t k s 3 i = 1 p j = 1 P 1 ϕ i Φ j z t i j s 1 i = 1 p m = 1 P 2 ϕ i Π m z t i m s 2 i = 1 p k = 1 P 3 ϕ i Ψ k z t i k s 3 j = 1 P 1 m = 1 P 2 Φ j Π m z t j s 1 m s 2 j = 1 P 1 k = 1 P 3 Φ j Ψ k z t j s 1 k s 3 m = 1 P 2 k = 1 P 3 Π m Ψ k z t m s 2 k s 3 + i = 1 p j = 1 P 1 m = 1 P 2 ϕ i Φ j Π m z t i j s 1 m s 2 + i = 1 p j = 1 P 1 k = 1 P 3 ϕ i Φ j Ψ k z t i j s 1 k s 3 + i = 1 p m = 1 P 2 k = 1 P 3 ϕ i Π m Ψ k z t i m s 2 k s 3 + j = 1 P 1 m = 1 P 2 k = 1 P 3 Φ j Π m Ψ k z t j s 1 m s 2 k s 3 i = 1 p j = 1 P 1 m = 1 P 2 k = 1 P 3 ϕ i Φ j Π m Ψ k z t i j s 1 m s 2 k s 3 + ε t .
In addition, we can write the matrix form of the TSAR model as:
Z = X β + ε ,
where Z = z 1 , z 2 , , z n T is the vector of observed time series, ε = ε 1 , ε 2 , , ε n T is the vector of unobserved model errors, X is an n × q design matrix, i.e., q = ( 1 + p ) ( 1 + P 1 ) ( 1 + P 2 ) ( 1 + P 3 ) 1 , with the tth row:
X t = z t 1 , , z t p , z t s 1 , z t s 1 1 , , z t s 1 p , , z t P 1 s 1 , z t P 1 s 1 1 , , z t P 1 s 1 p , z t s 2 , z t s 2 1 , , z t s 2 p , z t s 2 s 1 , z t s 2 s 1 1 , , z t s 2 s 1 p , , z t s 2 P 1 s 1 , z t s 2 P 1 s 1 1 , , z t s 2 P 1 s 1 p , , z t P 2 s 2 , z t P 2 s 2 1 , , z t P 2 s 2 p , z t P 2 s 2 s 1 , z t P 2 s 2 s 1 1 , , z t P 2 s 2 s 1 p , , z t P 2 s 2 P 1 s 1 , z t P 2 s 2 P 1 s 1 1 , , z t P 2 s 2 P 1 s 1 p , z t s 3 , z t s 3 1 , , z t s 3 p , z t s 3 s 1 , z t s 3 s 1 1 , , z t s 3 s 1 p , , z t s 3 P 1 s 1 , z t s 3 P 1 s 1 1 , , z t s 3 P 1 s 1 p , z t s 3 s 2 , z t s 3 s 2 1 , , z t s 3 s 2 p , z t s 3 s 2 s 1 , z t s 3 s 2 s 1 1 , , z t s 3 s 2 s 1 p , , z t s 3 P 1 s 1 s 2 , z t s 3 P 1 s 1 s 2 1 , , z t s 3 P 1 s 1 s 2 p , , z t s 3 P 2 s 2 , z t s 3 P 2 s 2 1 , , z t s 3 P 2 s 2 p , z t s 3 s 1 P 2 s 2 , z t s 3 s 1 P 2 s 2 1 , , z t s 3 s 1 P 2 s 2 p , , z t s 3 P 1 s 1 P 2 s 2 , z t s 3 P 1 s 1 P 2 s 2 1 , , z t s 3 P 1 s 1 P 2 s 2 p , , z t P 3 s 3 , z t P 3 s 3 1 , , z t P 3 s 3 p , z t P 3 s 3 s 1 , z t P 3 s 3 s 1 1 , , z t P 3 s 3 s 1 p , , z t P 3 s 3 P 1 s 1 , z t P 3 s 3 P 1 s 1 1 , , z t P 3 s 3 P 1 s 1 p , z t s 2 , z t P 3 s 3 s 2 1 , , z t P 3 s 3 s 2 p , z t P 3 s 3 s 1 s 2 , z t P 3 s 3 s 1 s 2 1 , , z t P 3 s 3 s 1 s 2 p , , z t P 3 s 3 P 1 s 1 s 2 , z t P 3 s 3 P 1 s 1 s 2 1 , , z t P 3 s 3 P 1 s 1 s 2 p , , z t P 3 s 3 P 2 s 2 , z t P 3 s 3 P 2 s 2 1 , , z t P 3 s 3 P 2 s 2 p , z t P 3 s 3 s 1 P 2 s 2 , z t P 3 s 3 s 1 P 2 s 2 1 , , z t P 3 s 3 s 1 P 2 s 2 p , , z t P 3 s 3 P 1 s 1 P 2 s 2 , z t P 3 s 3 P 1 s 1 P 2 s 2 1 , , z t P 3 s 3 P 1 s 1 P 2 s 2 p ,
and β is the vector of TSAR model coefficients given as:
β = ϕ 1 , , ϕ p , Φ 1 , ϕ 1 Φ 1 , , ϕ p Φ 1 , , Φ P 1 , ϕ 1 Φ P 1 , , ϕ p Φ P 1 , Π 1 , ϕ 1 Π 1 , , ϕ p Π 1 , Φ 1 Π 1 , ϕ 1 Φ 1 Π 1 , , ϕ p Φ 1 Π 1 , , Φ P 1 Π 1 , ϕ 1 Φ P 1 Π 1 , , ϕ p Φ P 1 Π 1 , , Π P 2 , ϕ 1 Π P 2 , , ϕ p Π P 2 , Φ 1 Π P 2 , ϕ 1 Φ 1 Π P 2 , , ϕ p Φ 1 Π P 2 , , Φ P 1 Π P 2 , ϕ 1 Φ P 1 Π P 2 , , ϕ p Φ P 1 Π P 2 , Ψ 1 , ϕ 1 Ψ 1 , , ϕ p Ψ 1 , Φ 1 Ψ 1 , ϕ 1 Φ 1 Ψ 1 , , ϕ p Φ 1 Ψ 1 , , Φ P 1 Ψ 1 , ϕ 1 Φ P 1 Ψ 1 , , ϕ p Φ P 1 Ψ 1 , Π 1 Ψ 1 , ϕ 1 Π 1 Ψ 1 , , ϕ p Π 1 Ψ 1 , Φ 1 Π 1 Ψ 1 , ϕ 1 Φ 1 Π 1 Ψ 1 , , ϕ p Φ 1 Π 1 Ψ 1 , , Φ P 1 Π 1 Ψ 1 , ϕ 1 Φ P 1 Π 1 Ψ 1 , , ϕ p Φ P 1 Π 1 Ψ 1 , , Π P 2 Ψ 1 , ϕ 1 Π P 2 Ψ 1 , , ϕ p Π P 2 Ψ 1 , Φ 1 Π P 2 Ψ 1 , ϕ 1 Φ 1 Π P 2 Ψ 1 , , ϕ p Φ 1 Π P 2 Ψ 1 , , Φ P 1 Π P 2 Ψ 1 , ϕ 1 Φ P 1 Π P 2 Ψ 1 , , ϕ p Φ P 1 Π P 2 Ψ 1 , Ψ P 3 , ϕ 1 Ψ P 3 , , ϕ p Ψ P 3 , Φ 1 Ψ P 3 , ϕ 1 Φ 1 Ψ P 3 , , ϕ p Φ 1 Ψ P 3 , , Φ P 1 Ψ P 3 , ϕ 1 Φ P 1 Ψ P 3 , , ϕ p Φ P 1 Ψ P 3 , Π 1 Ψ P 3 , ϕ 1 Π 1 Ψ P 3 , , ϕ p Π 1 Ψ P 3 , Φ 1 Π 1 Ψ P 3 , ϕ 1 Φ 1 Π 1 Ψ P 3 , , ϕ p Φ 1 Π 1 Ψ P 3 , , Φ P 1 Π 1 Ψ P 3 , ϕ 1 Φ P 1 Π 1 Ψ P 3 , , ϕ p Φ P 1 Π 1 Ψ P 3 , , Π P 2 Ψ P 3 , ϕ 1 Π P 2 Ψ P 3 , , ϕ p Π P 2 Ψ P 3 , Φ 1 Π P 2 Ψ P 3 , ϕ 1 Φ 1 Π P 2 Ψ P 3 , , ϕ p Φ 1 Π P 2 Ψ P 3 , , Φ P 1 Π P 2 Ψ P 3 , ϕ 1 Φ P 1 Π P 2 Ψ P 3 , , ϕ p Φ P 1 Π P 2 Ψ P 3 T .
It has to be noted that, when the TSAR model orders p, P 1 , P 2 and P 3 are unknown, the design matrix X in the TSAR model (3) is a function of p, P 1 , P 2 and P 3 , which complicates the posterior analysis of these models and shows the importance of our work in this paper.

3. Posterior Analysis and Proposed Identification Procedure for TSAR Models

We can obtain the posterior distribution of the TSAR model parameters by simply combining the prior distribution of these model parameters with the likelihood function of time series data { z t } [27]. First, for the likelihood function of observed time series data { z t } , assuming the random errors { ε t } of the TSAR model (3) are normally distributed and using a straightforward transformation from the model errors ε to Z [28], the conditional likelihood function on the first P initial values, i.e., P = P 3 s 3 + P 2 s 2 + P 1 s 1 + p , can be given as:
L ( β , τ Z ) τ n P 2 exp τ 2 ε T ε , τ n P 2 exp τ 2 Z X β T Z X β .
On the other hand, for the prior specification of TSAR model parameters β and τ , we assume the products of non-seasonal and seasonal coefficients to be free coefficients and consider three prior distributions: normal-gamma, g and Jeffreys’ priors. Suppose that τ G ( ν 2 , λ 2 ) and β N p ( μ β , τ 1 Σ β ) ; the normal-gamma prior of β and τ can be written as:
ζ n β , τ τ ν + q 2 1 exp τ 2 λ + β μ β T Σ β 1 β μ β ,
where the parameters μ β , Σ β , ν and λ are hyper-parameters of the normal-gamma prior that need to be estimated to conduct the Bayesian analysis.
In addition, we employ the g prior for the parameters β and τ with the objective of simplifying the elicitation of covariances of these parameters. Thus, the g prior of β and τ can be given as:
ζ g β , τ τ ( q 2 1 ) exp g τ 2 β β ¯ T ( X T X ) β β ¯ ,
where g is a hyper-parameter that might be specified as a function of the size of time series n and the number of model coefficients; for more details about how to set the g hyper-parameter value, see Fernandez et al. [29].
In case no or little information is available about the model parameters β and τ , Jeffreys’ prior can be employed, and it is given as:
ζ j β , τ τ 1 , τ > 0
In order to obtain the joint posterior of β and τ , we multiply the likelihood function given in (6) by each one of the above-given three prior distributions in (7)–(9). Following this Bayesian rule and for the normal-gamma prior (7), the joint posterior of β and τ is given as:
ζ n β , τ Z τ n P + ν + q 2 1 exp τ 2 λ + β μ β T Σ β 1 β μ β + Z X β T Z X β .
In addition, for the g prior (8), the joint posterior of β and τ is given as:
ζ g β , τ Z τ n P + q 2 1 exp τ 2 β β ¯ T ( g X T X ) β β ¯ + Z X β T Z X β .
Moreover, for Jeffreys’ prior (9), the joint posterior of β and τ is given as:
ζ j β , τ Z τ n P 2 1 exp τ 2 Z X β T Z X β .
In our previous work [13], we derived from the joint posteriors in (10)–(12) the marginal posteriors of the TSAR model coefficients vector β and precision τ , and we summarize these results in the following theorem and two corollaries.
Theorem 1. 
Given the conditional likelihood function (6) and the normal-gamma prior of β and τ (7), the marginal posterior of the TSAR model coefficients vector β is a multivariate t-distribution with the degrees of freedom v n = ( n + ν P ) , mean vector μ n = A n 1 B n and dispersion matrix V n = C n v n 2 A n 1 , and the marginal posterior of the TSAR model precision τ is a gamma distribution with the two parameters v n 2 and C n 2 , where A n 1 = ( X T X + Σ ϕ 1 ) 1 , B n = ( X T Z + Σ ϕ 1 μ ϕ ) and C n = [ Z T Z + λ + μ ϕ T Σ ϕ 1 μ ϕ B n T A n 1 B n ] .
Proof. 
We multiply the conditional likelihood function (6) of observed time series data { y t } by the normal-gamma prior of β and τ (7) to obtain the joint posterior of β and τ as given in Equation (10). We then integrate (10) over τ and complete the square with respect to β , leading to the given marginal posterior of the TSAR model coefficients vector β . On the other hand, we complete the square in the exponent of (10) with respect to β and then integrate over β , leading to the given marginal posterior of the TSAR model precision τ .
The following two corollaries are special cases of Theorem 1 when little or no information is available a priori about the TSAR model parameters, since we employ the g and Jeffreys’ priors to obtain the marginal posteriors of the TSAR model coefficients vector β and precision τ to also be the multivariate t and gamma distributions, respectively, with different parameters. □
Corollary 1. 
Given the conditional likelihood function (6) and the g prior of β and τ (8), the marginal posterior of the model TSAR coefficients vector β is a multivariate t-distribution with the degrees of freedom v g = ( n P ) , mean vector μ g = A g 1 B g and dispersion matrix V g = C g v g 2 A g 1 , and the marginal posterior of the TSAR model precision τ is a gamma distribution with the two parameters v g 2 and C g 2 , where A g 1 = ( ( g + 1 ) X T X ) 1 , B g = ( X T Z + g ( X T X ) β ¯ ) and C g = [ Z T Z + g β ¯ T ( X T X ) β ¯ B g T A g 1 B g ] .
Proof. 
We set λ = ν = 0 , μ β = β ¯ and Σ β 1 = g X T X , then directly it follows from Theorem 1. □
Corollary 2. 
Given the conditional likelihood function (6) and Jeffreys’ prior of β and τ (9), the marginal posterior of the TSAR model coefficients vector β is a multivariate t-distribution with the degrees of freedom v j = ( n P q ) , mean vector μ j = A j 1 B j , and dispersion matrix V j = C j v j 2 A j 1 , and the marginal posterior of the TSAR model precision τ is a gamma distribution with the two parameters v j 2 and C j 2 , where A j 1 = ( X T X ) 1 , B j = X T Z , and C j = [ Z T Z B j T A j 1 B j ] .
Proof. 
We set λ = 0 , Σ β 1 = 0 , and ν = q ; then, directly, it follows from Theorem 1. □
It is important to mention that an interesting property of the multivariate t-distribution of a given vector X is that each single component of this vector X has a univariate t-distribution and also any subvector of this vector X also has a multivariate t-distribution [30]. Accordingly, using Theorem 1 and Corollaries 1 and 2, the marginal posterior of each one of the TSAR model coefficients vectors ϕ , Φ , Π and Ψ is a multivariate t-distribution, and the marginal posterior of any single component of each one of them has a univariate t-distribution.
Based on the derived marginal posterior of the TSAR model coefficients, we propose a Bayesian identification procedure to identify the best value of the TSAR model order by first assuming that the maximum values of the TSAR model order p, P 1 , P 2 and P 3 are known and given as k 1 , k 2 , k 3 and k 4 , respectively. We then consider the following testing scheme as follows:
  • Test:
    H 0 ( 4 , 1 ) : Ψ k 4 = 0 against H 1 ( 4 , 1 ) : Ψ k 4 0 ,
    using the marginal posterior of the coefficient Ψ k 4 , which is a univariate t-distribution.
  • If H 0 ( 4 , 1 ) is rejected, then the identified value for P 3 is P 3 * = k 4 . Otherwise, test:
    H 0 ( 4 , 2 ) : Ψ k 4 1 = Ψ k 4 = 0 against H 1 ( 4 , 2 ) : Ψ k 4 1 0 or Ψ k 4 0 ,
    using the marginal posterior of the coefficients ( Ψ k 4 1 , Ψ k 4 ) , which is a multivariate t-distribution.
  • Continue executing this sequence of t-test of significance until the null hypothesis H 0 ( 4 , i 4 ) : Ψ k 4 i 4 + 1 = = Ψ k 4 = 0 is rejected, and then the identified value for P 3 is P 3 * = k 4 i 4 + 1 , where 0 P 3 * k 4 .
  • Test:
    H 0 ( 3 , 1 ) : Π k 3 = Ψ k 4 i 4 + 2 = = Ψ k 4 = 0 against
    H 1 ( 3 , 1 ) : Π k 3 0 or Ψ k 4 i 4 + 2 0 or or Ψ k 4 0 ,
    using the marginal posterior of the coefficients ( Π k 3 , Ψ k 4 i 4 + 2 , , Ψ k 4 ) , which is a multivariate t-distribution.
  • If H 0 ( 3 , 1 ) is rejected, then the identified value for P 2 is P 2 * = k 3 . Otherwise, test:
    H 0 ( 3 , 2 ) : Π k 3 1 = Π k 3 = Ψ k 4 i 4 + 2 = = Ψ k 4 = 0 against
    H 1 ( 3 , 2 ) : Π k 3 1 0 or Π k 3 0 or Ψ k 4 i 4 + 2 0 or or Ψ k 4 0 ,
    using the marginal posterior of the coefficients ( Π k 3 1 , Π k 3 , Ψ k 4 i 4 + 2 , , Ψ k 4 ) , which is a multivariate t-distribution.
  • Continue executing this sequence of t-test of significance until the null hypothesis H 0 ( 3 , i 3 ) : Π k 3 i 3 + 1 = = Π k 3 = Ψ k 4 i 4 + 1 = = Ψ k 4 = 0 is rejected, and then the identified value for P 2 is P 2 * = k 3 i 3 + 1 , where 0 P 2 * k 3 .
  • In the same way, run sequences of t-test of significance for the TSAR model coefficients vectors ϕ and Φ until the null hypotheses are rejected, and then the identified values for p and P 1 are 0 p * = k 1 i 1 + 1 k 1 and 0 P 1 * = k 2 i 2 + 1 k 2 , respectively.
The outcome of this testing scheme is the values p * , P 1 * , P 2 * and P 3 * as the best value for the TSAR model order p, P 1 , P 2 and P 3 , respectively.
An important point that we have to mention here is related to the computational challenge of applying the proposed Bayesian identification procedure. Indeed, capturing triple seasonality requires observing time series under study for long time, which results in time series with a very large sample size that can be considered as big time series data. For example, observing the hourly electricity load for four years results in a time series of size n 35,000. As is known, analyzing big time series data requires a high computational cost, especially when the analysis is conducted using interpreter software.

4. Simulation Study and Real Application

In this section, we evaluate the accuracy of the proposed Bayesian identification for TSAR models by conducting an extensive simulation study, and then we show the applicability of the proposed Bayesian identification procedure on real time series datasets of electricity load in six European countries that have three layers of seasonality.

4.1. Simulation Study

In this simulation study, we evaluate the accuracy of the proposed Bayesian identification procedure for TSAR models by covering different seasonality patterns with different data types and sample sizes. We generate 500 time series datasets of size n (from 3000 to 6000 with an increment of 1000 observations) from five TSAR models, and the design of these TSAR models and the true parameters values are presented in Table 1. As it can be seen from this table that the seasonal periods for gthe enerated time series are s 1 = 12, s 2 = 60 and s 3 = 600, and since the time series size has to be greater than k 4 s 3 + k 3 s 2 + k 2 s 1 + k 1 , it is justified for our selection for the minimum value of the time series size n to be 3000.
After obtaining the 500 time series datasets from each one of these five TSAR models with different sizes, we apply the proposed Bayesian identification procedure to identify the best value for the TSAR model order as follows.
  • First, we assume that the maximum values of the model order are k 1 = k 2 = k 3 = k 4 = 3 , since the maximum order value in the simulated TSAR models is not more than two.
  • Second, we employ different priors for the model parameters β and τ in order to evaluate the robustness or sensitivity to prior specification. In particular, we employ Jeffreys’ prior and g prior with five values for the hyper-parameter g, i.e., g 1 = 1 / n , g 2 = q / n , g 3 = 1 / n , g 4 = q / n and g 5 = l n ( q + 1 ) / l n ( n ) .
  • Third, for each time series, we execute the proposed testing scheme in Section 3 using the marginal posterior of the TSAR model coefficients resulting from the employed priors, and the outcome of this testing scheme is the identified TSAR model order, i.e., p * , P 1 * , P 2 * and P 3 * .
For all the generated time series datasets, we obtain the percentage of correctly identified true TSAR models by simply comparing the identified TSAR order obtained from the testing scheme with the true values of p, P 1 , P 2 and P 3 employed to generate the given time series.
In order to explain how the proposed Bayesian identification procedure works to identify the best value of the TSAR model order, we present the results of the testing scheme for only one generated time series with n = 3000 from TSAR Model I in Table 2. In this table, we present the null hypothesis, p-value calculated from the t distribution and whether the status of the null hypothesis is rejected or not using a 5% significance level for each test. It is worth noting that using the 5% significance level for each test does not guarantee that the overall significance level of the testing scheme equals 0.05, and indeed it is difficult to determine the overall significance level; for more discussion about this point, see, for example, Lütkepohl [31]. The results of the simulation study for all TSAR models are presented in Table 3.
From the simulation results in Table 3, we can conclude some important remarks. First, the proposed Bayesian identification procedure provides a higher percentage of correctly identified true TSAR models, since, in most of the cases, the percentage of correctly identified true TSAR models is at least 85%. Second, the larger the time series size, the higher the percentage of correctly identified true TSAR models obtained. Third, employing different prior distributions for the TSAR model coefficients results in totally different TSAR identification results, and their impact is obvious in the percentage of correctly identified true TSAR models. Fourth, employing the g prior with g = l n ( q + 1 ) / l n ( n ) for the TSAR model parameters β and τ highly improves the percentage of correctly identified true TSAR models compared to using other g values as well as Jeffreys’ prior, since, in all the cases of employing the g prior with g = l n ( q + 1 ) / l n ( n ) , the percentage of correctly identified true TSAR models is at least 92.2%.
In order to evaluate the robustness of the proposed Bayesian procedure to the violation of time series normality assumption, we generate the 500 time series datasets from the five TSAR models while assuming other distributions for the model errors. These errors distributions include Student’s t with 15 degrees of freedom, i.e., t(15), Laplace, log-normal and skew-normal with moderate skewness, i.e., skewness = 0.75. We apply the proposed Bayesian TSAR identification procedure in the same way as we discussed above, and the Bayesian identification results for Model I are presented in Table 4.
These results show that the proposed Bayesian TSAR identification procedure is robust to the normality assumption violation, and the same conclusion is obtained from the results of the other four TSAR models. In general, these results of the conducted simulation study confirm the accuracy of the proposed Bayesian procedure for the TSAR models identification, especially when assuming the g prior for the TSAR model coefficients β and precision τ .

4.2. Applications on Real Hourly Electricity Load Datasets

We apply in this section the proposed Bayesian identification procedure of TSAR models to real hourly time series datasets with three layers of seasonality. These datasets are electricity load per hour collected during about four years, from Saturday 1 January 2006 to Thursday 31 December 2009, in six European countries. These electricity load datasets are characterized by exhibiting three seasonality layers: intraday, intraweek and intrayear patterns. The time plots for these time series datasets of electricity load are displayed in Figure 1. For more details about these electricity load datasets, the reader are referred to Amin [26], where these datasets are first introduced and analyzed.
It is clear from Figure 1 that the seasonality with three layers is very strong in these hourly electricity load time series; accordingly, we set the maximum order values of the TSAR model as k 1 = k 2 = k 3 = k 4 = 4 , and we apply our proposed Bayesian identification procedure for the TSAR model with the same design of the simulation study using Jeffreys’ prior and the g prior with g = l n ( q + 1 ) / l n ( n ) . We present the results of the identified TSAR model for each one of these electricity load datasets in Table 5.
We can observe from the results of this real application in Table 5 that the proposed identification procedure identifies almost the same TSAR model for the electricity load datasets that have the same stochastic behavior, as displayed in Figure 1. In addition, the identified TSAR models for these electricity load datasets are very similar for both Jeffreys’ and g priors.

5. Conclusions

In this paper, we introduced a Bayesian identification procedure for TSAR models proposed to fit and model time series with triple seasonality. We first employed three prior distributions, i.e., Jeffreys’, g and normal-gamma priors, on the TSAR model coefficients and precision, and also we assumed that the TSAR model errors are identically and normally distributed. We then derived the marginal posteriors of the TSAR model coefficients vector and precision to be the multivariate t and gamma distributions, respectively. Using the derived marginal posterior of the TSAR model coefficients, we straightforwardly proposed an identification procedure for determining the best order of TSAR models based on a sequence of t-test of significance. We conducted an extensive simulation study and the simulation results confirmed the accuracy of the proposed Bayesian TSAR identification, and also we applied our proposed procedure to real hourly electricity load datasets in six European countries. Since the current work considers only the autoregressive component, future work may include considering the moving average component in the time series model [8] and may also include an extension to the multivariate time series models [32].

Author Contributions

Conceptualization, A.A.A.; Methodology, A.A.A.; Software, A.A.A.; Writing—original draft, A.A.A.; Writing—review & editing, S.A.A.; Project administration, S.A.A.; Funding acquisition, S.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All the datasets used in this paper are available from the authors upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Taylor, J.W. Triple seasonal methods for short-term electricity demand forecasting. Eur. J. Oper. Res. 2010, 204, 139–152. [Google Scholar] [CrossRef]
  2. Taylor, J.W. A comparison of univariate time series methods for forecasting intraday arrivals at a call center. Manag. Sci. 2008, 54, 253–265. [Google Scholar] [CrossRef]
  3. Taylor, J.W. An evaluation of methods for very short-term load forecasting using minute-by-minute British data. Int. J. Forecast. 2008, 24, 645–658. [Google Scholar] [CrossRef]
  4. Taylor, J.W. Exponentially weighted methods for forecasting intraday time series with multiple seasonal cycles. Int. J. Forecast. 2010, 26, 627–646. [Google Scholar] [CrossRef]
  5. Taylor, J.W.; Snyder, R.D. Forecasting intraday time series with multiple seasonal cycles using parsimonious seasonal exponential smoothing. Omega 2012, 40, 748–757. [Google Scholar] [CrossRef]
  6. De Livera, A.M.; Hyndman, R.J.; Snyder, R.D. Forecasting time series with complex seasonal patterns using exponential smoothing. J. Am. Stat. Assoc. 2011, 106, 1513–1527. [Google Scholar] [CrossRef]
  7. Sulandari, W.; Suhartono, S.; Rodrigues, P.C. Exponential Smoothing on Modeling and Forecasting Multiple Seasonal Time Series: An Overview. Fluct. Noise Lett. 2021, 20, 2130003. [Google Scholar] [CrossRef]
  8. Shaarawy, S.; Broemeling, L. Bayesian inferences and forecasts with moving averages processes. Commun. Stat.-Theory Methods 1984, 13, 1871–1888. [Google Scholar] [CrossRef]
  9. Shaarawy, S.; Ismail, M. Bayesian inference for seasonal ARMA models. Egypt. Stat. J. 1987, 31, 323–336. [Google Scholar]
  10. Shaarawy, S.M.; Ali, S.S. Bayesian identification of seasonal autoregressive models. Commun. Stat.-Theory Methods 2003, 32, 1067–1084. [Google Scholar] [CrossRef]
  11. Barnett, G.; Kohn, R.; Sheather, S. Robust Bayesian Estimation Of Autoregressive–Moving-Average Models. J. Time Ser. Anal. 1997, 18, 11–28. [Google Scholar] [CrossRef]
  12. Vermaak, J.; Niranjan, M.; Godsill, S.J. Markov Chain Monte Carlo Estimation for the Seasonal Autoregressive Process with Application to Pitch Modelling; Technical Report; Department of Engineering, University of Cambridge: Cambridge, UK, 1998. [Google Scholar]
  13. Amin, A.A. Bayesian Inference of Triple Seasonal Autoregressive Models. Pak. J. Stat. Oper. Res. 2022, 18, 853–865. [Google Scholar] [CrossRef]
  14. Amin, A.A.; Emam, W.; Tashkandy, Y.; Chesneau, C. Bayesian Subset Selection of Seasonal Autoregressive Models. Mathematics 2023, 11, 2878. [Google Scholar] [CrossRef]
  15. Barnett, G.; Kohn, R.; Sheather, S. Bayesian estimation of an autoregressive model using Markov chain Monte Carlo. J. Econom. 1996, 74, 237–254. [Google Scholar] [CrossRef]
  16. Ismail, M.A. Bayesian Analysis of Seasonal Autoregressive Models. J. Appl. Stat. Sci. 2003, 12, 123–136. [Google Scholar]
  17. Ismail, M.A. Bayesian Analysis of the Seasonal Moving Average Model: A Gibbs Sampling Approach. Jpn. J. Appl. Stat. 2003, 32, 61–75. [Google Scholar] [CrossRef]
  18. Ismail, M.A.; Amin, A.A. Gibbs sampling for SARMA models. Pak. J. Stat. 2014, 30, 153–168. [Google Scholar]
  19. Amin, A. Gibbs Sampling for Bayesian Prediction of SARMA Processes. Pak. J. Stat. Oper. Res. 2019, 15, 397–418. [Google Scholar] [CrossRef]
  20. Ismail, M.A.; Zahran, A.R. Bayesian inference on double seasonal autoregressive models. J. Appl. Stat. Sci. 2013, 21, 13. [Google Scholar]
  21. Amin, A. Gibbs sampling for double seasonal ARMA models. In Proceedings of the 29th Annual International Conference on Statistics and Computer Modeling in Human and Social Sciences, Cairo, Egypt, 28–30 March 2017. [Google Scholar]
  22. Amin, A.A. Bayesian inference for double SARMA models. Commun. Stat.-Theory Methods 2018, 47, 5333–5345. [Google Scholar] [CrossRef]
  23. Amin, A.A. Bayesian identification of double seasonal autoregressive time series models. Commun. Stat.-Simul. Comput. 2019, 48, 2501–2511. [Google Scholar] [CrossRef]
  24. Amin, A.A. Bayesian analysis of double seasonal autoregressive models. Sankhya B 2020, 82, 328–352. [Google Scholar] [CrossRef]
  25. Amin, A.A. Full Bayesian analysis of double seasonal autoregressive models with real applications. J. Appl. Stat. 2023, 1–21. [Google Scholar] [CrossRef]
  26. Amin, A.A. Gibbs sampling for Bayesian estimation of triple seasonal autoregressive models. Commun. Stat.-Theory Methods 2023, 52, 7303–7322. [Google Scholar] [CrossRef]
  27. Broemeling, L.D. Bayesian Analysis of Linear Models; CRC Press: Boca Raton, FL, USA, 1985. [Google Scholar]
  28. Broemeling, L.D. Bayesian Analysis of Time Series; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
  29. Fernandez, C.; Ley, E.; Steel, M.F. Benchmark priors for Bayesian model averaging. J. Econom. 2001, 100, 381–427. [Google Scholar] [CrossRef]
  30. Box, G.E.; Tiao, G. Bayesian Inference in Statistical Analysis; John Wiley & Sons: Hoboken, NJ, USA, 1973. [Google Scholar]
  31. Lütkepohl, H. New Introduction to Multiple Time Series Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
  32. Reinsel, G.C. Elements of Multivariate Time Series Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Figure 1. Time plots for electricity load datasets in European countries.
Figure 1. Time plots for electricity load datasets in European countries.
Mathematics 11 03823 g001
Table 1. Simulation design.
Table 1. Simulation design.
TSAR Model ϕ 1 ϕ 2 Φ 1 Φ 2 Π 1 Π 2 Ψ 1 τ
I.   (1)(1) 12 (1) 60 (1) 600 0.5 0.4 0.5 0.41.0
II.  (2)(1) 12 (1) 60 (1) 600 0.60.30.9 −0.8 0.71.0
III. (1)(2) 12 (1) 60 (1) 600 0.9 0.5−0.40.9 0.81.0
IV. (2)(2) 12 (1) 60 (1) 600 0.60.30.5−0.4−0.9 0.81.0
V.  (2)(2) 12 (2) 60 (1) 600 0.6−0.30.50.40.7−0.40.61.0
Table 2. Testing scheme results for one generated time series from TSAR Model I.
Table 2. Testing scheme results for one generated time series from TSAR Model I.
Null Hypothesisp-ValueStatus
H 0 ( 4 , 1 ) : Ψ 3 = 0 0.8857is not rejected
H 0 ( 4 , 2 ) : Ψ 3 = Ψ 2 = 0 0.7066is not rejected
H 0 ( 4 , 3 ) : Ψ 3 = Ψ 2 = Ψ 1 = 0 0.0001is rejected, and  P 3 * = 1
H 0 ( 3 , 1 ) : Π 3 = Ψ 3 = Ψ 2 = 0 0.8739is not rejected
H 0 ( 3 , 2 ) : Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.9247is not rejected
H 0 ( 3 , 3 ) : Π 3 = Π 2 = Π 1 = Ψ 3 = Ψ 2 = 0 0.0002is rejected, and  P 2 * = 1
H 0 ( 2 , 1 ) : Φ 3 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.9279is not rejected
H 0 ( 2 , 2 ) : Φ 3 = Φ 2 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.9678is not rejected
H 0 ( 2 , 3 ) : Φ 3 = Φ 2 = Φ 1 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.0004is rejected, and  P 1 * = 1
H 0 ( 1 , 1 ) : ϕ 3 = Φ 3 = Φ 2 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.9758is not rejected
H 0 ( 1 , 2 ) : ϕ 3 = ϕ 2 = Φ 3 = Φ 2 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.9809is not rejected
H 0 ( 1 , 3 ) : ϕ 3 = ϕ 2 = ϕ 1 = Φ 3 = Φ 2 = Π 3 = Π 2 = Ψ 3 = Ψ 2 = 0 0.0007is rejected, and  p * = 1
Table 3. Percentage of correctly identified true TSAR models.
Table 3. Percentage of correctly identified true TSAR models.
nJ g 1 g 2 g 3 g 4 g 5
Model I
300088.273.276.473.886.695.0
400087.082.485.083.491.096.4
500089.285.487.285.693.298.2
600089.086.688.286.893.497.4
Model II
300085.668.073.669.082.294.4
400086.880.882.481.689.096.0
500087.484.086.684.891.698.2
600087.883.486.284.693.498.2
Model III
300085.469.074.470.683.492.2
400086.079.883.281.089.496.0
500088.886.087.686.491.496.4
600088.885.887.686.892.096.6
Model IV
300086.874.678.075.285.292.6
400087.281.883.682.489.095.4
500090.486.888.287.292.497.0
600089.687.889.288.693.298.4
Model V
300090.880.883.882.089.695.6
400090.286.488.287.292.897.2
500093.490.892.291.095.297.6
600092.491.692.291.894.698.4
J refers to Jeffreys’ prior and g 1 to g 5 refer to the g prior with different g values.
Table 4. Percentage of correctly identified true TSAR models for Model I with different errors distributions.
Table 4. Percentage of correctly identified true TSAR models for Model I with different errors distributions.
nJ g 1 g 2 g 3 g 4 g 5
t(15)
300086.67275.472.684.493.6
40008883.685.684.490.497.8
50008884.28684.891.697.4
600087.684.886.485.292.897.6
Laplace
300088.67680.477.686.893.8
400086.680.683.88190.497.8
50008884.286.68591.497
600085.883.484.683.492.698
Log-normal
3000867074.270.683.893.4
400085.277.681.47888.495.4
500086.683.284.883.490.695.8
600088.485.487.4869296.8
Skew-normal
300084.868.474708393.8
400086.481.684.281.89096
500090.885.88987.293.897.8
600088.685.287.685.89298.2
Table 5. Identified TSAR models for electricity load datasets.
Table 5. Identified TSAR models for electricity load datasets.
Time Series DatasetJeffreys’ Priorg Prior ( g = ln ( q + 1 ) / ln ( n ) )
AustriaTSAR(4)(2) 24 (4) 168 (4) 8736 TSAR(4)(2) 24 (4) 168 (4) 8736
BelgiumTSAR(2)(4) 24 (4) 168 (3) 8736 TSAR(2)(4) 24 (4) 168 (3) 8736
Czech RepublicTSAR(4)(4) 24 (4) 168 (3) 8736 TSAR(4)(4) 24 (4) 168 (3) 8736
FranceTSAR(2)(4) 24 (3) 168 (3) 8736 TSAR(2)(4) 24 (3) 168 (3) 8736
GermanyTSAR(2)(4) 24 (3) 168 (4) 8736 TSAR(2)(3) 24 (3) 168 (4) 8736
SpainTSAR(3)(4) 24 (4) 168 (3) 8736 TSAR(3)(3) 24 (4) 168 (3) 8736
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Amin, A.A.; Alghamdi, S.A. Bayesian Identification Procedure for Triple Seasonal Autoregressive Models. Mathematics 2023, 11, 3823. https://doi.org/10.3390/math11183823

AMA Style

Amin AA, Alghamdi SA. Bayesian Identification Procedure for Triple Seasonal Autoregressive Models. Mathematics. 2023; 11(18):3823. https://doi.org/10.3390/math11183823

Chicago/Turabian Style

Amin, Ayman A., and Saeed A. Alghamdi. 2023. "Bayesian Identification Procedure for Triple Seasonal Autoregressive Models" Mathematics 11, no. 18: 3823. https://doi.org/10.3390/math11183823

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop