1. Introduction
Fitting a Gegenbauer autoregressive moving-average (GARMA) model to non-Gaussian data becomes intricate when real-world time series exhibit high anomalies, such as non-additivity and heteroscedasticity. Addressing this behavior has garnered significant interest, leading to numerous studies over the past few decades. For example, Albarracin et al. [
1] analyzed the structure of GARMA models in practical applications, Huntet al. [
2] proposed an R (Version R-4.4.0) package called ’garma’ to fit and forecast GARMA models, and Darmawan et al. [
3] used a GARMA model to forecast COVID-19 data in Indonesia.
The conditional heteroscedastic autoregressive moving-average (CHARMA) model is commonly employed to capture unobserved heterogeneity characteristics in real-world data [
4]. Another closely related model is the random coefficient autoregressive (RCA) model, introduced by Nicholls and Quinn [
5], with recent investigations into its properties conducted by Appadoo et al [
6].
The GARMA model has recently emerged as a suitable framework to identify and handle such features in real-world data under specific parameter values [
7]. An inference for the estimators of the GARMA model was given by Beaumont and Smallwood [
8], and an efficient estimation approach for the regression parameters of the generalized autoregressive moving-average model was provided by Hossain et al. [
9].
In this context, this study proposes a random coefficient approach, namely the random coefficient Gegenbauer autoregressive moving-average (RCGARMA) model, to capture unobserved heterogeneity. The RCGARMA model extends the GARMA model by introducing an additional source of random variation to the standard coefficient model. While the GARMA model has been extensively analyzed with non-random coefficients (see [
10,
11,
12], among others), the RCGARMA model provides flexibility in modeling unobserved heterogeneity in profit structures of dependent data, alongside long short-term dependence and seasonal fluctuations at different frequencies. Analyzing the statistical properties and estimation of this model is crucial for its application to real-world data.
To begin, we introduce the fundamentals of the random GARMA model, along with notations and commonly used assumptions. Throughout this paper, we focus on random coefficients in the GARMA model, akin to random-effect regression models, where the fixed coefficients of the GARMA model are randomly perturbed (see, for example, [
13,
14], and other relevant literature).
We consider a scenario in which we observe a sequence of random variables
, generated by the following recursive model [
15]:
where
is the noise, which is assumed to have zero mean and variance
, and
is determined by the following assumptions:
where
are the Gegenbauer polynomial coefficients, defined in terms of their generating function (see, for instance, Magnus et al. [
16] and Rainville and Earl [
17]), as follows:
where
denotes the Gamma function and
stands for the integer part of
[
2].
Therefore, in this study, we extend the GARMA models to RCGARMA models, as discussed above, by including dynamic random coefficient effects. We assume that the row vector of coefficients,
in Equation (
2), that gives the impact of the time-varying variables
on
in Equation (
1), initially presumed to be fixed and time-invariant parameters, does change over time. In this scenario, the parameter vector can be partitioned as
, where
is constant reflecting the fixed coefficients, and
denotes the sequence of random coefficients related to the nuisance parameters that can be omitted from the model. To identify this model, we assume that
and
. It is also assumed that
is uncorrelated with
, and that
for all
t and
s. We use
to represent the set of all model coefficients, then the general model (Equation (
1)) becomes:
The concept of fixed and random coefficient time-series models has been applied for testing the presence of random coefficients in autoregressive models [
18], and for handling the possible nonlinear features of real-world data [
19]. Another relevant reference is the study by Mundlak [
20], which introduced the dynamic random effects model for panel data analysis. Mundlak’s work highlights the importance of incorporating time-varying random effects into panel data.
In this context, it can be implicitly assumed that truncating the right-hand side of Equation (
3) at lag
m is valid for
n = 0, 1, …,
m. Under this assumption, the RCGARMA model should be accurately formulated as follows:
where the sequence
represents the truncated Gegenbauer process, with its behavior contingent on the selected finite truncation lag order denoted by
…
. This concept draws parallels to the MA approximation presented by Dissanayake et al. [
21], along with comprehensive literature reviews on diverse issues in long-memory time series, encompassing Gegenbauer processes and their associated properties.
…
is an
vector of random coefficients.
…
is an
vector of past observations.
Concerning the estimation of the unknown parameters of interest, we define a vector denoted as
. Under the assumption that the random sequences
are allowed to exhibit correlations with the error process
, we represent
where
is the
matrix representing the variance of
,
is the
vector representing the covariance between
and
, and
is the variance of the error process
.
The ordinary least squares (OLS) method is commonly used in this case. This method aims to estimate the parameters by minimizing the sum of squared differences between the observed and predicted values, and it assumes the independence, homoskedasticity, and normality of error distribution.
However, the assumptions of our model do not align with those of ordinary least squares estimation. While OLS assumes independence and homoskedasticity of errors, our model may exhibit heteroskedasticity and correlation structures due to the presence of random effects. Additionally, the errors in our model are not strictly bound to a normal distribution.
To address these issues, we employ an estimation procedure using conditional least squares (CLS) and weighted least squares (WLS) estimators. CLS adjusts for heteroskedasticity by incorporating the conditional variance structure into the estimation process, while WLS provides more robust estimates compared to OLS by assigning weights to observations based on their variances. This methodology is based on the work proposed by Hwang and Basawa [
22], which has demonstrated favorable performance for generalized random coefficient autoregressive processes. Additionally, the studies by Nicholls and Quinn [
5] and Hwang and Basawa [
23] are also relevant in this context.
By implementing these procedures, one can obtain significant estimates for random effects. This paper begins by assessing the parameters using the conditional least squares estimation method in
Section 2.
Section 3 introduces an alternative estimator based on the weighted conditional least squares estimation method. Following this,
Section 4 and
Section 5 compare the performance of these methods using simulation data and real-world data, respectively.
2. Conditional Least Squares Estimation Method
The conditional least squares estimation method is a flexible technique commonly used for estimation. It offers valuable characteristics like consistency and asymptotic normality under specific conditions. This approach is valuable because it helps ensure that the estimated coefficients approach the true values as more data are obtained, which is known as consistency. Furthermore, the asymptotic normality property indicates that as the sample size increases, the distribution of the estimators approximates a normal distribution. By minimizing the sum of squared deviations, we are essentially finding the best-fitting line or curve that represents the relationship between variables in the data, making it a widely used and reliable technique in statistical analysis.
The utilization of conditional least squares estimation in the context of GARMA models with random effects provides a robust approach to estimating model parameters while accounting for the inherent uncertainties and complexities introduced by random effects. Conditional least squares estimation in random effects GARMA models stands out for its capability to handle issues stemming from unobserved heterogeneity and time-varying dynamics [
24]. This approach enables researchers to discern the influence of random effects on model parameters, thereby enhancing the interpretability and robustness of the estimated coefficients [
25].
Moreover, the interpretability of the results obtained through CLS estimation in GARMA models with random effects is augmented by the incorporation of a conditional framework. By conditioning the estimation on information present at each time point, researchers can acquire a deeper understanding of the temporal progression of model parameters and their individual impacts on the observed data [
26].
In this section, we use the conditional least squares method to estimate the unknown parameters in the RCGARMA model. Viewing it as a regression model with the predictor variable
and response variable
, the least squares estimation involves minimizing the sum of squares of the differences. The initial step involves estimating the vector mean parameters
of the regression function (see Equation (
4)).
2.1. Estimation of Parameters
In this section, we discuss the estimation of
through the utilization of the conditional least squares estimation procedure. The estimator, denoted as
, represents the optimal selection of values for our parameters and is calculated using a sample (
…
…
). When performing conditional least squares estimation, the objective is to minimize the following conditional sum of squares:
with respect to the vector
…
. This is achieved by solving the system
.
Replacing
in the last equation with
yields the following result:
The derivation of the asymptotic properties of our estimator
relies on the following standard conditions:
- (C.0)
The square matrix must have full rank.
- (C.1)
The stationary distribution of must have a fourth-order moment, meaning that .
In time-series analysis, condition (C.0) is crucial for the estimation process. It requires the square matrix to have full rank. This condition essentially ensures that the information provided by the data is sufficient and not redundant, allowing for accurate estimation of the parameters. When this condition is met, it facilitates reliable inference and prediction based on the time-series data. The condition in the time series, identified as (C.1), states that the stationary distribution of needs to exhibit a fourth-order moment, indicating that . Now, assuming that these two conditions are verified, we have the following theorem, which describes the limit distribution of the estimator of .
Theorem 1. Let be the estimator of Θ.
Under (C.0) and (C.1), we have:whereandwith Proof. To prove the result of this theorem, let
Let
…
, where
,
j = 1, …,
m. Consider
where
Given that
is a stationary ergodic zero-mean martingale difference for all
t = 1, …, m, we apply the central limit theorem for stationary ergodic martingale differences proposed by Billingsley [
27].
Furthermore, by the ergodic theorem, we have
Finally, from (
7) and (
8), we find that
□
Therefore, the result of Theorem 1 is proven.
In the following section, we calculate the estimators for the variance component parameters of the RCGARMA model described earlier. These parameters define the variance of random errors, the covariances between errors and random effects, and the variance matrix parameters of random effects in the model. The covariance matrix is typically expressed as a mysterious linear combination of known cofactor matrices, as shown below.
2.2. Covariance Parameter Estimators
Let
be the unknown parameter vector containing all model parameters representing the variance components. We partition the parameter vector into three sub-vectors:
includes all parameters related to the time-invariant variance of errors,
includes all covariance parameters in the RCGARMA process, and
includes parameters for the matrix elements of the variance of random effects in the RCGARMA process. Specifically, it can be noted as:
where
and
denotes the (
i,
j)th element of the variance matrix of
,
.
To estimate the variance component parameters in the RCGARMA model using the least squares estimator, assume the available dataset is
…
…
, and note that
and
in Equation (
6) are, respectively, given as follows:
The conditional least squares estimator is used to estimate the parameters
. This method allows researchers to decompose the total variability observed in the data into different components, including the variance contributed by random model effects. A study by Ngatchou-Wandji [
28] applied conditional least squares to estimate the parameters in a class of heteroscedastic time-series models, demonstrating its consistency and its asymptotic normality.
By utilizing conditional least squares to estimate variance components in the RCGARMA model (e.g., GARMA models with random effects), the estimator, denoted as
, is acquired by minimizing the sum of squares represented by
. This process allows us to derive the conditional least squares estimator
, where
is given by Equation (
5) for
. The estimator is obtained as a solution to the equation represented by:
The derivation of the asymptotic properties of our estimator relies on the following standard conditions: (C.0), as described above, and (C.2), which states that the stationary distribution of must have an eight-order moment, meaning that .
Then, it can be shown that can converge in distribution to a multinormal matrix, as given in the theorem below.
Theorem 2. Under (C.0) and (C.2), we have:whereand 3. Weighted Conditional Least Squares Estimation Method
The weighted conditional least squares estimation method for generalized autoregressive moving-average (GARMA) models with random effects is a sophisticated statistical technique used to improve efficiency in the presence of heteroskedasticity. The traditional ordinary least squares estimation used above may not be the optimal choice when dealing with data exhibiting varying error variances, as it does not take this heteroskedasticity into account. In such cases, a more suitable approach involves employing a conditional weighted least squares estimator for the parameter of interest, denoted as . By incorporating weighted factors into the estimation process, this method assigns different weights to observations based on the variability of their error terms.
This allows for a more precise estimation of the parameters in the presence of heteroscedasticity in GARMA models, ultimately leading to more reliable statistical inference. This approach is particularly crucial when faced with heteroskedasticity in the data, as mentioned in
in Equation (
3).
The recognition of this heteroskedasticity highlights the need for more advanced estimation techniques to ensure the accuracy and reliability of the statistical analysis. This section concentrates on implementing this approach. We seek to enhance efficiency by employing a conditional weighted least squares estimator for .
Assuming that the nuisance parameter
is known, the weighted conditional least squares estimator
of
is obtained by minimizing:
Since
and
, the estimator
is given by:
Consider the following conditions:
- (C.3)
.
- (C.4)
The differentiability of at is established as follows:
There exists a linear map L such that:
Then, we show that converges in distribution in the theorem below.
Theorem 3. Under (C.0), (C.1), and (C.3), we have:where Proof. Using the ergodic theorem, we find that the first factor of (
9) converges strongly:
The next step is to check the convergence of the second factor of (
9):
Let . We have , and under (C.3), .
So, via the martingale central limit theorem of Billingsley [
27]:
Finally, from Equations (
10) and (
11), we have:
□
When is unknown, we replace it in with . Then, we denote . Herein, we give the limit distribution of .
Theorem 4. Under (C.0) and (C.2)–(C.4), we have: Proof. ,
and
First, we show that :
Using Theorem 2 and under (C.4), we find that:
Next, we show that .
Under (C.4) and Theorem 2, we find that:
Using Equations (
12) and (
13) and the equality
, we find that:
Finally, from Slutsky’s theorem [
29], we prove that:
□
4. Comparison of Methods by Simulation
A simulation was designed to compare the performance of the two estimators—the conditional least squares estimator and the weighted conditional least squares estimator—where we applied a GARMA model with random effects (RCGARMA) and investigated the behavior of the proposed approximate ratio for these two realizations using the following expression:
where
and
are, respectively, the conditional least squares estimator and the weighted conditional least squares estimator. R (Version R-4.4.0) programs were used to generate the data from the GARMA(1,0) model, with the random coefficients (we utilized the code provided in
Appendix A) defined as follows:
where
and
;
;
, with
and
;
…
, with
;
, with
, with
.
In this study, we performed simulations and created essential tables and graphs. To accomplish this, realizations were generated with sample sizes of for various values of (ranging from 0.1 to 0.5) and (ranging from −0.8 to 0.8). Additionally, m was set equal to , and 100 replications were conducted. Replication proved somewhat challenging due to the complexity of the techniques involved. The models contained numerous parameters, making convergence difficult to achieve.
Each section of
Table 1 presents the ratio of mean squared errors for different combinations of
and
with the same length of series
n. These findings are also depicted in the four panels in
Figure 1.
The ratio of mean squared errors serves as a comparative measure, where values exceeding 1 (>1) indicate that the second estimator performs better than the first. This comparison enables an assessment of the relative performance between the two estimators.
When analyzing the results presented in
Table 1 and visualized in
Figure 1, it is evident that all ratios of mean squared errors surpassed the threshold of 1. This observation underscores the superior efficiency of the
estimator over
across various parameter combinations.
These findings reinforce the validity and robustness of the simulation study, providing empirical evidence in support of the enhanced performance of the estimator in comparison to .