Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term

Doğan, Osman

doi:10.3390/econometrics3010101

Open AccessArticle

Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term

by

Osman Doğan

Program in Economics, The Graduate School and University Center, The City University of New York, New York, NY 10016, USA

Econometrics 2015, 3(1), 101-127; https://doi.org/10.3390/econometrics3010101

Submission received: 14 October 2014 / Accepted: 27 January 2015 / Published: 26 February 2015

(This article belongs to the Special Issue Spatial Econometrics)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, I investigate the necessary condition for the consistency of the maximum likelihood estimator (MLE) of spatial models with a spatial moving average process in the disturbance term. I show that the MLE of spatial autoregressive and spatial moving average parameters is generally inconsistent when heteroskedasticity is not considered in the estimation. I also show that the MLE of parameters of exogenous variables is inconsistent and determine its asymptotic bias. I provide simulation results to evaluate the performance of the MLE. The simulation results indicate that the MLE imposes a substantial amount of bias on both autoregressive and moving average parameters.

Keywords:

spatial dependence; spatial moving average; spatial autoregressive; maximum likelihood estimator; MLE; asymptotics; heteroskedasticity; SARMA(1,1)

JEL classifications:

C13; C21; C31

1. Introduction

The spatial dependence among the disturbance terms of a spatial model is generally assumed to take the form of a spatial autoregressive process. The spatial model that has a spatial lag in the dependent variable and an autoregressive process in the disturbance term is known as the SARARmodel. The main characteristic of an autoregressive process is that the effect of a location-specific shock transmits to all other locations with its effects gradually fading away for the higher order neighbors. The spatial autoregressive process may not be appropriate if there is strong evidence of the localized transmission of shocks. That is, the autoregressive process is not the correct specification when the effects of shocks are contained within a small region and are not transmitted to other regions. An alternative to an autoregressive process is a spatial moving average process, where the effects of shocks are more localized. Haining [1], Anselin [2] and, more recently, Hepple [3] and Fingleton [4,5] consider a spatial moving average process for the disturbance terms. The spatial model that contains a spatial lag of the dependent variable and a spatial moving average process for the disturbance term is known as the SARMA model.

In the literature, various estimation methods have been proposed [6,7,8,9,10,11,12,13,14,15,16]. The ML method is the best known and most common estimator used in the literature for both SARAR and SARMA specifications. Lee [11] shows the first order asymptotic properties of the MLE for the case of SARAR(1,0). The generalized method of moment (GMM) estimators is also considered for the estimation of the spatial models. Kelejian and Prucha [6,7] suggest a two-step GMM estimator for the SARAR(1,1) specification. One disadvantage of the two-step GMME is that it is usually inefficient relative to the MLE [10,17,18].

To increase efficiency, Lee [12], Liu et al. [10] and Lee and Liu [13] formulate one-step GMMEs based on a set of moment functions involving linear and quadratic moment functions. In this approach, the reduced form of spatial models motivates the formulation of moment functions. The reduced equations indicate that the endogenous variable, i.e., the spatial lag term, is a function of a stochastic and a non-stochastic term. The linear moment functions are based on the orthogonality condition between the non-stochastic term and the disturbance term, while the quadratic moment functions are formulated for the stochastic term. Then, the parameter vector is estimated simultaneously with a one-step GMME. Lee [12] shows that the one-step GMME can be asymptotically equivalent to the MLE when disturbance terms are i.i.d. normal. In the case where disturbances are simply i.i.d., Liu et al. [10] and Lee and Liu [13] suggest a one-step GMME that can be more efficient than the (quasi) MLE.

Fingleton [4,5] extend the two-step GMME suggested by Kelejian and Prucha [6,7] for spatial models that have a moving average process in the disturbance term, i.e., SARMA(1,1). Baltagi and Liu [19] modify the moment functions considered in Fingleton [4] in the manner of Arnold and Wied [20] and suggest a GMME for the case of SARMA(0,1). The spatial moving average parameter in both Fingleton [4] and Baltagi and Liu [19] is estimated by a non-linear least squares estimator (NLSE). The asymptotic distribution for the NLSE of the spatial moving average parameter is not provided in either Fingleton [4] or Baltagi and Liu [19]. Recently, Kelejian and Prucha [9] and Drukker et al. [21] provided a basic theorem regarding the asymptotic distribution of their estimator under fairly general conditions. The estimation approach suggested in Kelejian and Prucha [9] and Drukker et al. [21] can easily be adapted for the estimation of the SARMA(1,1) and SARMA(0,1) models. Finally, although the Kelejian and Prucha approach in Fingleton [4] and Baltagi and Liu [19] has computational advantages, it may be inefficient relative to the ML method 1.

In the presence of an unknown form of heteroskedasticity, Lin and Lee [22] show that the MLE for the case of SARAR(1,0) may not be consistent, as the log-likelihood function is not maximized at the true parameter vector. They suggest a robust GMME for the SARAR(1,0) specification by modifying the moment functions considered in Lee [12]. Likewise, Kelejian and Prucha [9] modify the moment functions of their previous two-step GMME to allow for an unknown form of heteroskedasticity.

The spatial moving average model introduces a different interaction structure. Therefore, it is of interest to investigate the implications of a moving average process for estimation and testing issues. In this paper, I investigate the effect of heteroskedasticy on the MLE for the case of SARMA(1,1) and SARMA(0,1) along the lines of Lin and Lee [22]. The analytical results show that when heteroskedasticity is not considered in the estimation, the necessary condition for the consistency of the MLE is generally not satisfied for both the SARMA(1,1) and SARMA(0,1) models. For the SARMA(1,1) specification, I also show that the MLE of other parameters is also inconsistent, and I determine its asymptotic bias. My simulation results indicate that the MLE imposes a substantial amount of bias on spatial autoregressive and moving average parameters. However, the simulation results also show that the MLE of other parameters reports a negligible amount of bias in large samples.

The rest of this paper is organized as follows. In Section 2, I specify the SARMA(1,1) model in more detail and list assumptions that are required for the asymptotic analysis. In Section 3, I briefly discuss the implications of the spatial processes proposed for the disturbance term in the literature. Section 4 investigates the necessary condition for the consistency of the MLE of the autoregressive and moving average parameters. Section 5 provides expressions for the asymptotic bias of the MLE of parameters of the exogenous variables. Section 6 contains a small Monte Carlo simulation. Section 7 closes with concluding remarks.

2. Model Specification and Assumptions

In this study, the following first order SARMA(1,1) specification is considered:

Y_{n} = λ_{0} W_{n} Y_{n} + X_{n} β_{0} + u_{n}, u_{n} = ε_{n} - ρ_{0} M_{n} ε_{n}

(1)

where

Y_{n}

is an

n \times 1

vector of observations of the dependent variable,

X_{n}

is an

n \times k

matrix of non-stochastic exogenous variables, with an associated

k \times 1

vector of population coefficients

β_{0}

,

W_{n}

,

M_{n}

are

n \times n

spatial weight matrices of known constants with zero diagonal elements and

ε_{n}

is an

n \times 1

vector of disturbances. The variables

W_{n} Y_{n}

and

M_{n} ε_{n}

are known as the spatial lag of the dependent variable and the disturbance term, respectively. The spatial effect parameters

λ_{0}

and

ρ_{0}

are known as the spatial autoregressive and moving average parameters, respectively. As the spatial data are characterized with triangular arrays, the variables in Equation (1) have subscript n 2. The model specifications with

λ_{0} \neq 0

,

ρ_{0} \neq 0

and

λ_{0} = 0, ρ \neq 0

are known, respectively, as SARMA(1,1) and SARMA(0,1) in the literature. Let Θ be the parameter space of the model. In order to distinguish the true parameter vector from other possible values in Θ, the model is stated with the true parameter vector

θ_{0} = {(β_{0}^{^{'}}, δ_{0}^{^{'}})}^{^{'}}

with

δ_{0} = {(λ_{0}, ρ_{0})}^{^{'}}

.

For notational simplicity, I denote

S_{n} (λ) = (I_{n} - λ W_{n})

,

R_{n} (ρ) = (I_{n} - ρ M_{n})

,

G_{n} (λ) = W_{n} S_{n}^{- 1} (λ)

,

H_{n} (ρ) = M_{n} R_{n}^{- 1} (ρ)

,

{\bar{X}}_{n} (ρ) = R_{n}^{- 1} (ρ) X_{n}

, and

{\bar{G}}_{n} (δ) = R_{n}^{- 1} (ρ) G_{n} (λ) R_{n} (ρ)

. Furthermore, at the true parameter values

(ρ_{0}, λ_{0})

, I denote

S_{n} (λ_{0}) = S_{n}

,

R_{n} (ρ_{0}) = R_{n}

,

G_{n} (λ_{0}) = G_{n}

,

H_{n} (ρ_{0}) = H_{n}

,

{\bar{X}}_{n} (ρ_{0}) = {\bar{X}}_{n}

and

{\bar{G}}_{n} (δ_{0}) = {\bar{G}}_{n}

.

The model in Equation (1) is considered under the following assumptions.

Assumption 1.

The elements

ε_{n i}

of the disturbance term

ε_{n}

are distributed independently with mean zero and variance

σ_{n i}^{2}

and

E {|ε_{n i}|}^{ν} < \infty

for some

ν > 4

for all n and i.

The elements of the disturbance term have moments higher than the fourth moment. The existence moments condition is required for the application of the central limit theorem for the quadratic form given in Kelejian and Prucha [9]. In addition, the variance of a quadratic form in

ε_{n}

exists and is finite when the first four moments are finite. Finally, Liapunov’s inequality guarantees that the moments less than ν are also uniformly bounded for all n and i.

Assumption 2.

The spatial weight matrices

M_{n}

and

W_{n}

are uniformly bounded in absolute value in row and column sums. Moreover,

S_{n}^{- 1}

,

S_{n}^{- 1} (λ)

,

R_{n}^{- 1}

and

R_{n}^{- 1} (ρ)

exist and are uniformly bounded in absolute value in row and column sums for all values of ρ and λ in a compact parameter space.

The uniform boundedness of the terms in Assumption 2 is motivated to control spatial autocorrelations in the model at a tractable level [6] 3. Assumption 2 also implies that the model in Equation (1) represents an equilibrium relation for the dependent variable. By this assumption, the reduced form of the model becomes feasible as

Y_{n} = S_{n}^{- 1} X_{n} β_{0} + S_{n}^{- 1} R_{n} ε_{n}

. The uniform boundedness of

S_{n}^{- 1} (λ)

and

R_{n}^{- 1} (ρ)

in Assumption 2 is only required for the MLE, not for the GMME [10]. When

W_{n}

is row normalized, a closed subset of interval

(1 / λ_{m i n}, 1)

, where

λ_{m i n}

is the smallest eigenvalue of

W_{n}

, can be considered as the parameter space for

λ_{0}

. Analogously, a closed subset of

(1 / ρ_{m i n}, 1)

, where

ρ_{m i n}

is the smallest eigenvalue of

M_{n}

, can be the parameter space of

ρ_{0}

([15], p.128) 4.

The next assumption states the regularity conditions for the exogenous variables.

Assumption 3.

The matrix

X_{n}

is an

n \times k

matrix consisting of constant elements that are uniformly bounded. It has full column rank k. Moreover,

{lim}_{n \to \infty} \frac{1}{n} X_{n}^{^{'}} X_{n}

and

{lim}_{n \to \infty} \frac{1}{n} {\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ)

exist and are nonsingular for all values of ρ in a compact parameter space.

3. Spatial Processes for the Disturbance Term

In the literature, there are three main parametric processes to model spatial autocorrelation among disturbance terms: (i) the spatial autoregressive process (SAR); (ii) the spatial moving average process (SMA); and (iii) the spatial error components model (SEC). The implied covariance structure is different under each specification. In this section, I describe the transmission and the effect of shocks under each specification. The SAR process is specified as:

u_{n} = ρ_{0} M_{n} u_{n} + ε_{n}

(2)

where

u_{n}

is an

n \times 1

vector of regression disturbances and

ε_{n}

is an

n \times 1

vector of i.i.d. innovations with variance

σ_{0}^{2}

. Under the assumption of an equilibrium, i.e.,

R_{n}

is invertible, the reduced from of Equation (2) is

u_{n} = R_{n}^{- 1} ε_{n}

with the covariance matrix of

E (u_{n} u_{n}^{^{'}}) = Ω_{n} = σ_{0}^{2} R_{n}^{- 1} R_{n}^{- 1^{'}}

. Note that even if the innovations are homoskedastic, the diagonal elements of

Ω_{n}

are not equal, suggesting heteroskedasticity for the regression disturbances. An expansion of

{(I_{n} - ρ_{0} M_{n})}^{- 1}

for

| ρ_{0} | < 1

yields

{(I_{n} - ρ_{0} M_{n})}^{- 1} = \sum_{j = 0}^{\infty} ρ_{0}^{j} M_{n}^{j} = I_{n} + ρ_{0} M_{n} + ρ_{0}^{2} M_{n}^{2} + \dots

. Hence, the SAR specification of the disturbance term implies that a shock at location i is transmitted to all other locations. The first term

I_{n}

implies that the shock at location i directly affects location i and, through other terms denoted by the powers of

M_{n}

, affects higher order neighbors. Eventually, the shock feeds back to location i through the interconnectedness of neighbors. Note that

| ρ_{0} | < 1

ensures that the magnitude of the transmitted shock decreases for the higher orders of neighbors. As a result, the SAR specification allows researchers to model the global transmission of shocks where the full effect of a shock to location i is the sum of the initial shock and the feedback from other locations.

If a more localized spatial dependence is conjectured for an economic model, then a spatial moving average process (SMA) specification is more suitable [1,3,4,5]. The SMA process is specified as:

u_{n} = ε_{n} - ρ_{0} M_{n} ε_{n}

(3)

where

ρ_{0}

is the spatial moving average parameter. The reduced form does not involve an inverse of a square matrix. Hence, the transmission of a shock emanated from location i is limited to its immediate neighbors given by the nonzero elements in the i-th row of

M_{n}

. Under this specification, the covariance matrix of

u_{n}

is

Ω_{n} = σ_{0}^{2} R_{n} R_{n}^{^{'}} = σ_{0}^{2} (I_{n} - ρ_{0} (M_{n} + M_{n}^{^{'}}) + ρ_{0}^{2} M_{n} M_{n}^{^{'}})

. The spatial covariance is limited to nonzero elements of

(M_{n} + M_{n}^{^{'}})

and

M_{n} M_{n}^{^{'}}

. In comparison with the SAR specification, the range of covariance induced by the SMA model is much smaller.

Kelejian and Robinson [23] suggest another specification, which is called the spatial error components (SEC) model. This specification is similar to the SMA process in the sense that the implied covariance matrix does not involve a matrix inverse. Formally, the SEC model is given by

u_{n} = M_{n} ε_{n} + ϵ_{n}

, where

ε_{n}

is an

n \times 1

vector of regional innovations, whereas

ϵ_{n}

is an

n \times 1

vector of locational innovations. Assuming that

ε_{n}

and

ϵ_{n}

are independent, the variance-covariance matrix becomes

Ω_{n} = σ_{ϵ}^{2} I_{n} + σ_{ε}^{2} M_{n} M_{n}^{^{'}}

, which indicates that the spatial correlation in a SEC specification is even more localized.

There have been some direct attempts to parametrize the covariance matrix of

u_{n}

, rather than defining a process for the disturbance term. For example, Besag [24] considers a conditional first-order autoregressive model (CAR(1)), such that the covariance matrix of

u_{n}

takes the form of

Ω_{n} = σ_{0}^{2} {(I_{n} - ρ_{0} M_{n})}^{- 1}

, where

M_{n}

is assumed to be a symmetric contiguity matrix. This covariance structure implies a process of

u_{n} = {(I_{n} - ρ_{0} M_{n})}^{- 1 / 2} ε_{n}

. As in the case of the SAR process, a shock in a location is transmitted to all other locations, but now with a smaller amplitude. Another example is

Ω_{n} = σ_{0}^{2} (I_{n} + ρ_{0} M_{n})

, where

M_{n}

is assumed to be symmetric [25,26]. In this case, the spatial correlation is restricted to first order neighbors, i.e., non-zero elements of

M_{n}

.

The elements of

Ω_{n}

can also be specified through a covariance generating function. For example, in Ripley [27], the covariance generating function is defined in terms of the distance between two locations in such a way that the resulting covariance is always non-negative definite. Let

d_{i j}

be the distance between locations i and j and

Ω_{i j, n}

be the covariance between these two locations. Then, the covariance generating function is defined by:

Ω_{i j, n} = \{\begin{matrix} σ_{0}^{2} \frac{2}{n} [{cos}^{- 1} (\frac{d_{i j}}{2 ψ}) - \frac{d_{i j}}{2 ψ} {(1 - \frac{d_{i j}^{2}}{4 ψ^{2}})}^{1 / 2}], & if d_{i j} \leq 2 ψ \\ 0, & otherwise . \end{matrix}

(4)

Intuitively,

Ω_{i j, n}

is proportional to the intersection area of two discs of common radius centered on locations i and j. The covariance generating function in Equation (4) depends on the single parameter ψ and has a fairly linear negative relationship with

d_{i j}

[25,27]. Another covariance generating function family, first introduced by Whittle in 1954, is a two-parameter function defined in terms of gamma and bessel functions. This family has the following specification:

Ω_{i j, n} = σ_{0}^{2} {[2^{ν - 1} Γ (ν)]}^{- 1} {(δ d_{i j})}^{ν} K_{ν} (δ d_{i j})

(5)

where

K_{ν} (\cdot)

is the modified bessel function and

Γ (\cdot)

is the standard gamma function. The parameters

ν > 0

and

δ > 0

are respectively known as a shape parameter and a spatial parameter. The spatial parameter δ determines how far the spatial correlation will stretch. For the special case, where

ν = \frac{1}{2}

, this covariance generating function gives an exponential decaying spatial correlation [25]. There is also a more general exponential covariance generating function that depends on two parameters. This function is specified by

Ω_{i j, n} = σ_{0}^{2} γ exp (λ d_{i j})

, where γ and λ are parameters that need to be estimated. This function also exhibits exponential decay for the spatial correlations.

In the literature, there are some other covariance generating function families. However, the majority of these functions do not necessarily ensure that

Ω_{n}

is a positive-definite matrix [25,28]. The formal properties of the MLE for spatial models that have a covariance structure determined by a parametric function are investigated in an early study by Mardia and Marshall [29]. In this study, the authors state conditions under which the MLE is consistent and has an asymptotic normal distribution.

In this study, the spatial model specified in Equation (1) is considered. The interaction between the spatial autoregressive process and the moving average process for this model induces a complicated pattern for the transmission of a location-specific shock. Under Assumption 2, the reduced form of the model is given by

Y_{n} = S_{n}^{- 1} X_{n} β_{0} + S_{n}^{- 1} R_{n} ε_{n}

. The last term in the reduced form can be written as

S_{n}^{- 1} R_{n} ε_{n} = ε_{n} - ρ_{0} M_{n} ε_{n} + \sum_{l = 1}^{\infty} λ_{0}^{l} W_{n}^{l} ε_{n} - ρ_{0} M_{n} \sum_{l = 1}^{\infty} λ_{0}^{l} W_{n}^{l} ε_{n}

. In this representation, the higher power of

W_{n}

does not have zero diagonal elements, which, in turn, implies that the total effect of a region-specific shock also contains the feedback effects passed through other locations. The corresponding expression in the case of SARAR(1,1) specification is given by

S_{n}^{- 1} R_{n}^{- 1} ε_{n} = \sum_{l = 0}^{\infty} λ_{0}^{l} W_{n}^{l} \sum_{k = 0}^{\infty} ρ_{0}^{k} M_{n}^{k} ε_{n}

. Again, the induced pattern involves the interaction of two weight matrices and two parameters.

Following Fingleton [4], I illustrate the transmission pattern for a shock under each specification by using a rook weight matrix over a

15 \times 15

lattice. Figure 1 shows the impact of a shock emanated from the unit located at the center of lattice 5. In the case of SAR and SARAR(1,1), the effect of shock is more vigorous over the whole lattice. For the SMA specification, the shock is only transmitted to the immediate units, as shown in Figure 1b. In contrast, the effect of the shock gradually dies out under the SARMA(1,1) model.

Figure 1. The effect of a shock. (a) The effect of a shock: spatial autoregressive process (SAR). (b) The effect of a shock: spatial moving average process (SMA). (c) The effect of a shock: SARAR(1,1). (d) The effect of a shock: SARMA(1,1).

4. The MLE of $λ_{0}$ and $ρ_{0}$

The log-likelihood function for the model in Equation (1) under the assumption that the disturbance terms of the model are i.i.d. normal with mean zero and variance

σ_{0}^{2}

can be written as:

\begin{matrix} ln L_{n} (ζ) & = - \frac{n}{2} ln (2 π) - \frac{n}{2} ln (σ^{2}) + ln |S_{n} (λ)| - ln |R_{n} (ρ)| \end{matrix}

\begin{matrix} - \frac{1}{2 σ^{2}} {(S_{n} (λ) Y_{n} - X_{n} β)}^{^{'}} R_{n}^{^{'} - 1} (ρ) R_{n}^{- 1} (ρ) (S_{n} (λ) Y_{n} - X_{n} β) \end{matrix}

(6)

where

ζ = {(θ^{^{'}}, σ^{2})}^{^{'}}

. The first order conditions with respect to β and

σ^{2}

are respectively given by:

\begin{matrix} \frac{\partial ln L_{n} (ζ)}{\partial β} & = \frac{1}{σ^{2}} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) (S_{n} (λ) Y_{n} - X_{n} β) \end{matrix}

(7a)

\begin{matrix} \frac{\partial ln L_{n} (ζ)}{\partial σ^{2}} & = \frac{- n}{2 σ^{2}} + \frac{1}{2 σ^{4}} ε_{n}^{^{'}} (θ) ε_{n} (θ) \end{matrix}

(7b)

where

ε_{n} (θ) = R_{n}^{- 1} (ρ) (S_{n} (λ) Y_{n} - X_{n} β)

. The solutions of the first order conditions for a given δ yield the MLE of

β_{0}

and

σ_{0}^{2}

:

\begin{matrix} {\hat{β}}_{n} (δ) & = {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) S_{n} (λ) Y_{n} \end{matrix}

(8a)

\begin{matrix} {\hat{σ}}_{n}^{2} (θ) & = \frac{1}{n} ε_{n}^{^{'}} (θ) ε_{n} (θ) \end{matrix}

(8b)

Concentrating the log-likelihood function by eliminating

σ^{2}

gives the following equation:

\begin{matrix} ln L_{n} (θ) & = - \frac{n}{2} ln (2 π) - \frac{1}{2} - \frac{n}{2} ln (\frac{ε_{n}^{^{'}} (θ) ε_{n} (θ)}{| S_{n} {(λ) |}^{\frac{2}{n}} {| R_{n} (ρ) |}^{- \frac{2}{n}}}) \end{matrix}

(9)

The above representation is useful for exploring the role of the Jacobian terms

| S_{n} (λ) |

and

| R_{n} (ρ) |

in the ML estimation. The MLE of θ is the extremum estimator obtained from the maximization of Equation (9). In an equivalent way, the MLE of

θ_{0}

can be defined by:

{\hat{θ}}_{n} = {argmin}_{θ \in Θ} \{\frac{ε_{n}^{^{'}} (θ) ε_{n} (θ)}{| S_{n} {(λ) |}^{\frac{2}{n}} {| R_{n} (ρ) |}^{- \frac{2}{n}}}\}

(10)

In the special case, where

| S_{n} (λ) | = | R_{n} (ρ) | = 1

, the MLE is the NLSE obtained from the minimization of

ε_{n}^{^{'}} (θ) ε_{n} (θ)

, i.e.,

{\hat{θ}}_{N L S E, n} = {argmin}_{θ \in Θ} ε_{n}^{^{'}} (θ) ε_{n} (θ)

. It is clear that the Jacobian terms

| S_{n} (λ) |

and

| R_{n} (ρ) |

play the role of a weight (or a penalty) on

ε_{n}^{^{'}} (θ) ε_{n} (θ)

. The penalty is a function of the autoregressive parameters and the spatial weight matrices, which can be defined as

f (λ, ρ, W_{n}, M_{n}) = {|S_{n} (λ)|}^{\frac{2}{n}} {|R_{n} (ρ)|}^{- \frac{2}{n}}

. For the SARAR(1,1) specification, the last term in Equation (9) is given by

- \frac{n}{2} ln (\frac{ε_{n}^{^{'}} (θ) ε_{n} (θ)}{{|S_{n} (λ)|}^{\frac{2}{n}} {|R_{n} (ρ)|}^{\frac{2}{n}}})

, where

ε_{n} (θ) = R_{n} (ρ) (S_{n} (λ) Y_{n} - X_{n} β)

. Therefore, in the case of SARAR(1,1), the MLE of

θ_{0}

is given by:

{\hat{θ}}_{n} = {argmin}_{θ \in Θ} \{\frac{ε_{n}^{^{'}} (θ) ε_{n} (θ)}{| S_{n} {(λ) |}^{\frac{2}{n}} {| R_{n} (ρ) |}^{\frac{2}{n}}}\}

(11)

It is hard to make any general statement about the effects and magnitudes of the penalty functions in both cases. Hepple [30] illustrates that the Jacobian term imposes a substantial penalty for the SARAR(0,1) specification. To illustrate the effect of penalty functions for the case of SARMA(1,1) and SARAR(1,1), I use a distance-based weight matrix for a sample of 91 countries, such that each country is connected to every other country. The elements of the weight matrices are specified by:

\begin{matrix} w_{i j} = m_{i j} = \{\begin{matrix} 0 if i = j \\ \frac{d_{i j}^{- 2}}{\sum_{j = 1}^{91} d_{i j}^{- 2}} if i \neq j \end{matrix} \end{matrix}

(12)

where

d_{i j}

between countries i and j is measured by the great circle distance between country capitals 6. Figure 2 shows the surface plots of penalty functions over a grid of spatial parameters.

Figure 2. The penalty functions for the dense weight matrix. (a) The penalty function for SARMA(1,1). (b) The penalty function for SARAR(1,1).

For the SARAR(1,1) specification, the value of the penalty function decreases as the parameter combination

(λ, ρ)

moves away from

(0, 0)

in any direction, as shown in Figure 2b 7. On the other hand, there is no such monotonic decrease in the penalty function under the SARMA(1,1) specification, as illustrated in Figure 2a. The penalty function of SARMA(1,1) obtains relatively larger values when there is strong spatial dependence in the disturbance term, i.e., when ρ is near

+ 1

or

- 1

. In contrast, the penalty function has smaller values when there is strong spatial dependence in the dependent variable. This pattern indicates that the sum

ε_{n}^{^{'}} (θ) ε_{n} (θ)

is penalized as ρ moves toward either

+ 1

or

- 1

. In the case of SARAR(1,1), this sum gets larger as

(λ, ρ)

moves toward

(\pm 1, \pm 1)

in any direction, suggesting that the solution of the minimization problem is restricted to the region

(- 1, - 1) \times (+ 1, + 1)

. Finally, in a small neighborhood of

(0, 0)

, the surface plots in Figure 2 indicate that the penalty functions take values around one, suggesting that the parameter estimates from the MLE can be similar to those from the NLSE under both specifications.

Next, I investigate the effect of heteroskedasticity on the MLE for the case of SARMA(1,1). I assume that the true data generating process is characterized by Assumption 1. More explicitly, the MLE

{\hat{σ}}_{n}^{2} (δ)

can be written as:

\begin{matrix} {\hat{σ}}_{n}^{2} (δ) & = \frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} (λ) R_{n}^{^{'} - 1} (ρ) {\bar{M}}_{n} (ρ) R_{n}^{- 1} (ρ) S_{n} (λ) Y_{n} \end{matrix}

(13)

where

{\bar{M}}_{n} (ρ) = (I_{n} - P_{n} (ρ))

is a projection-type matrix with

P_{n} (ρ) = {\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ)

. Substituting

R_{n}^{- 1} (ρ) S_{n} (λ) Y_{n} = R_{n}^{- 1} (ρ) X_{n} β + ε_{n}

into

{\hat{σ}}_{n}^{2} (δ)

and using the fact that

{\bar{X}}_{n}^{^{'}} (ρ) {\bar{M}}_{n} (ρ) = 0_{k \times n}

and

{\bar{M}}_{n} (ρ) {\bar{X}}_{n} (ρ) = 0_{n \times k}

, the MLE

{\hat{σ}}_{n}^{2} (δ)

can be written as:

\begin{matrix} {\hat{σ}}_{n}^{2} (δ) & = \frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} (ρ) ε_{n} \end{matrix}

(14)

At

δ_{0}

, the probability limit of

{\hat{σ}}_{n}^{2} (δ_{0})

is:

\begin{matrix} \underset{n \to \infty}{plim} {\hat{σ}}_{n}^{2} (δ_{0}) = \underset{n \to \infty}{plim} \frac{1}{n} ε_{n}^{^{'}} ε_{n} - \underset{n \to \infty}{plim} \frac{1}{n^{2}} ε_{n} {\bar{X}}_{n} {(\frac{1}{n} {\bar{X}}_{n}^{^{'}} {\bar{X}}_{n})}^{- 1} {\bar{X}}_{n}^{^{'}} ε_{n} \end{matrix}

(15)

For the first term on the right-hand side, we have

\frac{1}{n} ε_{n}^{^{'}} ε_{n} = \frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2} + o_{p} (1)

by Chebyshev’s weak law of large numbers. The second term vanishes by virtue of Lemma 1(4) in Appendix A and Assumption 3. Therefore, we have:

{\hat{σ}}_{n}^{2} (δ_{0}) = \frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2} + o_{p} (1)

(16)

The result in Equation (16) indicates that the average of the individual variances is asymptotically equivalent to

{\hat{σ}}_{n}^{2} (δ_{0})

.

Concentrating out β and

σ^{2}

from the log-likelihood function in Equation (6) yields:

\begin{matrix} ln L_{n} (δ) & = - \frac{n}{2} (ln (2 π) + 1) - \frac{n}{2} ln {\hat{σ}}_{n}^{2} (δ) + ln |S_{n} (λ)| - ln |R_{n} (ρ)| \end{matrix}

(17)

The MLEs

{\hat{λ}}_{n}

and

{\hat{ρ}}_{n}

are extremum estimators obtained from the maximization of Equation (17). The first order conditions of Equation (17) with respect to ρ and λ are 8:

\begin{matrix} \frac{\partial ln L_{n} (δ)}{\partial λ} & = - \frac{n}{2 {\hat{σ}}_{n}^{2} (δ)} \frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial λ} - tr (G_{n} (λ)) \end{matrix}

(18a)

\begin{matrix} \frac{\partial ln L_{n} (δ)}{\partial ρ} & = - \frac{n}{2 {\hat{σ}}_{n}^{2} (δ)} \frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial ρ} + tr (H_{n} (ρ)) \end{matrix}

(18b)

where

G_{n} (λ) = W_{n} S_{n}^{- 1} (λ)

and

H_{n} (ρ) = M_{n} R_{n}^{- 1} (ρ)

. For the consistency of

{\hat{λ}}_{n}

and

{\hat{ρ}}_{n}

, the necessary condition is

{plim}_{n \to \infty} \frac{1}{n} \frac{\partial ln L_{n} (δ_{0})}{\partial δ} = 0

. Below, I investigate the probability limit of the following expression:

\begin{matrix} \frac{1}{n} \frac{\partial ln L_{n} (δ_{0})}{\partial δ} & = (\begin{matrix} \frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial λ}) - \frac{1}{n} tr (G_{n}) \\ \frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial ρ}) + \frac{1}{n} tr (H_{n}) \end{matrix}) \end{matrix}

(19)

Under Assumption 2, both

H_{n}

and

G_{n}

are uniformly bounded in absolute value in row and column sums. Therefore,

\frac{1}{n} tr (H_{n})

and

\frac{1}{n} tr (G_{n})

in Equation (19) are of order

O (1)

. With these results for

\frac{1}{n} tr (H_{n})

and

\frac{1}{n} tr (G_{n})

, a convenient result for the probability limit of Equation (19) can be obtained, which is stated in the following proposition.

Proposition 1.

Under Assumptions 1 through 3, we have:

\begin{matrix} \frac{1}{n} \frac{\partial ln L_{n} (δ_{0})}{\partial δ} = (\begin{matrix} \frac{Cov ({\bar{G}}_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1) \\ - \frac{Cov (H_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1) \end{matrix}) \end{matrix}

(20)

where

Cov ({\bar{G}}_{n, i i}, σ_{n i}^{2})

is the covariance between the diagonal elements of

{\bar{G}}_{n}

,

{{\bar{G}}_{n, 11}, {\bar{G}}_{n, 22}, \dots, {\bar{G}}_{n, n n}}

and the individual variances

{σ_{n 1}^{2}, σ_{n 2}^{2}, \dots, σ_{n n}^{2}}

. Similarly,

Cov (H_{n, i i}, σ_{n i}^{2})

denotes the covariance between diagonal elements of

H_{n}

,

{H_{n, 11}, H_{n, 22}, \dots, H_{n, n n}}

and the individual variances

{σ_{n 1}^{2}, σ_{n 2}^{2}, \dots, σ_{n n}^{2}}

.

Proof.

See Appendix B. ▢

The above proposition indicates that the MLE of the spatial autoregressive and moving average parameters is not consistent, as long as the covariance terms in Equation (20) are not zero. Notice that, when the disturbance terms are homoskedastic, the covariance terms in Equation (40) are zero. In the special case where

W_{n} = M_{n}

and

λ_{0} = ρ_{0}

, we have

S_{n} = R_{n}

and

G_{n} = H_{n}

, so that

{\bar{G}}_{n} = R_{n}^{- 1} G_{n} R_{n} = R_{n}^{- 1} H_{n} R_{n} = R_{n}^{- 1} M_{n} R_{n}^{- 1} R_{n} = H_{n}

. Hence, the necessary condition for the consistency of

{\hat{λ}}_{n}

is identical to the one for

{\hat{ρ}}_{n}

.

The result in Proposition 1 indicates that the consistency of the MLE depends on the specification of weight matrices. It is of interest to investigate specifications that yield zero covariances. An obvious case is when there is no variation in the diagonal elements of

{\bar{G}}_{n}

and

H_{n}

. Then, the necessary condition for the consistency of

{\hat{λ}}_{n}

and

{\hat{ρ}}_{n}

is not violated, even if the disturbances are heteroskedastic. For example, there is no variations in the diagonal elements of

{\bar{G}}_{n}

and

H_{n}

when

W_{n}

and

M_{n}

are block-diagonal matrices with an identical sub-matrix in the diagonal blocks and zeros elsewhere. This type of block diagonal weight matrix can be seen in social interaction scenarios where a block represents a group in which each individual is equally affected by the members of the group [32,33]. Suppose that there are R groups, each of which has m members, so that

n = m R

. If we assign equal weight to each member of a group, then

W_{n} = M_{n} = I_{R} \otimes B_{m}

, where

B_{m} = \frac{1}{m - 1} (l_{m} l_{m}^{^{'}} - I_{m})

, and

l_{m}

is an m-dimensional column vector of ones. For this setup, there is no variation in the diagonal elements of

{\bar{G}}_{n}

and

H_{n}

; therefore

Cov ({\bar{G}}_{n, i i}, σ_{n i}^{2}) = Cov (H_{n, i i}, σ_{n i}^{2}) = 0

.

There is also no variation in the diagonal elements of

{\bar{G}}_{n}

and

H_{n}

when the circular world weight matrices considered in Kelejian and Prucha [7] are employed. In these weight matrices, the order of observations is important, since the observations are related to some units in front and to some in back. As an example, consider a “one ahead and one behind” weight matrix, where each observation is related to the one immediately after and immediately before it. For this scenario, we also have

Cov ({\bar{G}}_{n, i i}, σ_{n i}^{2}) = Cov (H_{n, i i}, σ_{n i}^{2}) = 0

. The circular world weight matrices can be adjusted to create some variation in the diagonal elements of

{\bar{G}}_{n}

and

H_{n}

. For example, Kelejian and Prucha [34] construct a different version in which the first and the last one-third of the sample observations has five neighbors in front and five in back, while the middle third only has one neighbor in front and one in back. Under this scenario, the Monte Carlo results in Kelejian and Prucha [34] show that the MLE is significantly biased for the case of SARAR(1,1).

5. The MLE of $β_{0}$

In the previous section, I showed that the consistency of the MLE of the spatial autoregressive and moving average parameters is not ensured. In this section, I investigate the consistency of the MLE of

β_{0}

. The result in Equation (8a) indicates that the MLE

{\hat{β}}_{n} ({\hat{δ}}_{n})

is also inconsistent, since it is based on the inconsistent estimators

{\hat{λ}}_{n}

and

{\hat{ρ}}_{n}

. The asymptotic bias of

{\hat{β}}_{n} ({\hat{δ}}_{n})

can be determined from Equation (8a). By using

S_{n} (λ) = S_{n} + (λ_{0} - λ) W_{n}

, the MLE

{\hat{β}}_{n} (δ)

can be written as:

\begin{matrix} {\hat{β}}_{n} (δ) = & β_{0} + {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) R_{n} ε_{n} \end{matrix}

\begin{matrix} + (λ_{0} - λ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) G_{n} X_{n} β_{0} \end{matrix}

\begin{matrix} + (λ_{0} - λ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) G_{n} R_{n} ε_{n} \end{matrix}

(21)

Under Assumption 3, the term

{(\frac{1}{n} {\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1}

is uniformly bounded in absolute value in row and column sums. By Lemma 1(5) of Appendix A, terms involving

ε_{n}

in Equation (21) vanish in probability. Thus,

\begin{matrix} {\hat{β}}_{n} (δ) & = β_{0} + (λ_{0} - λ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) G_{n} X_{n} β_{0} + o_{p} (1) \end{matrix}

(22)

The asymptotic bias of

{\hat{β}}_{n} ({\hat{δ}}_{n})

follows from Equation (22), which is given by

(λ_{0} - {\hat{λ}}_{n}) {({\bar{X}}_{n}^{^{'}} ({\hat{ρ}}_{n}) {\bar{X}}_{n} ({\hat{ρ}}_{n}))}^{- 1} {\bar{X}}_{n}^{^{'}} ({\hat{ρ}}_{n}) R_{n}^{- 1} ({\hat{ρ}}_{n}) G_{n} X_{n} β_{0}

. This result shows that the asymptotic bias of

{\hat{β}}_{n} ({\hat{δ}}_{n})

depends on weight matrices and the regressors matrix and is not zero unless the spatial parameters are consistent. Note that the bias is the OLS estimator obtained from the artificial regression of

R_{n}^{- 1} ({\hat{ρ}}_{n}) G_{n} X_{n} β_{0}

on

{\bar{X}}_{n} ({\hat{ρ}}_{n})

. For the special case of

{\hat{λ}}_{n} = λ_{0} + o_{p} (1)

, we have

{\hat{β}}_{n} (δ) = β_{0} + o_{p} (1)

. In this case, there is no asymptotic bias, and the inconsistency of

{\hat{ρ}}_{n}

has no effect on

{\hat{β}}_{n} ({\hat{δ}}_{n})

.

The specification with

λ_{0} = 0

in Equation (1) is called the spatial moving average model (SARMA(0,1) or SMA). For the SARMA(0,1) specification, the log-likelihood function simplifies to:

\begin{matrix} ln L_{n} (ζ) = - \frac{n}{2} ln (2 π) - \frac{n}{2} ln (σ^{2}) - ln |R_{n} (ρ)| - \frac{1}{2 σ^{2}} {(Y_{n} - X_{n} β)}^{^{'}} R_{n}^{^{'} - 1} (ρ) R_{n}^{- 1} (ρ) (Y_{n} - X_{n} β) \end{matrix}

(23)

where

ζ = {(θ^{^{'}}, σ^{2})}^{^{'}}

with

θ = {(ρ, β^{^{'}})}^{^{'}}

. For a given value of ρ, the first order conditions yield:

\begin{matrix} {\hat{β}}_{n} (ρ) = {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) Y_{n} \end{matrix}

\begin{matrix} {\hat{σ}}_{n}^{2} (ρ) = \frac{1}{n} ε_{n}^{^{'}} (θ) ε_{n} (θ) \end{matrix}

where

ε_{n} (θ) = R_{n}^{- 1} Y_{n} - {\bar{X}}_{n} β

. The necessary condition for the consistency of the MLE

{\hat{ρ}}_{n}

can be obtained from Equation (20). From the second row of Equation (20), we have

\frac{1}{n} \frac{\partial ln L_{n} (ρ_{0})}{\partial ρ} = - \frac{Cov (H_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1)

, which implies that the MLE

{\hat{ρ}}_{n}

is inconsistent. Substitution of

Y_{n} = X_{n} β_{0} + R_{n} ε_{n}

into

{\hat{β}}_{n} (ρ)

yields:

\begin{matrix} {\hat{β}}_{n} (ρ) & = β_{0} + {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) ε_{n} \end{matrix}

(24)

The variance of

{({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) R_{n} ε_{n}

in Equation (24) has an order of

O (\frac{1}{n})

by Lemma 1(5) of Appendix A. Then, Chebyshev’s inequality implies that

{\hat{β}}_{n} (ρ) = β_{0} + o_{p} (1)

. Hence,

{\hat{β}}_{n} ({\hat{ρ}}_{n})

has no asymptotic bias, even though

{\hat{ρ}}_{n}

is inconsistent.

For the spatial autoregressive model, where

ρ_{0} = 0

in Equation (1), the result in Equation (20) simplifies to

\frac{1}{n} \frac{\partial ln L_{n} (λ_{0})}{\partial λ} = \frac{Cov (G_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1)

. The term

{({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) G_{n} X_{n} β_{0}

in Equation (22) simplifies to

{(X_{n}^{^{'}} X_{n})}^{- 1} X^{^{'}} G_{n} X_{n} β_{0}

, so that:

\begin{matrix} {\hat{β}}_{n} (λ) = β_{0} + (λ_{0} - λ) {(X_{n}^{^{'}} X_{n})}^{- 1} X^{^{'}} G_{n} X_{n} β_{0} + o_{p} (1) \end{matrix}

(25)

The result in Equation (25) is the exact result stated in Lin and Lee [22] for the case of SARAR(1,0).

I collect the above results for the MLE of

β_{0}

in the following proposition.

Proposition 2.

Consider the model in Equation (1) under Assumptions 1 through 3; then:

(1): For the SARMA(1,1) model, we have:

$\begin{matrix} {\hat{β}}_{n} (δ) = β_{0} + (λ_{0} - λ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) R_{n}^{- 1} (ρ) G_{n} X_{n} β_{0} + o_{p} (1) \end{matrix}$

(26)
(2): For the SARMA(0,1) model, where $λ_{0} = 0$ , we have ${\hat{β}}_{n} (ρ) = β_{0} + o_{p} (1)$ .
(3): For the SARMA(1,0) model, where $ρ_{0} = 0$ , we have:

$\begin{matrix} {\hat{β}}_{n} (λ) = β_{0} + (λ_{0} - λ) {(X_{n}^{^{'}} X_{n})}^{- 1} X^{^{'}} G_{n} X_{n} β_{0} + o_{p} (1) \end{matrix}$

(27)

In Section 4 and Section 5, I showed that the MLE of

δ_{0}

and

β_{0}

is generally inconsistent when heteroskedasticity is present in the model. Besides its computational burden, the consistency of MLE is not ensured. In the next section, I confirm these large sample results through a Monte Carlo simulation.

6. Monte Carlo Simulation

In this section, the finite sample properties of the MLE are investigated through a Monte Carlo experiment for the cases of (i) SARMA(0,1) and (ii) SARMA(1,1). For both models, I assume heteroskedastic innovations in the data generating processes.

6.1. Design

There are two regressors and no intercept term, such that

X_{n} = [x_{n, 1}, x_{n, 2}]

and

β_{0} = {(β_{10}, β_{20})}^{'}

, where

x_{n, 1}

and

x_{n, 2}

are

n \times 1

independent random vectors that are generated from a Normal(0,1). I consider

n =

100, 500, 1,000; let

W_{n} = M_{n}

, and set

β_{0} = {(1, 1)}^{^{'}}

for all experiments. For the spatial autoregressive parameters

(λ_{0}, ρ_{0})

, I employ combinations from the set

B = (- 0.6, - 0.3, 0, 0.3, 0.6)

to allow for weak and strong spatial interactions.

The row normalized spatial weight matrix is based on the small group interaction scenario described in Lin and Lee [22]. In this scenario, the weight matrix is a block diagonal matrix where each block represents a group interaction. The size of each block is determined by the group size, which is determined by a random draw from Uniform(15,50). Let

{g_{1}, \dots, g_{G}}

be the set of groups, where G is the total number of groups. Denote the size of each group by

m_{i}

for

i = 1, \dots, G

. Then, the block for group i is given by

B_{i} = \frac{1}{m_{i} - 1} (l_{m_{i}} l_{m_{i}}^{^{'}} - I_{m_{i}})

, where

l_{m_{i}}

is the

m_{i} \times 1

vector of ones. Then,

W_{n} = M_{n} = Diag (B_{1}, \dots, B_{G})

9.

The observations in a group have the same variance, and I use the group size to create heteroskedasticity. If the group size is greater than 35, I set the variance of that group equal to its size raised to

0.4

power; otherwise I let the variance be the square of the inverse of the group size. Then, the i-th element of the innovation vector

ε_{n}

is generated according to

ε_{n i} = σ_{n i} ξ_{n i}

, where

σ_{n i}

is the standard error for the i-th observation and

ξ_{n i}

’s are i.i.d. Normal(0,1).

I use the following expressions to measure the level of signal-to-noise in this setup [35]:

\begin{matrix} R_{S A R M A (1, 1)}^{2} & = 1 - \frac{tr (R_{n}^{^{'}} S_{n}^{- 1^{'}} S_{n}^{- 1} R_{n} Σ_{n})}{β_{0}^{^{'}} X_{n}^{^{'}} S_{n}^{- 1^{'}} S_{n}^{- 1} X_{n} β_{0} + tr (R_{n}^{^{'}} S_{n}^{- 1^{'}} S_{n}^{- 1} R_{n} Σ_{n})} \end{matrix}

(28)

\begin{matrix} R_{S A R M A (0, 1)}^{2} & = 1 - \frac{tr (R_{n}^{^{'}} R_{n} Σ_{n})}{β_{0}^{^{'}} X_{n}^{^{'}} X_{n} β_{0} + tr (R_{n}^{^{'}} R_{n} Σ_{n})} \end{matrix}

(29)

where

Σ_{n}

is the diagonal

n \times n

covariance matrix of the disturbance terms. This setup yields an

R^{2}

value close to

0.55

. For each specification, the Monte Carlo experiment is based on 1,000 repetitions.

6.2. Simulation Results

The simulation results are presented in Appendix C and Appendix D. In each table, the empirical mean (Mean), the bias (Bias), the empirical standard error (SE) and the root mean square error (RMSE) of the parameter estimates are presented next to each other.

First, I consider the simulation results for the SARMA (0,1) model. The simulation results are presented in Table C1 of Appendix C. The MLE imposes almost no bias on

β_{10}

and

β_{20}

in all cases. The moving average parameter

ρ_{0}

has a substantial amount of bias when

n = 100

, but the amount of bias decreases as the sample size increases. Despite this, the MLE imposes a significant amount of bias on

ρ_{0}

when

n = 500

and

n =

1,000 in cases where the true value of

ρ_{0}

is nonzero. Overall, the simulation results are consistent with our large sample results. That is, the MLE of

β_{10}

and

β_{20}

is consistent, while the MLE of

ρ_{0}

is inconsistent in the presence of heteroskedasticity.

Now, we turn to the simulation results for the case of SARMA(1,1). First, I consider the simulation results for

λ_{0}

and

ρ_{0}

. Table D2 shows the estimation results for

n = 100

. The MLE imposes a substantial amount of bias on both parameters in all cases. The amount of bias for

λ_{0}

is relatively larger when there exists a strong negative spatial dependence in the dependent variable. There is a similar pattern for

ρ_{0}

, where the amount of bias and RMSE is, in general, larger for the cases of high negative spatial dependence in both the dependent variable and disturbance term. The pattern that we see for

λ_{0}

and

ρ_{0}

shows itself for the estimation results of

β_{10}

and

β_{20}

. That is, the reported biases and RMSEs are relatively larger for

β_{10}

and

β_{20}

, when there are strong spatial dependences in the model.

Table D3 contains the simulation results when

n = 500

. The same pattern that I described for

λ_{0}

and

ρ_{0}

is also prevalent in Table D3. The MLE still imposes a substantial amount of bias on

λ_{0}

and

ρ_{0}

. The noticeable improvement in the estimation results for

β_{10}

and

β_{20}

suggests that these parameters are less affected by the inconsistency of the MLE of

λ_{0}

and

ρ_{0}

, when the sample size is moderately large. The estimation results in Table D4 for

β_{10}

and

β_{20}

are also consistent with this claim. That is, when the sample size is large, i.e.,

n =

1000, the MLE imposes trivial bias on

β_{10}

and

β_{20}

in most cases. On the other hand, the estimation results in Table D4 show that the MLE imposes significant bias on

λ_{0}

and

ρ_{0}

, which, in turn, implies the inconsistency of the MLE for these parameters.

I now evaluate the finite sample efficiency measured by the RMSE of the MLE through the surface plots given in Appendix E. Figure E1 shows the surface plots of RMSEs for

β_{10}

and

β_{20}

. It is clear from the surface plots that the MLE has higher RMSEs when strong spatial dependence exists in the model. The surface plots in Figure E2 are for

λ_{0}

and

ρ_{0}

. These surface plots indicate that the MLE of these parameters has higher RMSEs when there exists strong negative spatial dependence in both the dependent variable and disturbance term.

7. Conclusions

In this study, I show that the MLE of the spatial autoregressive and moving average parameters for the SARMA(1,1) specification is generally inconsistent in the presence of heteroskedastic disturbances. The analytical results indicate that the concentrated log-likelihood function is not maximized at the true parameter values when heteroskedasticity is not considered in the estimation. The necessary condition for the consistency of the MLE depends on the specification of spatial weight matrices. I also show that the MLE of the parameters of the exogenous variables is inconsistent, and I state the expression for the corresponding asymptotic bias.

The Monte Carlo results show that the MLE imposes a substantial amount of bias on the spatial autoregressive and moving average parameters in all cases for all sample sizes when the spatial weight matrix has non-identical blocks on the diagonals. The simulation results also show that the inconsistency of the spatial autoregressive and moving average parameters has almost no effect on the estimates of the parameters of the exogenous variables for cases where the sample size is large.

Appendix

A: Some Useful Lemmas

Lemma 1.

Let

A_{n}

,

B_{n}

and

C_{n}

be

n \times n

matrices with

(i, j)

-th elements, respectively denoted by

a_{n, i j}

,

b_{n, i j}

and

c_{n, i j}

. Assume that

A_{n}

and

B_{n}

have zero diagonal elements and

C_{n}

has uniformly-bounded row and column sums in absolute value. Let

q_{n}

be an

n \times 1

vector with uniformly-bounded elements in absolute value. Assume that

ε_{n}

satisfies Assumption 1 with the covariance matrix denoted by

Σ_{n}

=

Diag {σ_{n 1}^{2}, \dots, σ_{n n}^{2}}

. Then,

(1): $E (ε_{n}^{^{'}} A_{n} ε_{n} \cdot ε_{n}^{^{'}} B_{n} ε_{n}) = \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{n, i j} (b_{n, i j} + b_{n, j i}) σ_{n i}^{2} σ_{n j}^{2} = tr (Σ_{n} A_{n} (B_{n}^{^{'}} Σ_{n} + Σ_{n} B_{n}))$
(2): $\begin{matrix} E {(ε_{n} C_{n} ε_{n})}^{2} & = \sum_{i = 1}^{n} c_{n, i i}^{2} [E (ε_{n i}^{4}) - 3 σ_{n i}^{4}] + {(\sum_{i = 1}^{n} c_{n, i i} σ_{n i}^{2})}^{2} \\ + \sum_{i = 1}^{n} \sum_{j = 1}^{n} c_{n, i j} (c_{n, i j} + c_{n, j i}) σ_{n i}^{2} σ_{n j}^{2} \\ = \sum_{i = 1}^{n} c_{n, i i}^{2} [E (ε_{n i}^{4}) - 3 σ_{n i}^{4}] + {tr}^{2} (Σ_{n} C_{n}) + tr (Σ_{n} C_{n} C_{n}^{^{'}} Σ_{n} + Σ_{n} C_{n} Σ_{n} C_{n}), \end{matrix}$
(3): $\begin{matrix} Var (ε_{n} C_{n} ε_{n}) & = \sum_{i = 1}^{n} c_{n, i i}^{2} [E (ε_{n i}^{4}) - 3 σ_{n i}^{4}] + \sum_{i = 1}^{n} \sum_{j = 1}^{n} c_{n, i j} (c_{n, i j} + c_{n, j i}) σ_{n i}^{2} σ_{n j}^{2} \\ = \sum_{i = 1}^{n} c_{n, i i}^{2} [E (ε_{n i}^{4}) - 3 σ_{n i}^{4}] + tr (Σ_{n} C_{n} C_{n}^{^{'}} Σ_{n} + Σ_{n} C_{n} Σ_{n} C_{n}) . \end{matrix}$
(4): $E (ε_{n}^{^{'}} C_{n} ε_{n}) = O (n), Var (ε_{n}^{^{'}} C_{n} ε_{n}) = O (n), ε_{n}^{^{'}} C_{n} ε_{n} = O_{p} (n) .$
(5): $E (C_{n} ε_{n}) = 0, Var (C_{n} ε_{n}) = O (n), C_{n} ε_{n} = O_{p} (n), Var (q_{n}^{^{'}} C_{n} ε_{n}) = O (n), q_{n}^{^{'}} C_{n} ε_{n} = O_{p} (n) .$

Proof.

For (1), (2), (3), (4) and (5), see Lemmas A.1 through A.4 in Lin and Lee [22] and Lemma 2 in Dogan and Suleyman [36]. ▢

Lemma 2.

Consider

{\bar{M}}_{n} = (I_{n} - P_{n})

, where

P_{n} = {\bar{X}}_{n} {({\bar{X}}_{n}^{^{'}} {\bar{X}}_{n})}^{- 1} {\bar{X}}_{n}^{^{'}}

under Assumption 3. Assume that

ε_{n}

satisfies Assumption 1 with the covariance matrix denoted by

Σ_{n}

=

Diag {σ_{n 1}^{2}, \dots, σ_{n n}^{2}}

. Then,

\begin{matrix} (1) {\bar{M}}_{n} and P_{n} are uniformly bounded in absolute value in both row and column sums . \\ (2) Var (P_{n} ε_{n}) = O (\frac{1}{n}), P_{n} ε_{n} = o_{p} (1), Var (ε_{n} P_{n} ε_{n}) = O (\frac{1}{n}), ε_{n} P_{n} ε_{n} = O_{p} (1) . \\ (3) Elements of P_{n} are O (\frac{1}{n}) . \end{matrix}

Proof.

The proof is similar to the proof of Lemma 3 in Dogan and Suleyman [36]. Hence, it is omitted. ▢

B: Proof of Proposition 1

For the probability limit of terms in Equation (19), the partial derivatives

\frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial ρ}

,

\frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial λ}

and

\frac{\partial {\bar{M}}_{n} (ρ)}{\partial ρ}

are required, which are given by:

\begin{matrix} (1) & \frac{\partial {\bar{M}}_{n} (ρ)}{\partial ρ} = - [R_{n}^{- 1} (ρ) M_{n} {\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ)] - [{\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} \\ \times {\bar{X}}_{n}^{^{'}} (ρ) M_{n}^{^{'}} R_{n}^{^{'} - 1} (ρ)] + [{\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) H_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ)] \\ + [{\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ) H_{n} (ρ) {\bar{X}}_{n} (ρ) {({\bar{X}}_{n}^{^{'}} (ρ) {\bar{X}}_{n} (ρ))}^{- 1} {\bar{X}}_{n}^{^{'}} (ρ)] \end{matrix}

\begin{matrix} (2) \frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial ρ} & = [\frac{2}{n} Y_{n}^{^{'}} S_{n}^{^{'}} (λ) H_{n}^{^{'}} (ρ) R_{n}^{^{'} - 1} (ρ) {\bar{M}}_{n} (ρ) R_{n}^{- 1} (ρ) S_{n} (λ) Y_{n}] \\ - [\frac{2}{n} Y_{n}^{^{'}} S_{n}^{^{'}} (λ) R_{n}^{^{'} - 1} (ρ) P_{n} (ρ) H_{n}^{^{'}} {\bar{M}}_{n} (ρ) R_{n}^{- 1} (ρ) S_{n} (λ) Y_{n}] . \end{matrix}

\begin{matrix} (3) \frac{\partial {\hat{σ}}_{n}^{2} (δ)}{\partial λ} & = - [\frac{2}{n} Y_{n}^{^{'}} S_{n}^{^{'}} (λ) R_{n}^{^{'} - 1} (ρ) {\bar{M}}_{n} (ρ) R_{n}^{- 1} (ρ) W_{n} Y_{n}] . \end{matrix}

First, the probability limit of the first row in Equation (19) is investigated:

\begin{matrix} \underset{n \to \infty}{plim} \frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial λ}) & = \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} {\bar{G}}_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} + \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} {\bar{G}}_{n} {\bar{X}}_{n} β_{0}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

(30)

where we use

{\bar{X}}_{n}^{^{'}} {\bar{M}}_{n} = 0_{k \times n}

. For the second term on the r.h.s. of Equation (30), we have:

\begin{matrix} \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} R_{n}^{- 1} G_{n} X_{n} β_{0}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} & = 0 \end{matrix}

(31)

since the numerator converges in probability to zero by Lemma 1(5) and Lemma 2(1), and for the term in the denominator, we have

\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n} = \frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2} + o_{p} (1)

, as shown in Equation (16). The overall result is zero, since

\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}

is uniformly bounded for all n by Assumption 1. As for the first term on the r.h.s of Equation (30), we have:

\begin{matrix} \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} {\bar{G}}_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} = \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n} {\bar{G}}_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} - \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n} {\bar{X}}_{n} {({\bar{X}}_{n}^{^{'}} {\bar{X}}_{n})}^{- 1} {\bar{X}}_{n}^{^{'}} {\bar{G}}_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

(32)

We first evaluate the last term in (32). The numerator of this term tends to zero in probability as n goes to infinity by Lemma 1(4) and Assumption 3. Hence, the last term in Equation (32) vanishes.

Now, we return to the first term in the r.h.s. of Equation (32). By Lemma 1(4),

Var (\frac{1}{n} ε_{n}^{^{'}} {\bar{G}}_{n} ε_{n}) = O (\frac{1}{n}) = o (1)

. Then, the Chebyshev inequality implies that

{plim}_{n \to \infty} (\frac{1}{n} ε_{n}^{^{'}} {\bar{G}}_{n} ε_{n} - E (\frac{1}{n} ε_{n}^{^{'}} {\bar{G}}_{n} ε_{n})) = {plim}_{n \to \infty} (\frac{1}{n} ε_{n}^{^{'}} {\bar{G}}_{n} ε_{n} - \frac{1}{n} \sum_{i = 1}^{n} {\bar{G}}_{n . i i} σ_{n i}^{2}) = 0

. Hence,

\begin{matrix} \frac{\frac{1}{n} ε_{n} {\bar{G}}_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} = \frac{\frac{1}{n} \sum_{i = 1}^{n} {\bar{G}}_{n, i i} σ_{n i}^{2}}{\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}} + o_{p} (1) \end{matrix}

(33)

These results imply the following one:

\frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial λ}) = \frac{\frac{1}{n} \sum_{i = 1}^{n} {\bar{G}}_{n . i i} σ_{n i}^{2}}{\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}} + o_{p} (1)

(34)

Now, we return to the first term in the second row of Equation (19):

\begin{matrix} \underset{n \to \infty}{plim} \frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial ρ}) & = - \underset{n \to \infty}{plim} \frac{\frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} H_{n}^{^{'}} R_{n}^{^{'} - 1} {\bar{M}}_{n} R_{n}^{- 1} S_{n} Y_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

\begin{matrix} + \underset{n \to \infty}{plim} \frac{\frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} R_{n}^{^{'} - 1} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} R_{n}^{- 1} S_{n} Y_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

(35)

Each term is handled separately below by using

R_{n}^{- 1} S_{n} Y_{n} = {\bar{X}}_{n} β_{0} + ε_{n}

,

S_{n} Y_{n} = X_{n} β_{0} + R_{n} ε_{n}

,

{\bar{X}}_{n}^{^{'}} {\bar{M}}_{n} = 0_{k \times n}

and

{\bar{M}}_{n} {\bar{X}}_{n} = 0_{n \times k}

. Note that

\frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} R_{n}^{^{'} - 1} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} R_{n}^{- 1} S_{n} Y_{n} = \frac{1}{n} β_{0}^{^{'}} {\bar{X}}_{n}^{^{'}} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} ε_{n} + \frac{1}{n} ε_{n} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} ε_{n}

. By Lemma 1(5) and Lemma 2(1),

\frac{1}{n} β_{0}^{^{'}} {\bar{X}}_{n}^{^{'}} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} ε_{n} = o_{p} (1)

. For the remaining term, by Lemma 2, we have

\frac{1}{n} ε_{n} P_{n} H_{n}^{^{'}} {\bar{M}}_{n} ε_{n} = o_{p} (1)

. Hence, the second term on the r.h.s. of Equation (35) vanishes.

The first term on the r.h.s. of Equation (35) can be written as:

\begin{matrix} - \underset{n \to \infty}{plim} & \frac{\frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} R_{n}^{^{'} - 1} {\bar{M}}_{n} H_{n} R_{n}^{- 1} S_{n} Y_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} = - \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} H_{n} {\bar{X}}_{n} β_{0}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} - \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} H_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

(36)

Substituting

{\bar{M}}_{n} = I_{n} - {\bar{X}}_{n} {({\bar{X}}_{n}^{^{'}} {\bar{X}}_{n})}^{- 1} {\bar{X}}_{n}^{^{'}}

into Equation (36) yields:

\begin{matrix} - \underset{n \to \infty}{plim} \frac{\frac{1}{n} Y_{n}^{^{'}} S_{n}^{^{'}} R_{n}^{^{'} - 1} {\bar{M}}_{n} H_{n} R_{n}^{- 1} S_{n} Y_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} & = - \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} H_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} - \underset{n \to \infty}{plim} \frac{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} H_{n} {\bar{X}}_{n} β_{0}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

\begin{matrix} + \underset{n \to \infty}{plim} \frac{\frac{1}{n^{2}} ε_{n}^{^{'}} {\bar{X}}_{n} {(\frac{1}{n} {\bar{X}}_{n}^{^{'}} {\bar{X}}_{n})}^{- 1} {\bar{X}}_{n}^{^{'}} H_{n} ε_{n}}{\frac{1}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \end{matrix}

(37)

By Lemma 1(5) and Equation (16), the second term on the r.h.s of Equation (37) vanishes. The third term vanishes by Lemma 1(4) and Equation (16). The probability limit of the remaining term can be found by the Chebyshev inequality. By Lemma 1(4), we have

Var (\frac{1}{n} ε_{n}^{^{'}} H_{n} ε_{n}) = O (\frac{1}{n}) = o (1)

. Hence,

{plim}_{n \to \infty} (\frac{1}{n} ε_{n}^{^{'}} H_{n} ε_{n} - E (\frac{1}{n} ε_{n}^{^{'}} H_{n} ε_{n})) = {plim}_{n \to \infty} (\frac{1}{n} ε_{n}^{^{'}} H_{n} ε_{n} - \frac{1}{n} \sum_{i = 1}^{n} H_{n, i i} σ_{n i}^{2}) = 0

. Combining these results, we get the following result for the first term in the first row of Equation (19):

\frac{1}{n} (- \frac{n}{\frac{2}{n} ε_{n}^{^{'}} {\bar{M}}_{n} ε_{n}} \frac{\partial {\hat{σ}}_{n}^{2} (δ_{0})}{\partial ρ}) = - \frac{\frac{1}{n} \sum_{i = 1}^{n} H_{n, i i} σ_{n i}^{2}}{\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}} + o_{p} (1)

(38)

By combining the results in Equations (34) and (38), we obtain:

\begin{matrix} \frac{1}{n} \frac{\partial ln L_{n} (δ_{0})}{\partial δ} & = (\begin{matrix} \frac{\frac{1}{n} \sum_{i = 1}^{n} {\bar{G}}_{n . i i} σ_{n i}^{2}}{\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}} - \frac{1}{n} t r (G_{n}) + o_{p} (1) \\ - (\frac{\frac{1}{n} \sum_{i = 1}^{n} H_{n, i i} σ_{n i}^{2}}{\frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}} - \frac{1}{n} t r (H_{n})) + o_{p} (1) \end{matrix}) \end{matrix}

(39)

For the notational simplification, denote

H_{n}^{*} = \frac{1}{n} t r (H_{n}) = \frac{1}{n} \sum_{i = 1}^{n} H_{n, i i}

,

{\bar{G}}_{n}^{*} = \frac{1}{n} t r ({\bar{G}}_{n}) = \frac{1}{n} \sum_{i = 1}^{n} {\bar{G}}_{n, i i}

and

{\bar{σ}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} σ_{n i}^{2}

. Then, Equation (39) can be written in a more convenient form as 10:

\begin{matrix} \frac{1}{n} \frac{\partial ln L_{n} (δ_{0})}{\partial δ} & = (\begin{matrix} \frac{\frac{1}{n} \sum_{i = 1}^{n} ({\bar{G}}_{n . i i} - {\bar{G}}^{*}) (σ_{n i}^{2} - {\bar{σ}}^{2})}{{\bar{σ}}^{2}} - \frac{1}{n} t r ({\bar{G}}_{n} - G_{n}) + o_{p} (1) \\ - \frac{\frac{1}{n} \sum_{i = 1}^{n} (H_{n, i i} - H_{n}^{*}) (σ_{n i}^{2} - {\bar{σ}}^{2})}{{\bar{σ}}^{2}} + o_{p} (1) \end{matrix}) \end{matrix}

\begin{matrix} = (\begin{matrix} \frac{cov ({\bar{G}}_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1) \\ - \frac{cov (H_{n, i i}, σ_{n i}^{2})}{{\bar{σ}}^{2}} + o_{p} (1) \end{matrix}) \end{matrix}

(40)

C: Simulation Results for SARMA(0,1)

Table C1. Simulation results for SARMA(0,1).

**Table C1.** Simulation results for SARMA(0,1).
$n = 100$	$β_{1}$	$β_{2}$	ρ
ρ	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]
−0.6	(0.987)[−0.013](0.209)[0.209]	(1.009)[0.009](0.218)[0.218]	(−0.405)[0.195](0.618)[0.648]
−0.3	(1.000)[−0.000](0.211)[0.211]	(0.998)[−0.002](0.203)[0.203]	(−0.205)[0.095](0.630)[0.637]
0.0	(1.001)[0.001](0.214)[0.214]	(1.008)[0.008](0.229)[0.229]	(0.101)[0.101](0.565)[0.574]
0.3	(1.000)[0.000](0.217)[0.217]	(0.993)[−0.007](0.222)[0.222]	(0.434)[0.134](0.386)[0.409]
0.6	(0.996)[−0.004](0.212)[0.212]	(0.998)[−0.002](0.210)[0.210]	(0.710)[0.110](0.204)[0.232]
$n = 500$
−0.6	(1.006)[0.006](0.083)[0.083]	(0.995)[−0.005](0.082)[0.082]	(−0.652)[−0.052](0.377)[0.380]
−0.3	(1.001)[0.001](0.083)[0.083]	(1.002)[0.002](0.084)[0.084]	(−0.354)[−0.054](0.388)[0.392]
0.0	(0.998)[−0.002](0.084)[0.084]	(1.002)[0.002](0.082)[0.082]	(0.007)[0.007](0.293)[0.293]
0.3	(1.005)[0.005](0.085)[0.085]	(0.998)[−0.002](0.080)[0.080]	(0.346)[0.046](0.189)[0.194]
0.6	(0.997)[−0.003](0.078)[0.078]	(1.002)[0.002](0.081)[0.081]	(0.652)[0.052](0.095)[0.108]
$n =$ 1,000
−0.6	(1.000)[−0.000](0.058)[0.058]	(1.000)[0.000](0.059)[0.059]	(−0.682)[−0.082](0.284)[0.296]
−0.3	(0.999)[−0.001](0.059)[0.059]	(0.998)[-0.002](0.058)[0.058]	(−0.342)[−0.042](0.274)[0.277]
0.0	(1.000)[−0.000](0.057)[0.057]	(1.004)[0.004](0.059)[0.059]	(0.010)[0.010](0.191)[0.191]
0.3	(0.998)[−0.002](0.058)[0.058]	(1.000)[0.000](0.058)[0.058]	(0.330)[0.030](0.125)[0.128]
0.6	(1.002)[0.002](0.057)[0.057]	(0.999)[−0.001](0.057)[0.057]	(0.630)[0.030](0.072)[0.078]

D: Simulation Results for SARMA(1,1)

Table D2. Simulation results for SARMA(1,1):

n = 100

.

**Table D2.** Simulation results for SARMA(1,1): $n = 100$ .
		λ	$β_{1}$	$β_{2}$	ρ
λ	ρ	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]
−0.6	−0.6	(−1.583)[−0.983](4.262)[4.374]	(0.874)[−0.126](0.342)[0.364]	(0.898)[-0.102](0.350)[0.365]	(−0.273)[0.327](0.981)[1.034]
−0.6	−0.3	(−1.790)[−1.190](4.346)[4.506]	(0.848)[−0.152](0.371)[0.401]	(0.847)[−0.153](0.361)[0.392]	(−0.178)[0.122](0.997)[1.004]
−0.6	0.0	(−1.794)[v1.194](4.355)[4.516]	(0.867)[−0.133](0.357)[0.381]	(0.865)[−0.135](0.353)[0.378]	(0.021)[0.021](0.934)[0.934]
−0.6	0.3	(−1.404)[−0.804](3.687)[3.773]	(0.839)[−0.161](0.379)[0.412]	(0.851)[−0.149](0.382)[0.410]	(0.264)[−0.036](0.709)[0.710]
−0.6	0.6	(−0.591)[0.009](1.108)[1.108]	(0.760)[−0.240](0.455)[0.515]	(0.760)[−0.240](0.455)[0.515]	(0.470)[−0.130](0.342)[0.366]
−0.3	−0.6	(−0.907)[−0.607](3.275)[3.331]	(0.912)[−0.088](0.325)[0.337]	(0.907)[−0.093](0.324)[0.337]	(−0.259)[0.341](0.822)[0.890]
−0.3	−0.3	(−1.132)[−0.832](3.497)[3.594]	(0.882)[−0.118](0.351)[0.370]	(0.881)[−0.119](0.362)[0.381]	(−0.136)[0.164](0.906)[0.920]
−0.3	0.0	(−1.335)[−1.035](3.840)[3.977]	(0.857)[−0.143](0.361)[0.388]	(0.861)[−0.139](0.367)[0.393]	(−0.005)[−0.005](0.861)[0.861]
−0.3	0.3	(−1.045)[−0.745](3.364)[3.445]	(0.840)[−0.160](0.399)[0.430]	(0.835)[−0.165](0.400)[0.433]	(0.220)[−0.080](0.709)[0.714]
−0.3	0.6	(−0.574)[−0.274](1.873)[1.893]	(0.768)[−0.232](0.466)[0.521]	(0.758)[−0.242](0.459)[0.519]	(0.436)[−0.164](0.390)[0.423]
0.0	−0.6	(−0.452)[−0.452](2.570)[2.609]	(0.904)[−0.096](0.354)[0.367]	(0.898)[−0.102](0.350)[0.365]	(−0.292)[0.308](0.721)[0.784]
0.0	−0.3	(−0.690)[−0.690](3.123)[3.199]	(0.903)[−0.097](0.337)[0.350]	(0.889)[−0.111](0.340)[0.358]	(−0.208)[0.092](0.772)[0.778]
0.0	0.0	(−0.834)[−0.834](3.174)[3.282]	(0.841)[−0.159](0.383)[0.415]	(0.857)[−0.143](0.391)[0.416]	(−0.079)[−0.079](0.804)[0.808]
0.0	0.3	(−0.450)[−0.450](2.131)[2.178]	(0.839)[−0.161](0.407)[0.438]	(0.838)[−0.162](0.412)[0.442]	(0.238)[−0.062](0.590)[0.593]
0.0	0.6	(−0.278)[−0.278](1.068)[1.104]	(0.768)[−0.232](0.469)[0.523]	(0.763)[−0.237](0.463)[0.521]	(0.411)[−0.189](0.349)[0.397]
0.3	−0.6	(0.068)[-0.232](1.429)[1.448]	(0.938)[-0.062](0.311)[0.317]	(0.951)[-0.049](0.307)[0.311]	(−0.384)[0.216](0.543)[0.585]
0.3	−0.3	(-0.157)[-0.457](2.174)[2.221]	(0.903)[-0.097](0.344)[0.358]	(0.902)[-0.098](0.345)[0.359]	(−0.279)[0.021](0.623)[0.623]
0.3	0.0	(−0.211)[-0.511](2.030)[2.094]	(0.867)[-0.133](0.376)[0.399]	(0.864)[-0.136](0.381)[0.404]	(−0.161)[-0.161](0.660)[0.679]
0.3	0.3	(−0.203)[-0.503](2.007)[2.069]	(0.819)[-0.181](0.437)[0.473]	(0.813)[-0.187](0.432)[0.471]	(0.095)[-0.205](0.621)[0.654]
0.3	0.6	(-0.022)[-0.322](0.735)[0.802]	(0.659)[-0.341](0.508)[0.612]	(0.657)[-0.343](0.503)[0.609]	(0.329)[-0.271](0.381)[0.468]
0.6	−0.6	(0.422)[−0.178](0.712)[0.733]	(0.981)[−0.019](0.231)[0.232]	(0.981)[−0.019](0.230)[0.231]	(−0.584)[0.016](0.346)[0.346]
0.6	−0.3	(0.376)[−0.224](0.580)[0.621]	(0.976)[−0.024](0.253)[0.254]	(0.965)[−0.035](0.255)[0.257]	(−0.511)[−0.211](0.329)[0.391]
0.6	0.0	(0.270)[−0.330](0.842)[0.905]	(0.961)[−0.039](0.294)[0.296]	(0.945)[−0.055](0.292)[0.297]	(−0.412)[−0.412](0.386)[0.564]
0.6	0.3	(0.152)[−0.448](1.345)[1.418]	(0.921)[−0.079](0.326)[0.335]	(0.920)[−0.080](0.335)[0.344]	(−0.286)[−0.586](0.415)[0.719]
0.6	0.6	(0.159)[−0.441](0.767)[0.884]	(0.802)−0.198](0.436)[0.479]	(0.800)[−0.200](0.432)[0.476]	(−0.059)[−0.659](0.414)[0.779]

Table D3. Simulation results for SARMA(1,1):

n = 500

.

**Table D3.** Simulation results for SARMA(1,1): $n = 500$ .
		λ	$β_{1}$	$β_{2}$	ρ
λ	ρ	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]
−0.6	−0.6	(−3.051)[−2.451](7.286)[7.687]	(0.914)[−0.086](0.257)[0.271]	(0.911)[−0.089](0.256)[0.271]	(−1.040)[−0.440](1.921)[1.970]
−0.6	−0.3	(−2.905)[−2.305](7.213)[7.572]	(0.916)[−0.084](0.253)[0.267]	(0.918)[−0.082](0.254)[0.267]	(−0.725)[−0.425](1.913)[1.960]
−0.6	0.0	(−1.771)[−1.171](5.677)[5.797]	(0.953)[−0.047](0.203)[0.208]	(0.949)[−0.051](0.204)[0.210]	(−0.123)[−0.123](1.427)[1.432]
−0.6	0.3	(−0.977)[−0.377](3.577)[3.597]	(0.985)[−0.015](0.142)[0.143]	(0.982)[−0.018](0.140)[0.141]	(0.303)[0.003](0.814)[0.814]
−0.6	0.6	(−0.667)[−0.067](0.139)[0.154]	(1.003)[0.003](0.088)[0.088]	(1.006)[0.006](0.085)[0.085]	(0.609)[0.009](0.087)[0.087]
−0.3	−0.6	(−0.985)[−0.685](4.189)[4.244]	(0.979)[−0.021](0.163)[0.165]	(0.975)[−0.025](0.162)[0.164]	(−0.608)[−0.008](1.201)[1.201]
−0.3	−0.3	(−1.513)[−1.213](5.164)[5.304]	(0.953)[−0.047](0.187)[0.193]	(0.960)[−0.040](0.189)[0.193]	(−0.577)[−0.277](1.472)[1.498]
−0.3	0.0	(−1.196)[−0.896](4.602)[4.689]	(0.972)[−0.028](0.174)[0.177]	(0.968)[−0.032](0.171)[0.174]	(−0.155)[−0.155](1.284)[1.293]
−0.3	0.3	(−0.457)[−0.157](1.797)[1.804]	(0.996)[−0.004](0.103)[0.103]	(0.994)[−0.006](0.102)[0.102]	(0.312)[0.012](0.449)[0.449]
−0.3	0.6	(−0.460)[−0.160](0.212)[0.266]	(1.000)[−0.000](0.082)[0.082]	(1.006)[0.006](0.082)[0.082]	(0.557)[−0.043](0.132)[0.139]
0.0	−0.6	(0.040)[0.040](1.086)[1.087]	(0.998)[−0.002](0.090)[0.090]	(0.998)[−0.002](0.086)[0.086]	(−0.371)[0.229](0.468)[0.521]
0.0	−0.3	(−0.220)[−0.220](2.089)[2.100]	(0.994)[−0.006](0.103)[0.103]	(0.994)[−0.006](0.109)[0.109]	(−0.333)[−0.033](0.703)[0.704]
0.0	0.0	(−0.205)[−0.205](1.807)[1.819]	(0.996)[−0.004](0.101)[0.101]	(0.995)[−0.005](0.101)[0.101]	(−0.075)[−0.075](0.681)[0.685]
0.0	0.3	(−0.077)[−0.077](0.731)[0.735]	(0.996)[−0.004](0.085)[0.085]	(0.998)[−0.002](0.087)[0.087]	(0.298)[−0.002](0.328)[0.328]
0.0	0.6	(−0.153)[−0.153](0.253)[0.296]	(0.987)[−0.013](0.136)[0.137]	(0.989)[−0.011](0.135)[0.136]	(0.521)[−0.079](0.197)[0.213]
0.3	−0.6	(0.317)[0.017](0.140)[0.141]	(1.003)[0.003](0.084)[0.084]	(1.000)[0.000](0.082)[0.082]	(−0.430)[0.170](0.201)[0.263]
0.3	−0.3	(0.228)[−0.072](0.173)[0.188]	(1.003)[0.003](0.086)[0.086]	(0.998)[−0.002](0.083)[0.083]	(−0.323)[−0.023](0.272)[0.273]
0.3	0.0	(0.137)[−0.163](0.715)[0.734]	(0.998)[−0.002](0.086)[0.087]	(0.997)[−0.003](0.086)[0.086]	(−0.174)[−0.174](0.408)[0.444]
0.3	0.3	(0.199)[−0.101](0.211)[0.234]	(0.996)[−0.004](0.100)[0.100]	(0.996)[−0.004](0.100)[0.100]	(0.216)[−0.084](0.362)[0.372]
0.3	0.6	(0.245)[−0.055](0.194)[0.202]	(0.961)[−0.039](0.211)[0.214]	(0.958)[−0.042](0.209)[0.213]	(0.587)[−0.013](0.205)[0.205]
0.6	−0.6	(0.545)[−0.055](0.086)[0.102]	(0.998)[−0.002](0.082)[0.082]	(1.000)[−0.000](0.084)[0.084]	(−0.652)[−0.052](0.102)[0.115]
0.6	−0.3	(0.486)[−0.114](0.082)[0.141]	(0.998)[−0.002](0.082)[0.082]	(1.001)[0.001](0.081)[0.081]	(−0.583)[−0.283](0.103)[0.301]
0.6	0.0	(0.411)[−0.189](0.091)[0.209]	(1.000)[0.000](0.083)[0.083]	(0.997)[−0.003](0.081)[0.081]	(−0.490)[−0.490](0.124)[0.505]
0.6	0.3	(0.324)[−0.276](0.088)[0.290]	(1.007)[0.007](0.089)[0.090]	(1.003)[0.003](0.092)[0.092]	(−0.344)[−0.644](0.200)[0.674]
0.6	0.6	(0.288)[−0.312](0.159)[0.350]	(0.943)[−0.057](0.253)[0.259]	(0.941)[−0.059](0.253)[0.260]	(−0.070)[−0.670](0.387)[0.774]

Table D4. Simulation results for SARMA(1,1):

n =

1,000

**Table D4.** Simulation results for SARMA(1,1): $n =$ 1,000
		λ	$β_{1}$	$β_{2}$	ρ
λ	ρ	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]	(Mean)[Bias](SE)[RMSE]
−0.6	−0.6	(−3.449)[−2.849](8.618)[9.077]	(0.907)[−0.093](0.283)[0.298]	(0.906)[−0.094](0.282)[0.297]	(−1.323)[−0.723](2.487)[2.590]
−0.6	−0.3	(−4.151)[−3.551](9.648)[10.280]	(0.877)[−0.123](0.311)[0.334]	(0.880)[−0.120](0.312)[0.335]	(−1.135)[−0.835](2.798)[2.920]
−0.6	0.0	(−1.675)[−1.075](6.110)[6.204]	(0.957)[−0.043](0.201)[0.205]	(0.958)[−0.042](0.199)[0.204]	(−0.148)[−0.148](1.666)[1.672]
−0.6	0.3	(−0.650)[−0.050](2.400)[2.401]	(0.991)[−0.009](0.092)[0.093]	(0.991)[−0.009](0.093)[0.093]	(0.352)[0.052](0.568)[0.570]
−0.6	0.6	(−0.682)[−0.082](0.095)[0.126]	(1.007)[0.007](0.059)[0.060]	(1.007)[0.007](0.057)[0.058]	(0.595)[−0.005](0.054)[0.055]
−0.3	−0.6	(−0.698)[−0.398](3.631)[3.653]	(0.983)[−0.017](0.128)[0.129]	(0.985)[−0.015](0.129)[0.129]	(−0.624)[−0.024](1.152)[1.153]
−0.3	−0.3	(−1.691)[−1.391](6.083)[6.240]	(0.952)[−0.048](0.204)[0.210]	(0.954)[−0.046](0.204)[0.209]	(−0.704)[−0.404](1.839)[1.883]
−0.3	0.0	(−0.829)[−0.529](4.086)[4.120]	(0.981)[−0.019](0.141)[0.143]	(0.982)[−0.018](0.142)[0.143]	(−0.103)[−0.103](1.241)[1.245]
−0.3	0.3	(−0.385)[−0.085](1.415)[1.418]	(1.000)[−0.000](0.073)[0.073]	(0.999)[−0.001](0.074)[0.074]	(0.300)[−0.000](0.335)[0.335]
−0.3	0.6	(−0.476)[−0.176](0.169)[0.244]	(1.005)[0.005](0.058)[0.058]	(1.007)[0.007](0.058)[0.058]	(0.524)[−0.076](0.105)[0.130]
0.0	−0.6	(0.090)[0.090](0.866)[0.870]	(0.998)[−0.002](0.064)[0.064]	(1.000)[−0.000](0.067)[0.067]	(−0.361)[0.239](0.373)[0.443]
0.0	−0.3	(−0.096)[−0.096](1.508)[1.511]	(0.996)[−0.004](0.083)[0.083]	(0.995)[−0.005](0.078)[0.078]	(−0.323)[−0.023](0.575)[0.575]
0.0	0.0	(−0.111)[−0.111](1.287)[1.291]	(0.997)[−0.003](0.071)[0.071]	(0.994)[−0.006](0.070)[0.070]	(−0.063)[−0.063](0.525)[0.529]
0.0	0.3	(−0.074)[−0.074](0.181)[0.195]	(0.997)[−0.003](0.060)[0.060]	(0.999)[−0.001](0.060)[0.060]	(0.260)[−0.040](0.169)[0.174]
0.0	−0.6	(0.068)[−0.232](1.429)[1.448]	(0.938)[−0.062](0.311)[0.317]	(0.951)[−0.049](0.307)[0.311]	(−0.384)[0.216](0.543)[0.585]
0.3	−0.6	(0.342)[0.042](0.090)[0.099]	(1.000)[0.000](0.060)[0.060]	(1.001)[0.001](0.059)[0.059]	(−0.429)[0.171](0.118)[0.208]
0.3	−0.3	(0.251)[−0.049](0.111)[0.121]	(1.000)[−0.000](0.063)[0.063]	(1.004)[0.004](0.060)[0.061]	(−0.338)[−0.038](0.152)[0.157]
0.3	0.0	(0.174)[−0.126](0.125)[0.178]	(1.001)[0.001](0.058)[0.058]	(0.998)[−0.002](0.059)[0.059]	(−0.188)[−0.188](0.229)[0.296]
0.3	0.3	(0.225)[−0.075](0.145)[0.163]	(0.999)[−0.001](0.061)[0.061]	(0.999)[−0.001](0.058)[0.058]	(0.223)[−0.077](0.271)[0.281]
0.3	0.6	(0.274)[−0.026](0.147)[0.149]	(0.996)[−0.004](0.097)[0.097]	(0.992)[−0.008](0.097)[0.097]	(0.609)[0.009](0.148)[0.148]
0.6	−0.6	(0.562)[−0.038](0.055)[0.067]	(1.002)[0.002](0.059)[0.059]	(1.000)[0.000](0.059)[0.059]	(−0.668)[−0.068](0.066)[0.095]
0.6	−0.3	(0.496)[−0.104](0.058)[0.119]	(1.002)[0.002](0.059)[0.059]	(1.000)[0.000](0.058)[0.058]	(−0.590)[−0.290](0.071)[0.299]
0.6	0.0	(0.417)[−0.183](0.058)[0.192]	(0.998)[−0.002](0.062)[0.062]	(1.000)[0.000](0.061)[0.061]	(−0.495)[−0.495](0.072)[0.501]
0.6	0.3	(0.324)[−0.276](0.061)[0.282]	(1.004)[0.004](0.060)[0.060]	(1.002)[0.002](0.060)[0.060]	(−0.372)[−0.672](0.114)[0.682]
0.6	0.6	(0.320)[−0.280](0.176)[0.331]	(0.977)[−0.023](0.175)[0.177]	(0.975)[−0.025](0.176)[0.177]	(0.007)[−0.593](0.416)[0.725]

E: Surface Plots of RMSEs for SARMA(1,1)

Figure E1. RMSEs of

β_{1}

and

β_{2}

. (a)

β_{1}, n = 100

. (b)

β_{2}, n = 100

. (c)

β_{1}, n = 500

. (d)

β_{2}, n = 500

. (e)

β_{1}, n =

1,000. (f)

β_{2}, n =

1,000.

Figure E1. RMSEs of

β_{1}

and

β_{2}

. (a)

β_{1}, n = 100

. (b)

β_{2}, n = 100

. (c)

β_{1}, n = 500

. (d)

β_{2}, n = 500

. (e)

β_{1}, n =

1,000. (f)

β_{2}, n =

1,000.

Figure E2. RMSEs of λ and ρ. (a)

λ, n = 100

. (b)

ρ, n = 100

. (c)

λ, n = 500

. (d)

ρ, n = 500

. (e)

λ, n =

1,000. (f)

ρ, n =

1,000.

Figure E2. RMSEs of λ and ρ. (a)

λ, n = 100

. (b)

ρ, n = 100

. (c)

λ, n = 500

. (d)

ρ, n = 500

. (e)

λ, n =

1,000. (f)

ρ, n =

1,000.

Acknowledgements

I would like to thank two anonymous referees, Suleyman Taspinar, Wim Vijverberg and Gabriel Movsesyan for their helpful comments on the earlier drafts of this paper.

Conflicts of Interest

The author declare no conflict of interest.

References

R.P. Haining. “The moving average model for spatial interaction.” Trans. Inst. Br. Geogr. 3 (1978): 202–225. [Google Scholar] [CrossRef]
L. Anselin. Spatial Econometrics: Methods and Models. New York, NY, USA: Springer, 1988. [Google Scholar]
L.W. Hepple. Bayesian and Maximum Likelihood Estimation of the Linear Model with Spatial Moving Average Disturbances. Working Papers Series; Bristol, UK: School of Geographical Sciences, University of Bristol, 2003. [Google Scholar]
B. Fingleton. “A generalized method of moments estimator for a spatial model with moving average errors, with application to real estate prices.” Empir. Econ. 34 (2008): 35–37. [Google Scholar] [CrossRef]
B. Fingleton. “A generalized method of moments estimator for a spatial panel model with an endogenous spatial lag and spatial moving average errors.” Spat. Econ. Anal. 3 (2008): 27–44. [Google Scholar] [CrossRef]
H.H. Kelejian, and I.R. Prucha. “A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances.” J. Real Estate Financ. Econ. 17 (1998): 1899–1926. [Google Scholar]
H.H. Kelejian, and I.R. Prucha. “A generalized moments estimator for the autoregressive parameter in a spatial model.” Int. Econ. Rev. 40 (1999): 509–533. [Google Scholar] [CrossRef]
D. Das, H.H. Kelejian, and I.R. Prucha. “Small sample properties of estimators of spatial autoregressive models with autoregressive disturbances.” Pap. Reg. Sci. 82 (2003): 1–26. [Google Scholar] [CrossRef]
H.H. Kelejian, and I.R. Prucha. “Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances.” J. Econom. 157 (2010): 53–67. [Google Scholar] [CrossRef] [PubMed]
X. Liu, L.F. Lee, and C.R. Bollinger. “An efficient GMM estimator of spatial autoregressive models.” J. Econom. 159 (2010): 303–319. [Google Scholar] [CrossRef]
L.F. Lee. “Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models.” Econometrica 72 (2004): 1899–1925. [Google Scholar] [CrossRef]
L.F. Lee. “GMM and 2SLS estimation of mixed regressive, spatial autoregressive models.” J. Econom. 137 (2007): 489–514. [Google Scholar] [CrossRef]
L.F. Lee, and X. Liu. “Efficient GMM estimation of high order spatial autoregressive models with autoregressive disturbances.” Econom. Theory 26 (2010): 187–230. [Google Scholar] [CrossRef]
J.P. Lesage. “Bayesian estimation of spatial autoregressive models.” Int. Reg. Sci. Rev. 20 (1997): 113–129. [Google Scholar] [CrossRef]
J. LeSage, and R.K. Pace. Introduction to Spatial Econometrics (Statistics: A Series of Textbooks and Monographs. London, UK: Chapman and Hall/CRC, 2009. [Google Scholar]
L.W. Hepple. “Bayesian techniques in spatial and network econometrics: 1. Model comparison and posterior odds.” Environ. Plan. 27 (1995): 247–469. [Google Scholar]
I.R. Prucha. “Instrumental variables/method of moments estimation.” In Handbook of Regional Science. Edited by M.M. Fischer and P. Nijkamp. Berlin, Germany: Springer Berlin Heidelberg, 2014, pp. 1597–1617. [Google Scholar]
L.F. Lee. “The method of elimination and substitution in the GMM estimation of mixed regressive, spatial autoregressive models.” J. Econom. 140 (2007): 155–189. [Google Scholar] [CrossRef]
B.H. Baltagi, and L. Liu. “An improved generalized moments estimator for a spatial moving average error model.” Econ. Lett. 113 (2011): 282–284. [Google Scholar] [CrossRef]
M. Arnold, and D. Wied. “Improved GMM estimation of the spatial autoregressive error model.” Econ. Lett. 108 (2010): 65–68. [Google Scholar] [CrossRef]
D.M. Drukker, P. Egger, and I.R. Prucha. “On two-step estimation of a spatial autoregressive model with autoregressive disturbances and endogenous regressors.” Econom. Rev. 32 (2013): 686–733. [Google Scholar] [CrossRef]
X. Lin, and L.F. Lee. “GMM estimation of spatial autoregressive models with unknown heteroskedasticity.” J. Econom. 157 (2010): 34–52. [Google Scholar] [CrossRef]
H.H. Kelejian, and D. Robinson. “A suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a county expenditure model.” Pap. Reg. Sci. 72 (1993): 297–312. [Google Scholar] [CrossRef]
J. Besag. “Spatial interaction and the statistical analysis of lattice systems.” J. R. Stat. Soc. Ser. B (Methodological) 36 (1974): 192–236. [Google Scholar]
S. Richardson, C. Guihenneuc, and V. Lasserre. “Spatial linear models with autocorrelated error structure.” J. R. Stat. Soc. Ser. D (The Statistician) 41 (1992): 539–557. [Google Scholar] [CrossRef]
L.W. Hepple. “Bayesian techniques in spatial and network econometrics: 2. Computational methods and algorithms.” Environ. Plan. 27 (1995): 615–644. [Google Scholar]
B.D. Ripley. Spatial Statistics. Wiley Series in Probability and Statistics; Hoboken, New Jersey, USA: John Wiley & Sons, 2005. [Google Scholar]
R. Haining. “Trend-Surface models with regional and local scales of variation with an application to aerial survey data.” Technometrics 29 (1987): 461–469. [Google Scholar]
K.V. Mardia, and R.J. Marshall. “Maximum likelihood estimation of models for residual covariance in spatial regression.” Biometrika 71 (1984): 135–146. [Google Scholar] [CrossRef]
L.W. Hepple. “A Maximum likelihood model for econometric estimation with spatial series.” In Theory and Practice in Regional Science. Edited by I. Masser. London papers in regional science 6; London, UK: Pion Limited, 1976. [Google Scholar]
K.M. Abadir, and J.R. Magnus. Matrix Algebra. Econometric Exercises; New York, NY, USA: Cambridge University Press, 2005. [Google Scholar]
L.F. Lee. “Identification and estimation of econometric models with group interactions, contextual factors and fixed effects.” J. Econom. 140 (2007): 333–374. [Google Scholar] [CrossRef]
L.F. Lee, X. Liu, and X. Lin. “Specification and estimation of social interaction models with network structures.” Econom. J. 13 (2010): 145–176. [Google Scholar] [CrossRef]
H.H. Kelejian, and I.R. Prucha. Specification and Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances. College Park, MD, USA: Department of Economics, University of Maryland, 2007. [Google Scholar]
R.K. Pace, J.P. LeSage, and S. Zhu. Spatial Dependence in Regressors and Its Effect on Performance of Likelihood-Based and Instrumental Variable Estimators, 30th Anniversary ed. Advances in Econometrics; Bingley, UK: Emerald Group Publishing Limited, 2012, Volume 30, pp. 257–295. [Google Scholar]
O. Dogan, and T. Suleyman. GMM Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances. Working Papers 001; New York, NY, USA: City University of New York Graduate Center, Ph.D. Program in Economics, 2013. [Google Scholar]

¹Fingleton [4] and Baltagi and Liu [19] do not compare the finite sample efficiency of their estimators with the MLE.
²See Kelejian and Prucha [9].
³For a definition and some properties of uniform boundedness, see Kelejian and Prucha [9].
⁴There are some other formulations for the parameter spaces in the literature. For details, see Kelejian and Prucha [9] and LeSage and Pace [15]. Note that the parameter spaces for $β_{0}$ and $σ_{0}^{2}$ are not required to be compact. As shown in Equations (8a) and (8b), the MLE of these parameters is an OLS-type estimator; hence, boundedness is enough for the parameter spaces.
⁵For easy comparison, we set $λ_{0} = 0.9$ for SAR, $ρ_{0} = - 0.9$ for SMA, $(λ_{0}, ρ_{0}) = (0.5, 0.9)$ for SARAR(1,1) and $(λ_{0}, ρ_{0}) = (0.5, - 0.9)$ for SARMA(1,1). The disturbance of the unit located at the center of the lattice is increased by three.
⁶ $d_{i j} = R_{0} \times arccos (cos (| l o n g i t u d e_{i} - l o n g i t u d e_{j} |) cos (l a t i t u d e_{i}) cos (l a t i t u d e_{j}) + sin (l a t i t u d e_{i}) sin (l a t i t u d e_{j}))$ , where $R_{0}$ is the Earth’s radius.
⁷For SARAR(1,1), the penalty function is $f (λ, ρ, W_{n}, M_{n}) = {| S_{n} (λ) |}^{\frac{2}{n}} {| R_{n} (ρ) |}^{\frac{2}{n}}$ .
⁸For these results, I use the derivative rule given by $\frac{\partial ln | R_{n} (ρ) |}{\partial ρ} = tr (R_{n}^{- 1} (ρ) \times \frac{\partial R_{n} (ρ)}{\partial ρ})$ . For a proof, see (Abadir and Magnus [31], p. 372). Also note the commutative property of $R_{n}^{- 1} (ρ) M_{n} = M_{n} R_{n}^{- 1} (ρ) = H_{n} (ρ)$ .
⁹Here, $Diag (B_{1}, \dots, B_{G})$ denotes the block diagonal matrix in which the diagonal blocks are $m_{i} \times m_{i}$ matrices of $B_{i}$ s.
¹⁰Note that $\frac{1}{n} t r ({\bar{G}}_{n} - G_{n}) = 0$ .

© 2015 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Doğan, O. Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term. Econometrics 2015, 3, 101-127. https://doi.org/10.3390/econometrics3010101

AMA Style

Doğan O. Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term. Econometrics. 2015; 3(1):101-127. https://doi.org/10.3390/econometrics3010101

Chicago/Turabian Style

Doğan, Osman. 2015. "Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term" Econometrics 3, no. 1: 101-127. https://doi.org/10.3390/econometrics3010101

APA Style

Doğan, O. (2015). Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term. Econometrics, 3(1), 101-127. https://doi.org/10.3390/econometrics3010101

Article Menu

Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term

Abstract

1. Introduction

2. Model Specification and Assumptions

3. Spatial Processes for the Disturbance Term

4. The MLE of $λ_{0}$ and $ρ_{0}$

5. The MLE of $β_{0}$

6. Monte Carlo Simulation

6.1. Design

6.2. Simulation Results

7. Conclusions

Appendix

A: Some Useful Lemmas

B: Proof of Proposition 1

C: Simulation Results for SARMA(0,1)

D: Simulation Results for SARMA(1,1)

E: Surface Plots of RMSEs for SARMA(1,1)

Acknowledgements

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with a Moving Average Disturbance Term

Abstract

1. Introduction

2. Model Specification and Assumptions

3. Spatial Processes for the Disturbance Term

4. The MLE of λ 0 and ρ 0

5. The MLE of β 0

6. Monte Carlo Simulation

6.1. Design

6.2. Simulation Results

7. Conclusions

Appendix

A: Some Useful Lemmas

B: Proof of Proposition 1

C: Simulation Results for SARMA(0,1)

D: Simulation Results for SARMA(1,1)

E: Surface Plots of RMSEs for SARMA(1,1)

Acknowledgements

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4. The MLE of $λ_{0}$ and $ρ_{0}$

5. The MLE of $β_{0}$