*Article* **A Parameterization of Models for Unit Root Processes: Structure Theory and Hypothesis Testing**

**Dietmar Bauer 1,\*, Lukas Matuschek 1,2, Patrick de Matos Ribeiro 1,2 and Martin Wagner 3,4,5**


Received: 18 April 2018; Accepted: 4 November 2020; Published: 10 November 2020

**Abstract:** We develop and discuss a parameterization of vector autoregressive moving average processes with arbitrary unit roots and (co)integration orders. The detailed analysis of the topological properties of the parameterization—based on the state space canonical form of Bauer and Wagner (2012)—is an essential input for establishing statistical and numerical properties of pseudo maximum likelihood estimators as well as, e.g., pseudo likelihood ratio tests based on them. The general results are exemplified in detail for the empirically most relevant cases, the (multiple frequency or seasonal) I(1) and the I(2) case. For these two cases we also discuss the modeling of deterministic components in detail.

**Keywords:** canonical form; cointegration; hypothesis testing; parameterization; state space representation; unit roots

#### **1. Introduction**

Since the seminal contribution of Clive W.J. Granger (1981) that introduced the concept of cointegration, the modeling of multivariate (economic) time series with models and methods that allow for unit roots and cointegration has become standard econometric practice with applications ranging from macroeconomics to finance to climate science.

The most prominent (parametric) model class for cointegration analysis are vector autoregressive (VAR) models, popularized by the important contributions of Søren Johansen and Katarina Juselius and their co-authors, see, e.g., the monographs Johansen (1995) and Juselius (2006). The popularity of VAR cointegration analysis stems not only from the (relative) simplicity of the model class, but also from the fact that the VAR cointegration literature is very well-developed and provides a large battery of tools for diagnostic testing, impulse response analysis, forecast error variance decompositions and the like. All this makes VAR cointegration analysis to a certain extent the benchmark in the literature.<sup>1</sup>

The imposition of specific cointegration properties on an estimated VAR model becomes increasingly complicated as one moves away from the I(1) case. As discussed in Section 2, e.g., in the

<sup>1</sup> Please note that the original contribution to the estimation of cointegrating relationship has been least squares estimation in a non- or semi-parametric regression setting, see, e.g., Engle and Granger (1987). A recent survey of regression-based cointegration analysis is provided by Wagner (2018).

I(2) case a triple of indices needs to be chosen (fixed or determined via testing) to describe the cointegration properties. The imposition of cointegration properties in the estimation algorithm then leads to "switching" type algorithms that come together with non-trivial parameterization restrictions involving non-linear inter-relations, compare Paruolo (1996) or Paruolo (2000).2 Mathematically, these complications arise from the fact that the unit root and cointegration properties are in the VAR setting related to rank restrictions on the autoregressive polynomial matrix and its derivatives.

Restricting cointegration analysis to VAR processes may be too restrictive. First, it is well-known since Zellner and Palm (1974) that VAR processes are not invariant with respect to marginalization, i.e., subsets of the variables of a VAR process are in general vector autoregressive moving average (VARMA) processes. Second, similar to the first argument, aggregation of VAR processes also leads to VARMA processes, an issue relevant, e.g., in the context of temporal aggregation and in mixed-frequency settings. Third, the linearized solutions to dynamic stochastic general equilibrium (DSGE) models are typically VARMA rather than VAR processes, see, e.g., Campbell (1994). Fourth, a VARMA model may be a more parsimonious description of the data generating process (DGP) than a VAR model, with parsimony becoming more important with increasing dimension of the process.3

If one accepts the above arguments as a motivation for considering VARMA processes in cointegration analysis, it is convenient to move to the—essentially equivalent (see Hannan and Deistler 1988, chps. 1 and 2)—state space framework. A key challenge when moving from VAR to VARMA models—or state space models—is that *identification* becomes an important issue for the latter model class, whereas unrestricted VAR models are (reduced-form) identified. In other words, there are so-called equivalence classes of VARMA models that lead to the same dynamic behavior of the observed process. As is well-known, to achieve identification, restrictions have to be placed on the coefficient matrices in the VARMA case, e.g., zero or exclusion restrictions. A mapping attaching to every transfer function, i.e, the function relating the error sequence to the observed process, a unique VARMA (or state space) system from the corresponding class of observationally equivalent systems is called *canonical form*. Since not all entries of the coefficient matrices in canonical form are free parameters, for statistical analysis a so-called *parameterization* is required that maps the free parameters from coefficient matrices in canonical form into a parameter vector. These issues, including the importance of the properties such as continuity and differentiability of parameterizations, are discussed in detail in Hannan and Deistler (1988, chp. 2) and, of course, are also relevant for our setting in this paper.

The convenience of the state space framework for unit root and cointegration analysis stems from the fact that (static and dynamic) cointegration can be characterized by orthogonality constraints, see Bauer and Wagner (2012), once an appropriate basis for the state vector, which is a (potentially singular) VAR process of order one, is chosen. The integration properties are governed by the eigenvalue structure of unit modulus eigenvalues of the system matrix in the state equation. Eigenvalues of unit modulus and orthogonality constraints arguably are easier restrictions to deal with or to implement than the interrelated rank restrictions considered in the VAR or VARMA setting. The canonical form of Bauer and Wagner (2012) is designed for cointegration analysis by using a basis of the state vector that puts the unit root and cointegration properties to the center and forefront. Consequently, these results are key input for the present paper and are thus briefly reviewed in Section 3.

<sup>2</sup> The complexity of these inter-relations is probably well illustrated by the fact that only Jensen (2013) notes that "even though the I(2) models are formulated as submodels of I(1) models, some I(1) models are in fact submodels of I(2) models".

<sup>3</sup> The literature often uses VAR models as approximations, based on the fact that VARMA processes often can be approximated by VAR models with the order tending to infinity with the sample size at certain rates. This line of work goes back to Lewis and Reinsel (1985) for stationary processes and was extended to (co)integrated processes by Saikkonen (1992), Saikkonen and Luukkonen (1997) and Bauer and Wagner (2005). In addition to the issue of the existence and properties of a sequence of VAR approximations, the question whether a VAR approximation is parsimonious remains.

An important problem with respect to appropriately defining the "free parameters" in VARMA models is the fact that no continuous parameterization of all VARMA or state space models of a certain order *n* exists in the multivariate case (see Hazewinkel and Kalman 1976). This implies that the model set, *Mn* say, has to be partitioned into subsets on which continuous parameterizations exist, i.e., *Mn* = 1 <sup>Γ</sup>∈*<sup>G</sup> <sup>M</sup>*<sup>Γ</sup> for some multi-index <sup>Γ</sup> varying in an index set *<sup>G</sup>*. Based on the canonical form of Bauer and Wagner (2012), the partitioning is according to systems—in addition to other restrictions such as fixed order *n*—with fixed unit root properties, to be precise over systems with given state space unit root structure. This has the advantage that, e.g., pseudo maximum likelihood (PML) estimation can straightforwardly be performed over systems with fixed unit root properties without any further ado, i.e., without having to consider (or ignore) rank restrictions on polynomial matrices. The definition and detailed discussion of the properties of this parameterization is the first main result of the paper.

The second main set of results, provided in Section 4, is a detailed discussion of the relationships between the different subsets of models *M*<sup>Γ</sup> for different indices Γ and the parameterization of the respective model sets. Knowledge concerning these relations is important to understand the asymptotic behavior of PML estimators and pseudo likelihood ratio tests based on them. In particular, the structure of the closures of *M*, *M* say, of the considered model set *M* has to be understood, since the difference *<sup>M</sup>* \ *<sup>M</sup>* cannot be avoided when maximizing the pseudo likelihood function4. Additionally, the inclusion properties between different sets *M*<sup>Γ</sup> need to be understood, as this knowledge is important for developing hypothesis tests, in particular for developing hypothesis tests for the dimensions of cointegrating spaces. Hypotheses testing, with a focus on the MFI(1) and I(2) cases, is discussed in Section 5, which shows how the parameterization results of the paper can be used to formulate a large number of hypotheses on (static and polynomial) cointegrating relationships as considered in the VAR cointegration literature. This discussion also includes commonly used deterministic components such as intercept, seasonal dummies, and linear trend, as well as restrictions on these components.

The paper is organized as follows: Section 2 briefly reviews VAR and VARMA models with unit roots and cointegration and discusses some of the complications arising in the VARMA case in addition to the complications arising due to the presence of unit roots and cointegration already in the VAR case. Section 3 presents the canonical form and the parameterization based on it, with the discussion starting with the multiple frequency I(1)—MFI(1)—and I(2) cases prior to a discussion of the general case. This section also provides several important definitions like, e.g., of the state space unit root structure. Section 4 contains a detailed discussion concerning the topological structure of the model sets and Section 5 discusses testing of a large number of hypotheses on the cointegrating spaces commonly tested in the cointegration literature. The discussion in Section 5 focuses on the empirically most relevant MFI(1) and I(2) cases and includes the usual deterministic components considered in the literature. Section 6 briefly summarizes and concludes the paper. All proofs are relegated to the Appendices A and B.

Throughout we use the following notation: *<sup>L</sup>* denotes the lag operator, i.e., *<sup>L</sup>*({*xt*}*t*∈Z) := {*xt*−1}*t*∈Z, for brevity written as *Lxt* <sup>=</sup> *xt*−1. For a matrix *<sup>γ</sup>* <sup>∈</sup> <sup>C</sup>*s*×*<sup>r</sup>* , *<sup>γ</sup>* <sup>∈</sup> <sup>C</sup>*r*×*<sup>s</sup>* denotes its conjugate transpose. For *<sup>γ</sup>* <sup>∈</sup> <sup>C</sup>*s*×*<sup>r</sup>* with full column rank *<sup>r</sup>* <sup>&</sup>lt; *<sup>s</sup>*, we define *<sup>γ</sup>*<sup>⊥</sup> <sup>∈</sup> <sup>C</sup>*s*×(*s*−*r*) of full column rank such that *γ <sup>γ</sup>*<sup>⊥</sup> = 0. *Ip* denotes the *<sup>p</sup>*-dimensional identity matrix, 0*m*×*<sup>n</sup>* the *<sup>m</sup>* times *<sup>n</sup>* zero matrix. For two matrices *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*m*×*n*, *<sup>B</sup>* <sup>∈</sup> <sup>C</sup>*k*×*<sup>l</sup>* , *<sup>A</sup>* <sup>⊗</sup> *<sup>B</sup>* <sup>∈</sup> <sup>C</sup>*mk*×*nl* denotes the Kronecker product of *<sup>A</sup>* and *<sup>B</sup>*. For a complex valued quantity *x*, R(*x*) denotes its real part, I(*x*) its imaginary part and *x* its complex conjugate. For a set *<sup>V</sup>*, *<sup>V</sup>* denotes its closure.<sup>5</sup> For two sets *<sup>V</sup>* and *<sup>W</sup>*, *<sup>V</sup>* \ *<sup>W</sup>* denotes the difference of *<sup>V</sup>* and *W*, i.e., {*v* ∈ *V* : *v* ∈/ *W*}. For a square matrix *A* we denote the spectral radius (i.e., the maximum of the moduli of its eigenvalues) by *<sup>λ</sup>*|max|(*A*) and by det(*A*) its determinant.

<sup>4</sup> Below we often use the term "likelihood" as short form of "likelihood function".

<sup>5</sup> We are confident that this dual usage of notation does not lead to confusion.

#### **2. Vector Autoregressive, Vector Autoregressive Moving Average Processes and Parameterizations**

In this paper, we define VAR processes {*yt*}*t*∈Z, *yt* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , as solution of

$$a(L)y\_t = y\_t + \sum\_{j=1}^p a\_j y\_{t-j} = \varepsilon\_t + \Phi d\_{t\prime} \tag{1}$$

with *<sup>a</sup>*(*L*) := *Is* + <sup>∑</sup>*<sup>p</sup> <sup>j</sup>*=<sup>1</sup> *ajL<sup>j</sup>* , where *aj* <sup>∈</sup> <sup>R</sup>*s*×*<sup>s</sup>* for *<sup>j</sup>* <sup>=</sup> 1, ... , *<sup>p</sup>*, <sup>Φ</sup> <sup>∈</sup> <sup>R</sup>*s*×*m*, *ap* <sup>=</sup> 0, a white noise process {*εt*}*t*∈Z, *<sup>ε</sup><sup>t</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , with Σ := E(*εtε <sup>t</sup>*) <sup>&</sup>gt; 0 and a vector sequence {*dt*}*t*∈Z, *dt* <sup>∈</sup> <sup>R</sup>*m*, comprising deterministic components like, e.g., the intercept, seasonal dummies or a linear trend. Furthermore, we impose the *non-explosiveness* condition det *a*(*z*) = 0 for all |*z*| < 1, with *a*(*z*) := *Is* + <sup>∑</sup>*<sup>p</sup> <sup>j</sup>*=<sup>1</sup> *ajz<sup>j</sup>* and *<sup>z</sup>* denoting a complex variable.6

Thus, for *given* autoregressive order *p*, with—as defining characteristic of the order—*ap* = 0, the considered class of VAR models with *specified* deterministic components {*dt*}*t*∈<sup>Z</sup> is given by the set of all polynomial matrices *a*(*z*) such that (i) the non-explosiveness condition holds, (ii) *a*(0) = *Is* and (iii) *ap* <sup>=</sup> 0; together with the set of all matrices <sup>Φ</sup> <sup>∈</sup> <sup>R</sup>*s*×*m*.

Equivalently, the model class can be characterized by a set of rational matrix functions *k*(*z*) := *a*(*z*)−1, referred to as *transfer functions*, and the input-output description for the deterministic variables, i.e.,

$$\begin{aligned} V\_{p,\spadesuit} &:= \quad V\_p \times \mathbb{R}^{s \times m}, \\ V\_p &:= \quad \left\{ k(z) = \sum\_{j=0}^{\infty} k\_j z^j = a(z)^{-1} : a(z) = I\_s + \sum\_{j=1}^p a\_j z^j, \det a(z) \neq 0 \text{ for } |z| < 1, a\_p \neq 0 \right\}. \end{aligned}$$

The associated parameter space is <sup>Θ</sup>*p*,<sup>Φ</sup> :<sup>=</sup> <sup>Θ</sup>*<sup>p</sup>* <sup>×</sup> <sup>R</sup>*sm* <sup>⊂</sup> <sup>R</sup>*s*<sup>2</sup> *<sup>p</sup>*+*sm*, where the parameters

*θ* := [*θ <sup>a</sup>*, *θ* Φ] = [vec(*a*1) , . . . , vec(*ap*) , vec(Φ) ] (2)

are obtained from stacking the entries of the matrices *aj* and Φ, respectively.

**Remark 1.** *In the above discussion the parameters, θ*<sup>Σ</sup> *say, describing the variance covariance matrix* Σ *of ε<sup>t</sup> are not considered. These can be easily included, similarly to* Φ *by, e.g., parameterizing positive definite symmetric s* × *s matrices via their lower triangular Cholesky factor. This leads to a parameter space* Θ*p*,Φ,<sup>Σ</sup> ⊂ R*s*<sup>2</sup> *<sup>p</sup>*+*sm*<sup>+</sup> *<sup>s</sup>*(*s*+1) <sup>2</sup> *. We omit θ*Σ *for brevity, since typically no cross-parameter restrictions involving parameters corresponding to* Σ *are considered, whereas as discussed in Section 5 parameter restrictions involving—in this paper in the state space rather than the VAR setting—both elements of* Θ*<sup>p</sup> and* Φ*, to, e.g., impose the absence of a linear trend in the cointegrating space, are commonly considered in the cointegration literature.*<sup>7</sup> *The estimator of the variance covariance matrix* Σ *often equals the sample variance of suitable residuals ε*ˆ*t*(*θ*) *from* (1)*, if there are no cross-restrictions between θ and θ*Σ*. This holds, e.g., for the Gaussian pseudo maximum likelihood estimator. Thus, explicitly including θ*<sup>Σ</sup> *and* ΘΣ *in the discussion would only overload notation without adding any additional insights, given the simple nature of the parameterization of* Σ*.*

<sup>6</sup> Our definition of VAR processes differs to a certain extent from some widely used definitions in the literature. Given our focus on unit root and cointegration analysis we, unlike Hannan and Deistler (1988), allow for determinantal roots at the unit circle that, as is well known, lead to integrated processes. We also include deterministic components in our definition, i.e., we allow for a special case of exogenous variables, compare also Remark 2 below. There is, however, also a large part of the literature that refers to this setting simply as (cointegrated) vector autoregressive models, see, e.g., Johansen (1995) and Juselius (2006).

<sup>7</sup> Of course, the statistical properties of the parameter estimators depend in many ways on the deterministic components.

**Remark 2.** *Our consideration of deterministic components is a special case of including exogenous variables. We include exogenous deterministic variables with a static input-output behavior governed solely by the matrix* Φ*. More general exogenous variables that are dynamically related to the output* {*yt*}*t*∈<sup>Z</sup> *could be considered, thereby considering so-called VARX models rather than VAR models, which would necessitate considering in addition to the transfer function k*(*z*) *also a transfer function l*(*z*)*, say, linking the exogenous variables dynamically to the output.*

For the VAR case, the fact that the mapping assigning a given transfer function *k*(*z*) ∈ *Vp*, to a parameter vector *θ<sup>a</sup>* ∈ Θ*p*—the parameterization—is continuous with continuously differentiable inverse is immediate.8 Homeomorphicity of a parameterization is important for the properties of parameter estimators, e.g., the ordinary least squares (OLS) or Gaussian PML estimator, compare the discussion in Hannan and Deistler (1988, Theorem 2.5.3 and Remark 1, p. 65).

For OLS estimation one typically considers the larger set *VOLS <sup>p</sup> without* the non-explosiveness condition and *without* the assumption *ap* = 0:

$$V\_p^{OLS} \quad := \quad \left\{ k(z) = \sum\_{j=0}^{\infty} k\_j z^j = a(z)^{-1} : a(z) = I\_\\$ + \sum\_{j=1}^p a\_j z^j \right\}.$$

Considering *VOLS <sup>p</sup>* allows for unconstrained optimization. It is well-known that for {*εt*}*t*∈<sup>Z</sup> as given above, the OLS estimator is consistent over the larger set *VOLS <sup>p</sup>* , i.e., without imposing non-explosiveness and also when specifying *p* too high. Alternatively, and closely related to OLS in the VAR case, the pseudo likelihood can be maximized over Θ*p*,Φ. With this approach, maxima respectively suprema can occur at the boundary of the parameter space, i.e., maximization effectively has to consider Θ*p*,Φ. It is well-known that the PML estimator is consistent for the stable case (cf. Hannan and Deistler 1988, Theorem 4.2.1), but the maximization problem is complicated by the restrictions on the parameter space stemming from the non-explosiveness condition. Avoiding these complications and asymptotic equivalence of OLS and PML in the stable VAR case explains why VAR models are usually estimated by OLS.9

To be more explicit, ignore deterministic components for a moment and consider the case where the DGP is a stationary VAR process, i.e., a solution of (1) with *a*(*z*) satisfying the *stability* condition det *a*(*z*) = 0 for |*z*| ≤ 1. Define the corresponding set of *stable* transfer functions by *Vp*,•:

$$V\_{p, \bullet} \quad := \left\{ a(z)^{-1} \in V\_p : \det a(z) \neq 0 \text{ for } |z| \le 1, a\_p \neq 0 \right\}.$$

Clearly, *Vp*,• is an open subset of *Vp*. If the DGP is a stationary VAR process, the above-mentioned consistency result of the OLS estimator over *VOLS <sup>p</sup>* implies that the probability that the estimated transfer function, ˆ *<sup>k</sup>*(*z*) = *<sup>a</sup>*ˆ(*z*)−<sup>1</sup> say, is contained in *Vp*,• converges to one as the sample size tends to infinity. Moreover, the asymptotic distribution of the estimated parameters is normal, under appropriate assumptions on {*εt*}*t*∈Z.

The situation is a bit more involved if the transfer function of the DGP corresponds to a point in the set *Vp*,• \ *Vp*,•, which contains systems with *unit roots*, i.e., determinantal roots of *a*(*z*) on the unit circle, as well as lower order autoregressive systems—with these two cases non-disjoint. The stable lower order case is relatively unproblematic from a statistical perspective. If, e.g., OLS estimation is performed over *VOLS <sup>p</sup>* , while the true model corresponds to an element in *Vp*∗,•, with *p*<sup>∗</sup> < *p*, the OLS estimator is

<sup>8</sup> The set *Vp* is endowed with the *pointwise topology Tpt*, defined in Section 3. For now, in the context of VAR models, it suffices to know that convergence in pointwise topology is equivalent to convergence of the VAR coefficient matrices *a*1, ... , *ap* in the Frobenius norm.

<sup>9</sup> Please note that in case of restricted estimation, i.e., zero restrictions or cross-equation restrictions, OLS is not asymptotically equivalent to PML in general.

still consistent, since *Vp*∗,• <sup>⊂</sup> *<sup>V</sup>OLS <sup>p</sup>* . Furthermore, standard chi-squared pseudo likelihood ratio test based inference still applies. The integrated case, for a precise definition see the discussion below Definition 1, is a bit more difficult to deal with, as in this case not all parameters are asymptotically normally distributed and nuisance parameters may be present. Consequently, parameterizations that do not take the specific nature of unit root processes into account are not very useful for inference in the unit root case, see, e.g., Sims et al. (1990, Theorem 1). Studying the unit root and cointegration properties is facilitated by resorting to suitable parameterizations that "zoom in on the relevant characteristics".

In case that the only determinantal root of *a*(*z*) on the unit circle is at *z* = 1, the system corresponds to a so-called *I*(*d*) process, with the integration order *d* > 0 made precise in Definition 1 below. Consider first the I(1) case: As is well-known, the rank of the matrix *a*(1) equals the dimension of the cointegrating space given in Definition 3 below—also referred to as the cointegrating rank. Therefore, determination of the rank of this matrix is of key importance. With the parameterization used so far, imposing a certain (maximal) rank on *a*(1) implies complicated restrictions on the matrices *aj*, *j* = 1, ... , *p*. This in turn renders the correspondingly restricted optimization unnecessarily complicated and not conducive to develop tests for the cointegrating rank. It is more convenient to consider the so-called *vector error correction model* (VECM) representation of autoregressive processes, discussed in full detail in the monograph Johansen (1995). To this end let us first introduce the differencing operator at frequency 0 ≤ *ω* ≤ *π*

$$\Delta\_{\omega} \quad := \begin{cases} \ I\_{\mathfrak{s}} - 2 \cos(\omega) L + L^2 & \text{for } 0 < \omega < \pi \\\ I\_{\mathfrak{s}} - \cos(\omega) L & \text{for } \omega \in \{0, \pi\} \end{cases} \tag{3}$$

For notational brevity, we omit the dependence on *L* in Δ*ω*(*L*), henceforth denoted as Δ*ω*. Using this notation, the I(1) error correction representation is given by

$$\Delta\_0 y\_t = -\Pi y\_{t-1} + \sum\_{j=1}^{p-1} \Gamma\_j \Delta\_0 y\_{t-j} + \varepsilon\_t + \Phi d\_t \tag{4}$$

$$= -a\beta' y\_{t-1} + \sum\_{j=1}^{p-1} \Gamma\_j \Delta\_0 y\_{t-j} + \varepsilon\_t + \Phi d\_{t\_t}$$

with the matrix <sup>Π</sup> :<sup>=</sup> <sup>−</sup>*a*(1) = <sup>−</sup>(*Is* <sup>+</sup> <sup>∑</sup>*<sup>p</sup> <sup>j</sup>*=<sup>1</sup> *aj*) of rank 0 ≤ *r* ≤ *s* factorized into the product of two full rank matrices *<sup>α</sup>*, *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>r</sup>* and <sup>Γ</sup>*<sup>j</sup>* :<sup>=</sup> <sup>∑</sup>*<sup>p</sup> <sup>m</sup>*=*j*+<sup>1</sup> *am*, *j* = 1, . . . , *p* − 1.

This constitutes a reparameterization, where *k*(*z*) ∈ *Vp* is now represented by the matrices (*α*, *β*, Γ1, ... , Γ*p*−1) and a corresponding parameter vector *θ*VECM *<sup>a</sup>* ∈ ΘVECM *<sup>p</sup>*,*<sup>r</sup>* . Please note that stacking the entries of the matrices does not lead to a homeomorphic mapping from *Vp* to ΘVECM *<sup>p</sup>*,*<sup>s</sup>* , since for 0 < *r* ≤ *s* the matrices *α* and *β* are not identifiable from the product *αβ* , since *αβ* = *αMM*−1*β* = *α*˜ *β*˜ for all regular matrices *<sup>M</sup>* <sup>∈</sup> <sup>R</sup>*r*×*<sup>r</sup>* . One way to obtain identifiability is to introduce the restriction *β* = [*Ir*, *β*∗] , with *<sup>β</sup>*<sup>∗</sup> <sup>∈</sup> <sup>R</sup>(*s*−*r*)×*<sup>r</sup>* and *<sup>α</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>r</sup>* . With this additional restriction the parameter vector *θ*VECM *<sup>a</sup>* is given by stacking the vectorized matrices *α*, *β*∗, Γ1, ... , Γ*p*−1, similarly to (2). Then ΘVECM *<sup>p</sup>*,*r*,<sup>Φ</sup> = ΘVECM *<sup>p</sup>*,*<sup>r</sup>* <sup>×</sup> <sup>R</sup>*sm* <sup>⊂</sup> R*ps*2−(*s*−*r*)2+*sm*. Note for completeness that the normalization of *<sup>β</sup>* = [*Ir*, *<sup>β</sup>*∗] may necessitate a re-ordering of the variables in {*yt*}*t*∈<sup>Z</sup> since—without potential reordering—this parameterization implies a restriction of generality as, e.g., processes, where the first variable is integrated, but does not cointegrate with the other variables, cannot be represented.

Define the following sets of transfer functions:

$$\begin{aligned} V\_{p,r} &:= \left\{ a(z)^{-1} \in V\_p : \det a(z) \neq 0 \text{ for } \{z : |z| = 1, z \neq 1\}, \text{ rank}(a(1)) \le r \right\}, \\ V\_{p,r}^{RRR} &:= \left\{ a(z)^{-1} \in V\_p^{OLS} : \text{rank}(a(1)) \le r \right\}. \end{aligned}$$

*Econometrics* **2020**, *8*, 42

The dimension of the parameter vector *θ*VECM *<sup>a</sup>* depends on the dimension of the cointegrating space, thus the parameterization of *k*(*z*) ∈ *Vp*,*<sup>r</sup>* depends on *r*. The so-called reduced rank regression (RRR) estimator, given by the maximizer of the pseudo likelihood over *VRRR <sup>p</sup>*,*<sup>r</sup>* is consistent, see, e.g., Johansen (1995, chp. 6). The RRR estimator uses an "implicit" normalization of *β* and thereby implicitly addresses the mentioned identification problem. However, for testing hypotheses involving the free parameters in *α* or *β*, typically the identifying assumption given above is used, as discussed in Johansen (1995, chp. 7).

Furthermore, since *Vp*,*<sup>r</sup>* ⊂ *Vp*,*r*<sup>∗</sup> for *r* < *r*<sup>∗</sup> ≤ *s*, with ΘVECM *<sup>p</sup>*,*<sup>r</sup>* a lower dimensional subset of ΘVECM *<sup>p</sup>*,*r*<sup>∗</sup> , pseudo likelihood ratio testing can be used to sequentially test for the rank *r*, starting with the hypothesis of a rank *r* = 0 against the alternative of a rank 0 < *r* ≤ *s*, and increasing the assumed rank consecutively until the null hypothesis is not rejected.

Ensuring that {*yt*}*t*∈<sup>Z</sup> generated from (4) is indeed an I(1) process, requires on the one hand that Π is of reduced rank, i.e., *r* < *s* and on the other that the matrix

$$a'\_{\perp} \Gamma \beta\_{\perp} \quad := \quad a'\_{\perp} \left( I\_s - \sum\_{j=1}^{p-1} \Gamma\_j \right) \beta\_{\perp} \tag{5}$$

has full rank. It is well-known that condition (5) is fulfilled on the complement of a "thin" algebraic subset of *VRRR <sup>p</sup>*,*<sup>r</sup>* , and is therefore, ignored in estimation, as it is "generically" fulfilled.10

The I(2) case is similar in structure to the I(1) case, but with two rank restrictions and one full rank condition to exclude even higher integration orders. The corresponding VECM is given by

$$
\Delta\_0^2 y\_t = -a\beta' y\_{t-1} - \Gamma \Delta\_0 y\_{t-1} + \sum\_{j=1}^{p-2} \Psi\_j \Delta\_0^2 y\_{t-j} + \varepsilon\_{t\_f} \tag{6}
$$

with *<sup>α</sup>*, *<sup>β</sup>* as defined in (4), <sup>Γ</sup> as defined in (5) and <sup>Ψ</sup>*<sup>j</sup>* :<sup>=</sup> <sup>−</sup> <sup>∑</sup>*p*−<sup>1</sup> *<sup>k</sup>*=*j*+<sup>1</sup> Γ*k*, *j* = 1, ... , *p* − 2. From (5) we already know that reduced rank of

$$
\pi^{\prime}\_{\perp} \Gamma \beta\_{\perp} = :\_{\perp} \xi \eta^{\prime}, \tag{7}
$$

with *<sup>ξ</sup>*, *<sup>η</sup>* <sup>∈</sup> <sup>R</sup>(*s*−*r*)×*m*, *<sup>m</sup>* <sup>&</sup>lt; *<sup>s</sup>* <sup>−</sup> *<sup>r</sup>* is required for higher integration orders. The condition for the corresponding solution process {*yt*}*t*∈<sup>Z</sup> to be an I(2) process is given by full rank of

$$\xi\_{\perp}^{\prime}\alpha\_{\perp}^{\prime}\left(\Gamma\beta(\beta^{\prime}\beta)^{-1}(\alpha^{\prime}\alpha)^{-1}\alpha^{\prime}\Gamma+I\_{\mathfrak{s}}-\sum\_{j=1}^{p-2}\Psi\_{j}\right)\beta\perp\eta\perp\omega$$

which again is typically ignored in estimation, just like condition (5) in the I(1) case. Thus, I(2) processes correspond to a "thin subset" of *VRRR <sup>p</sup>*,*<sup>r</sup>* , which in turn constitutes a "thin subset" of *VOLS <sup>p</sup>* . The fact that integrated processes correspond to "thin sets" in *VOLS <sup>p</sup>* implies that obtaining estimated systems with specific integration and cointegration properties requires restricted estimation based on parameterizations tailor made to highlight these properties.

Already for the I(2) case, formulating parameterizations that allow conveniently studying the integration and cointegration properties is a quite challenging task. Johansen (1997) contains several different (re-)parameterizations for the I(2) case and Paruolo (1996) defines "integration indices", *<sup>r</sup>*0,*r*1,*r*<sup>2</sup> say, as the number of columns of the matrices *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*s*×*r*<sup>0</sup> , *<sup>β</sup>*<sup>1</sup> :<sup>=</sup> *<sup>β</sup>*⊥*<sup>η</sup>* <sup>∈</sup> <sup>R</sup>*s*×*r*<sup>1</sup> and *<sup>β</sup>*<sup>2</sup> :<sup>=</sup> *<sup>β</sup>*⊥*η*<sup>⊥</sup> <sup>∈</sup> <sup>R</sup>*s*×*r*<sup>2</sup> . Clearly, the indices *<sup>r</sup>*0,*r*1,*r*<sup>2</sup> are linked to the ranks of the above matrices <sup>Π</sup> and *<sup>α</sup>* ⊥Γ*β*⊥, as *r*<sup>0</sup> = *r* and *r*<sup>1</sup> = *m* and the columns of [*β*, *β*1, *β*2] form a basis of R*<sup>s</sup>* , such that *s* = *r*<sup>0</sup> + *r*<sup>1</sup> + *r*2.

<sup>10</sup> A similar property holds for *VRRR <sup>p</sup>*,*<sup>r</sup>* being a "thin" subset of *VOLS <sup>p</sup>* . This implies that the probability that the OLS estimator calculated over *VOLS <sup>p</sup>* corresponds to an element *VRRR <sup>p</sup>*,*<sup>r</sup>* <sup>⊂</sup> *<sup>V</sup>OLS <sup>p</sup>* is equal to zero in general.

It holds that {*β* <sup>2</sup>*yt*}*t*∈<sup>Z</sup> is an I(2) process without cointegration and {*β* <sup>1</sup>*yt*}*t*∈<sup>Z</sup> is an I(1) process without cointegration. The process {*β yt*}*t*∈<sup>Z</sup> is typically *<sup>I</sup>*(1) and in this case cointegrates with {*β* <sup>2</sup>Δ0*yt*}*t*∈<sup>Z</sup> to stationarity. Thus, there is a direct correspondence of these indices to the dimensions of the different cointegrating spaces—both static and dynamic (with precise definitions given below in Definition 3). <sup>11</sup> Please note that again, as already before in the I(1) case, different values of the integration indices *r*0,*r*1,*r*2, lead to parameter spaces of different dimensions. Furthermore, in these parameterizations matrices describing different cointegrating spaces are (i) not identified and (ii) linked by restrictions, compare the discussion in Paruolo (2000, sct. 2.2) and (7). These facts render the analysis of the cointegration properties in I(2) VAR systems complicated. Also, in the I(2) VAR case usually some forms of RRR estimators are considered over suitable subsets *VRRR <sup>p</sup>*,*r*,*<sup>m</sup>* of *VRRR <sup>p</sup>*,*<sup>r</sup>* , again based on implicit normalizations. Inference, however, again requires one to consider parameterizations explicitly.

Estimation and inference issues are fundamentally more complex in the VARMA case than in the VAR case. This stems from the fact that unrestricted estimation—unlike in the VAR case—is not possible due to a lack of identification, as discussed below. This means that in the VARMA case identification and parameterization issues need to be tackled as the first step, compare the discussion in Hannan and Deistler (1988, chp. 2).

In this paper, we consider VARMA processes as solutions of the vector difference equation

$$\begin{array}{rcl} y\_t + \sum\_{j=1}^p a\_j y\_{t-j} & = & \varepsilon\_t + \sum\_{j=1}^q b\_j \varepsilon\_{t-j} + \Phi d\_{t,\*} \end{array}$$

with *<sup>a</sup>*(*L*) := *Is* + <sup>∑</sup>*<sup>p</sup> <sup>j</sup>*=<sup>1</sup> *ajL<sup>j</sup>* , where *aj* <sup>∈</sup> <sup>R</sup>*s*×*<sup>s</sup>* for *<sup>j</sup>* <sup>=</sup> 1, ... , *<sup>p</sup>*, *ap* <sup>=</sup> 0 and the non-explosiveness condition det(*a*(*z*)) <sup>=</sup> 0 for <sup>|</sup>*z*<sup>|</sup> <sup>&</sup>lt; 1. Similarly, *<sup>b</sup>*(*L*) :<sup>=</sup> *Is* <sup>+</sup> <sup>∑</sup>*<sup>q</sup> <sup>j</sup>*=<sup>1</sup> *bjL<sup>j</sup>* , where *bj* <sup>∈</sup> <sup>R</sup>*s*×*<sup>s</sup>* for *<sup>j</sup>* <sup>=</sup> 1, ... , *<sup>q</sup>*, *bq* <sup>=</sup> 0 and <sup>Φ</sup> <sup>∈</sup> <sup>R</sup>*s*×*m*. The transfer function corresponding to a VARMA process is *<sup>k</sup>*(*z*) :<sup>=</sup> *<sup>a</sup>*(*z*)−1*b*(*z*).

It is well-known that without further restrictions the VARMA *realization* (*a*(*z*), *b*(*z*)) of the transfer function *k*(*z*) = *a*(*z*)−1*b*(*z*) is not identified, i.e., different pairs of polynomial matrices (*a*(*z*), *b*(*z*)) can realize the same transfer function *k*(*z*). It is clear that *k*(*z*) = *a*(*z*)−1*m*(*z*)−1*m*(*z*)*b*(*z*) = *a*(*z*)−1*b*(*z*) for all non-singular polynomial matrices *m*(*z*). Thus, the mapping *π* attaching the transfer function *k*(*z*) = *a*(*z*)−1*b*(*z*) to the pair of polynomial matrices (*a*(*z*), *b*(*z*)) is not *injective*. 12

Consequently, we refer for given rational transfer function *k*(*z*) to the class {(*a*(*z*), *b*(*z*)) : *k*(*z*) = *<sup>a</sup>*(*z*)−1*b*(*z*)} as a class of *observationally equivalent* VARMA realizations of *<sup>k</sup>*(*z*). To achieve identification requires to define a canonical form, selecting one member of each class of observationally equivalent VARMA realizations for a set of considered transfer functions. A first step towards a canonical form is to only consider *left coprime* pairs (*a*(*z*), *b*(*z*)). <sup>13</sup> However, left coprimeness is not sufficient for identification and thus further restrictions are required, leading to parameter vectors of smaller dimension than R*s*2(*p*+*q*). A widely used canonical form is the (reverse) echelon canonical form, see Hannan and Deistler (1988, Theorem 2.5.1, p. 59), based on (monic) normalizations of the diagonal elements of *a*(*z*) and degree relationships between diagonal and off-diagonal elements as well as the entries in *b*(*z*), which lead to zero restrictions. The (reverse) echelon canonical form in conjunction with a transformation to an error correction model was used in VARMA cointegration analysis in the I(1) case, e.g., in Poskitt (2006, Theorem 4.1), but, as for the VAR case, understanding the interdependencies of rank conditions already becomes complicated once one moves to the I(2) case.

<sup>11</sup> Below Example 3 we clarify how these indices are related to the state space unit root structure defined in Bauer and Wagner (2012, Definition 2) and link these to the dimensions of the cointegrating spaces in Section 5.2.

<sup>12</sup> Uniqueness of realizations in the VAR case stems from the normalization *m*(*z*)*b*(*z*) = *Is*, which reduces the class of

observationally equivalent VAR realizations of the same transfer function *<sup>k</sup>*(*z*) = *<sup>a</sup>*(*z*)−1*b*(*z*), with *<sup>b</sup>*(*z*) = *Is*, to a singleton. <sup>13</sup> The pair (*a*(*z*), *<sup>b</sup>*(*z*)) is left coprime if all its left divisors are unimodular matrices. Unimodular matrices are polynomial matrices with constant non-zero determinant. Thus, pre-multiplication of, e.g., *a*(*z*) with a unimodular matrix *u*(*z*) does not affect the determinantal roots that shape the dynamic behavior of the solutions of VAR models.

In the VARMA case matters are further complicated by another well-known problem that makes statistical analysis considerably more involved compared to the VAR case. Although there exists a generalization of the autoregressive order to the VARMA case, such that any transfer function corresponding to a VARMA system has an *order <sup>n</sup>* <sup>∈</sup> <sup>N</sup> (with the precise definition given in the next section) it is known since Hazewinkel and Kalman (1976) that no continuous parameterization of all rational transfer functions of order *n* exists if *s* > 1. Therefore, if one wants to keep the above-discussed advantages that continuity of a parameterization provides, the set of transfer functions of order *n*, henceforth referred to as *Mn*, has to be partitioned into sets on which continuous parameterizations exist, i.e., *Mn* = 1 <sup>Γ</sup>∈*<sup>G</sup> <sup>M</sup>*Γ, for some index set *<sup>G</sup>*, as already mentioned in the introduction.14 For any given partitioning of the set *Mn* it is important to understand the relationships between the different subsets *M*Γ, as well as the closures of the pieces *M*Γ, since in case of misspecification of *M*<sup>Γ</sup> points in *M*<sup>Γ</sup> \ *M*<sup>Γ</sup> cannot be avoided even asymptotically in, e.g., pseudo maximum likelihood estimation. These are more complicated issues in the VARMA case than in the VAR case, see the discussion in Hannan and Deistler (1988, Remark 1 after Theorem 2.5.3).

Based on these considerations, the following section provides and discusses a parameterization that focuses on unit root and cointegration properties, resorting to the state space framework that—as mentioned in the introduction—provides advantages for cointegration analysis. In particular, we derive an almost everywhere homeomorphic parameterization, based on partitioning the set of all considered transfer functions according to a multi-index Γ that contains, among other elements, the state space unit root structure. This implies that certain cointegration properties are invariant for all systems corresponding to a subset *M*Γ, i.e., the parameterization allows to directly impose cointegration properties such as the "cointegration indices" of Paruolo (1996) mentioned before.

#### **3. The Canonical Form and the Parameterization**

As a first step we define the class of VARMA processes considered in this paper, using the differencing operator defined in (3):

**Definition 1.** *The s-dimensional real VARMA process* {*yt*}*t*∈<sup>Z</sup> *has* unit root structure <sup>Ω</sup> := ((*ω*1, *<sup>h</sup>*1),...,(*ωl*, *hl*)) *with* <sup>0</sup> <sup>≤</sup> *<sup>ω</sup>*<sup>1</sup> <sup>&</sup>lt; *<sup>ω</sup>*<sup>2</sup> <sup>&</sup>lt; ··· <sup>&</sup>lt; *<sup>ω</sup><sup>l</sup>* <sup>≤</sup> *<sup>π</sup>*, *hk* <sup>∈</sup> <sup>N</sup>, *<sup>k</sup>* <sup>=</sup> 1, ... , *<sup>l</sup>*, *<sup>l</sup>* <sup>≥</sup> <sup>1</sup>*, if it is a solution of the difference equation*

$$
\Delta\_{\Omega}(y\_t - \Phi d\_t) := \prod\_{k=1}^{l} \Delta\_{\omega\_k}^{h\_k}(y\_t - \Phi d\_t) = v\_{t\prime} \tag{8}
$$

*where* {*dt*}*t*∈<sup>Z</sup> *is an m-dimensional deterministic sequence,* <sup>Φ</sup> <sup>∈</sup> <sup>R</sup>*s*×*<sup>m</sup> and* {*vt*}*t*∈<sup>Z</sup> *is a linearly regular stationary VARMA process, i.e., there exists a pair of left coprime matrix polynomials* (*a*(*z*), *b*(*z*)), det *a*(*z*) = 0, <sup>|</sup>*z*| ≤ <sup>1</sup> *such that vt* <sup>=</sup> *<sup>a</sup>*(*L*)−1*b*(*L*)(*εt*) =: *<sup>c</sup>*(*L*)(*εt*) *for a white noise process* {*εt*}*t*∈<sup>Z</sup> *with* <sup>E</sup>(*εtε <sup>t</sup>*) = Σ > 0*, with furthermore c*(*z*) <sup>=</sup> <sup>0</sup> *for z* <sup>=</sup> *<sup>e</sup>iω<sup>k</sup>* , *<sup>k</sup>* <sup>=</sup> 1, . . . , *l.*


A linearly regular stationary VARMA process has empty unit root structure Ω<sup>0</sup> := {}.

<sup>14</sup> When using the echelon canonical form, the partitioning is according to the so-called *Kronecker indices* related to a basis selection for the row-space of the *Hankel* matrix corresponding to the transfer function *k*(*z*), see, e.g., Hannan and Deistler (1988, chp. 2.4) for a precise definition.

As discussed in Bauer and Wagner (2012) the state space framework is convenient for the analysis of VARMA unit root processes. Detailed treatments of the state space framework are given in Hannan and Deistler (1988) and—in the context of unit root processes—Bauer and Wagner (2012).

A state space representation of a unit root VARMA process is15

$$\begin{array}{rcl} y\_t &=& \mathbb{C}\mathbf{x}\_t + \Phi d\_t + \varepsilon\_{t\prime} \\ \mathbf{x}\_{t+1} &=& A\mathbf{x}\_t + B\varepsilon\_{t\prime} \end{array} \tag{9}$$

for a white noise process {*εt*}*t*∈Z, *<sup>ε</sup><sup>t</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , a deterministic process {*dt*}*t*∈Z, *dt* <sup>∈</sup> <sup>R</sup>*<sup>m</sup>* and the unobserved state process {*xt*}*t*∈Z, *xt* <sup>∈</sup> <sup>C</sup>*n*, *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, *<sup>B</sup>* <sup>∈</sup> <sup>C</sup>*n*×*<sup>s</sup>* , *<sup>C</sup>* <sup>∈</sup> <sup>C</sup>*s*×*<sup>n</sup>* and <sup>Φ</sup> <sup>∈</sup> <sup>R</sup>*s*×*m*.

**Remark 3.** *Bauer and Wagner (2012, Theorem 2) show that every real valued unit root VARMA process* {*yt*}*t*∈<sup>Z</sup> *as given in* (8) *has a real valued state space representation with* {*xt*}*t*∈<sup>Z</sup> *real valued and real valued system matrices* (*A*, *B*, *C*)*. Considering complex valued state space representations in* (9) *is merely for algebraic convenience, as in general some eigenvalues of A are complex valued. Note for completeness that Bauer and Wagner (2012) contains a detailed discussion why considering the A-matrix in the canonical form in (up to reordering) the Jordan normal form is useful for cointegration analysis. For the sake of brevity we abstain from including this discussion again in the present paper. The key aspect of this construction is its usefulness for cointegration analysis, which becomes visible in Remark 4, where the "simple" unit root properties of blocks of the state vector are discussed.*

The transfer function *k*(*z*) with real valued power series coefficients corresponding to a real valued unit root process {*yt*}*t*∈<sup>Z</sup> as given in Definition 1 is given by the rational matrix function *k*(*z*) = ΔΩ(*z*)−1*a*(*z*)−1*b*(*z*). The (possibly complex valued) matrix triple (*A*, *B*, *C*) *realizes* the transfer function *<sup>k</sup>*(*z*) if and only if *<sup>π</sup>*(*A*, *<sup>B</sup>*, *<sup>C</sup>*) :<sup>=</sup> *Is* <sup>+</sup> *zC*(*In* <sup>−</sup> *zA*)−1*<sup>B</sup>* <sup>=</sup> *<sup>k</sup>*(*z*). Please note that as for VARMA realizations, for a transfer function *k*(*z*) there exist multiple state space realizations (*A*, *B*, *C*), with possibly different state dimensions *n*. A state space system (*A*, *B*, *C*) is *minimal* if there exists no state space system of lower state dimension realizing the same transfer function *k*(*z*). The *order* of the transfer function *k*(*z*) is the state dimension of a minimal system (*A*, *B*, *C*) realizing *k*(*z*).

All minimal state space realizations of a transfer function *k*(*z*) only differ in the basis of the state (cf. Hannan and Deistler 1988, Theorem 2.3.4), i.e., *π*(*A*, *B*, *C*) = *π*(*A*˜, *B*˜, *C*˜) for two minimal state space systems (*A*, *<sup>B</sup>*, *<sup>C</sup>*) and (*A*˜, *<sup>B</sup>*˜, *<sup>C</sup>*˜) is equivalent to the existence of a regular matrix *<sup>T</sup>* <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* such that *A* = *TAT*˜ <sup>−</sup>1, *B* = *TB*˜, *C* = *CT*˜ <sup>−</sup>1. Thus, the matrices *A* and *A*˜ are similar for all minimal realizations of a transfer function *k*(*z*).

By imposing restrictions on the matrices of a minimal state space system (*A*, *B*, *C*) realizing *k*(*z*), Bauer and Wagner (2012, Theorem 2) provide a canonical form, i.e., a mapping of the set *Mn* of transfer functions with real valued power series coefficients defined below onto unique state space realizations (A, B, C). The set *Mn* is defined as

$$M\_n \quad := \quad \left\{ k(z) = \pi(A, B, \mathbb{C}) \Big| \begin{array}{c} \lambda\_{|\max|}(A) \le 1, \\ A \in \mathbb{R}^{n \times n}, B \in \mathbb{R}^{n \times s}, \mathbb{C} \in \mathbb{R}^{s \times n}, (A, B, \mathbb{C}) \text{ minimal} \end{array} \right\}.$$

To describe the necessary restrictions of the canonical form the following definition is useful:

<sup>15</sup> Here and below we will only consider state space systems in so-called innovation representation, with the same error in both the output equation and the state equation. Since every state space system has an innovation representation this is no restriction, compare Aoki (1990, chp. 7.1).

**Definition 2.** *A matrix <sup>B</sup>* = [*bi*,*j*]*i*=1,...,*c*,*j*=1,...,*<sup>s</sup>* <sup>∈</sup> <sup>C</sup>*c*×*<sup>s</sup> is positive upper triangular (p.u.t.) if there exist integers* <sup>1</sup> <sup>≤</sup> *<sup>j</sup>*<sup>1</sup> <sup>≤</sup> *<sup>j</sup>*<sup>2</sup> ≤ ··· ≤ *jc* <sup>≤</sup> *s, such that for ji* <sup>≤</sup> *<sup>s</sup> we have bi*,*<sup>j</sup>* <sup>=</sup> 0, *<sup>j</sup>* <sup>&</sup>lt; *ji*, *ji* <sup>&</sup>lt; *ji*+1*, bi*,*ji* <sup>∈</sup> <sup>R</sup>+*; i.e., B is of the form*

$$B\_{\mathbf{a}} = \begin{bmatrix} 0 & \cdots & 0 & b\_{1,j\_1} & \* & & \dots & & \* \\ 0 & \cdots & & 0 & b\_{2,j\_2} & \* & & \\ & & & & & & \\ 0 & & & \dots & & 0 & b\_{c,j\_c} & \* \end{bmatrix}\_{\mathbf{a}}$$

*where the symbol* ∗ *indicates unrestricted complex-valued entries.*

A unique state space realization of *k*(*z*) ∈ *Mn* is given as follows (cf. Bauer and Wagner 2012, Theorem 2):

**Theorem 1.** *For every transfer function k*(*z*) ∈ *Mn there exists a unique minimal (complex) state space realization* (A, B, C) *such that*

$$\begin{array}{rcl} y\_t &=& \mathcal{C} \mathbf{x}\_{t, \mathbb{C}} + \Phi d\_t + \varepsilon\_{t, \mathbb{C}}\\ \mathbf{x}\_{t+1, \mathbb{C}} &=& \mathcal{A} \mathbf{x}\_{t, \mathbb{C}} + \mathcal{B} \varepsilon\_t \end{array}$$

*with:*

*(i)* <sup>A</sup> :<sup>=</sup> *diag*(A*u*, A•) :<sup>=</sup> *diag*(A1,C, ... , <sup>A</sup>*l*,C, A•)*,* <sup>A</sup>*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*nu*×*nu* , A• <sup>∈</sup> <sup>R</sup>*n*•×*n*• , *where it holds for k* = 1, . . . , *l that*

$$- \qquad \text{for } 0 < \omega\_k < \pi ;$$

$$\mathcal{A}\_{k,\mathbb{C}} \quad := \left[ \begin{array}{ccc} J\_k & 0 \\ 0 & \mathsf{J}\_k \end{array} \right] \in \mathbb{C}^{2d^k \times 2d^k},$$

**–** *for ω<sup>k</sup>* ∈ {0, *π*}*:*

$$\mathcal{A}\_{k,\mathbb{C}} \quad := \quad f\_k \in \mathbb{R}^{d^k \times d^k},$$

*with*

$$I\_{k} \ := \begin{bmatrix} \Xi\_{k}^{\mathsf{T}} I\_{\mathsf{d}\_{1}} & [I\_{\mathsf{d}\_{1}}, 0\_{\mathsf{d}\_{1}^{\mathsf{R}} \times (\mathsf{d}\_{2}^{\mathsf{R}} - \mathsf{d}\_{1}^{\mathsf{R}})] & 0 & \cdots & 0\\ 0\_{\mathsf{d}\_{1}^{\mathsf{T}} \times \mathsf{d}\_{1}^{\mathsf{R}}} & \Xi\_{k}^{\mathsf{T}} I\_{\mathsf{d}\_{2}^{\mathsf{R}}} & [I\_{\mathsf{d}\_{2}^{\mathsf{T}}}, 0\_{\mathsf{d}\_{2}^{\mathsf{R}} \times (\mathsf{d}\_{3}^{\mathsf{R}} - \mathsf{d}\_{2}^{\mathsf{R}})] & 0 & \vdots\\ 0 & 0 & \Xi\_{k}^{\mathsf{T}} I\_{\mathsf{d}\_{3}^{\mathsf{R}}} & \ddots & 0\\ \vdots & \vdots & \ddots & \ddots & [I\_{\mathsf{d}\_{k\_{k-1}}^{\mathsf{R}}}, 0\_{\mathsf{d}\_{k\_{k}-1}^{\mathsf{R}} \times (\mathsf{d}\_{k\_{k}}^{\mathsf{R}} - \mathsf{d}\_{k\_{k}}^{\mathsf{R}})] \\ 0 & 0 & \cdots & 0 & \Xi\_{k}^{\mathsf{T}} I\_{\mathsf{d}\_{k}^{\mathsf{R}}} \end{bmatrix},\tag{10}$$

*where* 0 < *d<sup>k</sup>* <sup>1</sup> <sup>≤</sup> *<sup>d</sup><sup>k</sup>* <sup>2</sup> ≤···≤ *<sup>d</sup><sup>k</sup> hk .*

	- **–** *for* 0 < *ω<sup>k</sup>* < *π:*

$$\mathcal{B}\_{k,\mathbb{C}} := \left[ \begin{array}{c} \mathcal{B}\_k \\ \overline{\mathcal{B}}\_k \end{array} \right] \in \mathbb{C}^{2d^k \times s} \text{ and } \mathcal{C}\_{k,\mathbb{C}} := \left[ \mathcal{C}\_k, \,\, \overline{\mathcal{C}}\_k \right] \in \mathbb{C}^{s \times 2d^k}.$$

**–** *for ω<sup>k</sup>* ∈ {0, *π*}*:*

$$\mathcal{B}\_{k,\mathbb{C}} := \mathcal{B}\_k \in \mathbb{R}^{d^k \times s} \text{ and } \mathcal{C}\_{k,\mathbb{C}} := \mathcal{C}\_k \in \mathbb{R}^{s \times d^k}.$$


**Remark 4.** *As indicated in Remark 3 and discussed in detail in Bauer and Wagner (2012) considering complex valued quantities is merely for algebraic convenience. For econometric analysis, interest is, of course, on real valued quantities. These can be straightforwardly obtained from the representation given in Theorem 1 as follows. First define a transformation matrix (and its inverse):*

$$T\_{\mathbb{R},d} \quad := \quad \left[ I\_d \otimes \begin{bmatrix} \mathbf{1} \\ i \end{bmatrix}, I\_d \otimes \begin{bmatrix} \mathbf{1} \\ -i \end{bmatrix} \right] \in \mathbb{C}^{2d \times 2d}, \quad T\_{\mathbb{R},d}^{-1} := \frac{1}{2} \begin{bmatrix} I\_d \otimes \begin{bmatrix} \mathbf{1} \ -i \end{bmatrix} \\ I\_d \otimes \begin{bmatrix} \mathbf{1} \ \mathbf{1} \end{bmatrix} \end{bmatrix}$$

.

*Starting from the complex valued canonical representation* (A, B, C)*, a real valued canonical representation*

$$\begin{array}{rcl} y\_t &=& \mathcal{C}\_{\mathbb{R}} \boldsymbol{x}\_{t,\mathbb{R}} + \Phi d\_t + \varepsilon\_{t,\prime} \\ \boldsymbol{x}\_{t+1,\mathbb{R}} &=& \mathcal{A}\_{\mathbb{R}} \boldsymbol{x}\_{t,\mathbb{R}} + \mathcal{B}\_{\mathbb{R}} \varepsilon\_{t,\prime} \end{array}$$

*with real valued matrices* (AR, BR, CR) *follows from using the just defined transformation matrix. In particular it holds that:*

$$\begin{array}{lcl} \mathcal{A}\_{\mathbb{R}} := & \operatorname{diag} (\mathcal{A}\_{\mathfrak{u},\mathbb{R}\prime} \mathcal{A}\_{\bullet}) & := & \operatorname{diag} (\mathcal{A}\_{1,\mathbb{R}\prime} \dots , \mathcal{A}\_{l,\mathbb{R}\prime} \mathcal{A}\_{\bullet}) \\ \mathcal{B}\_{\mathbb{R}} := & [\mathcal{B}\_{\mathfrak{u},\mathbb{R}\prime}^{\prime} \mathcal{B}\_{\bullet}^{\prime}]^{\prime} & := & [\mathcal{B}\_{1,\mathbb{R}\prime}^{\prime} \dots , \mathcal{B}\_{l,\mathbb{R}\prime}^{\prime} \mathcal{B}\_{\bullet}^{\prime}]^{\prime} \\ \mathcal{C}\_{\mathbb{R}} := & [\mathcal{C}\_{\mathfrak{u},\mathbb{R}\prime} \mathcal{C}\_{\bullet}] & := & [\mathcal{C}\_{1,\mathbb{R}\prime} \dots , \mathcal{C}\_{l,\mathbb{R}\prime} \mathcal{C}\_{\bullet}] \end{array}$$

*with*

$$\begin{array}{rcl} (\mathcal{A}\_{\textit{k},\mathtt{R}\prime}\mathcal{B}\_{\textit{k},\mathtt{R}\prime}\mathcal{C}\_{\textit{k},\mathtt{R}}) &:=& \begin{cases} \left(T\_{\textit{R},\mathtt{d}^{k}}\mathcal{A}\_{\textit{k}}T\_{\textit{R},\mathtt{d}^{k}}^{-1}\mathcal{T}\_{\textit{R},\mathtt{d}^{k}}\mathcal{B}\_{\textit{k}\prime}\mathcal{C}\_{\textit{k}}T\_{\textit{R},\mathtt{d}^{k}}^{-1}\right) & \text{if } 0 < \omega\_{\textit{k}} < \pi\_{\textit{r}}\\ \left(\mathcal{A}\_{\textit{k}\prime}\mathcal{B}\_{\textit{k}\prime}\mathcal{C}\_{\textit{k}}\right) & \text{if } \omega\_{\textit{k}} \in \{0,\pi\}. \end{cases} \end{array}$$

*Before we turn to the real valued state process corresponding to the real valued canonical representation, we first consider the complex valued state process* {*xt*,C}*t*∈<sup>Z</sup> *in more detail. This process is partitioned according to the partitioning of the matrices* C*k*,<sup>C</sup> *into xt*,<sup>C</sup> := [*x <sup>t</sup>*,*u*, *x t*,•] := [*x <sup>t</sup>*,1,C,..., *x <sup>t</sup>*,*l*,C, *x t*,•] *, where*

$$\mathfrak{x}\_{t,k,\mathbb{C}} \quad := \begin{cases} [\mathfrak{x}'\_{t,k'} \overline{\mathfrak{x}}'\_{t,k}]' & \text{if } 0 < \omega\_k < \pi\_\prime\\ \mathfrak{x}\_{t,k} & \text{if } \omega\_k \in \{0, \pi\} \end{cases}$$

*with*

$$\mathbf{x}\_{t+1,k} = f\_k \mathbf{x}\_{t,k} + \mathcal{B}\_k \boldsymbol{\varepsilon}\_{t\prime} \qquad \text{for } k = 1, \ldots, l.$$

*For k* = 1, ... , *l the sub-vectors xt*,*<sup>k</sup> are further decomposed into xt*,*<sup>k</sup>* := [(*x*<sup>1</sup> *t*,*k*) , ... ,(*xhk t*,*k*) ] *, with x<sup>j</sup> <sup>t</sup>*,*<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*d<sup>k</sup> j for j* = 1, . . . , *hk according to the partitioning* C*<sup>k</sup>* = [C*k*,1,..., C*k*,*hk* ]*.*

*The partitioning of the complex valued process* {*xt*,C}*t*∈<sup>Z</sup> *leads to an analogous partitioning of the real valued state process* {*xt*,R}*t*∈<sup>Z</sup>*, xt*,<sup>R</sup> := [*x <sup>t</sup>*,*u*,R, *x t*,•] := [*x <sup>t</sup>*,1,R,..., *x <sup>t</sup>*,*l*,R, *x t*,•] *, obtained from*

$$\mathfrak{x}\_{\mathfrak{t},k,\mathbb{R}} \quad := \begin{cases} T\_{\mathbb{R},d^k} \mathfrak{x}\_{\mathfrak{t},k,\mathbb{C}} & \text{if } 0 < \omega\_k < \pi\_\prime \\ \mathfrak{x}\_{\mathfrak{t},k} & \text{if } \omega\_k \in \{0, \pi\}, \end{cases}$$

*with the corresponding block of the state equation given by*

$$
\boldsymbol{\omega}\_{t+1,k,\mathbb{R}} = \mathcal{A}\_{k,\mathbb{R}} \boldsymbol{\omega}\_{t,k,\mathbb{R}} + \mathcal{B}\_{k,\mathbb{R}} \boldsymbol{\varepsilon}\_t.
$$

*For k* = 1, ... , *l the sub-vectors xt*,*k*,<sup>R</sup> *are further decomposed into xt*,*k*,<sup>R</sup> := [(*x*<sup>1</sup> *t*,*k*,R) , ... ,(*xhk t*,*k*,R) ] *, with x<sup>j</sup> <sup>t</sup>*,*k*,<sup>R</sup> <sup>∈</sup> <sup>R</sup>2*d<sup>k</sup> <sup>j</sup> if* <sup>0</sup> <sup>&</sup>lt; *<sup>ω</sup><sup>k</sup>* <sup>&</sup>lt; *<sup>π</sup> and <sup>x</sup><sup>j</sup> <sup>t</sup>*,*k*,<sup>R</sup> <sup>∈</sup> <sup>R</sup>*d<sup>k</sup> <sup>j</sup> if ω<sup>k</sup>* ∈ {0, *π*} *for j* = 1, ... , *hk and* C*k*,<sup>R</sup> := [C*k*,1,R,..., C*k*,*hk*,R] *decomposed accordingly.*

*Bauer and Wagner (2012, Theorem 3, p. 1328) show that the processes* {*x<sup>j</sup> <sup>t</sup>*,*k*,R}*t*∈<sup>Z</sup> *have unit root structure* ((*ωk*, *hk* − *j* + 1)) *for j* = 1, ... , *hk and k* = 1, ... , *l. Furthermore, for j* = 1, ... , *hk and k* = 1, ... , *l the processes* {*x<sup>j</sup> <sup>t</sup>*,*k*,R}*t*∈<sup>Z</sup> *are not cointegrated, as defined in Definition <sup>3</sup> below. For <sup>ω</sup><sup>k</sup>* <sup>=</sup> <sup>0</sup>*, the process* {*x<sup>j</sup> <sup>t</sup>*,*k*,R}*t*∈<sup>Z</sup> *is the d j <sup>k</sup>-dimensional process of* stochastic trends *of order <sup>h</sup>*<sup>1</sup> <sup>−</sup> *<sup>j</sup>* <sup>+</sup> <sup>1</sup>*, while the* <sup>2</sup>*d<sup>k</sup> <sup>j</sup> components of* {*x<sup>j</sup> <sup>t</sup>*,*k*,R}*t*∈Z*, for* 0 < *ω<sup>k</sup>* < *π, and the d<sup>k</sup> <sup>j</sup> components of* {*x<sup>j</sup> <sup>t</sup>*,*l*,R}*t*∈Z*, for <sup>ω</sup><sup>k</sup>* = *<sup>π</sup>, are referred to as* stochastic cycles *of order hk* − *j* + 1 *at their corresponding frequencies ωk.*

**Remark 5.** *Parameterizing the stable part of the transfer function using the echelon canonical form is merely one possible choice. Any other canonical form of the stable subsystem and suitable parameterization based on it can be used instead for the stable subsystem.*

**Remark 6.** *Starting from a state space system* (9) *with matrices* (A, B, C) *in canonical form, a solution for yt*, *t* > 0 *(with the solution for t* < 0 *obtained completely analogously)—for some x*<sup>1</sup> = [*x* 1,*u*, *x* 1,•] *—is given by*

$$y\_t = \sum\_{j=1}^{t-1} \mathcal{C}\_{\mathbf{u}} \mathcal{A}\_{\mathbf{u}}^{j-1} \mathcal{B}\_{\mathbf{u}} \varepsilon\_{t-j} + \mathcal{C}\_{\mathbf{u}} \mathcal{A}\_{\mathbf{u}}^{t-1} \mathbf{x}\_{1,\mathbf{u}} + \sum\_{j=1}^{t-1} \mathcal{C}\_{\bullet} \mathcal{A}\_{\bullet}^{j-1} \mathcal{B}\_{\bullet} \varepsilon\_{t-j} + \mathcal{C}\_{\bullet} \mathcal{A}\_{\bullet}^{t-1} \mathbf{x}\_{1,\bullet} + \Phi d\_t + \varepsilon\_{t-1}$$

*Clearly, the term* <sup>C</sup>*u*A*t*−<sup>1</sup> *<sup>u</sup> <sup>x</sup>*1,*<sup>u</sup> is stochastically singular and is effectively like a deterministic component, which may lead to an identification problem with* Φ*dt. If, the deterministic component* Φ*dt is rich enough to "absorb"* <sup>C</sup>*u*A*t*−<sup>1</sup> *<sup>u</sup> <sup>x</sup>*1,*u, then one solution of the identification problem is to set <sup>x</sup>*1,*<sup>u</sup>* <sup>=</sup> <sup>0</sup>*. Rich enough here means, e.g., in the I(1) case with* A*<sup>u</sup>* = *I that dt contains an intercept. Analogously, in the MFI(1) case dt has to contain seasonal dummy variables corresponding to all unit root frequencies. The term* C•A*t*−<sup>1</sup> • *<sup>x</sup>*1,• *decays exponentially and, therefore, does not impact the asymptotic properties of any statistical procedure. It is, therefore, inconsequential for statistical analysis but convenient (with respect to our definition of unit root processes) to set <sup>x</sup>*1,• <sup>=</sup> <sup>∑</sup><sup>∞</sup> *<sup>j</sup>*=<sup>1</sup> <sup>A</sup>*j*−<sup>1</sup> • B•*ε*1−*j. This corresponds to the steady state or stationary solution of the stable block of the state equation, and renders* {*xt*,•}*t*∈<sup>N</sup> *or, when the solution on* <sup>Z</sup> *is considered,* {*xt*,•}*t*∈<sup>Z</sup> *stationary. Please note that these issues with respect to starting values, potential identification problems and their impact or non-impact on statistical procedures also occur in the VAR setting.*

Bauer and Wagner (2012, Theorem 2) show that minimality of the canonical state space realization (A, B, C) implies full row rank of the p.u.t. blocks B*k*,*hk*,*<sup>j</sup>* of B*k*,*hk* . In addition to proposing the canonical form, Bauer and Wagner (2012) also provide details how to transform any minimal state space realization into canonical form: Given a minimal state space system (*A*, *B*, *C*) realizing the transfer function *<sup>k</sup>*(*z*) <sup>∈</sup> *Mn*, the first step is to find a similarity transformation *<sup>T</sup>* such that *<sup>A</sup>*˜ <sup>=</sup> *TAT*−<sup>1</sup> is of the form given in (10) by using an eigenvalue decomposition, compare Chatelin (1993). In the second step the corresponding subsystem (*A*˜ •, *<sup>B</sup>*˜•, *<sup>C</sup>*˜•) is transformed to echelon canonical form as described in Hannan and Deistler (1988, chp. 2). These two transformations do not lead to a unique realization, because the restrictions on A do not uniquely determine the *unstable subsystem* (A*u*, B*u*, C*u*).

For example, in the case <sup>Ω</sup> = ((*ω*1, *<sup>h</sup>*1)) = ((0, 1)), *<sup>n</sup>*• <sup>=</sup> 0, *<sup>d</sup>*<sup>1</sup> <sup>1</sup> < *s*, such that (*I d*1 1 , B1, C1) is a corresponding state space system, the same transfer function *<sup>k</sup>*(*z*) = *Is* <sup>+</sup> *<sup>z</sup>*C1(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*)−1B<sup>1</sup> <sup>=</sup> *Is* <sup>+</sup> <sup>C</sup>1B1*z*(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*)−<sup>1</sup> is realized also by all systems (*<sup>I</sup> d*1 1 , *<sup>T</sup>*B1, <sup>C</sup>1*T*−1), with some regular matrix *<sup>T</sup>* <sup>∈</sup> <sup>C</sup>*d*<sup>1</sup> 1×*d*<sup>1</sup> 1 . To find a unique realization the product C1B<sup>1</sup> needs to be uniquely decomposed into factors C<sup>1</sup> and B1. This is achieved by performing a QR decomposition of C1B<sup>1</sup> (without pivoting) that leads to C <sup>1</sup>C<sup>1</sup> = *I*. The additional restriction of B<sup>1</sup> being a p.u.t. matrix of full row rank then leads to a unique factorization of C1B<sup>1</sup> into C<sup>1</sup> and B1. In the general case with an arbitrary unit root structure Ω, similar arguments lead to p.u.t. restrictions on sub-blocks B*k*,*hk*,*<sup>j</sup>* in B*<sup>u</sup>* and orthogonality restrictions on sub-blocks of C*u*.

The canonical form introduced in Theorem 1 was designed to be useful for cointegration analysis. To see this, first requires a definition of static and polynomial cointegration (cf. Bauer and Wagner 2012, Definitions 3 and 4).

#### **Definition 3.**

	- **–** *<sup>F</sup>*(Ω˜ ) <sup>⊆</sup> *<sup>F</sup>*(Ω)*.*

**–** *For all <sup>ω</sup>* <sup>∈</sup> *<sup>F</sup>*(Ω˜ ) *for* ˜ *k and k such that ω*˜ ˜ *<sup>k</sup>* <sup>=</sup> *<sup>ω</sup><sup>k</sup>* <sup>=</sup> *<sup>ω</sup> it holds that* ˜ *h*˜ *<sup>k</sup>* ≤ *hk.*

*Furthermore,* <sup>Ω</sup>˜ <sup>≺</sup> <sup>Ω</sup> *if* <sup>Ω</sup>˜ <sup>Ω</sup> *and* <sup>Ω</sup>˜ <sup>=</sup> <sup>Ω</sup>*. For two unit root structures* <sup>Ω</sup>˜ <sup>Ω</sup> *define the decrease δk*(Ω, Ω˜ ) *of the integration order at frequency ωk, for k* = 1, . . . , *l, as*

$$\delta\_k(\Omega, \tilde{\Omega}) \quad := \begin{cases} h\_k - \tilde{h}\_k & \exists \tilde{k} : \tilde{\omega}\_{\tilde{k}} = \omega\_k \in F(\tilde{\Omega}), \\ h\_k & \omega\_k \notin F(\tilde{\Omega}) \end{cases}.$$

	- **–** *β*(*L*) ({*yt*}*t*∈Z) *has unit root structure* <sup>Ω</sup>˜ *,*
	- **–** *maxk*<sup>=</sup>1,...,*lβ*(*eiω<sup>k</sup>* )*δk*(Ω, <sup>Ω</sup>˜ ) <sup>=</sup> <sup>0</sup>*.*

*In this case the vector polynomial β*(*z*) *is a* polynomial cointegrating vector (PCIV) of order (Ω, Ω˜ )*. (v) All PCIVs of order* (Ω, Ω˜ ) *span the* polynomial cointegrating space of order (Ω, Ω˜ )*.*

<sup>16</sup> The definition of cointegrating spaces as linear subspaces allows to characterize them by a basis and implies a well-defined dimension. These advantages, however, have the implication that the zero vector is an element of all cointegrating spaces, despite not being a cointegrating vector in our definition, where the zero vector is excluded. This issue is well-known of course in the cointegration literature.

#### **Remark 7.**


To illustrate the advantages of the canonical form for cointegration analysis consider

$$\|y\_t\| = \sum\_{k=1}^{l} \sum\_{j=1}^{l\_k} \mathcal{C}\_{k,j,\mathbb{R}} \boldsymbol{x}\_{t,k,\mathbb{R}}^j + \mathcal{C}\_{\bullet} \boldsymbol{x}\_{t,\bullet} + \Phi d\_t + \varepsilon\_t.$$

By Remark 4, the process {*x<sup>j</sup> <sup>t</sup>*,*k*,R}*t*∈<sup>Z</sup> is not cointegrated. This implies that *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *β* = 0, reduces the integration order at unit root *zk* to *hk* − *j* if and only if *β* [C*k*,1,R, ... , C*k*,*j*,R] = 0 and *β* C*k*,*j*+1,<sup>R</sup> = 0 or equivalently *β* [C*k*,1, ... , C*k*,*j*] = 0 and *β* C*k*,*j*+<sup>1</sup> = 0 (using the transformation to the complex matrices of the canonical form, as discussed in Remark 4, and that *β* [C*k*, C*k*] = 0 if and only if *β* C*<sup>k</sup>* = 0). Thus, the CIVs are characterized by orthogonality to sub-blocks of C*u*.

The real valued representation given in Remark 4 used in its partitioned form just above immediately leads to necessary orthogonality constraint for polynomial cointegration of degree one:

*β*(*L*) (*yt*) = *β*(*L*) (C*u*,<sup>R</sup>*xt*,*u*,<sup>R</sup> + C•*xt*,• + Φ*dt* + *εt*) = *β* <sup>0</sup>C*u*,<sup>R</sup>*xt*,*u*,<sup>R</sup> + *β* <sup>1</sup>C*u*,<sup>R</sup>*xt*−1,*u*,<sup>R</sup> + *<sup>β</sup>*(*L*) (C•*xt*,• + Φ*dt* + *εt*) = *β* <sup>0</sup>C*u*,R(A*u*,<sup>R</sup>*xt*−1,*u*,<sup>R</sup> + B*u*,R*εt*−1) + *<sup>β</sup>* <sup>1</sup>C*u*,<sup>R</sup>*xt*−1,*u*,<sup>R</sup> + *<sup>β</sup>*(*L*) (C•*xt*,• + Φ*dt* + *εt*) = (*β* <sup>0</sup>C*u*,RA*u*,<sup>R</sup> + *β* <sup>1</sup>C*u*,R)*xt*−1,*u*,<sup>R</sup> + *<sup>β</sup>* <sup>0</sup>C*u*,RB*u*,R*εt*−<sup>1</sup> + *β*(*L*) (C•*xt*,• + Φ*dt* + *εt*) = (*β* <sup>0</sup>C*u*A*<sup>u</sup>* + *β* <sup>1</sup>C*u*)*xt*−1,*<sup>u</sup>* + *β* <sup>0</sup>C*u*B*uεt*−<sup>1</sup> + *β*(*L*) (C•*xt*,• + Φ*dt* + *εt*)

follows. Since all terms except the first are stationary or deterministic, a necessary condition for a reduction of the unit root structure is the orthogonality of [ *β* <sup>0</sup> *β* 1 ] to sub-blocks of <sup>C</sup>*u*,RA*u*,<sup>R</sup> C*u*,<sup>R</sup> or sub-blocks of the complex matrix <sup>C</sup>*u*A*<sup>u</sup>* C*u* . Please note, however, that this orthogonality condition is not sufficient for [*β* <sup>0</sup>, *β* 1] to be a PCIV, because it does not imply max*k*=1,...,*lβ*(*eiω<sup>k</sup>* )*δk*(Ω, <sup>Ω</sup>˜ ) <sup>=</sup> 0. For a detailed discussion of polynomial cointegration, when considering also higher polynomial degrees, see Bauer and Wagner (2012, sct. 5).

The following examples illustrate cointegration analysis in the state space framework for the empirically most relevant, i.e., the I(1), MFI(1) and I(2) cases.

**Example 1** (Cointegration in the I(1) case)**.** *In the I(1) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that*

$$\begin{array}{rclcrcl}y\_t &=& \mathcal{C}\_1 \mathbf{x}\_{t,1} + \boldsymbol{\varepsilon}\_{t\prime} & y\_{t\prime} \boldsymbol{\varepsilon}\_t \in \mathbb{R}^s, \mathbf{x}\_{t,1} \in \mathbb{R}^{d^1\_1}\_{\prime} \mathcal{C}\_1 \in \mathbb{R}^{s \times d^1\_1},\\\mathbf{x}\_{t+1,1} &=& \mathbf{x}\_{t,1} + \mathcal{B}\_1 \boldsymbol{\varepsilon}\_t & \mathcal{B}\_1 \in \mathbb{R}^{d^1\_1 \times s}.\end{array}$$

*The vector <sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *β* = 0*, is a CIV of order* ((0, 1), {}) *if and only if β* C<sup>1</sup> = 0*.*

**Example 2** (Cointegration in the MFI(1) case with complex unit root *zk*)**.** *In the MFI(1) case with unit root structure* Ω = ((*ωk*, 1)) *and complex unit root zk, neglecting the stable subsystem and the deterministic components for simplicity, it holds that*

$$\begin{split} y\_{t} &=& \mathcal{C}\_{k,\mathbb{R}} \mathbf{x}\_{t,k,\mathbb{R}} + \varepsilon\_{t} \\ &=& \left[ \begin{array}{cc} \mathcal{C}\_{k} & \overline{\mathcal{C}}\_{k} \end{array} \right] \left[ \begin{array}{cc} \mathbf{x}\_{t,k} \\ \mathbf{x}\_{t,k} \end{array} \right] + \varepsilon\_{t} . \end{split}$$

$$\begin{split} y\_{t\prime}\varepsilon\_{t} &\in \mathbb{R}^{s}, \mathbf{x}\_{t,k,\mathbb{R}} \in \mathbb{R}^{2d\_{1}^{k}}, \mathbf{x}\_{t,k} \in \mathbb{C}^{d\_{1}^{k}}, \mathcal{C}\_{k,\mathbb{R}} \in \mathbb{R}^{s \times 2d\_{1}^{k}}, \mathcal{C}\_{k} \in \mathbb{C}^{s \times d\_{1}^{k}}, \\ \left[ \begin{array}{cc} \mathbf{x}\_{t+1,k} \\ \mathbf{x}\_{t+1,k} \end{array} \right] &=& \left[ \begin{array}{cc} \mathbb{Z}\_{k}I\_{d\_{1}^{k}} & 0 \\ 0 & z\_{k}I\_{d\_{1}^{k}} \end{array} \right] \left[ \begin{array}{cc} \mathbf{x}\_{t,k} \\ \mathbf{x}\_{t,k} \end{array} \right] + \left[ \begin{array}{cc} \mathcal{B}\_{k} \\ \overline{\mathcal{B}}\_{k} \end{array} \right] \varepsilon\_{t}, \quad \mathcal{B}\_{k} \in \mathbb{C}^{d\_{1}^{k} \times s}. \end{split}$$

*The vector <sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *β* = 0, *is a CIV of order* (Ω, {}) *if and only if*

$$
\beta' \mathcal{C}\_k = 0 \text{ (and thus } \beta' \overline{\mathcal{C}}\_k = 0).
$$

*The vector polynomial <sup>β</sup>*(*z*) = *<sup>β</sup>*<sup>0</sup> <sup>+</sup> *<sup>β</sup>*1*z, with <sup>β</sup>*0, *<sup>β</sup>*<sup>1</sup> <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , [*β* <sup>0</sup>, *β* 1] = 0, *is a PCIV of order* (Ω, {}) *if and only if*

$$[\left[\beta\_{0\prime}^{\prime}\beta\_{1}^{\prime}\right]\left[\begin{array}{cc} \mathbb{Z}\_{k}\mathcal{C}\_{k} & z\_{k}\overline{\mathcal{C}}\_{k} \\ \mathcal{C}\_{k} & \overline{\mathcal{C}}\_{k} \end{array}\right] = 0,\tag{11}$$

*which is equivalent to*

(*zkβ* <sup>0</sup> + *β* <sup>1</sup>)C*<sup>k</sup>* = 0.

*The fact that the matrix in* (11) *has a block structure with two blocks of conjugate complex columns implies some additional structure also on the space of PCIVs, here with polynomial degree one. More specifically it holds that if β*<sup>0</sup> + *β*1*z is a PCIV of order* (Ω, {})*, also* −*β*<sup>1</sup> + (*β*<sup>0</sup> + 2 cos(*ωk*)*β*1)*z is a PCIV of order* (Ω, {})*. This follows from*

$$\begin{aligned} (\overline{z}\_k(-\beta\_1)' + (\beta\_0 + 2\cos(\omega\_k)\beta\_1)')\mathcal{C}\_k &=& (\beta\_0' + (2\mathcal{R}(z\_k) - \overline{z}\_k)\beta\_1')\mathcal{C}\_k \\ &=& (\beta\_0' + z\_k\beta\_1')\mathcal{C}\_k \\ &=& z\_k(\overline{z}\_k\beta\_0' + \beta\_1')\mathcal{C}\_k = 0. \end{aligned}$$

*Thus, the space of PCIVs of degree (up to) one inherits some additional structure emanating from the occurrence of complex eigenvalues in complex conjugate pairs.*

**Example 3** (Cointegration in the I(2) case)**.** *In the I(2) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that*

$$\begin{array}{rclcrcl}y\_{t}&=&\mathcal{C}^{E}\_{1,1}\mathbf{x}^{E}\_{t,1}+\mathcal{C}^{G}\_{1,2}\mathbf{x}^{G}\_{t,2}+\mathcal{C}^{E}\_{1,2}\mathbf{x}^{E}\_{t,2}+\varepsilon\_{t} \\ &y\_{t},\varepsilon\_{t}\in\mathbb{R}^{s},\mathbf{x}^{E}\_{t,1},\mathbf{x}^{G}\_{t,2}\in\mathbb{R}^{d^{1}},\mathbf{x}^{E}\_{t,2}\in\mathbb{R}^{d^{1}\_{1}},\mathbf{x}^{E}\_{t,2}\in\mathbb{R}^{d^{1}\_{2}-d^{1}\_{1}},\mathcal{C}^{E}\_{1,1},\mathcal{C}^{G}\_{1,2}\in\mathbb{R}^{s\times d^{1}\_{1}},\mathcal{C}^{E}\_{1,2}\in\mathbb{R}^{s\times(d^{1}\_{2}-d^{1}\_{1})}\\ \mathbf{x}^{E}\_{t+1,1}&=&\mathbf{x}^{E}\_{t,1}+\mathbf{x}^{G}\_{t,2}+\mathcal{B}\_{1,1}\varepsilon\_{t} \\ \mathbf{x}^{G}\_{t+1,2}&=&\mathbf{x}^{G}\_{t,2}+\mathcal{B}\_{1,2,1}\varepsilon\_{t} \\ \mathbf{x}^{E}\_{t+1,2}&=&\mathbf{x}^{E}\_{t,2}+\mathcal{B}\_{1,2,2}\varepsilon\_{t} \quad\mathcal{B}\_{1,1}\in\mathbb{R}^{d^{1}\_{1}\times s},\mathcal{B}\_{1,2,1}\in\mathbb{R}^{d^{1}\_{1}\times s},\mathcal{B}\_{1,2,2}\in\mathbb{R}^{(d^{1}\_{2}-d^{1}\_{1})\times s}.\end{array}$$

,

*The vector <sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *β* = 0 *is a CIV of order* ((0, 2),(0, 1)) *if and only if*

$$
\beta' \mathcal{C}\_{1,1}^E = 0 \quad \text{and} \quad \beta'[\mathcal{C}\_{1,2}^G, \mathcal{C}\_{1,2}^E] \neq 0.
$$

*The vector <sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *β* = 0, *is a CIV of order* ((0, 2), {}) *if and only if*

$$
\beta'[\mathcal{C}^E\_{1,1'} \mathcal{C}^G\_{1,2'} \mathcal{C}^E\_{1,2}] = 0.
$$

*The vector polynomial <sup>β</sup>*(*z*) = *<sup>β</sup>*<sup>0</sup> <sup>+</sup> *<sup>β</sup>*1*z, with <sup>β</sup>*0, *<sup>β</sup>*<sup>1</sup> <sup>∈</sup> <sup>R</sup>*<sup>s</sup> is a PCIV of order* ((0, 2), {}) *if and only if*

$$\begin{aligned} \left[\beta\_{0\prime}^{\prime}, \beta\_{1}^{\prime}\right] \left[\begin{array}{cc} \mathcal{C}\_{1,1}^{E} & \mathcal{C}\_{1,1}^{E} + \mathcal{C}\_{1,2}^{G} & \mathcal{C}\_{1,2}^{E} \\ \mathcal{C}\_{1,1}^{E} & \mathcal{C}\_{1,2}^{G} & \mathcal{C}\_{1,2}^{E} \end{array}\right] = 0 \quad \text{and} \quad \beta(1) = \beta\_{0} + \beta\_{1} \neq 0. \end{aligned}$$

*The above orthogonality constraint indicates that the two cases* <sup>C</sup>*<sup>G</sup>* 1,2 <sup>=</sup> <sup>0</sup> *and* <sup>C</sup>*<sup>G</sup>* 1,2 = 0 *have to be considered separately for polynomial cointegration analysis. Consider first the case* <sup>C</sup>*<sup>G</sup>* 1,2 = 0*. In this case the orthogonality constraints imply β* 0C*<sup>E</sup>* 1,1 = 0*, β* 1C*<sup>E</sup>* 1,1 = 0 *and* (*β*<sup>0</sup> + *β*1) C*E* 1,2 = 0*. Thus, the vector β*<sup>0</sup> + *β*<sup>1</sup> *is a CIV of order* ((0, 2), {}) *and therefore β*(*z*) = *β*<sup>0</sup> + *β*1*z is of "non-minimum" degree, one in this case rather than zero (β*<sup>0</sup> + *β*1*). For a formal definition of minimum degree PCIVs see Bauer and Wagner (2003, Definition 4). In case* C*G* 1,2 = 0 *there are PCIVs of degree one that are not simple transformations of static CIVs. Consider β*(*z*) = *β*<sup>0</sup> + *β*1*z* = *γ*1(1 − *z*) + *γ*<sup>2</sup> *such that* {*γ* <sup>1</sup>(*yt* − *yt*−1) + *γ* <sup>2</sup>*yt*}*t*∈<sup>Z</sup> *is stationary. The integrated contribution to* {*γ* <sup>1</sup>(*yt* − *yt*−1)}*t*∈<sup>Z</sup> *is given by <sup>γ</sup>* <sup>1</sup>(<sup>1</sup> <sup>−</sup> *<sup>L</sup>*)({C*<sup>E</sup>* 1,1*x<sup>E</sup> <sup>t</sup>*,1}*t*∈Z) = {*γ* 1C*<sup>E</sup>* 1,1*x<sup>G</sup> <sup>t</sup>*−1,2 <sup>+</sup> *<sup>γ</sup>* 1C*<sup>E</sup>* 1,1B1,1*εt*−1}*t*∈Z*, with γ* 1C*<sup>E</sup>* 1,1 = 0*. This term is eliminated by* {*γ* 2C*<sup>G</sup>* 1,2*x<sup>G</sup> <sup>t</sup>*,2}*t*∈<sup>Z</sup> *in* {*γ* <sup>2</sup>*yt*}*t*∈Z*, if γ* 1C*<sup>E</sup>* 1,1 + *γ* 2C*<sup>G</sup>* 1,2 = 0*, which is only possible if* <sup>C</sup>*<sup>G</sup>* 1,2 = 0*. Additionally, γ* 2[C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] = 0 *needs to hold, such that there is no further integrated contribution to* {*γ* <sup>2</sup>*yt*}*t*∈Z*. Neither γ*<sup>1</sup> *nor γ*<sup>2</sup> *are CIVs since both violate the necessary conditions given in the definition of CIVs, which implies that β*(*z*) *is indeed a "minimum degree" PCIV.*

As was shown above, the unit root and cointegration properties of {*yt*}*t*∈<sup>Z</sup> depend on the sub-blocks of C*<sup>u</sup>* and the eigenvalue structure of A*u*. We, therefore, define the more encompassing state space unit root structure containing information on the geometrical and algebraic multiplicities of the eigenvalues of A*<sup>u</sup>* (cf. Bauer and Wagner 2012, Definition 2).

**Definition 4.** *A unit root process* {*yt*}*t*∈<sup>Z</sup> *with a canonical state space representation as given in Theorem 1 has* state space unit root structure

$$\Omega\_{\mathbb{S}} \quad := \left( (\omega\_{1\prime}, d^1\_{1\prime}, \dots, d^1\_{h\_1}), \dots, (\omega\_{l\prime}, d^l\_{1\prime}, \dots, d^l\_{h\_l}) \right).$$

*where* <sup>0</sup> <sup>≤</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup> <sup>≤</sup> *<sup>d</sup><sup>k</sup>* <sup>2</sup> ≤···≤ *<sup>d</sup><sup>k</sup> hk* ≤ *s for k* = 1, . . . , *l. For* {*yt*}*t*∈<sup>Z</sup> *with empty unit root structure* <sup>Ω</sup>*<sup>S</sup>* := {}*.* **Remark 8.** *The state space unit root structure* Ω*<sup>S</sup> contains information concerning the integration properties of the process* {*yt*}*t*∈Z*, since the integers <sup>d</sup><sup>k</sup> <sup>j</sup> , k* = 1, ... , *l, j* = 1, ... , *hk describe (multiplied by two for k such that* 0 < *ω<sup>k</sup>* < *π*) *the numbers of non-cointegrated stochastic trends or cycles of corresponding integration orders, compare again Remark 4. As such,* Ω*<sup>S</sup> describes properties of the stochastic process* {*yt*}*t*∈Z*—and, therefore, the state space unit root structure* Ω*<sup>S</sup> partitions unit root processes according to these (co-)integration properties. These (co-)integration properties, however, are invariant to a chosen canonical representation, or more generally invariant to whether a VARMA or state space representation is considered. For all minimal state representations of a unit root process* {*yt*}*t*∈<sup>Z</sup> *these indices—being related to the Jordan normal form—are invariant.*

As mentioned in Section 2, Paruolo (1996, Definition 3) introduces integration indices at frequency zero as a triple of integers (*r*0,*r*1,*r*2). These correspond to the numbers of columns of the matrices *β*, *β*1, *β*<sup>2</sup> in the error correction representation of I(2) VAR processes, see, e.g., Johansen (1997, sct. 3). Here, *r*<sup>2</sup> is the number of stochastic trends of order two, i.e., *r*<sup>2</sup> = *d*<sup>1</sup> <sup>1</sup>. Furthermore, *r*<sup>1</sup> is the number of stochastic trends of order one that do not cointegrate with *β* <sup>2</sup>Δ0{*yt*}*t*∈<sup>Z</sup> and hence *<sup>r</sup>*<sup>1</sup> <sup>=</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup> <sup>−</sup> *<sup>d</sup>*<sup>1</sup> 1. Therefore, the integration indices at frequency zero are in one-one correspondence with the state space unit root structure Ω*<sup>S</sup>* = ((0, *d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)) for I(2) processes and the dimension *s* = *r*<sup>0</sup> + *r*<sup>1</sup> + *r*<sup>2</sup> of the process.

The canonical form given in Theorem 1 imposes p.u.t. structures on sub-blocks of the matrix <sup>B</sup>*u*. The occurrence of these blocks—related to *<sup>d</sup><sup>k</sup> <sup>j</sup>* > *<sup>d</sup><sup>k</sup> <sup>j</sup>*−1—is determined by the state space unit root structure Ω*S*. The number of free entries in these p.u.t.-blocks, however, is not determined by Ω*S*. Consequently, we need structure indices *<sup>p</sup>* <sup>∈</sup> <sup>N</sup>*nu* <sup>0</sup> indicating for each row the position of a potentially restricted positive element, as formalized below:

**Definition 5** (Structure indices)**.** *For the block* <sup>B</sup>*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*nu*×*<sup>s</sup> of the matrix* <sup>B</sup> *of a state space realization* (A, <sup>B</sup>, <sup>C</sup>) *in canonical form, define the corresponding structure indices p* <sup>∈</sup> <sup>N</sup>*nu* <sup>0</sup> *as*

$$p\_i \quad := \begin{cases} \begin{array}{ll} 0 & \text{if the i-th row of } \mathcal{B}\_\mathcal{U} \text{ is not part of a p.u.t. block,} \\ \ j & \text{if the i-th row of } \mathcal{B}\_\mathcal{U} \text{ is part of a p.u.t. block and its j-th entry is restricted to be positive.} \end{cases} \end{cases}$$

**Remark 9.** *Since sub-blocks of* B*<sup>u</sup> corresponding to complex unit roots are of the form* B*k*,<sup>C</sup> = [B *<sup>k</sup>*, <sup>B</sup> *k*] *, the entries restricted to be positive are located in the same columns and rows of both* B*<sup>k</sup> and* B*k. Thus, the structure indices pi of the corresponding rows are identical for* B*<sup>k</sup> and* B*k. Therefore, it would be possible to omit the parts of p corresponding to the blocks* B*k. It is, however, as will be seen in Definition 9, advantageous for the comparison of unit root structures and structure indices that p is a vector with nu entries.*

**Example 4.** *Consider the following state space system:*

$$\begin{array}{rcl} y\_t &=& \left[ \mathcal{C}\_{1,1}^{\operatorname{E}} \quad \mathcal{C}\_{1,2}^{\operatorname{G}} \quad \mathcal{C}\_{1,2}^{\operatorname{E}} \right] \mathbf{x}\_t + \boldsymbol{\varepsilon}\_t & y\_{t\prime} \ \boldsymbol{\varepsilon}\_t \in \mathbb{R}^2, \ \mathbf{x}\_t \in \mathbb{R}^3, \ \mathcal{C}\_{1,1}^{\operatorname{E}} \mathcal{C}\_{1,2}^{\operatorname{E}}, \mathcal{C}\_{1,2}^{\operatorname{E}} \in \mathbb{R}^{2 \times 1} \\ \mathbf{x}\_{t+1} &=& \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \mathbf{x}\_t + \begin{bmatrix} \mathcal{B}\_{1,1} \\ \mathcal{B}\_{1,2,1} \\ \mathcal{B}\_{1,2,2} \end{bmatrix} \boldsymbol{\varepsilon}\_t & \mathbf{x}\_0 = \mathbf{0}, \quad \mathcal{B}\_{1,1} \mathcal{B}\_{1,2,1} \mathcal{B}\_{1,2,2} \in \mathbb{R}^{1 \times 2}. \end{array} \tag{12}$$

*In canonical form* B1,2,1 *and* B1,2,2 *are p.u.t. matrices and* B1,1 *is unrestricted. If, e.g., the second entry b*1,2,1,2 *of* B1,2,1 *and the first entry b*1,2,2,1 *of* B1,2,2 *are restricted to be positive, then*

$$\mathcal{B}\_{\mathbf{a}} = \left[ \begin{array}{ccc} \* & \* \\ 0 & b\_{1,2,1,2} \\\ b\_{1,2,2,1} & \* \\ \end{array} \right] / $$

*where the symbol* ∗ *denotes unrestricted entries. In this case p* = [0, 2, 1] *.*

For given state space unit root structure Ω*<sup>S</sup>* the matrix A*<sup>u</sup>* is fully determined. The parameterization of the set of feasible matrices B*<sup>u</sup>* for given structure indices *p* and of the set of stable subsystems (A•, B•, C•) for given Kronecker indices *α*• (cf. Hannan and Deistler 1988, chp. 2.) is straightforward, since the entries in these matrices are either unrestricted, restricted to zero or restricted to be positive. Matters are a bit more complicated for C*u*. One possibility to parameterize the set of possible matrices C*<sup>u</sup>* for a given state space unit root structure Ω*<sup>S</sup>* is to use real and complex valued Givens rotations (cf. Golub and van Loan 1996, chp. 5.1).

**Definition 6** (Real Givens rotation)**.** *The* real Givens rotation *Rq*,*i*,*j*(*θ*) <sup>∈</sup> <sup>R</sup>*q*×*q, <sup>θ</sup>* <sup>∈</sup> [0, 2*π*) *is defined as*

$$\begin{aligned} R\_{q,i,j}(\theta) &:= \begin{bmatrix} I\_{i-1} & & & 0 \\ & \cos(\theta) & 0 & \sin(\theta) \\ & 0 & I\_{j-1-i} & 0 \\ & -\sin(\theta) & 0 & \cos(\theta) \\ 0 & & & I\_{q-j} \end{bmatrix} \end{aligned}$$

.

.

**Remark 10.** *Givens rotations allow transforming any vector v* = [*v*1, *v*2, ..., *vq*] <sup>∈</sup> <sup>R</sup>*<sup>q</sup> into a vector of the form* [*v*˜1, 0, ..., 0] *with v*˜1 ≥ 0*. This is achieved by the following algorithm:*


*This algorithm determines a unique vector θ* = [*θ*1, ..., *θq*−1] *for every vector v* <sup>∈</sup> <sup>R</sup>*q.*

**Remark 11.** *The determinant of real Givens rotations is equal to one, i.e.,* det(*Rs*,*i*,*j*(*θ*)) = <sup>1</sup> *for all <sup>s</sup>*, *<sup>i</sup>*, *<sup>j</sup>* <sup>∈</sup> <sup>N</sup> *and all θ* ∈ [0, 2*π*)*. Thus, it is not possible to factorize an orthonormal matrix Q with* det(*Q*) = −1 *into a product of Givens rotations. This obvious fact has implications for the parameterization of* C*-matrices as is detailed below.*

**Definition 7** (Complex Givens rotation)**.** *The* complex Givens rotation *Qq*,*i*,*j*(*ϕ*) <sup>∈</sup> <sup>C</sup>*q*×*q, <sup>ϕ</sup>* :<sup>=</sup> [*ϕ*1, *ϕ*2] ∈ Θ<sup>C</sup> := [0, *π*/2] × [0, 2*π*)*, is defined as*

$$Q\_{q,i,j}(\boldsymbol{\varrho}) \quad := \begin{bmatrix} I\_{i-1} & & & & 0 \\ & \cos(\boldsymbol{\varrho}\_1) & 0 & \sin(\boldsymbol{\varrho}\_1)e^{i\boldsymbol{\varrho}\_2} \\ & 0 & I\_{j-1-i} & 0 \\ & -\sin(\boldsymbol{\varrho}\_1)e^{-i\boldsymbol{\varrho}\_2} & 0 & \cos(\boldsymbol{\varrho}\_1) \\ & 0 & & & I\_{q-j} \end{bmatrix}$$

**Remark 12.** *Complex Givens rotations allow transforming any vector v* = [*v*1, *v*2, ..., *vq*] <sup>∈</sup> <sup>C</sup>*<sup>q</sup> into a vector of the form* [*v*˜1, 0, ..., 0] *with <sup>v</sup>*˜1 <sup>∈</sup> <sup>C</sup>*. This is achieved by the following algorithm:*


*3. Set*

$$\begin{aligned} \, \_0\boldsymbol{q}\_{q-j,1} &= \begin{cases} \tan^{-1}\left(\frac{b\_j}{a\_j}\right) & \text{if } a\_j > 0, \\ \pi/2 & \text{if } a\_j = 0, b\_j > 0, \\ 0 & \text{if } a\_j = 0, b\_j = 0, \\ \, \_0\boldsymbol{q}\_{d,j} - \boldsymbol{q}\_{b,j} \bmod 2\pi. \end{cases} \end{aligned}$$

*Then <sup>Q</sup>*2,1,2(*ϕq*−*j*)[*<sup>v</sup>* (*j*) <sup>1</sup> , *vq*−*j*+1] = [*v* (*j*+1) <sup>1</sup> , 0] *such that <sup>v</sup>*(*j*+1) <sup>=</sup> *Qq*,1,*q*−*j*+1(*θq*−1)*v*(*j*) <sup>=</sup> [*v* (*j*+1) <sup>1</sup> , *<sup>v</sup>*2,..., *vq*−*j*, 0] *, with v*(*j*+1) <sup>1</sup> <sup>∈</sup> <sup>C</sup>*.*

*4. If j* = *q* − 1*, stop. Else increment j by one (j* → *j* + 1*) and continue at step 2.*

*This algorithm determines a unique vector ϕ* = [*ϕ*1,1, *ϕ*1,2, ..., *ϕq*−1,2] *for every vector v* <sup>∈</sup> <sup>C</sup>*q.*

To set the stage for the general case, we start the discussion of the parameterization of the set of matrices (A, B, C) in canonical form with the MFI(1) and I(2) cases. These two cases display all ingredients required later for the general case. The MFI(1) case illustrates the usage of either real or complex Givens rotations, depending on whether the considered C-block corresponds to a real or complex unit root. The I(2) case highlights recursive orthogonality constraints on the parameters of the C-block, which are related to the polynomial cointegration properties (cf. Example 3).

#### *3.1. The Parameterization in the MFI(1) Case*

The state space unit root structure of an MFI(1) process is given by Ω*<sup>S</sup>* = ((*ω*1, *d*<sup>1</sup> <sup>1</sup>), ... ,(*ωl*, *<sup>d</sup><sup>l</sup>* <sup>1</sup>)). For the corresponding state space system (A, B, C) in canonical form, the sub-blocks of A*<sup>u</sup>* are equal to *Jk* = *zk I dk* 1 , the sub-blocks B*<sup>k</sup>* of B*<sup>u</sup>* are p.u.t. and C *<sup>k</sup>*C*<sup>k</sup>* = *I dk* 1 , for *k* = 1, . . . , *l*.

Starting with the sub-blocks of C*u*, it is convenient to separate the discussion of the parameterization of <sup>C</sup>*u*-blocks into the real case, where *<sup>ω</sup><sup>k</sup>* ∈ {0, *<sup>π</sup>*} and <sup>C</sup>*<sup>k</sup>* <sup>∈</sup> <sup>R</sup>*s*×*d<sup>k</sup>* <sup>1</sup> , and the complex case with 0 <sup>&</sup>lt; *<sup>ω</sup><sup>k</sup>* <sup>&</sup>lt; *<sup>π</sup>* and <sup>C</sup>*<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*s*×*d<sup>k</sup>* <sup>1</sup> . For the case of real unit roots the two cases *d<sup>k</sup>* <sup>1</sup> < *<sup>s</sup>* and *<sup>d</sup><sup>k</sup>* <sup>1</sup> = *s* have to be distinguished. For brevity of notation refer to the considered real block simply as C ∈ <sup>R</sup>*s*×*d*. Using this notation, the set of matrices to be parameterized is

$$O\_{s,d} \quad := \quad \{ \mathcal{C} \in \mathbb{R}^{s \times d} | \mathcal{C}' \mathcal{C} = I\_d \}.$$

The parameterization of *Os*,*<sup>d</sup>* is based on the combination of real Givens rotations, as given in Definition 6, that allow transforming every matrix in *Os*,*<sup>d</sup>* to the form [*Id*, 0 (*s*−*d*)×*d*] for *d* < *s*. For *d* = *s*, Givens rotations allow transforming every matrix C ∈ *Os*,*<sup>s</sup>* either to *Is* or *I*<sup>−</sup> *<sup>s</sup>* := diag(*Is*−1, −1), since, compare Remark 11, for the transformed matrix <sup>C</sup>˜(*s*) it holds that det(C) = det(C˜(*s*)) ∈ {−1, 1}. This is achieved with the following algorithm:


The parameter vector *θ* = [*θ <sup>L</sup>*, *θ R*] , contains the angles of the employed Givens rotations and provides one way of parameterizing *Os*,*d*. The following Lemma 1 demonstrates the usefulness of this parameterization.

**Lemma 1** (Properties of the parameterization of *Os*,*d*)**.** *Define for d* ≤ *s a mapping θ* → *CO*(*θ*) *from* Θ<sup>R</sup> *<sup>O</sup>* := [0, 2*π*)*d*(*s*−*d*) <sup>×</sup> [0, 2*π*)*d*(*d*−1)/2 <sup>→</sup> *Os*,*<sup>d</sup> by*

$$\begin{split} \mathsf{C}\_{O}(\boldsymbol{\theta}) &:= \left[ \prod\_{i=1}^{d} \prod\_{j=1}^{s-d} \mathsf{R}\_{s,i,d+j} (\theta\_{\mathsf{L},(s-d)(i-1)+j}) \right]^{\prime} \left[ \begin{array}{c} I\_{d} \\ 0\_{(s-d)\times d} \end{array} \right] \left[ \prod\_{i=1}^{d-1} \prod\_{j=1}^{i} \mathsf{R}\_{d,d-i,d-i+j} (\theta\_{\mathsf{R},i(i-1)/2+j}) \right] \\ &:= \left. \begin{array}{c} \mathrm{R}\_{L}(\boldsymbol{\theta}\_{\mathsf{L}})^{\prime} \left[ \begin{array}{c} I\_{d} \\ 0\_{(s-d)\times d} \end{array} \right] \mathrm{R}\_{R}(\boldsymbol{\theta}\_{\mathsf{R}}), \end{array} \right] \end{split}$$

*with θ* := [*θ <sup>L</sup>*, *θ R*] *, where <sup>θ</sup><sup>L</sup>* := [*θL*,1, ... , *<sup>θ</sup>L*,*d*(*s*−*<sup>d</sup>*)] *and <sup>θ</sup><sup>R</sup>* := [*θR*,1, ... , *<sup>θ</sup>R*,*d*(*d*−1)/2] *. The following properties hold:*


$$For \ d < \text{s}, \text{ it holds that}$$

*(iii) For every* C ∈ *Os*,*<sup>d</sup> there exists a vector <sup>θ</sup>* <sup>∈</sup> <sup>Θ</sup><sup>R</sup> *<sup>O</sup> such that*

$$\mathcal{C} = \mathsf{C}\_{\mathsf{O}}(\boldsymbol{\theta}) = \mathsf{R}\_{L}(\boldsymbol{\theta}\_{L})^{\prime} \left[ \begin{array}{c} I\_{d} \\ \boldsymbol{0}\_{(s-d)\times d} \end{array} \right] \mathsf{R}\_{R}(\boldsymbol{\theta}\_{R}).$$

*The algorithm discussed above defines the inverse mapping C*−<sup>1</sup> *<sup>O</sup>* : *Os*,*<sup>d</sup>* <sup>→</sup> <sup>Θ</sup><sup>R</sup> *O.*

*(iv) The inverse mapping C*−<sup>1</sup> *<sup>O</sup>* (·)*—the parameterization of Os*,*d—is infinitely often differentiable on the pre-image of the interior of* Θ<sup>R</sup> *<sup>O</sup>. This is an open and dense subset of Os*,*d.*

*For d* = *s, it holds that*


$$\mathcal{C} = \mathbb{C}\_{\mathcal{O}}(\boldsymbol{\theta}) = \mathbb{R}\_{L}(\boldsymbol{\theta}\_{L})' \left[ \begin{array}{c} I\_{d} \end{array} \right] \\ \mathbb{R}\_{\mathcal{R}}(\boldsymbol{\theta}\_{\mathcal{R}}) = \mathbb{R}\_{\mathcal{R}}(\boldsymbol{\theta}\_{\mathcal{R}}). \text{}$$

*In this case, steps 1-4 of the algorithm discussed above define the inverse mapping C*−<sup>1</sup> *<sup>O</sup>* : *<sup>O</sup>*<sup>+</sup> *<sup>s</sup>*,*<sup>s</sup>* <sup>→</sup> <sup>Θ</sup><sup>R</sup> *O. (vii) Define v* := [*π*,..., *π*] <sup>∈</sup> <sup>R</sup>*s*(*s*−1)/2*. Then a parameterization of Os*,*<sup>s</sup> is given by*

$$\mathcal{C}\_O^{\pm}(\mathbb{C}) \quad = \begin{cases} \upsilon + \mathcal{C}\_O^{-1}(\mathbb{C}) & \text{if } \mathbb{C} \in O\_{s,\ast}^{+} \\ - (\upsilon + \mathcal{C}\_O^{-1}(\mathbb{C}I\_s^{-})) & \text{if } \mathbb{C} \in O\_{s,\ast}^{-}. \end{cases}$$

.

*The parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of Os*,*s.*

**Remark 13.** *The following arguments illustrate why C*−<sup>1</sup> *<sup>O</sup> is not continuous on the pre-image of the boundary of* Θ<sup>R</sup> *<sup>O</sup>: Consider the unit sphere <sup>O</sup>*3,1 <sup>=</sup> {*<sup>C</sup>* <sup>∈</sup> <sup>R</sup>3|*C C* = *C*<sup>2</sup> = 1}*. One way to parameterize the unit sphere is to use degrees of longitude and latitude. Two types of discontinuities occur: After fixing the location of the zero degree of longitude, i.e., the prime meridian, its anti-meridian is described by both 180*◦*W and 180*◦*E. Using the half-open interval* [0, 2*π*) *in our parametrization causes a similar discontinuity. Second, the degree of longitude is irrelevant at the north pole. As seen in Remark 10, with our parameterization a similar issue occurs when the first two entries of C to be compared are both equal to zero. In this case the parameter of the Givens rotation is set to zero, although every θ will produce the same result. Both discontinuities clearly occur on a thin subset of Os*,*d.*

*As in the parametrization of the VAR I(1)-case in the VECM framework, where the restriction β* = [*Is*−*d*, *<sup>β</sup>*∗] *can only be imposed when the upper* (*s* − *d*) × (*s* − *d*) *block of the true β*<sup>0</sup> *of the DGP is of full rank (cf. Johansen 1995, chp. 5.2), the set where the discontinuities occur can effectively be changed by a permutation of the components of the observed time series. This corresponds to redefining the locations of the prime meridian and the poles.*

**Remark 14.** *Please note that the parameterization partitions the parameter vector θ into two parts θ<sup>L</sup>* ∈ [0, 2*π*)*d*(*s*−*d*) *and <sup>θ</sup><sup>R</sup>* <sup>∈</sup> [0, 2*π*)(*d*−1)*d*/2*. Since changing the parameter values in <sup>θ</sup><sup>R</sup> does not change the column space of CO*(*θ*)*, which, as seen above, determines the cointegrating vectors, θ<sup>L</sup> fully characterizes the (static) cointegrating space. Please note that the dimension of θ<sup>L</sup> is d*(*s* − *d*) *and thus coincides with the number of free parameters in β in the VECM framework (cf. Johansen 1995, chp. 5.2).*

**Example 5.** *Consider the matrix*

$$\mathbf{C}\_{\mathbf{a}} = \begin{bmatrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix}$$

*with d* = 2 *and s* = 3*. As discussed, the static cointegrating space is characterized by the left kernel of this matrix. The left kernel of a matrix in* R3×<sup>2</sup> *with full rank two is given by a one-dimensional space, with the corresponding basis vector parameterized, when normalized to length one, by two free parameters. Thus, for the characterization of the static cointegrating space two parameters are required, which exactly coincides with the dimension of θ<sup>L</sup> given in Remark 14. The parameters in θ<sup>R</sup> correspond to the choice of a basis of the image of C. Having fixed the two-dimensional subspace through θL, only one free parameter for the choice of an orthonormal basis remains, which again coincides with the dimension given in Remark 14. To obtain the parameter vector, the starting point is a QR decomposition of C* = *RR*(*θR*)*C*˜ *. In this example RR*(*θR*) = *R*2,1,2(*θR*,1)*, with θR*,1 *to be determined. To find <sup>θ</sup>R*,1*, solve* [ <sup>0</sup> <sup>√</sup><sup>1</sup> <sup>2</sup> ]*R*2,1,2(*θR*,1) = [ *<sup>r</sup>* <sup>0</sup> ] *for <sup>r</sup>* ≥ <sup>0</sup> *and <sup>θ</sup>R*,1 ∈ [0, 2*π*)*. In other words, find <sup>r</sup>* <sup>≥</sup> <sup>0</sup> *and <sup>θ</sup>R*,1 <sup>∈</sup> [0, 2*π*) *such that* [ <sup>0</sup> <sup>√</sup><sup>1</sup> <sup>2</sup> ] = *<sup>r</sup>*[ cos(*θR*,1) sin(*θR*,1) ]*, which leads to r* = <sup>√</sup><sup>1</sup> 2 *, θR*,1 = *<sup>π</sup>* <sup>2</sup> *. Thus, the orthonormal matrix RR*(*θR*) *is equal to <sup>R</sup>*2,1,2 *<sup>π</sup>* 2 *and the transpose of the upper triangular matrix C*˜ *is equal to:*

$$\tilde{\mathcal{C}} = \tilde{\mathcal{C}}^{(0)} = \mathbb{C} \cdot R\_{2,1,2} \left( \frac{\pi}{2} \right)' = \begin{bmatrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix} \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} & 0 \\ \frac{1}{2} & \frac{1}{\sqrt{2}} \\ \frac{1}{2} & -\frac{1}{\sqrt{2}} \end{bmatrix}.$$

*Second, transform the entries in the lower* <sup>1</sup> <sup>×</sup> <sup>2</sup>*-sub-block of <sup>C</sup>*˜(0) *to zero, starting with the last column. For this find <sup>θ</sup>L*,2 <sup>∈</sup> [0, 2*π*) *such that <sup>R</sup>*3,2,3(*θL*,2)[ <sup>0</sup> <sup>√</sup><sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>1</sup> 2 ] = [ 010 ] *, i.e.,* [ <sup>√</sup><sup>1</sup> <sup>2</sup> <sup>−</sup> <sup>√</sup><sup>1</sup> 2 ] = *r*[ cos(*θL*,2) sin(*θL*,2) ]*. This yields r* = 1*, θL*,2 = <sup>7</sup>*<sup>π</sup>* <sup>4</sup> *. Next compute <sup>C</sup>*˜(1) <sup>=</sup> *<sup>R</sup>*3,2,3( <sup>7</sup>*<sup>π</sup>* <sup>4</sup> )*C*˜(0)*:*

$$\tilde{\mathcal{L}}^{(1)} = R\_{3,2,3} \left( \frac{7\pi}{4} \right) \cdot \mathbb{C} \cdot R\_{2,1,2} \left( \frac{\pi}{2} \right)' = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}} \\ 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix} \begin{bmatrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix} \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} \quad = \begin{bmatrix} \frac{1}{\sqrt{2}} & 0 \\ 0 & 1 \\ \frac{1}{\sqrt{2}} & 0 \end{bmatrix}.$$

*In the final step find <sup>θ</sup>L*,1 <sup>∈</sup> [0, 2*π*) *such that <sup>R</sup>*3,1,3(*θL*,1)[ <sup>√</sup><sup>1</sup> <sup>2</sup> <sup>0</sup> <sup>√</sup><sup>1</sup> 2 ] = [ 100 ] *, i.e.,* [ <sup>√</sup><sup>1</sup> <sup>2</sup> <sup>√</sup><sup>1</sup> 2 ] = *r*[ cos(*θL*,1) sin(*θL*,1) ]*. The solution is r* = 1*, θL*,1 = *<sup>π</sup>* <sup>4</sup> *. Combining the transformations leads to*

$$R\_{3,1,3}\left(\frac{\pi}{4}\right) \cdot R\_{3,2,3}\left(\frac{7\pi}{4}\right) \cdot C \cdot R\_{2,1,2}\left(\frac{\pi}{2}\right)^{\prime} =$$

$$\begin{bmatrix} \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}}\\ 0 & 1 & 0\\ \frac{-1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \end{bmatrix} \begin{bmatrix} 1 & 0 & 0\\ 0 & \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}}\\ 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix} \begin{bmatrix} 0 & \frac{1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}} & \frac{1}{2}\\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix} \begin{bmatrix} 0 & -1\\ 1 & 0 \end{bmatrix}^{\prime} = \begin{bmatrix} 1 & 0\\ 0 & 1\\ 0 & 0 \end{bmatrix}.$$

*The parameter vector for this matrix is therefore θ* = [*θ <sup>L</sup>*, *θ R*] = *<sup>π</sup>* <sup>4</sup> , <sup>7</sup>*<sup>π</sup>* 4 , *π* 2 *with θ* = *C*−<sup>1</sup> *<sup>O</sup>* (*C*)*.*

In case of complex unit roots, referring for brevity again to the considered block C*<sup>k</sup>* simply as C ∈ <sup>C</sup>*s*×*d*, the set of matrices to be parameterized is

$$\mathcal{U}\_{s,d} \quad := \quad \{ \mathcal{C} \in \mathbb{C}^{s \times d} | \mathcal{C}' \mathcal{C} = I\_d \}.$$

The parameterization of this set is based on the combination of complex Givens rotations, as given in Definition 7, which can be used to transform every matrix in *Us*,*<sup>d</sup>* to the form [*Dd*, 0 (*s*−*d*)×*d*] with a diagonal matrix *Dd* whose diagonal elements are of unit modulus. This transformation is achieved with the following algorithm:


9. Transform the diagonal entries of the transformed matrix <sup>C</sup>˜(*d*) = [*Dd*, 0 (*s*−*d*)×*d*] into polar coordinates and collect the angles in a parameter vector *ϕD*.

The following lemma demonstrates the usefulness of this parameterization.

**Lemma 2** (Properties of the parametrization of *Us*,*d*)**.** *Define for d* ≤ *s a mapping ϕ* → *CU*(*ϕ*) *from* Θ<sup>C</sup> *<sup>U</sup>* :<sup>=</sup> <sup>Θ</sup>*d*(*s*−*d*) <sup>C</sup> <sup>×</sup> <sup>Θ</sup>(*d*−1)*d*/2 <sup>C</sup> <sup>×</sup> [0, 2*π*)*<sup>d</sup>* <sup>→</sup> *Us*,*<sup>d</sup> by*

$$\begin{split} \mathbb{C}\_{lI}(\mathfrak{g}) &:= \left[ \prod\_{i=1}^{d} \prod\_{j=1}^{s-d} Q\_{s,i,d+j}(\mathfrak{q}\_{L\_{r}(s-d)(i-1)+j}) \right]^{l} \left[ \begin{array}{l} D\_{d}(\mathfrak{g}\_{D}) \\ 0\_{(s-d)\times d} \end{array} \right] \left[ \prod\_{i=1}^{d-1} \prod\_{j=1}^{i} Q\_{d,d-i,d-i+j}(\mathfrak{q}\_{R,i(i-1)/2+j}) \right] \\ &:= \left. Q\_{L}(\mathfrak{g}\_{L})^{l} \left[ \begin{array}{l} D\_{d}(\mathfrak{g}\_{D}) \\ 0\_{(s-d)\times d} \end{array} \right] Q\_{R}(\mathfrak{g}\_{R}) , \end{split}$$

*with ϕ* := [*ϕ L*,*ϕ R*,*ϕ D*] *, where <sup>ϕ</sup><sup>L</sup>* = [*ϕL*,1, ... , *<sup>ϕ</sup>L*,*d*(*s*−*<sup>d</sup>*)] *, <sup>ϕ</sup><sup>R</sup>* := [*ϕR*,1, ... , *<sup>ϕ</sup>R*,*d*(*d*−1)/2] *and ϕ<sup>D</sup>* := [*ϕD*,1,..., *ϕD*,*d*] *and where Dd*(*ϕD*) = *diag*(*eiϕD*,1 ,...,*eiϕD*,*<sup>d</sup>* )*. The following properties hold:*


$$\mathcal{C} = \mathbb{C}\_{\mathrm{II}}(\mathfrak{p}) = \mathbb{Q}\_{\mathrm{L}}(\mathfrak{p}\_{\mathrm{L}})' \left[ \begin{array}{c} D\_{d}(\mathfrak{p}\_{\mathrm{D}}) \\ 0\_{(s-d)\times d} \end{array} \right] \mathcal{Q}\_{\mathrm{R}}(\mathfrak{p}\_{\mathrm{R}}) .$$

*The algorithm discussed above defines the inverse mapping C*−<sup>1</sup> *<sup>U</sup>* : *Us*,*<sup>d</sup>* <sup>→</sup> <sup>Θ</sup><sup>R</sup> *U.*

*(iv) The inverse mapping C*−<sup>1</sup> *<sup>U</sup>* (·)*—the parameterization of Us*,*d—is infinitely often differentiable on an open and dense subset of Us*,*d.*

**Remark 15.** *Note the partitioning of the parameter vector ϕ into the parts ϕL,ϕ<sup>D</sup> and ϕR. The component ϕ<sup>L</sup> fully characterizes the column space of CU*(*ϕ*)*, i.e., ϕ<sup>L</sup> determines the cointegrating spaces.*

**Example 6.** *Consider the matrix*

$$\mathbf{C}\_{\ell} = \begin{bmatrix} \frac{1-\ell}{2} & \frac{1-\ell}{2} \\ \frac{1+\ell}{2} & \frac{-1-\ell}{2} \\ 0 & 0 \end{bmatrix}.$$

*The starting point is again a QR decomposition of <sup>C</sup>* = *QR*(*ϕR*)*C*˜ = *<sup>Q</sup>*2,1,2(*ϕR*,1)*C*˜ *. To find a complex Givens rotation such that* [ <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> ]*Q*2,1,2(*ϕR*,1) = [ *reiϕ<sup>a</sup>* <sup>0</sup> ] *with <sup>r</sup>* > <sup>0</sup>*, transform the entries of* [ <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> ] *into polar coordinates. The equation* [ <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> <sup>1</sup>−*<sup>i</sup>* <sup>2</sup> ] = [ *aeiϕ<sup>a</sup> beiϕ<sup>b</sup>* ] *has the solutions a* = *b* = <sup>√</sup><sup>1</sup> <sup>2</sup> *and <sup>ϕ</sup><sup>a</sup>* <sup>=</sup> *<sup>ϕ</sup><sup>b</sup>* <sup>=</sup> <sup>7</sup>*<sup>π</sup>* <sup>4</sup> *. Using the results of Remark 12, the parameters of the Givens rotation are ϕR*,1,1 = *tan*−1( *<sup>b</sup> <sup>a</sup>* ) = *<sup>π</sup>* <sup>4</sup> *and <sup>ϕ</sup>R*,1,2 <sup>=</sup> *<sup>ϕ</sup><sup>a</sup>* <sup>−</sup> *<sup>ϕ</sup><sup>b</sup>* <sup>=</sup> <sup>0</sup>*. Right-multiplication of C with Q*2,1,2 *<sup>π</sup>* <sup>4</sup> , 0 *leads to*

$$\tilde{\mathbb{C}} = \mathbb{C}Q\_{2,1,2}\left(\left[\frac{\pi}{4}, 0\right]\right)' = \mathbb{C}\left[\begin{array}{ccc} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{array}\right]' = \left[\begin{array}{ccc} \frac{1-i}{\sqrt{2}} & 0\\ 0 & \frac{-1-i}{\sqrt{2}}\\ 0 & 0 \end{array}\right] = \left[\begin{array}{ccc} D\_2(\mathfrak{p}\_D) \\\ 0\_{1\times 2} \end{array}\right].$$

*Since the entries in the lower* <sup>1</sup> <sup>×</sup> <sup>2</sup>*-sub-block of <sup>C</sup>*˜ *are already equal to zero, the remaining complex Givens rotations are Q*3,2,3([0, 0]) = *Q*3,1,3([0, 0]) = *I*3*. Finally, the parameter values corresponding to the diagonal matrix D*2(*ϕD*) = *diag*(*eiϕD*,1 ,*eiϕD*,2 ) = *diag*( <sup>1</sup> √−*i* <sup>2</sup> , −√1−*<sup>i</sup>* <sup>2</sup> ) *are <sup>ϕ</sup>D*,1 <sup>=</sup> <sup>3</sup>*<sup>π</sup>* <sup>4</sup> *and <sup>ϕ</sup>D*,2 <sup>=</sup> <sup>5</sup>*<sup>π</sup>* 4 *.*

*The parameter vector for this matrix is therefore ϕ* = [*ϕ L*,*ϕ R*,*ϕ D*] = [0, 0, 0, 0], *π* <sup>4</sup> , 0 , 3*π* <sup>4</sup> , <sup>5</sup>*<sup>π</sup>* 4 *, with ϕ* = *C*−<sup>1</sup> *<sup>U</sup>* (*C*)*.*

Components of the Parameter Vector

Based on the results of the preceding sections we can now describe the parameter vectors for the general case. The dimensions of the parameter vectors of the respective blocks of the system matrices (A, B, C) depend on the multi-index Γ, consisting of the state space unit root structure Ω*S*, the structure indices *p* and the Kronecker indices *α*• for the stable subsystem. A parameterization of the set of all systems in canonical form with given multi-index Γ for the MFI(1) case, therefore, combines the following components:

$$\bullet \qquad \theta\_{B,f} := [\theta'\_{B,f,1'}, \dots, \theta'\_{B,f,l}]' \in \Theta\_{B,f} = \mathbb{R}^{d\_{B,f}}, \text{ with:} \bullet$$

$$\theta\_{B,f,k} := \begin{cases} [b^k\_{1,p^k\_1+1}, b^k\_{1,p^k\_1+2}, \dots, b^k\_{1,s'}, b^k\_{2,p^k\_2+1}, \dots, b^k\_{d^k\_1,s}]' & \text{for } \omega\_k \in \{0, \pi\}, \\ [\mathcal{R}(b^k\_{1,p^k\_1+1}), \mathcal{Z}(b^k\_{1,p^k\_1+1}), \mathcal{R}(b^k\_{1,p^k\_1+2}), \dots, \mathcal{Z}(b^k\_{1,s}), \mathcal{R}(b^k\_{2,p^k\_2+1})', \dots, \mathcal{Z}(b^k\_{d^k\_1,s})]' \\ & \text{for } 0 < \omega\_k < \pi, \end{cases}$$

for *k* = 1, ... , *l*, with *p<sup>k</sup> <sup>j</sup>* denoting the *j*-th entry of the structure indices *p* corresponding to B*k*. The vectors *θB*, *<sup>f</sup>* ,*<sup>k</sup>* contain the real and imaginary parts of free entries in B*<sup>k</sup>* not restricted by the p.u.t. structures.


**Example 7.** *Consider an MFI(1) process with* Ω*<sup>S</sup>* = ((0, 2),( *<sup>π</sup>* <sup>2</sup> , 2))*, p* = [1, 3, 1, 2, 1, 2] *, n*• = 0*, and system matrices*

$$\begin{array}{rcl} \mathcal{A} &=& \text{diag}(1, 1, i, i, -i, -i), \\ \mathcal{B} &=& \begin{bmatrix} 1 & -1 & 2 \\ 0 & 0 & 2 \\ \hline \hline 1 & 1+i & 1-i \\ 0 & 2 & i \\ \hline 1 & 1-i & 1+i \\ 0 & 2 & -i \end{bmatrix}, \quad \mathcal{C} = \begin{bmatrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix} \begin{array}{c} \frac{1+i}{2} & \frac{1+i}{2} \\ \frac{1+i}{2} & \frac{-1-i}{2} \\ 0 & 0 \end{array} \left[ \begin{array}{ccc} \frac{1+i}{2} & \frac{1+i}{2} \\ \frac{-1}{2} & \frac{-1+i}{2} \\ 0 & 0 \end{array} \right], \end{array}$$

*in canonical form. For this example it holds that θB*, *<sup>f</sup>* = [[−1, 2], [1, 1, 1, −1, 0, 1]] *, θB*,*<sup>p</sup>* = [[1, 2], [1, 2]] *and*

$$\theta\_{\mathbb{C},E} \quad = \left[ \left[ \left[ \left[ \frac{\pi}{4}, \frac{7\pi}{4} \right], \left[ \frac{\pi}{2} \right] \right], \left[ \left[ 0, 0, 0, 0 \right], \left[ \frac{\pi}{4}, 0 \right], \left[ \frac{3\pi}{4}, \frac{5\pi}{4} \right] \right] \right] \right]',$$

*with parameter values corresponding to the C-blocks collected in θC*,*<sup>E</sup> considered in Examples 5 and 6.*

#### *3.2. The Parameterization in the I(2) Case*

The canonical form provided above for the general case has the following form for I(2) processes with unit root structure Ω*<sup>s</sup>* = ((0, *d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)):

$$\mathcal{A} = \begin{bmatrix} I\_{d\_1^1} & I\_{d\_1^1} & 0 & 0 \\ 0 & I\_{d\_1^1} & 0 & 0 \\ 0 & 0 & I\_{d\_2^1 - d\_1^1} & 0 \\ 0 & 0 & 0 & \mathcal{A}\_\bullet \end{bmatrix}, \quad \mathcal{B} = \begin{bmatrix} \mathcal{B}\_{1,1} \\ \mathcal{B}\_{1,2,1} \\ \mathcal{B}\_{1,2,2} \\ \mathcal{B}\_\bullet \end{bmatrix}, \quad \mathcal{C} = \begin{bmatrix} \mathcal{C}\_{1,1}^E & \mathcal{C}\_{1,2}^G & \mathcal{C}\_{1,2}^E & \mathcal{C}\_\bullet \end{bmatrix}, \quad \mathcal{C} = \begin{bmatrix} \mathcal{C}\_{1,1}^E & \mathcal{C}\_{1,2}^E & \mathcal{C}\_{1,2}^E \end{bmatrix}$$

where 0 < *d*<sup>1</sup> <sup>1</sup> <sup>≤</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup> <sup>≤</sup> *<sup>s</sup>*, <sup>B</sup>1,2,1 and <sup>B</sup>1,2,2 are p.u.t., <sup>C</sup>*<sup>E</sup>* 1,1 ∈ *Os*,*d*<sup>1</sup> 1 , <sup>C</sup>*<sup>E</sup>* 1,2 ∈ *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 , (C*<sup>E</sup>* 1,1) C*E* 1,2 = 0*d*<sup>1</sup> 1×*d*<sup>1</sup> 2 , (*C<sup>E</sup>* 1,1) *C<sup>G</sup>* 1,2 = 0*d*<sup>1</sup> 1×*d*<sup>1</sup> 1 , (*C<sup>E</sup>* 1,2) *C<sup>G</sup>* 1,2 = 0(*d*<sup>1</sup> 2−*d*<sup>1</sup> 1)×*d*<sup>1</sup> 1 and (A•, B•, C•) is in echelon canonical form with Kronecker indices *α*•. All matrices are real valued.

The parameterizations of the p.u.t. matrices B1,2,1 and B1,2,2 are as discussed above. The entries of B1,1 are unrestricted and thus included in the parameter vector *θB*, *<sup>f</sup>* containing also the free entries in B1,2,1 and B1,2,2. The subsystem (A•, B•, C•) is parameterized using the echelon canonical form.

The parameterization of <sup>C</sup>*<sup>E</sup>* 1,1 ∈ *Os*,*d*<sup>1</sup> 1 proceeds as in the MFI(1) case, using *C*−<sup>1</sup> *<sup>O</sup>* (C*<sup>E</sup>* 1,1). The parameterization of <sup>C</sup>*<sup>E</sup>* 1,2 has to take the restriction of orthogonality of <sup>C</sup>*<sup>E</sup>* 1,2 to <sup>C</sup>*<sup>E</sup>* 1,1 into account, thus the set to be parameterized is given by

$$O\_{s,d\_2^1-d\_1^1}(\mathcal{C}\_{1,1}^{\mathbb{E}}) \quad := \quad \{ \mathcal{C}\_{1,2}^{\mathbb{E}} \in \mathbb{R}^{s \times (d\_2^1-d\_1^1)} | (\mathcal{C}\_{1,1}^{\mathbb{E}})^\prime \mathcal{C}\_{1,2}^{\mathbb{E}} = 0\_{d\_1^1 \times (d\_2^1-d\_1^1)^\prime} (\mathcal{C}\_{1,2}^{\mathbb{E}})^\prime \mathcal{C}\_{1,2}^{\mathbb{E}} = I\_{d\_2^1-d\_1^1} \}. \tag{13}$$

The parameterization of this set again uses real Givens rotations. For C ∈ *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) it follows that *RL*(*θL*)C = [0 *d*1 1×(*d*<sup>1</sup> 2−*d*<sup>1</sup> 1) , C˜ ] for a matrix <sup>C</sup>˜ such that <sup>C</sup>˜ <sup>C</sup>˜ <sup>=</sup> *<sup>I</sup> d*1 2−*d*<sup>1</sup> 1 with *RL*(*θL*) corresponding to <sup>C</sup>*<sup>E</sup>* 1,1. The matrix <sup>C</sup>˜ is parameterized as discussed in Lemma 1.

**Corollary 1** (Properties of the parameterization of *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1))**.** *Define for <sup>d</sup>*<sup>1</sup> <sup>1</sup> < *<sup>d</sup>*<sup>1</sup> <sup>2</sup> ≤ *s a mapping <sup>θ</sup>***˜** <sup>→</sup> *CO*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (*θ***˜**; *C<sup>E</sup>* 1,1) *from* <sup>Θ</sup><sup>R</sup> *O*,*d*<sup>1</sup> 2 := [0, 2*π*)(*d*<sup>1</sup> 2−*d*<sup>1</sup> 1)(*s*−*d*<sup>1</sup> <sup>2</sup>) <sup>×</sup> [0, 2*π*)(*d*<sup>1</sup> 2−*d*<sup>1</sup> 1)(*d*<sup>1</sup> 2−*d*<sup>1</sup> <sup>1</sup>−1)/2 <sup>→</sup> *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *by*

$$\mathcal{C}\_{O,d\_2^1-d\_1^1}(\boldsymbol{\varPhi}; \mathcal{C}\_{1,1}^E) \quad := \quad \mathcal{R}\_L(\boldsymbol{\varPhi}\_L)' \left[ \begin{array}{c} \mathcal{O}\_{d\_1^1 \times (d\_2^1-d\_1^1)} \\ \mathcal{C}\_O(\boldsymbol{\varPhi}) \end{array} \right].$$

,

*.*

*where θ<sup>L</sup> denotes the parameter values corresponding to* [*θ <sup>L</sup>*, *θ R*] = *C*−<sup>1</sup> *<sup>O</sup>* (C*<sup>E</sup>* 1,1) *as defined in Lemma 1. The following properties hold:*


*For d*<sup>1</sup> <sup>2</sup> < *s, it holds*

*(iii) For every* <sup>C</sup>*<sup>E</sup>* 1,2 ∈ *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *there exists a vector <sup>θ</sup>***˜** = [*θ***˜** *<sup>L</sup>*, *θ***˜** *R*] <sup>∈</sup> <sup>Θ</sup><sup>R</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 *such that*

$$\mathcal{C}\_{1,2}^{\to} = \mathbb{C}\_{O,d\_2^1 - d\_1^1}(\tilde{\theta}; \mathcal{C}\_{1,1}^{\to}) = \mathcal{R}\_L(\theta\_L)' \left[ \begin{array}{c} 0\_{d\_1^1 \times (d\_2^1 - d\_1^1)} \\ \ R\_L(\tilde{\theta}\_L)' \left[ \begin{array}{c} I\_{d\_2^1 - d\_1^1} \\ 0\_{(s - d\_2^1) \times (d\_2^1 - d\_1^1)} \end{array} \right] \mathcal{R}\_R(\tilde{\theta}\_R) \\ \end{array} \right].$$

*The algorithm discussed above Lemma 1 defines the inverse mapping C*−<sup>1</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup>

1 *(iv) The inverse mapping C*−<sup>1</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (·; <sup>C</sup>*<sup>E</sup>* 1,1)*—the parameterization of Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1)*—is infinitely often differentiable on the pre-image of the interior of* Θ<sup>R</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 *. This is an open and dense subset of Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1)*.*

#### *For d*<sup>1</sup> <sup>2</sup> = *s, it holds that*

*(v) Os*,*s*−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *is a disconnected space with two disjoint non-empty closed subsets:*

$$\begin{split} \mathcal{O}^{+}\_{s,s-d^{1}\_{1}}(\mathcal{C}^{E}\_{1,1}) &:= \\ \{\mathcal{C}^{E}\_{1,2} \in \mathbb{R}^{s \times (s-d^{1}\_{1})} | (\mathcal{C}^{E}\_{1,1})' \mathcal{C}^{E}\_{1,2} = 0\_{d^{1}\_{1} \times (s-d^{1}\_{1})}, (\mathcal{C}^{E}\_{1,2})' \mathcal{C}^{E}\_{1,2} = I\_{s-d^{1}\_{1}}, \det([\mathcal{C}^{E}\_{1,1}, \mathcal{C}^{E}\_{1,2}]) = 1 \}, \\ \{\mathcal{O}^{-}\_{s,s-d^{1}\_{1}}(\mathcal{C}^{E}\_{1,1}) &:= \\ \{\mathcal{C}^{E}\_{1,2} \in \mathbb{R}^{s \times (s-d^{1}\_{1})} | (\mathcal{C}^{E}\_{1,1})' \mathcal{C}^{E}\_{1,2} = 0\_{d^{1}\_{1} \times (s-d^{1}\_{1})}, (\mathcal{C}^{E}\_{1,2})' \mathcal{C}^{E}\_{1,2} = I\_{s-d^{1}\_{1}}, \det([\mathcal{C}^{E}\_{1,1}, \mathcal{C}^{E}\_{1,2}]) = -1 \}. \end{split}$$

*(vi) For every O*<sup>+</sup> *<sup>s</sup>*,*s*−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *there exists a vector <sup>θ</sup>***˜** <sup>∈</sup> <sup>Θ</sup><sup>R</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 *such that*

$$\mathcal{C}\_{1,2}^{E} = \mathcal{C}\_{O, \mathfrak{s} - d\_1^1}(\Phi; \mathcal{C}\_{1,1}^{E}) = \mathcal{R}\_R(\Phi\_R) \dots$$

*Steps 1–4 of the algorithm discussed above Lemma 1 define the inverse mapping C*−<sup>1</sup> *<sup>O</sup>*,*s*−*d*<sup>1</sup> 1 (·; <sup>C</sup>*<sup>E</sup>* 1,1) : *O*<sup>+</sup> *<sup>s</sup>*,*s*−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) <sup>→</sup> <sup>Θ</sup><sup>R</sup> *<sup>O</sup>*,*s*−*d*<sup>1</sup> 1 *.*

*(vii) Define v* := [*π*,..., *π*] <sup>∈</sup> <sup>R</sup>(*s*−*d*<sup>1</sup> 1)(*s*−*d*<sup>1</sup> <sup>1</sup>−1)/2*. Then a parameterization of Os*,*s*−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *is given by*

$$\begin{array}{rcl} \mathcal{C}^{\pm}\_{O,s-d\_1^1}(\mathcal{C}^{\mathbb{E}}\_{1,2};\mathcal{C}^{\mathbb{E}}\_{1,1}) &=& \begin{cases} \upsilon + \mathcal{C}^{-1}\_{O,s-d\_1^1}(\mathcal{C}^{\mathbb{E}}\_{1,2};\mathcal{C}^{\mathbb{E}}\_{1,1}) & \text{if } \mathbb{C} \in \mathcal{O}^{+}\_{s,s-d\_1^1}(\mathcal{C}^{\mathbb{E}}\_{1,1})\\ - (\upsilon + \mathcal{C}^{-1}\_{O,s-d\_1^1}(\mathcal{C}^{\mathbb{E}}\_{1,2}I\_{s-d\_1^1}^{-};\mathcal{C}^{\mathbb{E}}\_{1,1})) & \text{if } \mathbb{C} \in \mathcal{O}^{-}\_{s,s-d\_1^1}(\mathcal{C}^{\mathbb{E}}\_{1,1}). \end{cases} \end{array}$$

*The parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of Os*,*s.*

The proof of Corollary 1 uses the same arguments as the proof of Lemma 1 and is, therefore, omitted. It remains to provide a parameterization for <sup>C</sup>*<sup>G</sup>* 1,2 restricted to be orthogonal to both <sup>C</sup>*<sup>E</sup>* 1,1 and C*E* 1,2. Thus, the set to be parametrized is given by

$$O\_{\mathfrak{s},\mathbb{G}}(\mathcal{C}\_{1,1}^{\mathbb{E}}, \mathcal{C}\_{1,2}^{\mathbb{E}}) \quad := \quad \{\mathcal{C}\_{1,2}^{\mathbb{G}} \in \mathbb{R}^{\varepsilon \times d\_1^4} | (\mathcal{C}\_{1,1}^{\mathbb{E}})^\prime \mathcal{C}\_{1,2}^{\mathbb{G}} = 0\_{d\_1^1 \times d\_1^4}, (\mathcal{C}\_{1,2}^{\mathbb{E}})^\prime \mathcal{C}\_{1,2}^{\mathbb{G}} = 0\_{(d\_2^1 - d\_1^1) \times d\_1^4} \}.$$

The parameterization of *Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2) is straightforward: Left multiplication of <sup>C</sup>*<sup>G</sup>* 1,2 with *RL*(*θL*) as defined in Lemma <sup>1</sup> and of the lower (*<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup>) <sup>×</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup>- block with *RL*(*θ***˜***L*) as defined in Corollary <sup>1</sup> transforms the upper *d*<sup>1</sup> <sup>2</sup> <sup>×</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup>-block to zero and collects the free parameters in the lower (*<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup>) × *d*1 <sup>1</sup>-block. Clearly, this is a bijective and infinitely often differentiable mapping on *Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2) and thus a useful parameterization, since the matrix <sup>C</sup>*<sup>G</sup>* 1,2 is only multiplied with two constant invertible matrices. The entries of the matrix product are then collected in a parameter vector as shown in Corollary 2.

**Corollary 2** (Properties of the parameterization of *Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2))**.** *Define for given matrices* <sup>C</sup>*<sup>E</sup>* 1,1 ∈ *Os*,*d*<sup>1</sup> 1 *and* <sup>C</sup>*<sup>E</sup>* 1,2 ∈ *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,1) *a mapping <sup>λ</sup>* <sup>→</sup> *CO*,*G*(*λ*; <sup>C</sup>*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2) *from* <sup>R</sup>*d*<sup>1</sup> 1(*s*−*d*<sup>1</sup> <sup>2</sup>) <sup>→</sup> *Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2) *by*

$$\begin{array}{rcl} \mathcal{C}\_{\mathcal{O},\mathcal{G}}(\lambda; \mathcal{C}\_{1,1}^{\mathrm{E}}, \mathcal{C}\_{1,2}^{\mathrm{E}}) & := & \mathcal{R}\_{L}(\boldsymbol{\theta}\_{L})^{\prime} \left[ \begin{array}{cccc} \mathbf{0}\_{d\_{1}^{1} \times d\_{1}^{1}} & \cdots & \mathbf{0}\_{\left(d\_{2}^{1} - d\_{1}^{1}\right) \times 1} \\\\ \mathbf{\mathcal{R}}\_{L}(\boldsymbol{\tilde{\theta}}\_{L})^{\prime} & \lambda\_{1} & \cdots & \lambda\_{d\_{1}^{1}} \\\\ \lambda\_{d\_{1}^{1} + 1} & \cdots & \lambda\_{2d\_{1}^{1}} \\ \vdots & & \vdots \\\\ \lambda\_{d\_{1}^{1} (s - d\_{2}^{1} - 1) + 1} & \cdots & \lambda\_{d\_{1}^{1} (s - d\_{2}^{1})} \end{array} \right] \end{array} \right]$$

*where θ<sup>L</sup> denotes the parameter values corresponding to* [*θ <sup>L</sup>*, *θ R*] = *C*−<sup>1</sup> *<sup>O</sup>* (C*<sup>E</sup>* 1,1) *as defined in Lemma <sup>1</sup> and <sup>θ</sup>***˜***<sup>L</sup> denotes the parameter values corresponding to* [*θ***˜** *<sup>L</sup>*, *θ***˜** *R*] = *C*−<sup>1</sup> *O*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 (C*<sup>E</sup>* 1,2; <sup>C</sup>*<sup>E</sup>* 1,1) *as defined in Corollary 1. The set Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2) *is closed and both CO*,*<sup>G</sup> as well as <sup>C</sup>*−<sup>1</sup> *<sup>O</sup>*,*G*(·)*—the parameterization of Os*,*G*(C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2)*—are infinitely often differentiable.*

#### Components of the Parameter Vector

In the I(2) case, the multi-index Γ contains the state space unit root structure Ω*<sup>S</sup>* = ((0, *d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)), the structure indices *<sup>p</sup>* <sup>∈</sup> <sup>N</sup>*d*<sup>1</sup> 1+*d*<sup>1</sup> 2 <sup>0</sup> , encoding the p.u.t. structures of B1,2,1 and B1,2,2, and the Kronecker indices *α*• for the stable subsystem. The parameterization of the set of all systems in canonical form with given multi-index Γ for the I(2) case uses the following components:


**Example 8.** *Consider an I(2) process with* Ω*<sup>S</sup>* = ((0, 1, 2))*, p* = [0, 1, 1] *, n*• = 0 *and system matrices*

$$\mathcal{A} = \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}, \quad \mathcal{B} = \begin{bmatrix} -1 & 2 & -2 \\ 1 & -1 & 3 \\ 2 & 0 & 1 \end{bmatrix}, \quad \mathcal{C} = \begin{bmatrix} 0 & -1 & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \frac{1}{2} \end{bmatrix}.$$

*In this case, θB*, *<sup>f</sup>* ,1 = [−1, 2, −2, −1, 3, 0, 1] *, θB*,*p*,1 = [1, 2] *. It follows from*

$$\begin{split} &R\_{3,1,2}\left(\frac{7\pi}{4}\right)R\_{3,1,3}\left(\frac{\pi}{2}\right)\mathcal{L}\_{1,1}^{E} = \left[\begin{array}{cc} 1 & 0 & 0 \end{array}\right]',\\ &R\_{3,1,2}\left(\frac{7\pi}{4}\right)R\_{3,1,3}\left(\frac{\pi}{2}\right)\mathcal{L}\_{1,2}^{E} = \left[\begin{array}{cc} 0 & \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}} \end{array}\right]' \quad \text{and} \quad R\_{2,1,2}\left(\frac{7\pi}{4}\right)\left[\begin{array}{cc} \frac{1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}} \end{array}\right] = \left[\begin{array}{cc} 0 & 1 & 1 \end{array}\right]',\\ &R\_{3,1,2}\left(\frac{7\pi}{4}\right)R\_{3,1,3}\left(\frac{\pi}{2}\right)\mathcal{L}\_{1,2}^{G} = \left[\begin{array}{cc} 0 & 1 & 1 \end{array}\right]' \quad \text{and} \quad R\_{2,1,2}\left(\frac{7\pi}{4}\right)\left[\begin{array}{cc} 1\\ 1 \end{array}\right] = \left[\begin{array}{cc} 0\\ \sqrt{2} \end{array}\right]'. \end{split}$$

*that θC*,*<sup>E</sup>* = [*θ <sup>C</sup>*,*E*,1,1, *θC*,*E*,1,2] = *<sup>π</sup>* <sup>2</sup> , <sup>7</sup>*<sup>π</sup>* 4 , 7*π* 4 *and <sup>θ</sup>C*,*<sup>G</sup>* = [√2]*.*

#### *3.3. The Parameterization in the General Case*

Inspecting the canonical form shows that all relevant building blocks are already present in the MFI(1) and the I(2) cases and can be combined to deal with the general case: The entries in B*<sup>u</sup>* are either unrestricted or follow restrictions according to given structure indices *p*, and the parameter space is chosen accordingly, as discussed for the MFI(1) and I(2) cases. The restrictions on the matrices C*<sup>u</sup>* and its blocks C*<sup>k</sup>* require more sophisticated parameterizations of parts of unitary or orthonormal matrices as well as of orthogonal complements. These are dealt with in Lemmas 1 and 2 and Corollaries 1 and 2 above. The extension of Corollaries 1 and 2 to complex matrices and to matrices which are orthogonal to a larger number of blocks of C*<sup>k</sup>* is straightforward.

The following theorem characterizes the properties of parameterizations for sets *M*Γ of transfer functions with (general) multi-index Γ and describes the relations between sets of transfer functions and the corresponding sets ΔΓ of triples (A, B, C) of system matrices in canonical form, defined below. Discussing the continuity and differentiability of mappings on sets of transfer functions and on sets of matrix triples also requires the definition of a topology on both sets.

#### **Definition 8.**


Please note that in the definition of the pointwise topology convergence does not need to be uniform in *j* and moreover, the power series coefficients do not need to converge to zero for *j* → ∞ and hence the concept can also be used for unstable systems.

**Theorem 2.** *The set Mn can be partitioned into pieces M*Γ*, where* Γ := {Ω*S*, *p*, *α*•}*, i.e.,*

$$M\_{\mathfrak{n}} = \bigcup\_{\Gamma = \{\Omega\_S, p, \mathfrak{a}\_\bullet\} \mid n\_\mathfrak{u} \left(\Omega\_S\right) + n\_\bullet \left(\mathfrak{a}\_\bullet\right) = n} M\_{\Gamma'}$$

*where nu*(Ω*S*) := ∑*<sup>l</sup> <sup>k</sup>*=<sup>1</sup> <sup>∑</sup>*hk <sup>j</sup>*=<sup>1</sup> *<sup>d</sup><sup>k</sup> <sup>j</sup> δk, with δ<sup>k</sup>* = 1 *for ω<sup>k</sup>* ∈ {0, *π*} *and δ<sup>k</sup>* = 2 *for* 0 < *ω<sup>k</sup>* < *π is the state dimension of the unstable subsystem* (A*u*, B*u*, C*u*) *with state space unit root structure* Ω*<sup>S</sup> and n*•(*α*•) := ∑*s <sup>i</sup>*=<sup>1</sup> *<sup>α</sup>*•,*<sup>i</sup> is the state dimension of the stable subsystem with Kronecker indices <sup>α</sup>*• = (*α*•,1, ... , *<sup>α</sup>*•,*s*), *<sup>α</sup>*•,*<sup>i</sup>* <sup>∈</sup> <sup>N</sup>0*. For every multi-index* <sup>Γ</sup> *there exists a parameter space* ΘΓ <sup>⊂</sup> <sup>R</sup>*d*(Γ) *for some integer <sup>d</sup>*(Γ)*, endowed with the Euclidean norm, and a function φ*<sup>Γ</sup> : ΔΓ → ΘΓ*, such that for every* (A, B, C) ∈ ΔΓ *the parameter vector θ* := *φ*Γ(A, B, C) ∈ ΘΓ *is composed of:*

	- *(i) The mapping ψ*<sup>Γ</sup> : *M*<sup>Γ</sup> → ΔΓ *that attaches a triple* (A, B, C) *in canonical form to a transfer function in M*<sup>Γ</sup> *is continuous. It is the inverse (restricted to M*Γ*) of the Tpt-continuous function π* : (*A*, *B*, *C*) → *<sup>k</sup>*(*z*) = *Is* <sup>+</sup> *zC*(*In* <sup>−</sup> *zA*)−1*B.*
	- *(ii) Every parameter vector θ* = [*θ <sup>B</sup>*, *<sup>f</sup>* , *θ <sup>B</sup>*,*p*, *θ <sup>C</sup>*,*E*, *θ <sup>C</sup>*,*G*, *θ* •] ∈ ΘΓ ⊂ Θ*B*, *<sup>f</sup>* × Θ*B*,*<sup>p</sup>* × Θ*C*,*<sup>E</sup>* × Θ*C*,*<sup>G</sup>* × Θ• *corresponds to a triple* (A(*θ*), B(*θ*), C(*θ*)) ∈ ΔΓ *and a transfer function k*(*z*) = *<sup>π</sup>*(A(*θ*), <sup>B</sup>(*θ*), <sup>C</sup>(*θ*)) <sup>∈</sup> *<sup>M</sup>*Γ*. The mapping <sup>φ</sup>*−<sup>1</sup> <sup>Γ</sup> : *θ* → (A(*θ*), B(*θ*), C(*θ*)) *is continuous on* ΘΓ*.*

*(iii) For every multi-index* Γ *the set of points in* ΔΓ*, where the mapping φ*<sup>Γ</sup> *is continuous, is open and dense in* ΔΓ*.*

As mentioned in Section 2, the parameterization of Φ is straightforward. The *s* × *m* entries of Φ are collected in a parameter vector *d*. Thus, there is a one-to-one correspondence between state space realizations (A, <sup>B</sup>, <sup>C</sup>, <sup>Φ</sup>) <sup>∈</sup> ΔΓ <sup>×</sup> <sup>R</sup>*s*×*<sup>m</sup>* and parameter vectors *<sup>τ</sup>* = [*θ* , *d* ] <sup>∈</sup> ΘΓ <sup>×</sup> <sup>R</sup>*sm*. The same holds true for parameters used for the symmetric, positive definite innovation matrix <sup>Σ</sup> <sup>∈</sup> <sup>R</sup>*s*×*<sup>s</sup>* obtained, e.g., from a lower triangular Cholesky factor of Σ.

#### **4. The Topological Structure**

The parameterization of *Mn* in Theorem 2 partitions *Mn* into subsets *M*<sup>Γ</sup> for a selection of multi-indices Γ. To every multi-index Γ there exists a corresponding associated parameter set ΘΓ. Thus, in practical applications, maximizing the pseudo likelihood requires choosing the multi-index Γ. Maximizing the pseudo likelihood over the set *M*Γ effectively amounts to including also all elements in the closure of *M*Γ, because of continuity of the parameterization. It is thus necessary to characterize the closures of the sets *M*Γ.

Moreover, maximizing the pseudo likelihood function over all possible multi-indices is time-consuming and not desirable. Fortunately, the results discussed below show that there exists a generic multi-index Γ*<sup>g</sup>* such that *Mn* ⊂ *M*Γ*<sup>g</sup>* . This generic choice corresponds to the set of all stable systems of order *n* corresponding to the generic neighborhood of the echelon canonical form. This multi-index, therefore, is a natural starting point for estimation.

However, in particular for hypotheses testing, it will be necessary to maximize the pseudo likelihood over sets of transfer functions of order *n* with specific state space unit root structure Ω*S*, denoted as *M*(Ω*S*, *n*•) below, where *n*• denotes the dimension of the stable part of the state. We show below that also in this case there exists a generic multi-index Γ*g*(Ω*S*, *n*•) such that *M*(Ω*S*, *n*•) ⊂ *<sup>M</sup>*Γ*g*(Ω*S*,*n*•).

The main tool to obtain these results is investigating the properties of the mappings *ψ*Γ, that map transfer functions in *M*<sup>Γ</sup> to triples (*A*, *B*, *C*) ∈ ΔΓ, as well as analyzing the closures of the sets ΔΓ. The relation between parameter vectors *θ* ∈ ΘΓ and triples of system matrices (*A*, *B*, *C*) ∈ ΔΓ is easier to understand than the relation between ΔΓ and *M*Γ, due to the results of Theorem 2. Consequently, this section focuses on the relations between ΔΓ and *M*<sup>Γ</sup> —and their closures—for different multi-indices Γ.

To define the closures we embed the sets ΔΓ of matrices in canonical form with multi-indices Γ corresponding to transfer functions of order *n* into the space Δ*<sup>n</sup>* of all conformable complex matrix triples (*A*, *<sup>B</sup>*, *<sup>C</sup>*) with *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, where additionally *<sup>λ</sup>*|*max*|(*A*) <sup>≤</sup> 1. Since the elements of <sup>Δ</sup>*<sup>n</sup>* are matrix triples, this set is isomorphic to a subset of the finite dimensional space C*n*2+2*ns*, equipped with the Euclidean topology. Please note that Δ*<sup>n</sup>* also contains non-minimal state space realizations, corresponding to transfer functions of lower order.

**Remark 16.** *In principle the set* Δ*<sup>n</sup> also contains state space realizations of transfer functions k*(*z*) = *Is* + ∑<sup>∞</sup> *<sup>j</sup>*=<sup>1</sup> *Kjz<sup>j</sup> with complex valued coefficients Kj. Since the subset of* Δ*<sup>n</sup> of state space systems realizing transfer functions with real valued Kj is closed in* Δ*n, realizations corresponding to transfer functions with coefficients with non-zero imaginary part are irrelevant for the analysis of the closures of the sets* ΔΓ*.*

After investigating the closure of ΔΓ in Δ*n*, denoted by ΔΓ, we consider the set of corresponding transfer functions *π*(ΔΓ). Since we effectively maximize the pseudo likelihood over ΔΓ, we have to understand for which multi-indices Γ˜ the set *π*(ΔΓ˜) is a subset of *π*(ΔΓ). Moreover, we find a covering of *<sup>π</sup>*(ΔΓ) ⊂ <sup>1</sup> *<sup>i</sup>*∈I *M*Γ*<sup>i</sup>* . This restricts the set of multi-indices Γ that may occur as possible multi-indices of the limit of a sequence in *π*(ΔΓ) and thus the set of transfer functions that can be obtained by maximization of the pseudo likelihood.

The sets *M*Γ, are embedded into the vector space *M* of all causal transfer functions *k*(*z*) = *Is* + ∑<sup>∞</sup> *<sup>j</sup>*=<sup>1</sup> *Kjz<sup>j</sup>* . The vector space *<sup>M</sup>* is isomorphic to the infinite dimensional space <sup>Π</sup>*j*∈NR*<sup>s</sup>*×*<sup>s</sup> <sup>j</sup>* equipped with the pointwise topology. Since, as mentioned above, maximization of the pseudo likelihood function over *M*<sup>Γ</sup> effectively includes *M*Γ, it is important to determine for any given multi-index Γ, the multi-indices Γ˜ for which the set *M*Γ˜ is a subset of *M*Γ. Please note that *M*<sup>Γ</sup> is not necessarily equal to *π*(ΔΓ). The continuity of *π*, as shown in Theorem 2 (i), implies the following inclusions:

$$M\_{\Gamma} = \pi(\Delta\_{\Gamma}) \subset \pi(\overline{\Delta\_{\Gamma}}) \subset \overline{M\_{\Gamma}}.$$

In general all these inclusions are strict. For a discussion in case of stable transfer functions see Hannan and Deistler (1988, Theorem 2.5.3).

We first define a partial ordering on the set of multi-indices Γ. Subsequently we examine the closure ΔΓ in Δ*<sup>n</sup>* and finally we examine the closures *M*<sup>Γ</sup> in *M*.

#### **Definition 9.**

*(i) For two state space unit root structures* <sup>Ω</sup>*<sup>S</sup> and* <sup>Ω</sup>˜ *<sup>S</sup> with corresponding matrices* <sup>A</sup>*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*nu*×*nu and* <sup>A</sup>˜*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*n*˜*u*×*n*˜*<sup>u</sup> in canonical form, it holds that* <sup>Ω</sup>˜ *<sup>S</sup>* <sup>≤</sup> <sup>Ω</sup>*<sup>S</sup> if and only if there exists a permutation matrix <sup>S</sup> such that*

$$\mathcal{S}\mathcal{A}\_{\underline{u}}\mathcal{S}^{\prime} \quad = \quad \begin{bmatrix} \mathcal{A}\_{\underline{u}} & I\_{12} \\ 0 & \tilde{I}\_{2} \end{bmatrix}.$$

*Moreover,* <sup>Ω</sup>˜ *<sup>S</sup>* <sup>&</sup>lt; <sup>Ω</sup>*<sup>S</sup> holds if additionally* <sup>Ω</sup>˜ *<sup>S</sup>* <sup>=</sup> <sup>Ω</sup>*S.*

*(ii) For two state space unit root structures* <sup>Ω</sup>*<sup>S</sup> and* <sup>Ω</sup>˜ *<sup>S</sup> and dimensions of the stable subsystems <sup>n</sup>*•, *<sup>n</sup>*˜ • <sup>∈</sup> <sup>N</sup><sup>0</sup> *we define*

(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) <sup>≤</sup> (Ω*S*, *<sup>n</sup>*•) *if and only if* <sup>Ω</sup>˜ *<sup>S</sup>* <sup>≤</sup> <sup>Ω</sup>*S*, *<sup>n</sup>*˜ • <sup>≤</sup> *<sup>n</sup>*•.

*Strict inequality holds, if at least one of the two inequalities above holds strictly.*

*(iii) For two pairs* (Ω*S*, *<sup>p</sup>*) *and* (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) *with corresponding matrices* <sup>A</sup>*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*nu*×*nu and* <sup>A</sup>˜*<sup>u</sup>* <sup>∈</sup> <sup>C</sup>*n*˜*u*×*n*˜*<sup>u</sup> in canonical form, it holds that* (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) <sup>≤</sup> (Ω*S*, *<sup>p</sup>*) *if and only if there exists a permutation matrix <sup>S</sup> such that*

$$\mathcal{S}\mathcal{A}\_{\mathfrak{u}}\mathcal{S}' = \begin{bmatrix} \mathcal{A}\_{\mathfrak{u}} & f\_{12} \\ 0 & \tilde{f}\_2 \end{bmatrix}, \qquad \mathcal{S}\mathcal{P} = \begin{bmatrix} p\_1 \\ p\_2 \end{bmatrix}.$$

*where <sup>p</sup>*<sup>1</sup> <sup>∈</sup> <sup>N</sup>*n*˜*<sup>u</sup>* <sup>0</sup> *and p*˜ *restricts at least as many entries as p*1*, i.e., p*˜*<sup>i</sup>* ≥ (*p*1)*<sup>i</sup> holds for all i* = 1, ... , *n*˜ *u. Moreover,* (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) <sup>&</sup>lt; (Ω*S*, *<sup>p</sup>*) *holds if additionally* (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) = (Ω*S*, *<sup>p</sup>*)*.*

*(iv) Let <sup>α</sup>*• = (*α*•,1, ... , *<sup>α</sup>*•,*s*), *<sup>α</sup>*•,*<sup>i</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *and <sup>α</sup>*˜ • = (*α*˜ •,1, ... , *<sup>α</sup>*˜ •,*s*), *<sup>α</sup>*˜ •,*<sup>i</sup>* <sup>∈</sup> <sup>N</sup>0*. Then <sup>α</sup>*˜ • <sup>≤</sup> *<sup>α</sup>*• *if and only if <sup>α</sup>*˜ •,*<sup>i</sup>* ≤ *<sup>α</sup>*•,*i*, *<sup>i</sup>* = 1, ... ,*s. Moreover, <sup>α</sup>*˜ • < *<sup>α</sup>*• *holds, if at least one inequality is strict (compare Hannan and Deistler 1988, sct. 2.5).*

*Finally, define*

$$
\bar{\Gamma} = (\bar{\Omega}\_{\mathbb{S}}, \bar{p}, \mathbb{R}\_{\bullet}) \le \Gamma = (\Omega\_{\mathbb{S}}, p, u\_{\bullet}) \quad \text{if and only if} \quad (\bar{\Omega}\_{\mathbb{S}}, \bar{p}) \le (\Omega\_{\mathbb{S}}, p) \text{ and } \mathbb{R}\_{\bullet} \le u\_{\bullet}.
$$

*Strict inequality holds, if at least one of the inequalities above holds strictly.*

Please note that (i) implies that Ω˜ *<sup>S</sup>* only contains unit roots that are also contained in Ω*S*, with the integration orders ˜ *hk* of the unit roots in Ω˜ *<sup>S</sup>* smaller or equal to the integration orders of the respective unit roots in Ω*S*. Thus, denoting the unit root structures corresponding to Ω˜ *<sup>S</sup>* and Ω*<sup>S</sup>* by Ω˜ and Ω, it follows that <sup>Ω</sup>˜ *<sup>S</sup>* <sup>≤</sup> <sup>Ω</sup>*<sup>S</sup>* implies <sup>Ω</sup>˜ <sup>Ω</sup>. The reverse does not hold as, e.g., for <sup>Ω</sup>*<sup>S</sup>* = ((0, 1, 1)) (where

hence <sup>Ω</sup> = ((0, 2))) and <sup>Ω</sup>˜ *<sup>S</sup>* = ((0, 2)) (with <sup>Ω</sup>˜ = ((0, 1))) it holds that <sup>Ω</sup>˜ <sup>≺</sup> <sup>Ω</sup>, but neither <sup>Ω</sup>˜ *<sup>S</sup>* <sup>≤</sup> <sup>Ω</sup>*<sup>S</sup>* nor <sup>Ω</sup>*<sup>S</sup>* <sup>≤</sup> <sup>Ω</sup>˜ *<sup>S</sup>* holds as here

$$\mathcal{A}\_{\mathbb{W}} = \left( \begin{array}{ccc} 1 & 1 \\ 0 & 1 \\ \end{array} \right), \quad \bar{\mathcal{A}}\_{\mathbb{W}} = \left( \begin{array}{ccc} 1 & 0 \\ 0 & 1 \\ \end{array} \right).$$

This partial ordering is convenient for the characterization of the closure of ΔΓ.

#### *4.1. The Closure of* ΔΓ *in* Δ*<sup>n</sup>*

Please note that the block-structure of A implies that every system in ΔΓ can be separated in two subsystems (A*u*, B*u*, C*u*) and (A•, B•, C•). Define ΔΩ*S*,*<sup>p</sup>* := <sup>Δ</sup>(Ω*S*,*p*,{}) as the set of all state space realizations in canonical form corresponding to state space unit root structure Ω*S*, structure indices *<sup>p</sup>* and *<sup>n</sup>*• <sup>=</sup> 0. Analogously define <sup>Δ</sup>*α*• :<sup>=</sup> <sup>Δ</sup>({},{},*α*•) as the set of all state space realizations in canonical form with Ω*<sup>S</sup>* = {} and Kronecker indices *α*•. Examining ΔΩ*S*,*<sup>p</sup>* and Δ*α*• separately simplifies the analysis.

#### 4.1.1. The Closure of ΔΩ*S*,*<sup>p</sup>*

The canonical form imposes a lot of structure, i.e., restrictions on the matrices A, B and C. By definition ΔΩ*S*,*<sup>p</sup>* = Δ<sup>A</sup> <sup>Ω</sup>*S*,*<sup>p</sup>* × <sup>Δ</sup><sup>B</sup> <sup>Ω</sup>*S*,*<sup>p</sup>* × <sup>Δ</sup><sup>C</sup> <sup>Ω</sup>*S*,*<sup>p</sup>* and the closures of the three matrices can be analyzed separately. ΔA <sup>Ω</sup>*S*,*<sup>p</sup>* and <sup>Δ</sup><sup>C</sup> <sup>Ω</sup>*S*,*<sup>p</sup>* are very easy to investigate. The structure of A is fully determined by Ω*<sup>S</sup>* and consequently Δ<sup>A</sup> <sup>Ω</sup>*S*,*<sup>p</sup>* consists of a single matrix A which immediately implies that ΔA <sup>Ω</sup>*S*,*<sup>p</sup>* = <sup>Δ</sup><sup>A</sup> <sup>Ω</sup>*S*,*p*. The matrix <sup>C</sup>, compare Theorem <sup>1</sup> is composed of blocks <sup>C</sup>*<sup>E</sup> <sup>k</sup>* that are sub-blocks of unitary (or orthonormal) matrices and blocks <sup>C</sup>*<sup>G</sup> <sup>k</sup>* that have to fulfill (recursive) orthogonality constraints. The corresponding sets were shown to be closed in Lemmas 1 and 2 and Corollaries 1 and 2. Thus, ΔC <sup>Ω</sup>*S*,*<sup>p</sup>* = <sup>Δ</sup><sup>C</sup> <sup>Ω</sup>*S*,*p*.

It remains to discuss ΔB <sup>Ω</sup>*S*,*p*. The structure indices *p* defining the p.u.t. structures of the matrices B*<sup>k</sup>* restrict some entries to be positive. Combining all the parameters—unrestricted with complex values parameterized by real and imaginary part and the positive entries—into a parameter vector leads to an open sub-set of R*<sup>m</sup>* for some *m*. For convergent sequences of systems with fixed Ω*<sup>S</sup>* and *p*, limits of entries restricted to be positive may be zero. When this happens, two cases have to be distinguished. First, all p.u.t. sub-matrices still have full row rank. In this case the limiting system, (A0, <sup>B</sup>0, <sup>C</sup>0) say, is still minimal and can be transformed to a system in canonical form (A˜ 0, B˜ 0, C˜ <sup>0</sup>) with *fewer* unrestricted entries in <sup>B</sup>˜ 0.

Second, if at least one of the row ranks of the p.u.t. blocks decreases in the limit, the limiting system is no longer minimal. Consequently, (Ω˜ *<sup>S</sup>*, *p*˜) < (Ω*S*, *p*) in the limit.

To illustrate this point consider again Example 4 with Equation (12) rewritten as

$$\mathbf{x}\_{t+1,1} = \mathbf{x}\_{t,1} + \mathbf{x}\_{t,2} + \mathcal{B}\_{1,1}\varepsilon\_t \quad \mathbf{x}\_{t+1,2} = \mathbf{x}\_{t,2} + \mathcal{B}\_{1,2,1}\varepsilon\_t \quad \mathbf{x}\_{t+1,3} = \mathbf{x}\_{t,3} + \mathcal{B}\_{1,2,2}\varepsilon\_t.$$

If B1,2,1 = [0, *<sup>b</sup>*1,2,1,2] = 0 and B1,2,2 = [*b*1,2,2,1, *<sup>b</sup>*1,2,2,2] = 0, *<sup>b</sup>*1,2,2,1 > 0, it holds that {*yt*}*t*∈<sup>Z</sup> is an I(2) process with state space unit root structure Ω*<sup>S</sup>* = ((0, 1, 2)).

Now consider a sequence of systems with all parameters except for *b*1,2,1,2 constant and *b*1,2,1,2 → 0. The limiting system is then given by

$$\begin{array}{rcl} y\_t &=& \mathcal{C}\_{1,1}^E \mathbf{x}\_{t,1} + \mathcal{C}\_{1,2}^G \mathbf{x}\_{t,2} + \mathcal{C}\_{1,2}^E \mathbf{x}\_{t,3} + \varepsilon\_{t\prime} \\ \begin{bmatrix} \mathbf{x}\_{t+1,1} \\ \mathbf{x}\_{t+1,2} \\ \mathbf{x}\_{t+1,3} \end{bmatrix} &=& \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{t,1} \\ \mathbf{x}\_{t,2} \\ \mathbf{x}\_{t,3} \end{bmatrix} + \begin{bmatrix} b\_{1,1,1} & b\_{1,1,2} \\ 0 & 0 \\ b\_{1,2,2,1} & b\_{1,2,2,2} \end{bmatrix} \varepsilon\_{t\prime} \qquad \mathbf{x}\_{1,1} = \mathbf{x}\_{1,2} = \mathbf{x}\_{1,3} = 0.1 \end{array}$$

In the limiting system *xt*,2 = 0 is redundant and {*yt*}*t*∈<sup>Z</sup> is an I(1) process rather than an I(2) process. Dropping *xt*,2 leads to a state space realisation of the limiting system {*yt*}*t*∈<sup>Z</sup> given by

$$\begin{array}{rcl} y\_t &=& \mathcal{C}\_{1,1}^{\operatorname{E}} \mathbf{x}\_{t,1} + \mathcal{C}\_{1,2}^{\operatorname{E}} \mathbf{x}\_{t,3} + \varepsilon\_t = \mathbf{C} \mathbf{\bar{x}}\_t + \varepsilon\_t, & \quad \mathbf{\bar{x}}\_t \in \mathbb{R}^2, \\ \mathbf{x}\_{t+1} = \begin{bmatrix} \mathbf{x}\_{t+1,1} \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{t,1} \\ \mathbf{x}\_{t,3} \end{bmatrix} + \begin{bmatrix} b\_{1,1,1} & b\_{1,1,2} \\ b\_{1,2,2,1} & b\_{1,2,2,2} \end{bmatrix} \boldsymbol{\varepsilon}\_t = \mathbf{\bar{x}}\_t + \mathcal{B} \boldsymbol{\varepsilon}\_t, & \quad \mathbf{x}\_{1,1} = \mathbf{x}\_{1,3} = 0. \end{array}$$

In case *B*˜ has full rank, the above system is minimal. Since *b*1,2,2,1 > 0, the matrix *B*˜ needs to be transformed into p.u.t. format. By definition all systems in the sequence, with *b*1,2,1,2 = 0, have structure indices *p* = [0, 2, 1] as discussed in Example 12. The limiting system—in case of full rank of *B*˜—has indices *p*˜ = [1, 2] . To relate to Definition 9 choose the permutation matrix ⎡ 100 ⎤

*S* = ⎣ 001 010 ⎦ to arrive at *S*A*uS* = ⎡ ⎣ 101 010 001 ⎤ ⎦ = *<sup>I</sup>*<sup>2</sup> ˜*J*<sup>12</sup> 0 ˜*J*<sup>2</sup> , *Sp* = ⎡ ⎣ 0 1 2 ⎤ ⎦ = ⎡ ⎣ (*p*1)<sup>1</sup> (*p*1)<sup>2</sup> *p*2 ⎤ ⎦ .

This shows that (*p*˜)*<sup>i</sup>* > (*p*1)*i*, *i* = 1, 2 and thus the limiting system has a smaller multi-index Γ than the systems of the sequence. In case *B*˜ has reduced rank equal to one a further reduction in the system order to *n* = 1 along similar lines as discussed is possible, again leading to a limiting system with smaller multi-index Γ.

The discussion shows that the closure of ΔB <sup>Ω</sup>*S*,*<sup>p</sup>* is related to lower order systems in the sense of Definition 9. The precise statement is given in Theorem 3 after a discussion of the closure of the stable subsystems.

#### 4.1.2. The Closure of Δ*α*•

Consider a convergent sequence of systems {(A*j*, B*j*, C*j*)}*j*∈<sup>N</sup> in <sup>Δ</sup>*α*• and denote the limiting system by (*A*0, *<sup>B</sup>*0, *<sup>C</sup>*0). Clearly, *<sup>λ</sup>*|max|(*A*0) ≤ 1 holds true for the limit *<sup>A</sup>*<sup>0</sup> of the sequence {A*j*}*j*∈<sup>N</sup> with *<sup>λ</sup>*|max|(A*j*) < 1 for all *<sup>j</sup>*. Therefore, two cases have to be discussed for the limit:


The first case is well understood, compare Hannan and Deistler (1988, chp. 2), since the limit in this case corresponds to a stable transfer function. In the second case the limiting system can be separated into two subsystems (˜*J*2, *<sup>B</sup>*˜*u*, *<sup>C</sup>*˜*u*) and (*A*˜ •, *<sup>B</sup>*˜•, *<sup>C</sup>*˜•), according to the block diagonal structure of *<sup>A</sup>*˜. The state space unit root structure of the limiting system (*A*0, *B*0, *C*0) depends on the multiplicities of the eigenvalues of the matrix ˜*J*<sup>2</sup> and is greater (in the sense of Definition 9) than the empty state space unit root structure. At the same time the Kronecker indices of the subsystem (*A*˜ •, *<sup>B</sup>*˜•, *<sup>C</sup>*˜•) are smaller than *α*•, compare again Hannan and Deistler (1988, chp. 2). Since the Kronecker indices impose restrictions on some entries of the matrices <sup>A</sup>*<sup>j</sup>* and thus also on *<sup>A</sup>*0, the block ˜*J*<sup>2</sup> and consequently also the limiting state space unit root structure might be subject to further restrictions.

#### 4.1.3. The Conformable Index Set and the Closure of ΔΓ

The previous subsection shows that the closure of ΔΓ does not only contain systems corresponding to transfer functions with multi-index smaller or equal to Γ, but also systems that are related in a different way that is formalized below.

**Definition 10** (Conformable index set)**.** *Given a multi-index* Γ = (Ω*S*, *p*, *α*•)*, the set of conformable multi-indices* <sup>K</sup>(Γ) *contains all multi-indices* <sup>Γ</sup>˜ = (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜, *<sup>α</sup>*˜ •)*, where:*

• *The pair* (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) *with corresponding matrix* <sup>A</sup>˜*<sup>u</sup> in canonical form extends* (Ω*S*, *<sup>p</sup>*) *with corresponding matrix* A*<sup>u</sup> in canonical form, i.e., there exists a permutation matrix S such that*

$$\mathcal{S}\mathcal{A}\_{\mathfrak{u}}\mathcal{S}' = \begin{bmatrix} \mathcal{A}\_{\mathfrak{u}} & 0\\ 0 & J\_2 \end{bmatrix} \quad \text{and} \quad \mathcal{S}\,\vec{p} = \begin{bmatrix} p\\ \vec{p}\_2 \end{bmatrix} \,'$$


Please note that the definition implies Γ ∈ K(Γ). The importance of the set K(Γ) is clarified in the following theorem:

**Theorem 3.** *Transfer functions corresponding to state space realizations with multi-index* <sup>Γ</sup>˜ <sup>≤</sup> <sup>Γ</sup> *are contained in the set <sup>π</sup>*(ΔΓ)*. The set <sup>π</sup>*(ΔΓ) *is contained in the union of all sets <sup>M</sup>*Γ<sup>ˇ</sup> *for* <sup>Γ</sup><sup>ˇ</sup> <sup>≤</sup> <sup>Γ</sup>˜ *with* <sup>Γ</sup>˜ *conformable to* <sup>Γ</sup>*, i.e.,*

$$\bigcup\_{\Gamma \le \Gamma} M\_{\Gamma} \subset \pi(\overline{\Delta\_{\Gamma}}) \subset \bigcup\_{\Gamma \in \mathcal{K}(\Gamma)} \bigcup\_{\check{\Gamma} \le \Gamma} M\_{\Gamma}.$$

Theorem 3 provides a characterization of the transfer functions corresponding to systems in the closure of ΔΓ. The conformable set K(Γ) plays a key role here, since it characterizes the set of all minimal systems that can be obtained as limits of convergent sequences from within the set ΔΓ. Conformable indices extend the matrix <sup>A</sup>*<sup>u</sup>* corresponding to the unit root structure by the block ˜*J*2.

The second inclusion in Theorem 3 is potentially strict, depending on the Kronecker indices *α*• in Γ. Equality holds, e.g., in the following case:

**Corollary 3.** *For every multi-index* Γ *with n*• = 0 *the set of conformable indices consists only of* Γ*, which implies π*(ΔΓ) = 1 <sup>Γ</sup>˜≤<sup>Γ</sup> *<sup>M</sup>*Γ˜ *.*

#### *4.2. The Closure of M*Γ

It remains to investigate the closure of *M*Γ in *M*. Hannan and Deistler (1988, Theorem 2.6.5 (ii) and Remark 3, p. 73) show that for any order *n*, there exist Kronecker indices *α*•,*<sup>g</sup>* = *α*•,*g*(*n*) corresponding to the *generic neighborhood Mα*•,*<sup>g</sup>* for transfer functions of order *n* such that

$$M\_{\bullet,\mathsf{u}} \quad := \bigcup\_{\mathfrak{a}\_{\bullet} \mid \mathfrak{u}\_{\bullet}(\mathfrak{a}\_{\bullet}) = \mathfrak{u}} M\_{\mathfrak{a}\_{\bullet}} \quad \subset \quad \overline{M\_{\mathfrak{a}\_{\bullet} \mathfrak{a}\_{\bullet}}}.$$

where *Mα*• := *π*(Δ*α*• ). Here *M*•,*<sup>n</sup>* denotes the set of all transfer functions of order *n* with state space realizations (*A*, *<sup>B</sup>*, *<sup>C</sup>*) satisfying *<sup>λ</sup>*|max|(*A*) < 1. Every transfer function in *<sup>M</sup>*•,*<sup>n</sup>* can be approximated by a sequence of transfer functions in *Mα*•,*<sup>g</sup>* .

It can be easily seen that a generic neighborhood also exists for systems with state space unit root structure Ω*<sup>S</sup>* and without stable subsystem: Set the structure indices *p* to have a minimal number of elements restricted in p.u.t. sub-blocks of <sup>B</sup>*u*, i.e., for any block <sup>B</sup>*k*,*hk*,*<sup>j</sup>* <sup>∈</sup> <sup>C</sup>*nk*,*hk*,*j*×*<sup>s</sup>* , or <sup>B</sup>*k*,*hk*,*<sup>j</sup>* <sup>∈</sup> <sup>R</sup>*nk*,*hk*,*j*×*<sup>s</sup>* in case of a real unit root, set the corresponding structure indices to *p* = [1, ... , *nk*,*hk*,*j*]. Any p.u.t. matrix can be approximated by a matrix in this generic neighborhood with some positive entries restricted by the p.u.t. structure tending to zero. Combining these results with Theorem 3 implies the existence of a generic neighborhood for the canonical form considered in this paper:

**Theorem 4.** *Let <sup>M</sup>*(Ω*S*, *<sup>n</sup>*•) *be the set of all transfer functions <sup>k</sup>*(*z*) <sup>∈</sup> *Mnu*(Ω*S*)+*n*• *with state space unit root structure* Ω*S. For every* Ω*<sup>S</sup> and n*•*, there exists a multi-index* Γ*<sup>g</sup>* := Γ*g*(Ω*S*, *n*•) *such that*

$$M(\Omega\_{\mathcal{S}}, n\_{\bullet}) \quad \subset \quad \overline{M\_{\Gamma\_{\mathcal{S}}}}.\tag{14}$$

*Moreover, it holds that M*(Ω*S*, *<sup>n</sup>*•) <sup>⊂</sup> *<sup>M</sup>α*•,*g*(*n*) *for every* <sup>Ω</sup>*<sup>S</sup> and n*• *satisfying nu*(Ω*S*) + *<sup>n</sup>*• <sup>≤</sup> *n.*

Theorem 4 is the basis for choosing a generic multi-index Γ for maximizing the pseudo likelihood function. For every Ω*<sup>S</sup>* and *n*• there exists a generic piece that—in its closure—contains all transfer functions of order *nu*(Ω*S*) + *n*• and state space unit root structure Ω*S*: The set of transfer functions corresponding to the multi-index with the largest possible structure indices *p* in the sense of Definition 9 (iii) and generic Kronecker indices for the stable subsystem. Choosing these sets and their corresponding parameter spaces as model sets is, therefore, the most convenient choice for numerical maximization, if only Ω*<sup>S</sup>* and *n*• are known.

If, e.g., only an upper bound for the system order *n* is known and the goal is only to obtain consistent estimators, using *α*•,*g*(*n*) is a feasible choice, since all transfer functions in the closure of the set *<sup>M</sup>α*•,*g*(*n*) can be approximated arbitrarily well, regardless of their potential state space unit root structure Ω*S*, *nu*(Ω*S*) ≤ *n*. For testing hypotheses, however, it is important to understand the topological relations between sets corresponding to different multi-indices Γ. In the following we focus on the multi-indices Γ*g*(Ω*S*, *n*•) for arbitrary Ω*<sup>S</sup>* and *n*•.

The closure of *M*(Ω*S*, *n*•) contains also transfer functions that have a different state space unit root structure than <sup>Ω</sup>*S*. Considering convergent sequences of state space realizations (*Aj*, *Bj*, *Cj*)*j*∈<sup>N</sup> of transfer functions in *<sup>M</sup>*(Ω*S*, *<sup>n</sup>*•), the state space unit root structure of (*A*0, *<sup>B</sup>*0, *<sup>C</sup>*0) := lim*j*→∞(*Aj*, *Bj*, *Cj*) may differ in three ways:


The first change of Ω*<sup>S</sup>* described above results in a transfer function with smaller state space unit root structure according to Definition 9 (ii). The implications of the other two cases are summarized in the following definition:

**Definition 11** (Attainable unit root structures)**.** *For given n*• *and* Ω*<sup>S</sup> the set* A(Ω*S*, *n*•) *of* attainable unit root structures *contains all pairs* (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •)*, where* <sup>Ω</sup>˜ *<sup>S</sup> with corresponding matrix* <sup>A</sup>˜*<sup>u</sup> in canonical form extends* Ω*<sup>S</sup> with corresponding matrix* A*<sup>u</sup> in canonical form, i.e., there exists a permutation matrix S such that*

$$\mathcal{S}\,\bar{\mathcal{A}}\_{\underline{u}}\mathcal{S}'\; \;=\;\;\;\;\begin{bmatrix} \mathcal{A}\_{\underline{u}} & I\_{12} \\ 0 & I\_{2} \end{bmatrix}$$

,

*where* <sup>A</sup>ˇ*<sup>u</sup> can be obtained by replacing off-diagonal entries in* <sup>A</sup>*<sup>u</sup> by zeros and where <sup>n</sup>*˜ • :<sup>=</sup> *<sup>n</sup>*• <sup>−</sup> *dJ with dJ the dimension of J*<sup>2</sup> <sup>∈</sup> <sup>C</sup>*dJ*×*dJ .*

**Remark 17.** *It is a direct consequence of the definition of* <sup>A</sup>(Ω*S*, *<sup>n</sup>*•) *that* (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) ∈ A(Ω*S*, *<sup>n</sup>*•) *implies* <sup>A</sup>(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) ⊂ A(Ω*S*, *<sup>n</sup>*•)*.*

#### **Theorem 5.**

*(i) M*Γ *is Tpt-open in M*Γ *(see Definition 8 for a definition of Tpt).*

*(ii) For every generic multi-index* Γ*<sup>g</sup> corresponding to* Ω*<sup>S</sup> and n*• *it holds that*

$$\begin{split} \pi(\overline{\Delta\_{\Gamma\_{\mathcal{S}}}}) &\subset \bigcup\_{\Gamma \in \mathcal{K}(\Gamma\_{\mathcal{S}})} \bigcup\_{\Gamma \preceq \Gamma} \mathcal{M}\_{\Gamma} \\ &\subset \bigcup\_{(\hat{\Omega}\_{\mathcal{S}}, \mathfrak{n}\_{\bullet}) \in \mathcal{A}(\Omega\_{\mathcal{S}}, \mathfrak{n}\_{\bullet})} \bigcup\_{(\hat{\Omega}\_{\mathcal{S}}, \mathfrak{h}\_{\bullet}) \leq (\hat{\Omega}\_{\mathcal{S}}, \mathfrak{n}\_{\bullet})} \mathcal{M}(\hat{\Omega}\_{\mathcal{S}}, \mathfrak{h}\_{\bullet}) \ &= \ \overline{\mathcal{M}\_{\Gamma\_{\mathcal{S}}}}. \end{split}$$

Theorem 5 has important consequences for statistical analysis, e.g., PML estimation, since—as stated several times already—maximizing the pseudo likelihood function over ΘΓ effectively amounts to calculating the supremum over the larger set *M*Γ. Depending on the choice of Γ the following asymptotic behavior may occur:


Finally, note that Theorem 5 also implies the following result relevant for the determination of the unit root structure, further discussed in Sections 5.1.1 and 5.2.1:

**Corollary 4.** *For every pair* (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) ∈ A(Ω*S*, *<sup>n</sup>*•) *it holds that*

$$\overline{M(\tilde{\Omega}\_{\mathcal{S}\prime}\tilde{n}\_{\bullet})} \subset \overline{M(\Omega\_{\mathcal{S}\prime}n\_{\bullet})}.$$

#### **5. Testing Commonly Used Hypotheses in the MFI(1) and I(2) Cases**

This section discusses a large number of hypotheses, respectively restrictions, on cointegrating spaces, adjustment coefficients and deterministic components often tested in the empirical literature. As with the VECM framework, as discussed for the I(2) case in Section 2, testing hypotheses on the cointegrating spaces or adjustment coefficients may necessitate different reparameterizations.

*5.1. The MF I*(1) *Case*

The two by far most widely used cases of MFI(1) processes are *I*(1) processes and seasonally (co-)integrated processes for quarterly data with state space unit root structure ((0, *d*<sup>1</sup> <sup>1</sup>),(*π*/2, *<sup>d</sup>*<sup>2</sup> <sup>1</sup>),(*π*, *<sup>d</sup>*<sup>3</sup> <sup>1</sup>)). In general, assuming for notational simplicity *ω*<sup>1</sup> = 0 and *ω<sup>l</sup>* = *π*, it holds that for *t* > 0 and *x*1,*<sup>u</sup>* = 0 we have

*yt* = *l* ∑ *k*=1 C*k*,<sup>R</sup>*xt*,*k*,<sup>R</sup> + C•*xt*,• + Φ*dt* + *ε<sup>t</sup>* = C1*xt*,1 + *l*−1 ∑ *k*=2 (C*kxt*,*<sup>k</sup>* <sup>+</sup> <sup>C</sup>*kxt*,*k*) + <sup>C</sup>*lx<sup>j</sup> <sup>t</sup>*,*<sup>l</sup>* + C•*xt*,• + Φ*dt* + *ε<sup>t</sup>* = C1B<sup>1</sup> *t*−1 ∑ *j*=1 *<sup>ε</sup>t*−*<sup>j</sup>* + <sup>2</sup> *l*−1 ∑ *k*=2 R C*k*B*k t*−1 ∑ *j*=1 (*zk*)*j*−1*εt*−*<sup>j</sup>* + C*l*B*<sup>l</sup> t*−1 ∑ *j*=1 (−1)*j*−1*εt*−*<sup>j</sup>* +C• *t*−1 ∑ *j*=1 <sup>A</sup>*j*−<sup>1</sup> • B•*εt*−*<sup>j</sup>* <sup>+</sup> C•A*t*−<sup>1</sup> • *<sup>x</sup>*1,• <sup>+</sup> <sup>Φ</sup>*dt* <sup>+</sup> *<sup>ε</sup><sup>t</sup>* = C1B<sup>1</sup> *t*−1 ∑ *j*=1 *<sup>ε</sup>t*−*<sup>j</sup>* + <sup>2</sup> *l*−1 ∑ *k*=2 *t*−1 ∑ *j*=1 <sup>R</sup>(C*k*B*k*)cos(*ωk*(*<sup>j</sup>* <sup>−</sup> <sup>1</sup>)) + <sup>I</sup>(C*k*B*k*)sin(*ωk*(*<sup>j</sup>* <sup>−</sup> <sup>1</sup>)) *εt*−*<sup>j</sup>* +C*l*B*<sup>l</sup> t*−1 ∑ *j*=1 (−1)*j*−1*εt*−*<sup>j</sup>* <sup>+</sup> C• *t*−1 ∑ *j*=1 <sup>A</sup>*j*−<sup>1</sup> • B•*εt*−*<sup>j</sup>* <sup>+</sup> C•A*t*−<sup>1</sup> • *<sup>x</sup>*1,• <sup>+</sup> <sup>Φ</sup>*dt* <sup>+</sup> *<sup>ε</sup>t*.

The above equation provides an additive decomposition of {*yt*}*t*∈<sup>Z</sup> into stochastic trends and cycles, the deterministic and stationary components. The stochastic cycles at frequency 0 < *ω<sup>k</sup>* < *π* are, of course, given by the combination of sine and cosine terms. For the MFI(1) case this can also be seen directly from considering the real valued canonical form discussed in Remark 4, with the matrices

$$\mathcal{A}\_{k,\mathbb{R}}\text{ for }k=2,\ldots,l-1\text{, given by }\mathcal{A}\_{k,\mathbb{R}} = I\_{d\_1^k} \otimes \begin{pmatrix} \cos(\omega\_k) & -\sin(\omega\_k) \\ \sin(\omega\_k) & \cos(\omega\_k) \end{pmatrix}\text{ in this case...}$$

The ranks of <sup>C</sup>*k*B*<sup>k</sup>* are equal to the integers *<sup>d</sup><sup>k</sup>* <sup>1</sup> in <sup>Ω</sup>*<sup>S</sup>* = ((*ω*1, *<sup>d</sup>*<sup>1</sup> <sup>1</sup>), ... ,(*ωl*, *<sup>d</sup><sup>l</sup>* <sup>1</sup>)). The number of stochastic trends is equal to *d*<sup>1</sup> <sup>1</sup>, the number of stochastic cycles at frequency *<sup>ω</sup><sup>k</sup>* is equal to 2*d<sup>k</sup>* <sup>1</sup> for *<sup>k</sup>* <sup>=</sup> 2, . . . , *<sup>l</sup>* <sup>−</sup> 1 and equal to *<sup>d</sup><sup>l</sup>* <sup>1</sup> if *k* = *l*, as discussed in Section 3.

Moreover, in the MFI(1) case, *d<sup>k</sup>* <sup>1</sup> is linked to the *complex cointegrating rank rk* at frequency *ωk*, defined in Johansen (1991) and Johansen and Schaumburg (1999) in the VECM case as the rank of the matrix Π*<sup>k</sup>* := −*a*(*zk*). For VARMA processes with arbitrary integration orders the complex cointegrating rank *rk* at frequency *<sup>ω</sup><sup>k</sup>* is *rk* :<sup>=</sup> rank(−*k*−1(*zk*)), where *<sup>k</sup>*(*z*) is the transfer function, with *rk* <sup>=</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup> in the MFI(1) case. Thus, in the MFI(1) case, determination of the state space unit root structure corresponds to determination of the complex cointegrating ranks in the VECM case.

In the VECM setting, the matrix Π*<sup>k</sup>* is usually factorized into Π*<sup>k</sup>* = *αkβ <sup>k</sup>*, as presented for the I(1) case in Section 2. For *ω<sup>k</sup>* = {0, *π*} the column space of *β<sup>k</sup>* gives the cointegrating space of the process at frequency *ωk*. For 0 < *ω<sup>k</sup>* < *π* the relation between the column space of *β<sup>k</sup>* and the space of CIVs and PCIVs at the corresponding frequency is more involved. The columns of *β<sup>k</sup>* are orthogonal to the columns of C*k*, the sub-block of C from a state space realization (A, B, C) in canonical form corresponding to the VAR process. Analogously, the column space of the matrix *αk*, containing the so-called *adjustment coefficients*, is orthogonal to the row space of the sub-block B*<sup>k</sup>* of B.

Both integers *d<sup>k</sup>* <sup>1</sup> and *rk* are related to the dimensions of the static and dynamic cointegrating spaces in the MFI(1) case: For *<sup>ω</sup><sup>k</sup>* ∈ {0, *<sup>π</sup>*}, the cointegrating rank *rk* <sup>=</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup> coincides with the dimension of the static cointegrating space at frequency *ωk*. Furthermore, the dimension of the static cointegrating space at frequency 0 <sup>&</sup>lt; *<sup>ω</sup><sup>k</sup>* <sup>&</sup>lt; *<sup>π</sup>* is bounded from above by *rk* <sup>=</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup>, since it is spanned by at most *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup> vectors *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* orthogonal to the complex valued matrix <sup>C</sup>*k*. The dimension of the dynamic cointegrating space at 0 <sup>&</sup>lt; *<sup>ω</sup><sup>k</sup>* <sup>&</sup>lt; *<sup>π</sup>* is equal to 2*rk* <sup>=</sup> <sup>2</sup>(*<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup>). Identifying again *β*(*z*) = *β*<sup>0</sup> + *β*1*z* with the vector [*β* <sup>0</sup>, *β* 1] , a basis of the dynamic cointegrating space at 0 < *ω<sup>k</sup>* < *π* is then given by the column space of the product

$$
\begin{bmatrix}
\gamma\_0 & \tilde{\gamma}\_0 \\
\gamma\_1 & \tilde{\gamma}\_1
\end{bmatrix} := \begin{bmatrix}
I\_s & 0\_{s \times s} \\
\end{bmatrix} \begin{bmatrix}
\mathcal{R}(\beta\_k) & \mathcal{I}(\beta\_k) \\
\end{bmatrix},
$$

with the columns of *<sup>β</sup><sup>k</sup>* <sup>∈</sup> <sup>C</sup>*s*×(*s*−*d<sup>k</sup>* <sup>1</sup>) spanning the orthogonal complement of the column space of <sup>C</sup>*k*, i.e., *β<sup>k</sup>* is of full rank and *β <sup>k</sup>*C*<sup>k</sup>* = (R(*βk*) − *i*I(*βk*) )C*<sup>k</sup>* = 0. This holds true, since both factors are of full rank and [*γ* <sup>0</sup>, *γ* 1] satisfies (*zkγ* <sup>0</sup> + *γ* <sup>1</sup>)C*<sup>k</sup>* = 0, which corresponds to the necessary condition given in Example 2 for the columns of [*γ* <sup>0</sup>, *γ* 1] to be PCIVs. The latter implies (*zkγ*˜ <sup>0</sup> + *γ*˜ <sup>1</sup>)C*<sup>k</sup>* = 0 also for [*γ*˜ <sup>0</sup>, *γ*˜ 1] , highlighting again the additional structure of the cointegrating space emanating from the complex conjugate pairs or eigenvalues (and matrices) as discussed in Example 2.

Please note that the relations between *rk* and *d<sup>k</sup>* <sup>1</sup> discussed above only hold in the MFI(1) and I(1) special cases. For higher orders of integration no such simple relations exist.

In the MFI(1) setting the deterministic component typically includes a constant, seasonal dummies and a linear trend. As discussed in Remark 6, a sufficiently rich set of deterministic components allows to absorb non-zero initial values *x*1,*u*.

#### 5.1.1. Testing Hypotheses on the State Space Unit Root Structure

Using the generic sets of transfer functions *M*Γ*<sup>g</sup>* presented in Theorem 4, we can construct pseudo likelihood ratio tests for different hypotheses *H*<sup>0</sup> : (Ω*S*, *n*•)=(Ω*S*,0, *n*•,0) against chosen alternatives. Note, however, that by the results of Theorem 5 the null hypothesis includes all pairs (Ω*S*, *n*•) ∈ <sup>A</sup>(Ω*S*,0, *<sup>n</sup>*•,0) as well as all pairs (Ω*S*, *<sup>n</sup>*•) that are smaller than a pair (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) ∈ A(Ω*S*,0, *<sup>n</sup>*•,0).

As common in the VECM setting, first consider hypotheses at a single frequency *ωk*. For an MFI(1) process, the hypothesis of a state space unit root structure equal to Ω*S*,0 = ((*ωk*, *d<sup>k</sup>* 1,0)) corresponds to the hypothesis of the (compex) cointegrating rank *rk* at frequency *<sup>ω</sup><sup>k</sup>* being equal to *<sup>r</sup>*<sup>0</sup> <sup>=</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* 1,0. Maximization of the pseudo likelihood function over the set *M*(((*ωk*, *d<sup>k</sup>* 1,0)), *<sup>n</sup>* − *<sup>δ</sup>kd<sup>k</sup>* 1,0) – with a suitably chosen order *n*—leads to estimates that may be arbitrary close to transfer functions with different state space unit root structures Ω*S*. These include Ω*<sup>S</sup>* with additional unit root frequencies *ω*˜ *<sup>k</sup>*, with the integers *d*˜ *k* <sup>1</sup> restricted only by the order *n*. Therefore, focusing on a single frequency *ω<sup>k</sup>* does not rule out a more complicated true state space unit root structure. Assume *n* ≥ *δks* with *δ<sup>k</sup>* = 1 for *ω<sup>k</sup>* ∈ {0, *π*} and *δ<sup>k</sup>* = 2 else. Corollary 4 shows that

$$\overline{M(\{\},\mathfrak{n})} \supset \overline{M(((\omega\_{k},1)),\mathfrak{n}-\delta\_{\mathfrak{k}})} \supset \cdots \supset \overline{M(((\omega\_{k},\mathfrak{s})),\mathfrak{n}-\mathfrak{s}\delta\_{\mathfrak{k}})}$$

since, e.g., (((*ωk*, 1)), *n* − *δk*) ∈ A({}, *n*).

Analogously to the procedure of testing for the complex cointegrating rank *rk* in the VECM setting, these inclusions can be employed to test for *d<sup>k</sup>* <sup>1</sup>: Start with the hypothesis of *<sup>d</sup><sup>k</sup>* <sup>1</sup> = *s* against the alternative of 0 <sup>≤</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup> < *<sup>s</sup>* and decrease the assumed *<sup>d</sup><sup>k</sup>* <sup>1</sup> consecutively until the test does not reject the null hypothesis.

Furthermore, one can formulate hypotheses on *d<sup>k</sup>* <sup>1</sup> jointly at different frequencies *ωk*. Again, there exist inclusions based on the definition of the set of attainable state space unit root structures and Corollary 4, which can be used to consecutively test hypotheses on Ω*S*.

#### 5.1.2. Testing Hypotheses on CIVs and PCIVs

Johansen (1995) considers in the *I*(1) case three types of hypotheses on the cointegrating space spanned by the columns of *β* that are each motivated by examples from economic research: The different cases correspond to different types of hypotheses related to restrictions implied by economic theory.

(i) *<sup>H</sup>*<sup>0</sup> : *<sup>β</sup>* <sup>=</sup> *<sup>H</sup>ϕ*, *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>r</sup>* , *<sup>H</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>ϕ</sup>* <sup>∈</sup> <sup>R</sup>*t*×*<sup>r</sup>* ,*r* ≤ *t* < *s*: The cointegrating space is known to be a subspace of the column space of *H* (which is of full column rank).


As discussed in Example 1, cointegration at *ω<sup>k</sup>* = 0 occurs if and only if a vector *β<sup>j</sup>* satisfies *β j* C<sup>1</sup> = 0. In other words, the column space of C<sup>1</sup> is the orthocomplement of the cointegrating space spanned by the columns of *β* and hypotheses on *β* restrict entries of C1.

The first type of hypothesis, *H*0, implies that the column space of C<sup>1</sup> is equal to the orthocomplement of the column space of *Hϕ*. Assume w.l.o.g. *H* ∈ *Os*,*t*, *ϕ*<sup>⊥</sup> ∈ *Ot*,*t*−*<sup>r</sup>* and *H*<sup>⊥</sup> ∈ *Os*,*s*−*t*, such that the columns of [*Hϕ*⊥, *<sup>H</sup>*⊥] form an orthonormal basis for the orthocomplement of the cointegrating space. Consider now the mapping:

$$\mathcal{C}\_1^r(\mathfrak{H}\_L, \mathfrak{H}\_R) \quad := \quad \left[ H \cdot \mathcal{R}\_L(\mathfrak{H}\_L)' \begin{bmatrix} I\_{t-r} \\ 0\_{r \times (t-r)} \end{bmatrix}, \quad H\_\perp \right] \cdot \mathcal{R}\_R(\mathfrak{H}\_R), \tag{15}$$

where *R*ˇ *<sup>L</sup>*(*θ***ˇ***L*) := ∏*t*−*<sup>r</sup> <sup>i</sup>*=<sup>1</sup> <sup>∏</sup>*<sup>r</sup> <sup>j</sup>*=<sup>1</sup> *Rt*,*i*,*t*−*r*+*j*(*θL*,*r*(*i*−1)+*j*) <sup>∈</sup> <sup>R</sup>*t*×*<sup>t</sup>* and *RR*(*θR*) <sup>∈</sup> <sup>R</sup>(*s*−*r*)×(*s*−*r*) as in Lemma 1. From this one can derive a parameterization of the set of matrices <sup>C</sup>*<sup>r</sup>* <sup>1</sup> corresponding to *H*0, analogously to Lemma 1. The difference of the number of free parameters under the null hypothesis and under the alternative is the difference between the number of free parameters in *<sup>θ</sup><sup>L</sup>* <sup>∈</sup> [0, 2*π*)*r*(*s*−*r*) and *<sup>θ</sup>***ˇ***<sup>L</sup>* <sup>∈</sup> [0, 2*π*)*r*(*t*−*r*), implying a reduction of the number of free parameters of *<sup>r</sup>*(*<sup>s</sup>* <sup>−</sup> *<sup>t</sup>*) under the null hypothesis. This necessarily coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.2).

The second type of hypothesis, *H* <sup>0</sup>, is also straightforwardly parameterized: In this case a subspace of the cointegrating space is known and given by the column space of *<sup>b</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* . Assume w.l.o.g. *b* ∈ *Os*,*t*. The orthocomplement of *β* = [*b*, *ϕ*] is given by the set of matrices C<sup>1</sup> satisfying the restriction *b* C<sup>1</sup> = 0, i.e., the set *Os*,*d*<sup>1</sup> (*b*) defined in (13). The parameterization of this set has already been discussed. The reduction of the number of free parameters under the null hypothesis is *t*(*s* − *r*) which again coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.3).

Finally, the third type of hypothesis, *H* <sup>0</sup> , is the most difficult to parameterize in our setting. As an illustrative example consider the case *H* <sup>0</sup> : *<sup>β</sup>* = [*H*1*ϕ*1, *<sup>H</sup>*2*ϕ*2], *<sup>β</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>r</sup>* , *<sup>H</sup>*<sup>1</sup> <sup>∈</sup> <sup>R</sup>*s*×*t*<sup>1</sup> , *<sup>H</sup>*<sup>2</sup> <sup>∈</sup> <sup>R</sup>*s*×*t*<sup>2</sup> , *<sup>ϕ</sup>*<sup>1</sup> <sup>∈</sup> <sup>R</sup>*t*1×*r*<sup>1</sup> , *<sup>ϕ</sup>*<sup>2</sup> <sup>∈</sup> <sup>R</sup>*t*2×*r*<sup>2</sup> ,*rj* <sup>≤</sup> *tj* <sup>≤</sup> *<sup>s</sup>* and *<sup>r</sup>*<sup>1</sup> <sup>+</sup> *<sup>r</sup>*<sup>2</sup> <sup>=</sup> *<sup>r</sup>*. W.l.o.g. choose *Hb* <sup>∈</sup> *Os*,*tb* such that its columns span the *tb*-dimensional intersection of the column spaces of *<sup>H</sup>*<sup>1</sup> and *<sup>H</sup>*<sup>2</sup> and choose *<sup>H</sup>*˜ *<sup>j</sup>* <sup>∈</sup> *Os*,˜*tj* (*Hb*), *j* = 1, 2 such that the columns of *<sup>H</sup>*˜ *<sup>j</sup>* and *Hb* span the column space of *Hj*. Define *<sup>H</sup>*˜ := [*H*˜ 1, *<sup>H</sup>*˜ 2, *Hb*] <sup>∈</sup> *Os*,˜*t*, with ˜*<sup>t</sup>* <sup>=</sup> ˜*t*<sup>1</sup> <sup>+</sup> ˜*t*<sup>2</sup> <sup>+</sup> *tb*. Let w.l.o.g. *<sup>H</sup>*˜ <sup>⊥</sup> <sup>∈</sup> *Os*,*s*−˜*t*(*H*˜ ) and define *pj* :<sup>=</sup> min(*rj*, ˜*tj*), *qj* :<sup>=</sup> max(*rj*, ˜*tj*) for *<sup>j</sup>* <sup>=</sup> 1, 2 and *pb* <sup>=</sup> *<sup>q</sup>*<sup>1</sup> <sup>−</sup> ˜*t*<sup>1</sup> <sup>+</sup> *<sup>q</sup>*<sup>2</sup> <sup>−</sup> ˜*t*2. A parameterization of *<sup>β</sup><sup>r</sup>* <sup>∈</sup> *Os*,*<sup>r</sup>* satisfying the restrictions under the null hypothesis can be derived from the following mapping:

*βr* (*θH*, *<sup>θ</sup>R*,*β*) :<sup>=</sup> *<sup>H</sup>*˜ · *RH*(*θH*) ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *Ip*<sup>1</sup> 0*p*1×*p*<sup>2</sup> 0*p*1×*pb* **0**(*q*1−*r*1)×*p*<sup>1</sup> 0(*q*1−*r*1)×*p*<sup>2</sup> 0(*q*1−*r*1)×*pb* 0*p*2×*p*<sup>1</sup> *Ip*<sup>2</sup> 0*p*2×*pb* 0(*q*2−*r*2)×*p*<sup>1</sup> **0**(*q*2−*r*2)×*p*<sup>2</sup> 0(*q*2−*r*2)×*pb* 0*pb*×*p*<sup>1</sup> 0*pb*×*p*<sup>2</sup> *Ipb* **<sup>0</sup>**(˜*t*−*q*1−*q*2)×*p*<sup>1</sup> **<sup>0</sup>**(˜*t*−*q*1−*q*2)×*p*<sup>2</sup> **<sup>0</sup>**(˜*t*−*q*1−*q*2)×*pb* ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ · *RR*(*θR*,*β*),

where *RR*(*θR*,*β*) <sup>∈</sup> <sup>R</sup>*r*×*<sup>r</sup>* as in Lemma <sup>1</sup> and *RH*(*θH*) :<sup>=</sup> *RH* (*θH*<sup>1</sup> , *θH*<sup>2</sup> , *θHb* ) := *RH*<sup>1</sup> (*θH*<sup>1</sup> )*RH*<sup>2</sup> (*θH*<sup>2</sup> )*RHb* (*θHb* ) <sup>∈</sup> <sup>R</sup>˜*t*×˜*<sup>t</sup>* is a product of Givens rotations corresponding to the entries in the blocks highlighted by bold font. The three matrices are defined as follows:

$$\begin{aligned} R\_{H\_1}(\theta\_{H\_1}) &:= \prod\_{i=1}^{p\_1} \prod\_{j=1}^{t-q\_2-r\_1} R\_{t,i\delta t\_1(j)+j}(\theta\_{H\_1,(\hat{l}-q\_2-r\_1)(i-1)+j}), \,\delta\_{H\_1}(j) := \begin{cases} p\_1 & \text{if } j \le q\_1 - r\_1 \\ \mathbb{I}\_1 + \mathbb{I}\_2 + p\_b & \text{else}, \end{cases} \\\ R\_{H\_2}(\theta\_{H\_2}) &:= \prod\_{i=1}^{p\_2} \prod\_{j=1}^{t-q\_1-r\_2} R\_{t,p\_1+i,\delta q\_2(j)+j}(\theta\_{H\_2,(\hat{l}-q\_1-r\_2)(i-1)+j}), \,\delta\_{H\_2}(j) := \begin{cases} \mathbb{I}\_1 + p\_2 & \text{if } j \le q\_2 - r\_2 \\ \mathbb{I}\_1 + \mathbb{I}\_2 + p\_b & \text{else}, \end{cases} \end{aligned}$$

$$R\_{H\_b}(\theta\_{H\_b}) \quad := \prod\_{i=1}^{p\_b} \prod\_{j=1}^{\tilde{t}-q\_1-q\_2} R\_{t, p\_1+p\_2+i, \tilde{t}\_1+\tilde{t}\_2+p\_b+j}(\theta\_{H\_b, (\tilde{t}-q\_1-q\_2)(i-1)+j}) \cdot$$

Consequently, a parameterization of the orthocomplement of the cointegrating space is based on the mapping:

C*r* <sup>1</sup>(*θH*, *<sup>θ</sup>R*,<sup>C</sup> ) := ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ *<sup>H</sup>*˜ · *RH*(*θH*) ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ <sup>0</sup>*p*1×(*q*1−*r*1) <sup>0</sup>*p*1×(*q*2−*r*2) <sup>0</sup>*p*1×(˜*t*−*q*1−*q*2) *Iq*1−*r*<sup>1</sup> <sup>0</sup>(*q*1−*r*1)×(*q*2−*r*2) <sup>0</sup>(*q*1−*r*1)×(˜*t*−*q*1−*q*2) <sup>0</sup>*p*2×(*q*1−*r*1) <sup>0</sup>*p*2×(*q*2−*r*2) <sup>0</sup>*p*2×(˜*t*−*q*1−*q*2) <sup>0</sup>(*q*2−*r*2)×(*q*1−*r*1) *Iq*2−*r*<sup>2</sup> <sup>0</sup>(*q*2−*r*2)×(˜*t*−*q*1−*q*2) <sup>0</sup>*pb*×(*q*1−*r*1) <sup>0</sup>*pb*×(*q*2−*r*2) <sup>0</sup>*pb*×(˜*t*−*q*1−*q*2) <sup>0</sup>(˜*t*−*q*1−*q*2)×(*q*1−*r*1) <sup>0</sup>(˜*t*−*q*1−*q*2)×(*q*2−*r*2) *<sup>I</sup>*˜*t*−*q*1−*q*<sup>2</sup> ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , *<sup>H</sup>*˜ <sup>⊥</sup> ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ · *RR*(*θR*,<sup>C</sup> ),

where *RH*(*θH*) <sup>∈</sup> <sup>R</sup>˜*t*×˜*<sup>t</sup>* as above and *RR*(*θR*,<sup>C</sup> ) <sup>∈</sup> <sup>R</sup>(*s*−*r*)×(*s*−*r*) as in Lemma 1. Please note that for all *<sup>θ</sup>H*, *<sup>θ</sup>R*,*<sup>β</sup>* and *<sup>θ</sup>R*,<sup>C</sup> it holds that *<sup>β</sup>r*(*θH*, *<sup>θ</sup>R*,*β*) C*r* <sup>1</sup>(*θH*, *<sup>θ</sup>R*,<sup>C</sup> ) = <sup>0</sup>*r*×(*s*−*r*). The number of parameters restricted under *H* <sup>0</sup> is equal to *<sup>r</sup>*1(*q*<sup>1</sup> − *<sup>r</sup>*1) + *<sup>r</sup>*2(*q*<sup>2</sup> − *<sup>r</sup>*2)+(*r*<sup>1</sup> + *<sup>r</sup>*2)(˜*<sup>t</sup>* − *<sup>q</sup>*<sup>1</sup> − *<sup>q</sup>*2)+(*<sup>s</sup>* − *<sup>r</sup>*)(*<sup>s</sup>* − *<sup>r</sup>* + <sup>1</sup>)/2, and thus, through *q*<sup>1</sup> and *q*2, depends on the dimension *tb* of the intersection of the columns spaces of *H*<sup>1</sup> and *H*2. The reduction of the number of free parameters matches the degrees of freedom of the test statistics in Johansen (1995, Theorem 7.5), if *<sup>β</sup>* is identified, which is the case if *<sup>r</sup>*<sup>1</sup> ≤ ˜*t*<sup>1</sup> and *<sup>r</sup>*<sup>2</sup> ≤ ˜*t*2.

Using the mapping *<sup>β</sup>r*(·) as a basis for a parameterization allows to introduce another type of hypotheses of the form:

(iv) *H* <sup>0</sup> : *<sup>β</sup>*<sup>⊥</sup> <sup>=</sup> <sup>C</sup><sup>1</sup> = [*H*1*ϕ*1, ... , *Hcϕc*], *<sup>β</sup>*<sup>⊥</sup> <sup>∈</sup> <sup>R</sup>*s*×(*s*−*r*), *Hj* <sup>∈</sup> *Os*,*tj* , *ϕ<sup>j</sup>* ∈ *Otj*,*rj* ,*rj* ≤ *tj* ≤ *s*, for *j* = 1, . . . , *c* such that ∑*<sup>c</sup> <sup>j</sup>*=<sup>1</sup> *rj* = *s* − *r*. The ortho-complement of the cointegrating space is contained in the column spaces of the (full rank) matrices *Hk*.

This type of hypothesis allows, e.g., to test for the presence of cross-unit cointegrating relations (cf. Wagner and Hlouskova 2009, Definition 1) in, e.g., multi-country data sets.

Hypotheses on the cointegrating space at frequency *ω<sup>k</sup>* = *π* can be treated analogously to hypotheses on the cointegrating space at frequency *ω<sup>k</sup>* = 0.

Testing hypotheses on cointegrating spaces at frequencies 0 < *ω<sup>k</sup>* < *π* has to be discussed in more detail, as one also has to consider the space spanned by PCIVs, compare Example 2. There are <sup>2</sup>(*<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup>) linearly independent PCIVs of the form *β*(*z*) = *β*<sup>0</sup> + *β*1*z*. Every PCIV corresponds to a vector *zkβ*<sup>0</sup> <sup>+</sup> *<sup>β</sup>*<sup>1</sup> <sup>∈</sup> <sup>C</sup>*<sup>s</sup>* orthogonal to <sup>C</sup>*<sup>k</sup>* and consequently hypotheses on the space spanned by PCIVs can be transformed to hypotheses on the complex column space of <sup>C</sup>*<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*s*×*d<sup>k</sup>* 1 .

Consider, e.g., an extension of the first type of hypothesis of the form

$$\begin{aligned} \begin{bmatrix} H\_0^k : \begin{bmatrix} \gamma\_0 & \tilde{\gamma}\_0 \\ \gamma\_1 & \tilde{\gamma}\_1 \end{bmatrix} \end{bmatrix} &= \begin{bmatrix} I\_s & 0\_{s \times s} \\ -\cos(\omega\_k)I\_s & \sin(\omega\_k)I\_s \end{bmatrix} \begin{bmatrix} (H\_0\ddot{\Phi}\_0 - H\_1\ddot{\Phi}\_1) & (H\_0\ddot{\Phi}\_1 + H\_1\ddot{\Phi}\_0) \\ -(H\_0\ddot{\Phi}\_1 + H\_1\ddot{\Phi}\_0) & (H\_0\ddot{\Phi}\_0 - H\_1\ddot{\Phi}\_1) \end{bmatrix} \\ &= \begin{bmatrix} I\_s & 0\_{s \times s} \\ -\cos(\omega\_k)I\_s & \sin(\omega\_k)I\_s \end{bmatrix} \begin{bmatrix} H\_0 & H\_1 \\ -H\_1 & H\_0 \end{bmatrix} \begin{bmatrix} \ddot{\Phi}\_0 & \ddot{\Phi}\_1 \\ -\ddot{\Phi}\_1 & \ddot{\Phi}\_0 \end{bmatrix} \end{aligned}$$

with *<sup>H</sup>*˜ 0, *<sup>H</sup>*˜ <sup>1</sup> <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>φ</sup>*˜0, *<sup>φ</sup>*˜1 <sup>∈</sup> <sup>R</sup>*t*×*<sup>r</sup>* , *r* ≤ *t* < *s*, which implies that the column space of C*<sup>k</sup>* is equal to the orthocomplement of the column space of (*H*˜ <sup>0</sup> + *iH*˜ <sup>1</sup>)(*φ*˜0 + *iφ*˜1). This general hypothesis encompasses, e.g., the hypothesis [*γ* <sup>0</sup>, *γ* 1] = *Hφ* = [*H* <sup>0</sup>, *H* 1] *<sup>φ</sup>*, with *<sup>H</sup>* <sup>∈</sup> <sup>R</sup>2*s*×*<sup>t</sup>* , *<sup>H</sup>*0, *<sup>H</sup>*<sup>1</sup> <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>φ</sup>* <sup>∈</sup> <sup>R</sup>*t*×*<sup>r</sup>* , by setting *<sup>φ</sup>*˜0 :<sup>=</sup> *<sup>φ</sup>*˜1 :<sup>=</sup> *<sup>φ</sup>*˜, *<sup>H</sup>*˜ <sup>0</sup> :<sup>=</sup> *<sup>H</sup>*<sup>0</sup> and *<sup>H</sup>*˜ <sup>1</sup> :<sup>=</sup> <sup>−</sup>(cos(*ωk*)*H*<sup>0</sup> <sup>+</sup> *<sup>H</sup>*1)/ sin(*ωk*). The extension is tailored to include the pairwise structure of PCIVs and to simplify transformation into hypotheses on the complex matrix C*<sup>k</sup>* used in the parameterization. The parameterization of the set of matrices corresponding to *H<sup>k</sup>* <sup>0</sup> is derived from a mapping of the form given in (15), with *<sup>R</sup>*<sup>ˇ</sup> *<sup>L</sup>*(*θ*<sup>ˇ</sup> *<sup>L</sup>*) and *RR*(*θR*) replaced by *<sup>Q</sup>*<sup>ˇ</sup> *<sup>L</sup>*(*ϕ*<sup>ˇ</sup> *<sup>L</sup>*) := <sup>∏</sup>*t*−*<sup>r</sup> <sup>i</sup>*=<sup>1</sup> <sup>∏</sup>*<sup>r</sup> <sup>j</sup>*=<sup>1</sup> *Qt*,*i*,*t*−*r*+*j*(*ϕL*,*r*(*i*−1)+*j*) <sup>∈</sup> <sup>R</sup>*t*×*<sup>t</sup>* and *Dd*(*ϕD*)*QR*(*ϕR*) as in Lemma 2.

Similarly, the three other types of hypotheses on the cointegrating spaces considered above can be extended to hypotheses on the space of PCIVs in the MFI(1) case. They translate into hypotheses on complex valued matrices *β<sup>k</sup>* orthogonal to C*k*. To parameterize the set of matrices restricted according to these null hypotheses, Lemma 2 is used. Thus, the restrictions implied by the extensions of all four types of hypotheses to hypotheses on the dynamic cointegrating spaces at frequencies 0 < *ω<sup>k</sup>* < *π* for MFI(1) processes can be implemented using Givens rotations.

A different case of interest is the hypothesis of at least *<sup>m</sup>* linearly independent CIVs *bj* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* , *<sup>j</sup>* <sup>=</sup> 1, ... , *<sup>m</sup>* with 0 <sup>&</sup>lt; *<sup>m</sup>* <sup>≤</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup><sup>k</sup>* <sup>1</sup>, i.e., an *m*-dimensional static cointegrating space at frequency 0 < *ω<sup>k</sup>* < *π*, which we discuss as another illustrative example to the procedure for the case of cointegration at complex unit roots.

For the dynamic cointegrating space, this hypothesis implies the existence of 2*m* linearly independent PCIVs of the form *β*1(*z*) = *bj* and *β*2(*z*) = *bjz*, *j* = 1, ... , *m*. In light of the discussion above the necessary condition for these two polynomials to be PCIVs is equivalent to *b j* C*<sup>k</sup>* = 0, for *j* = 1, ... , *m*. This restriction is similar to *H* <sup>0</sup> discussed above, except for the fact that the cointegrating vectors *bj* are not fully specified. This hypothesis is equivalent to the existence of an *m*-dimensional real kernel of C*k*. A suitable parameterization is derived from the following mapping

$$\mathcal{C}(\theta\_{b'}\mathfrak{p}) \quad := \quad \mathcal{R}\_L(\theta\_b) \left[ \begin{array}{c} \mathbf{0}\_{m \times d\_1^k} \\ \mathbf{C}\_{\mathcal{U}}(\mathfrak{p}) \end{array} \right] .$$

where *<sup>θ</sup><sup>b</sup>* <sup>∈</sup> [0, 2*π*)*m*(*s*−*m*) and *CU*(*ϕ*) :<sup>=</sup> *CU*(*ϕL*,*ϕD*,*ϕR*) <sup>∈</sup> *Us*−*m*,*d<sup>k</sup>* 1 as in Lemma 2. The difference in the number of free parameters without restrictions and with restrictions is equal to *m*(*s* − *m*).

The hypotheses can also be tested jointly for the cointegrating spaces of several unit roots.

#### 5.1.3. Testing Hypotheses on the Adjustment Coefficients

As in the case of hypotheses on the cointegrating spaces *βk*, hypotheses on the adjustment coefficients *α<sup>k</sup>* are typically formulated as hypotheses on the column spaces of *αk*. We only focus on hypotheses on the real valued *α*<sup>1</sup> corresponding to frequency zero. Analogous hypotheses may be considered for *α<sup>k</sup>* at frequencies *ω<sup>k</sup>* = 0, using the same ideas.

The first type of hypothesis on *<sup>α</sup>*<sup>1</sup> is of the form *<sup>H</sup><sup>α</sup>* : *<sup>α</sup>*<sup>1</sup> <sup>=</sup> *<sup>A</sup>ψ*, *<sup>A</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>ψ</sup>* <sup>∈</sup> <sup>R</sup>*t*×*<sup>r</sup>* and therefore, can be rewritten as B1*A<sup>ψ</sup>* = 0. W.l.o.g. let *<sup>A</sup>* ∈ *Os*,*<sup>t</sup>* and *<sup>A</sup>*<sup>⊥</sup> ∈ *Os*,*s*−*t*. We deal with this type of hypothesis as with *H*<sup>0</sup> : *β* = *Hϕ* in the previous section by simply reversing the roles of C<sup>1</sup> and B1. We, therefore, consider the set of feasible matrices B <sup>1</sup> as a subset in *Os*,*s*−*<sup>r</sup>* and use the mapping B <sup>1</sup>(*θ***ˇ***L*, *<sup>θ</sup>R*)=[*AR*<sup>ˇ</sup> *<sup>L</sup>*(*θ***ˇ***L*) [*It*−*r*, 0*r*×(*t*−*<sup>r</sup>*)] , *<sup>A</sup>*⊥]*RR*(*θR*) to derive a parameterization, while C <sup>1</sup> is restricted to be a p.u.t. matrix and the set of feasible matrices C <sup>1</sup> is parameterized accordingly.

As a second type of hypothesis Juselius (2006, sct. 11.9, p. 200) discusses *H <sup>α</sup>* : *<sup>α</sup>*1,<sup>⊥</sup> = *<sup>H</sup>ψ*, *<sup>H</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>ψ</sup>* <sup>∈</sup> <sup>R</sup>*t*×(*s*−*r*), linked to the absence of permanent effects of shocks *<sup>H</sup>*⊥*ε<sup>t</sup>* on any of the variables of the system. Assume w.l.o.g. *<sup>H</sup>*<sup>⊥</sup> ∈ *Os*,*s*−*t*. Using the parameterization of *Os*−*r*(*H*⊥) defined in (13) for the set of feasible matrices B <sup>1</sup> and the parameterization of the set of p.u.t. matrices for the set of feasible matrices C <sup>1</sup>, implements this restriction.

The restrictions on *H<sup>α</sup>* reduce the number of free parameters by *r*(*s* − *t*) and the restrictions implied by *H <sup>α</sup>* lead to a reduction by *t*(*s* − *r*) free parameters, compared to the unrestricted case, which matches in both cases the number of degrees of freedom of the corresponding test statistic in the VECM framework.

5.1.4. Restrictions on the Deterministic Components

Including an unrestricted constant in the VECM equation Δ0*yt* = *ε<sup>t</sup>* + Φ<sup>0</sup> leads to a linear trend in the solution process *yt* = ∑*<sup>t</sup> <sup>j</sup>*=1(*ε<sup>j</sup>* + <sup>Φ</sup>0) + *<sup>y</sup>*<sup>1</sup> = <sup>∑</sup>*<sup>t</sup> <sup>j</sup>*=<sup>1</sup> *ε<sup>j</sup>* + *y*<sup>1</sup> + Φ0*t*, for *t* > 1. If one restricts the constant to <sup>Φ</sup><sup>0</sup> <sup>=</sup> *<sup>α</sup>*Φ˜ 0, <sup>Φ</sup>˜ <sup>0</sup> <sup>∈</sup> <sup>R</sup>*<sup>r</sup>* in a general VECM equation as given in (4), with <sup>Π</sup> <sup>=</sup> *αβ* of rank *<sup>r</sup>*, no summation to linear trends in the solution process occurs, while a constant non-zero mean is still present in the cointegrating relations, i.e., the process {*β yt*}*t*∈Z. Analogously an unrestricted linear trend Φ1*t* in the VECM equation leads to a quadratic trend of the form Φ1*t*(*t* − 1)/2 in the solution process, which is excluded by the restriction Φ1*t* = *α*Φ˜ <sup>1</sup>*t*.

In the VECM framework, compare Johansen (1995, sct. 5.7, p. 81), five restrictions related to the coefficients corresponding to the constant and the linear trend are commonly considered:


with <sup>Φ</sup>0, <sup>Φ</sup><sup>1</sup> <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* and <sup>Φ</sup>˜ 0, <sup>Φ</sup>˜ 1, <sup>∈</sup> <sup>R</sup>*<sup>r</sup>* and the following consequences for the solution processes: Under *H*(*r*) the solution process contains a quadratic trend in the direction of the common trends, i.e., in {*β* ⊥*yt*}*t*∈Z, and a linear trend in the direction of the cointegrating relations, i.e., in {*β yt*}*t*∈Z. Under *H*∗(*r*) the quadratic trend is not present. *H*1(*r*) features a linear trend only in the directions of the common trends, *H*2(*r*) a constant only in these directions. Under *H*<sup>∗</sup> <sup>1</sup> (*r*) the constant is also present in the directions of the cointegrating relations.

In the state space framework the deterministic components can be added in the output equation *yt* = C*xt* + Φ*dt* + *εt*, compare (9). Consequently, the above considered hypotheses can be imposed by formulating linear restrictions on Φ. These can be directly parameterized by including the following deterministic components in the five considered cases:


where <sup>Φ</sup>0, <sup>Φ</sup><sup>1</sup> <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* and <sup>Φ</sup>˜ 0, <sup>Φ</sup>˜ 1, <sup>Φ</sup>˜ <sup>2</sup> <sup>∈</sup> <sup>R</sup>*d*<sup>1</sup> <sup>1</sup> . The component <sup>C</sup>1Φ˜ <sup>0</sup> captures the influence of the initial value C1*x*1,1 in the output equation.

In the VECM framework for the seasonal MFI(1) case, with Π*<sup>k</sup>* = *αkβ <sup>k</sup>* of rank *rk* for 0 < *ω<sup>k</sup>* < *π*, the deterministic component usually includes restricted seasonal dummies of the form *αk*Φ˜ *kz<sup>t</sup> <sup>k</sup>* + *<sup>α</sup>k*Φ˜ *<sup>k</sup>*(*zk*)*t*, <sup>Φ</sup>˜ *<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*rk* to avoid summation in the directions of the stochastic trends. The state space framework allows to straightforwardly include seasonal dummies in the output equation in the form of Φ*kz<sup>t</sup> <sup>k</sup>* <sup>+</sup> <sup>Φ</sup>*k*(*zk*)*t*, <sup>Φ</sup>*<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*<sup>s</sup>* . Again, it is of interest whether these components are unrestricted or whether they take the form of <sup>C</sup>*k*Φ˜ *kz<sup>t</sup> <sup>k</sup>* <sup>+</sup> <sup>C</sup>*k*Φ˜ *<sup>k</sup>*(*zk*)*t*, <sup>Φ</sup>˜ *<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*d<sup>k</sup>* <sup>1</sup> , similarly allowing for a reinterpretation of these components as influence of the initial values *x*1,*<sup>k</sup>* on the output.

Please note that Φ*kz<sup>t</sup> <sup>k</sup>* <sup>+</sup> <sup>Φ</sup>*k*(*zk*)*<sup>t</sup>* is equivalently given by <sup>Φ</sup><sup>ˇ</sup> *<sup>k</sup>*,1 sin(*ωkt*) + <sup>Φ</sup><sup>ˇ</sup> *<sup>k</sup>*,2 cos(*ωkt*) using real coefficients <sup>Φ</sup><sup>ˇ</sup> *<sup>k</sup>*,1, <sup>Φ</sup><sup>ˇ</sup> *<sup>k</sup>*,2 <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* and the desired restrictions can be implemented accordingly.

#### *5.2. The I*(2) *Case*

The state space unit root structure of I(2) processes is of the form Ω*<sup>S</sup>* = ((0, *d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)), where the integer *d*<sup>1</sup> <sup>1</sup> equals the dimension of *<sup>x</sup><sup>E</sup> <sup>t</sup>*,1, and *<sup>d</sup>*<sup>1</sup> <sup>2</sup> equals the dimension of [(*x<sup>G</sup> t*,2) ,(*x<sup>E</sup> t*,2) ] . Recall that the solution for *t* > 0 and *x*1,*<sup>u</sup>* = 0 of the system in canonical form in this setting is given by

$$\begin{split} y\_t &= \quad \mathcal{C}\_{1,1}^E \mathbf{x}\_{t,1}^E + \mathcal{C}\_{1,2}^G \mathbf{x}\_{t,2}^G + \mathcal{C}\_{1,2}^E \mathbf{x}\_{t,2}^E + \mathcal{C}\_{\bullet} \mathbf{x}\_{t,\bullet} + \Phi d\_t + \varepsilon\_t \\ &= \quad \mathcal{C}\_{1,1}^E \mathcal{B}\_{1,2,1} \sum\_{k=1}^{t-1} \sum\_{j=1}^k \varepsilon\_{t-j} + \left( \mathcal{C}\_{1,1}^E \mathcal{B}\_{1,1} + \mathcal{C}\_{1,2}^G \mathcal{B}\_{1,2,1} + \mathcal{C}\_{1,2}^E \mathcal{B}\_{1,2,2} \right) \sum\_{j=1}^{t-1} \varepsilon\_{t-j} \\ &\quad + \mathcal{C}\_{\bullet} \sum\_{j=1}^{t-1} \mathcal{A}\_{\bullet}^{j-1} \mathcal{B}\_{\bullet} \varepsilon\_{t-j} + \mathcal{C}\_{\bullet} \mathcal{A}\_{\bullet}^{t-1} \mathbf{x}\_{1,\bullet} + \Phi d\_t + \varepsilon\_t. \end{split}$$

For VAR processes integrated of order two the integers *d*<sup>1</sup> <sup>1</sup> and *<sup>d</sup>*<sup>1</sup> <sup>2</sup> of the corresponding state space unit root structure are linked to the ranks of the matrices Π = *αβ* (denoted as *r* = *r*0) and *α* <sup>⊥</sup>Γ*β*<sup>⊥</sup> <sup>=</sup> *ξη* (denoted as *<sup>m</sup>* <sup>=</sup> *<sup>r</sup>*1) in the VECM setting, as discussed in Section 2. It holds that *<sup>r</sup>* <sup>=</sup> *<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup> and *m* = *d*<sup>1</sup> <sup>2</sup> <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup>. The relation of the state space unit root structure to the cointegration indices *r*0,*r*1,*r*<sup>2</sup> was also discussed in Section 3.

Again, both the integers *d*<sup>1</sup> <sup>1</sup> and *<sup>d</sup>*<sup>1</sup> <sup>2</sup> and the ranks *r* and *m*, and consequently also the indices *r*0,*r*<sup>1</sup> and *r*2, are closely related to the dimensions of the spaces spanned by CIVs and PCIVs. In the *I*(2) case the static cointegrating space of order ((0, 2),(0, 1)) is the orthocomplement of the column space of <sup>C</sup>*<sup>E</sup>* 1,1 and thus of dimension *<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup>. The dimension of the space spanned by CIVs of order ((0, 2), {}) is equal to *<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup> <sup>−</sup> *rc*,*G*, where *rc*,*<sup>G</sup>* denotes the rank of <sup>C</sup>*<sup>G</sup>* 1,2, since this space is the orthocomplement of the column space of [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>G</sup>* 1,2, <sup>C</sup>*<sup>E</sup>* 1,2]. The space spanned by the PCIVs *β*<sup>0</sup> + *β*1*z* of order ((0, 2), {}) is of dimension smaller or equal to 2*<sup>s</sup>* <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup> <sup>−</sup> *<sup>d</sup>*<sup>1</sup> <sup>2</sup>, due to the orthogonality constraint on [*β* <sup>0</sup>, *β* 1] given in Example 3.

Consider the matrices *β*,*β*<sup>1</sup> and *β*<sup>2</sup> as defined in Section 2. From a state space realization (A, B, C) in canonical form corresponding to a VAR process it immediately follows that the columns of *β*<sup>2</sup> span the same space as the columns of the sub-block <sup>C</sup>*<sup>E</sup>* 1,1. The same relation holds true for *β*<sup>1</sup> and the sub-block <sup>C</sup>*<sup>E</sup>* 1,2. With respect to polynomial cointegration, Bauer and Wagner (2012) show that the rank of <sup>C</sup>*<sup>G</sup>* 1,2 determines the number of minimum degree polynomial cointegrating relations, as discussed in Example 3. If <sup>C</sup>*<sup>G</sup>* 1,2 = 0, then there exists no vector *γ*, such that {*γ yt*}*t*∈<sup>Z</sup> is integrated and cointegrated with {*β* <sup>2</sup>Δ0*yt*}*t*∈Z. In this case {*β yt*}*t*∈<sup>Z</sup> is a stationary process.

The deterministic components included in the I(2) setting are typically a constant and a linear trend. As in the MFI(1) case, identifiability problems occur, if we consider a non-zero initial state *x*1,*u*: The solution to the state space equations for *t* > 0 and *x*1,*<sup>u</sup>* = 0 is given by:

$$\log y\_t = \sum\_{j=1}^{t-1} \mathcal{C} \mathcal{A}^{j-1} \mathcal{B} \varepsilon\_{t-j} + \mathcal{C}\_{1,1}^{\operatorname{E}} (\mathbf{x}\_{1,1}^{\operatorname{E}} + \mathbf{x}\_{1,2}^{\operatorname{G}}(t-1)) + \mathcal{C}\_{1,2}^{\operatorname{G}} \mathbf{x}\_{1,2}^{\operatorname{G}} + \mathcal{C}\_{1,2}^{\operatorname{E}} \mathbf{x}\_{1,2}^{\operatorname{E}} + \mathcal{C}\_{\bullet} \mathcal{A}\_{\bullet}^{t-1} \mathbf{x}\_{1,\bullet} + \Phi d\_t + \varepsilon\_t.$$

Hence, if <sup>Φ</sup>*dt* <sup>=</sup> <sup>Φ</sup><sup>0</sup> <sup>+</sup> <sup>Φ</sup>1*t*, the output equation contains the terms <sup>C</sup>*<sup>E</sup>* 1,1*x<sup>E</sup>* 1,1 <sup>+</sup> <sup>C</sup>*<sup>G</sup>* 1,2*x<sup>G</sup>* 1,2 <sup>+</sup> <sup>C</sup>*<sup>E</sup>* 1,2*x<sup>E</sup>* 1,2 − C*E* 1,1*x<sup>G</sup>* 1,2 <sup>+</sup> <sup>Φ</sup><sup>0</sup> and (C*<sup>E</sup>* 1,1*x<sup>G</sup>* 1,2 + Φ1)*t*. Again, this implies non-identifiability, which is resolved by assuming *x*1,*<sup>u</sup>* = 0, compare Remark 6.

5.2.1. Testing Hypotheses on the State Space Unit Root Structure

To simplify notation we use

$$
\overline{M}(d\_1^1, d\_2^1) \quad := \begin{cases}
\overline{M(((0, d\_1^1, d\_2^1)), n - d\_1^1 - d\_2^1)} & \text{if } d\_1^1 > 0, \\
\overline{M(((0, d\_2^1)), n - d\_2^1)} & \text{if } d\_1^1 = 0, d\_2^1 > 0, \\
\overline{M\_{\bullet, n}} & \text{if } d\_1^1 = d\_2^1 = 0,
\end{cases}
$$

with *<sup>n</sup>* <sup>≥</sup> *<sup>d</sup>*<sup>1</sup> <sup>1</sup> + *<sup>d</sup>*<sup>1</sup> <sup>2</sup>. Here *<sup>M</sup>*(*d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>) for *<sup>d</sup>*<sup>1</sup> <sup>1</sup> + *<sup>d</sup>*<sup>1</sup> <sup>2</sup> > 0 denotes the closure of the set of transfer functions of order *n* that possess a state space unit root structure of either Ω*<sup>S</sup>* = ((0, *d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)) or <sup>Ω</sup>*<sup>S</sup>* = ((0, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>)) in case of *d*<sup>1</sup> <sup>1</sup> = 0, while *M*(0, 0) denotes the closure of the set of all stable transfer functions of order *n*.

Considering the relations between the different sets of transfer functions given in Corollary 4 shows that the following relations hold (assuming *s* ≥ 4; the columns are arranged to include transfer functions with the same dimension of A*u*):

$$\begin{array}{ccccccccc}\overline{M}(0,0) & \supset & \overline{M}(0,1) & \supset & \overline{M}(1,0) & & & & \\ & & & \cup & & & & \\ & & & \overline{M}(0,2) & \supset & \overline{M}(1,1) & \supset & \overline{M}(2,0) & & \\ & & & & \cup & & \cup & & \\ & & & & \overline{M}(0,3) & \supset & \overline{M}(1,2) & & & \\ & & & & & & \cup & & \\ & & & & & & \overline{M}(0,4) & & & \\ \end{array}$$

Please note that *M*(*d*<sup>1</sup> <sup>1</sup>, *<sup>d</sup>*<sup>1</sup> <sup>2</sup>) corresponds to *Hs*−*d*<sup>1</sup> 2,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 = *Hr*,*r*<sup>1</sup> in Johansen (1995). Therefore, the relationships between the subsets match the ones in Johansen (1995, Table 9.1) and the ones found by Jensen (2013). The latter type of inclusions appear for instance for *M*(0, 2), containing transfer functions corresponding to *I*(1) processes, which is a subset of the set *M*(1, 0) of transfer functions corresponding to *I*(2) processes.

The same remarks as in the MFI(1) case also apply in the I(2) case: When testing for *H*<sup>0</sup> : Ω*<sup>S</sup>* = ((0, *d*<sup>1</sup> 1,0, *<sup>d</sup>*<sup>1</sup> 2,0)), all attainable state space unit root structures <sup>A</sup>(((0, *<sup>d</sup>*<sup>1</sup> 1,0, *<sup>d</sup>*<sup>1</sup> 2,0))) have to be included in the null hypothesis.

#### 5.2.2. Testing Hypotheses on CIVs and PCIVs

Johansen (2006) discusses several types of hypotheses on the cointegrating spaces of different orders. These deal with properties of *β*, joint properties of [*β*, *β*1] or the occurrence of non-trivial polynomial cointegrating relations. Boswijk and Paruolo (2017), moreover, discuss testing hypotheses on the loading matrices of common trends (corresponding in our setting to testing hypotheses on C1).

We commence with hypotheses of the form *H*<sup>0</sup> : *β* = *Kϕ* and *H* <sup>0</sup> : *β* = [*b*, *ϕ*] just as in the MFI(1) case at unit root one, since hypotheses on *β* correspond to hypotheses on its orthocomplement spanned by [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] in the VARMA framework:

Hypotheses of the form *<sup>H</sup>*<sup>0</sup> : *<sup>β</sup>* <sup>=</sup> *<sup>K</sup>ϕ*, *<sup>K</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>ϕ</sup>* <sup>∈</sup> <sup>R</sup>*t*×*<sup>r</sup>* imply *<sup>ϕ</sup> K* [C*E* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] = 0. W.l.o.g. let *K* ∈ *Os*,*<sup>t</sup>* and *K*<sup>⊥</sup> ∈ *Os*,*s*−*t*. As in the parameterization under *H*<sup>0</sup> in the MFI(1) case at unit root one, compare (15), use the mapping

$$[\mathcal{C}\_{1,1}^{E,r}, \mathcal{C}\_{1,2}^{E,r}](\check{\theta}\_{L}, \theta\_{R}) \quad := \quad \left[ \mathbb{K} \cdot \check{\mathcal{R}}\_{L}(\check{\theta}\_{L})' \begin{bmatrix} I\_{t-r} \\ 0\_{r \times (t-r)} \end{bmatrix}, \quad \mathbb{K}\_{\perp} \right] \cdot \mathcal{R}\_{R}(\theta\_{R}),$$

to derive a parameterization of the set of feasible matrices [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2], i.e., a joint parameterization of both sets of matrices <sup>C</sup>*<sup>E</sup>* 1,1 and <sup>C</sup>*<sup>E</sup>* 1,2, where [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] ∈ *Os*,*s*−*r*.

Hypotheses of the form *H* <sup>0</sup> : *<sup>β</sup>* = [*b*, *<sup>ϕ</sup>*], *<sup>b</sup>* <sup>∈</sup> <sup>R</sup>*s*×*<sup>t</sup>* , *<sup>ϕ</sup>* <sup>∈</sup> <sup>R</sup>*s*×(*r*−*t*), 0 <sup>&</sup>lt; *<sup>t</sup>* <sup>≤</sup> *<sup>r</sup>* are equivalent to *b* [C*E* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] = 0. Assume w.l.o.g. *b* ∈ *Os*,*<sup>t</sup>* and parameterize the set of feasible matrices C*E* 1,1 using *Os*,*d*<sup>1</sup> 1 (*b*) as defined in (13) and the set of feasible matrices <sup>C</sup>*<sup>E</sup>* 1,2 using *Os*,*d*<sup>1</sup> 2−*d*<sup>1</sup> 1 ([*b*, *C<sup>E</sup>* 1,1]). Alternatively, parameterize the set of feasible matrices jointly as elements [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] ∈ *Os*,*s*−*r*(*b*).

Applications using the VECM framework allow for testing hypotheses on [*β*, *β*1]. In the VARMA framework, these correspond to hypotheses on the orthogonal complement of [*β*, *<sup>β</sup>*1], i.e., <sup>C</sup>*<sup>E</sup>* 1,1. Implementation of different types of hypotheses on [*β*, *β*1] proceeds as for similar hypotheses on *<sup>β</sup>* in the MFI(1) case at unit root one, replacing [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2] by <sup>C</sup>*<sup>E</sup>* 1,1.

The hypothesis of no minimum degree polynomial cointegrating relations implies the restriction C*G* 1,2 = 0, compare Example 3. Therefore, we can test all hypotheses considered in Johansen (2006) also in our more general setting.

#### 5.2.3. Testing Hypotheses on the Adjustment Coefficients

Hypotheses on *α* and *ξ* as defined in (6) and (7) correspond to hypotheses on the spaces spanned by the rows of B1,2,1 and B1,2,2. For VAR processes integrated of order two, the row space of B1,2,1 is equal to the orthogonal complement of the column space of [*α*, *<sup>α</sup>*⊥*ξ*], while the row space of B1,2 := [B 1,2,1, B 1,2,2] is equal to the orthogonal complement of the column space of *α*. The restrictions corresponding to hypotheses on *α* and *ξ* can be implemented analogously to the restrictions corresponding to hypotheses on *α*<sup>1</sup> in Section 5.1.3, reversing the roles of the relevant sub-blocks in *Bu* and C*<sup>u</sup>* accordingly.

#### 5.2.4. Restrictions on the Deterministic Components

The I(2) case is, with respect to the modeling of deterministic components, less well studied than the MFI(1) case. In most theory papers they are simply left out, with the notable exception Rahbek et al. (1999), dealing with the inclusion of a constant term in the I(2)-VECM representation. The main reason for this appears to be the way deterministic components in the defining vector error correction representation translate into deterministic components in the corresponding solution process. An unrestricted constant in the VECM for I(2) processes leads to a linear trend in {*β* <sup>1</sup>*yt*}*t*∈<sup>Z</sup> and a quadratic trend in {*β* <sup>2</sup>*yt*}*t*∈Z, while an unrestricted linear trend results in quadratic and cubic trends in the respective directions. Already in the I(1) case discussed above five different cases—with respect to integration and asymptotic behavior of estimators and tests—need to be considered separately. An all encompassing discussion of the restrictions on the coefficients of a constant and a linear trend in the I(2) case requires the specification of even more cases. As an alternative approach in the VECM framework, deterministic components could be dealt with by replacing *yt* with *yt* − Φ*dt* in the VECM equation. This has recently been considered in Johansen and Nielsen (2018) and is analogous to our approach in the state space framework.

As before, in the MFI(1) or I(1) case, the analysis of (the impact of) deterministic components is straightforward in the state space framework, which effectively stems from their additive inclusion in the Granger-type representation, compare (9). Choose, e.g., Φ*dt* = Φ<sup>0</sup> + Φ1*t*, as in the I(1) case. In analogy to Section 5.1.4, linear restrictions of deterministic components in relation to the static and polynomial cointegrating spaces can be embedded in a parameterization. Focusing on Φ0, e.g., this is achieved by

<sup>Φ</sup><sup>0</sup> = [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2]*φ*<sup>0</sup> <sup>+</sup> <sup>C</sup>˜ 1,2*φ*˜0 <sup>+</sup> C⊥*φ*ˇ0,

where the columns of <sup>C</sup>˜ 1,2 are a basis for the column space of <sup>C</sup>*<sup>G</sup>* 1,2, which does not necessarily have full column rank, and the columns of C⊥ span the orthocomplement of the column space of [C*<sup>E</sup>* 1,1, <sup>C</sup>*<sup>E</sup>* 1,2, <sup>C</sup>˜ 1,2]. The matrix Φ<sup>1</sup> can be decomposed analogously. The corresponding parametrization then allows to consider different restricted versions of deterministic components and to study the asymptotic behavior of estimators and tests for these cases.

#### **6. Summary and Conclusions**

Vector autoregressive moving average (VARMA) processes, which can be cast equivalently in the state space framework, may be useful for empirical analysis compared to the more restrictive class of vector autoregressive (VAR) processes for a variety of reasons. These include invariance with respect to marginalization and aggregation, parsimony as well as the fact that the log-linearized solutions to DSGE models are typically VARMA processes rather than VAR processes. To realize the potential of these advantages necessitates, in our view, to develop cointegration analysis for VARMA processes to a similar extent as it is developed for VAR processes. The necessary first steps of this research agenda are to develop a set of structure theoretical results that allow subsequently developing statistical inference procedures. Bauer and Wagner (2012) provides the very first step of this agenda by providing a *canonical form* for unit root processes in the state space framework, which is shown in that paper to be very convenient for cointegration analysis.

Based on the earlier canonical form paper this paper derives a state space model *parameterization* for VARMA processes with unit roots using the state space framework. The canonical form and a fortiori the parameterization based on it are constructed to facilitate the investigation of the unit root and (static and polynomial) cointegration properties of the considered process. Furthermore, the paper shows that the framework allows to test a large variety of hypotheses on cointegrating ranks and spaces, clearly a key aspect for the usefulness of any method to analyze cointegration. In addition to providing general results, throughout the paper all results are discussed in detail for the multiple frequency I(1) and I(2) cases, which cover the vast majority of applications.

Given the fact that (as shown in Hazewinkel and Kalman 1976) VARMA unit root processes cannot be continuously parameterized, the set of all unit root processes (as defined in this paper) is partitioned according to a multi-index Γ that includes the state space unit root structure. The parameterization is shown to be a diffeomorphism on the interior of the considered sets. The topological relationships between the sets forming the partitioning of all transfer functions considered are studied in great detail for three reasons: First, pseudo maximum likelihood estimation effectively amounts to maximizing the pseudo likelihood function over the closures of sets of transfer functions, *M*<sup>Γ</sup> in our notation. Second, related to the first item, the relations between subsets of *M*<sup>Γ</sup> have to be understood in detail as knowledge concerning these relations is required for developing (sequential) pseudo likelihood-ratio tests for the numbers of stochastic trends or cycles. Third, of particular importance for the implementation of, e.g., pseudo maximum likelihood estimators, we discuss the existence of *generic pieces*.

In this respect we derive two results: First, for correctly specified state space unit root structure and system order of the stable subsystem —and thus correctly specified system order—we explicitly describe generic indices <sup>Γ</sup>*g*(Ω*S*, *<sup>n</sup>*•) such that *<sup>M</sup>*Γ*g*(Ω*S*,*n*•) is open and dense in the set of all transfer functions with state space unit root structure Ω*<sup>S</sup>* and system order of the stable subsystem *n*•. This result forms the basis for establishing consistent estimators of the transfer functions—and via continuity of the parameterization—of the parameter estimators when the state space unit root structure and system order are known. Second, in case only an upper bound on the system order is known (or specified), we show the existence of a generic multi-index <sup>Γ</sup>*α*•,*g*(*n*) for which the set of corresponding transfer functions *<sup>M</sup>*Γ*α*•,*g*(*n*) is open and dense in the set *Mn* of all non-explosive transfer functions whose order (or McMillan degree) is bounded by *n*. This result is the basis for consistent estimation (on an open and dense subset) when only an upper bound of the system order is known. In turn this estimator is the starting point for determining Ω*S*, using the subset relationships alluded to above in the second point. For the MFI(1) and I(2) cases we show in detail that similar subset relations (concerning cointegrating ranks) as in the cointegrated VAR MFI(1) and I(2) cases hold, which suggests constructing similar sequential test procedures for determining the cointegrating ranks as in the VAR cointegration literature.

Section 5 is devoted to a detailed discussion of testing hypotheses on the cointegrating spaces, again for both the MFI(1) and the I(2) case. In this section, particular emphasis is put on modeling deterministic components. The discussion details how all usually formulated and tested hypotheses concerning (static and polynomial) cointegrating vectors, potentially in combination with (un-)restricted deterministic components, in the VAR framework can also be investigated in the state space framework.

Altogether, the paper sets the stage to develop pseudo maximum likelihood estimators, investigate their asymptotic properties (consistency and limiting distributions) and tests based on them for determining cointegrating ranks that allow performing cointegration analysis for cointegrated

VARMA processes. The detailed discussion of the MFI(1) and I(2) cases benefits the development of statistical theory dealing with these cases undertaken in a series of companion papers.

**Author Contributions:** The authors of the paper have contributed equally, via joint efforts, regarding both ideas, research, and writing. Conceptualization, all authors; methodology, all authors; formal analysis, P.d.M.R. and L.M.; investigation, all authors; writing—original draft preparation, P.d.M.R. and L.M.; writing—review and editing, all authors.; project administration, D.B. and M.W.; funding acquisition, D.B. and M.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation-Projektnummer 276051388) which is gratefully acknowledged. We acknowledge support for the publication costs by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.

**Acknowledgments:** We thank the editors, Rocco Mosconi and Paolo Paruolo, as well as anonymous referees for helpful suggestions. The views expressed in this paper are solely those of the authors and not necessarily those of the Bank of Slovenia or the European System of Central Banks. On top of this the usual disclaimer applies.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Proofs of the Results of Section 3**

#### *Appendix A.1. Proof of Lemma 1*

(i) Let *Cj* be a sequence in *Os*,*<sup>d</sup>* converging to *C*<sup>0</sup> for *j* → ∞. By continuity of matrix multiplication

$$\mathbf{C}\_0^\prime \mathbf{C}\_0 = (\lim\_{j \to \infty} \mathbf{C}\_j)^\prime\\\lim\_{j \to \infty} \mathbf{C}\_j = \lim\_{j \to \infty} (\mathbf{C}\_j^\prime \mathbf{C}\_j) = I\_d.$$

Thus, *C*<sup>0</sup> ∈ *Os*,*d*, which shows that *Os*,*<sup>d</sup>* is closed. By construction [*C C*]*i*,*<sup>i</sup>* = ∑*<sup>s</sup> <sup>j</sup>*=<sup>1</sup> *c*<sup>2</sup> *j*,*i* . Since [*C C*]*i*,*<sup>i</sup>* = 1 for all *C* ∈ *Os*,*<sup>d</sup>* and *i* = 1, . . . , *d*, the entries of *C* are bounded.


Clearly, the interior of Θ<sup>R</sup> *<sup>O</sup>* is open and dense in <sup>Θ</sup><sup>R</sup> *<sup>O</sup>*. By the definition of continuity the pre-image of the interior of Θ<sup>R</sup> *<sup>O</sup>* is open in *Os*,*d*. By (iii) there exists a *θ*<sup>0</sup> for arbitrary *C*<sup>0</sup> ∈ *Os*,*<sup>d</sup>* such that *CO*(*θ*0) = *<sup>C</sup>*0. Since the interior of <sup>Θ</sup><sup>R</sup> *<sup>O</sup>* is dense in <sup>Θ</sup><sup>R</sup> *<sup>O</sup>* there exists a sequence *θ<sup>j</sup>* in the interior of Θ<sup>R</sup> *<sup>O</sup>* such that *θ<sup>j</sup>* → *θ*0. Then *CO*(*θj*) → *C*<sup>0</sup> because of the continuity of *CO*. Since *CO*(*θj*) is a sequence in the pre-image of the interior of Θ<sup>R</sup> *<sup>O</sup>*, it follows that the pre-image of the interior of Θ<sup>R</sup> *<sup>O</sup>* is dense in *Os*,*d*.


*O*<sup>+</sup> *<sup>s</sup>*,*s*. Clearly, the multiplication with *I*<sup>−</sup> *<sup>s</sup>* is infinitely often differentiable with infinitely often differentiable inverse, which implies that *C*± *<sup>O</sup>* (·) *O*− *s*,*s* is infinitely often differentiable with infinitely often differentiable inverse on an open subset of *O*− *<sup>s</sup>*,*s*, from which the result follows.

#### *Appendix A.2. Proof of Lemma 2*

(i) Let *Cj* be a sequence in *Us*,*<sup>d</sup>* converging to *C*<sup>0</sup> for *j* → ∞. By continuity of matrix multiplication

$$\mathbf{C}\_0^\prime \mathbf{C}\_0 = (\lim\_{j \to \infty} \mathbf{C}\_j)^\prime\\\lim\_{j \to \infty} \mathbf{C}\_j = \lim\_{j \to \infty} (\mathbf{C}\_j^\prime \mathbf{C}\_j) = I\_d.$$

Thus, *C*<sup>0</sup> ∈ *Us*,*d*, which shows that *Us*,*<sup>d</sup>* is closed. By construction [*C C*]*i*,*<sup>i</sup>* = ∑*<sup>s</sup> <sup>j</sup>*=<sup>1</sup> |*cj*,*i*| 2. Since [*C C*]*i*,*<sup>i</sup>* = 1 for all *C* ∈ *Us*,*<sup>d</sup>* and *i* = 1, . . . , *d*, the entries of *C* are bounded.


#### *Appendix A.3. Proof of Theorem 2*


contains the free parameters of B*u*, the mapping *θB*,*<sup>p</sup>* × *θB*, *<sup>f</sup>* → B*<sup>u</sup>* is continuous. The mapping *θ*• → (A•, B•, C•) is continuous (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii)). The mapping *θC*,*<sup>E</sup>* × *θC*,*<sup>G</sup>* → C*<sup>u</sup>* consists of iterated applications of *CO*, and *CU* (compare Lemmas 1 and 2) which are differentiable and thus continuous and iterated applications of the extensions of the mappings *CO*,*d*2−*d*<sup>1</sup> and *CO*,*<sup>G</sup>* (compare Corollaries 1 and 2) to general unit root structures and to complex matrices. The proof that these functions are differentiable is analogous to the proofs of Lemma 1 and Lemma 2.

(iii) The definitions of *θB*, *<sup>f</sup>* and *θB*,*<sup>p</sup>* immediately imply that they depend continuously on B*u*. The parameter vector *θ*• depends continuously on (A•, B•, C•) (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii)). The existence of an open and dense subset of matrices C*<sup>u</sup>* such that the mapping attaching parameters to the matrices is continuous follows from arguments contained in the proofs of Lemmas 1 and 2.

#### **Appendix B. Proofs of the Results of Section 4**

#### *Appendix B.1. Proof of Theorem 3*

For the first inclusion the proof can be divided into two parts, discussing the stable and the unstable subsystem separately. The result with regard to the stable subsystem is due to Hannan and Deistler (1988, Theorem 2.5.3 (iv)). For the unstable subsystem (Ω˜ *<sup>S</sup>*, *<sup>p</sup>*˜) <sup>≤</sup> (Ω*S*, *<sup>p</sup>*) implies the existence of a matrix *S* as described in Definition 9. Partition *S* = *<sup>S</sup>*<sup>1</sup> *S*2 such that *S*<sup>1</sup> *p* = *p*<sup>1</sup> ≥ *p*˜. Let ˜ *k* be an arbitrary transfer function in *M*Γ˜ = *π*(ΔΓ˜) with corresponding state space realization (A˜, <sup>B</sup>˜, <sup>C</sup>˜) <sup>∈</sup> ΔΓ˜ . Then, we find matrices *<sup>B</sup>*<sup>1</sup> and *<sup>C</sup>*<sup>1</sup> such that for the state space realization given by A = *S* <sup>A</sup>˜ ˜*J*<sup>12</sup> 0 ˜*J*<sup>2</sup> *S* , B = *S* <sup>B</sup>˜ *B*1 and <sup>C</sup> <sup>=</sup> <sup>C</sup>˜ *<sup>C</sup>*<sup>1</sup> *S* it holds that (A, B, C) ∈ ΔΓ. Then, (A*j*, B*j*, C*j*)=(A, *S* diag(*In*<sup>1</sup> , *j* <sup>−</sup><sup>1</sup> *In*<sup>2</sup> )*S* B, C) ∈ ΔΓ, where *ni* is the number of rows of *Si* for *<sup>i</sup>* <sup>=</sup> 1, 2 converges for *<sup>j</sup>* <sup>→</sup> <sup>∞</sup> to A, *S* <sup>B</sup>˜ 0 , C ∈ ΔΓ, which is observationally equivalent to (A˜, <sup>B</sup>˜, <sup>C</sup>˜). Consequently, ˜ *k* = *π* A, *S* <sup>B</sup>˜ 0 , C ∈ *π*(ΔΓ).

To show the second inclusion, consider a sequence of systems (A*j*, <sup>B</sup>*j*, <sup>C</sup>*j*) <sup>∈</sup> ΔΓ, *<sup>j</sup>* <sup>∈</sup> <sup>N</sup> converging to (*A*0, *<sup>B</sup>*0, *<sup>C</sup>*0) <sup>∈</sup> ΔΓ. We need to show <sup>Γ</sup>¯ <sup>∈</sup> <sup>1</sup> <sup>Γ</sup>˜∈K(Γ){Γ<sup>ˇ</sup> <sup>≤</sup> <sup>Γ</sup>˜ }, where <sup>Γ</sup>¯ is the multi-index corresponding to (*A*0, *B*0, *C*0).

For the stable system we can separate the subsystem (*Aj*,*s*, *Bj*,*s*, *Cj*,*s*) remaining stable in the limit and the part with eigenvalues of *Aj* tending to the unit circle. As discussed in Section 4.1.2, (*Aj*,*s*, *Bj*,*s*, *Cj*,*s*) converges to the stable subsystem (*A*0,•, *B*0,•, *C*0,•) whose Kronecker indices can only be smaller than or equal to *α*• (cf. Hannan and Deistler 1988, Theorem 2.5.3).

The remaining subsystem consists of the unstable subsystem of (A*j*, B*j*, C*j*) which converges to (*A*0,*u*, *B*0,*u*, *C*0,*u*) and the second part of the stable subsystem containing all stable eigenvalues of *Aj* converging to the unit circle. The limiting combined subsystem (*A*0,*c*, *B*0,*c*, *C*0,*c*) is such that *A*0,*<sup>c</sup>* is block diagonal. If the limiting combined subsystem is minimal and *B*0,*<sup>u</sup>* has a structure corresponding to *<sup>p</sup>*, this shows that the pair (Ω¯ *<sup>S</sup>*, *<sup>p</sup>*¯) extends (Ω*S*, *<sup>p</sup>*) in accordance with the definition of <sup>K</sup>(Γ).

Since the limiting subsystem is not necessarily minimal and *B*0,*<sup>u</sup>* has not necessarily a structure corresponding to *p*, eliminating coordinates of the state and adapting the corresponding structure indices *p* may result in a pair (Ω¯ *<sup>S</sup>*, *p*¯) that is smaller than the pair (Ω˜ *<sup>S</sup>*, *p*˜) corresponding to an element of K(Γ).

#### *Appendix B.2. Proof of Theorem 4*

The multi-index Γ contains three components: Ω*S*, *p*, *α*•. For given Ω*<sup>S</sup>* the selection of the structures indices *p*max introducing the fewest restrictions, such that in its boundary all possible p.u.t. matrices occur, was discussed in Section 4.2. Choosing this maximal element *p*max then implies that

all systems of given state space unit root structure correspond to a multi-index that is smaller than or equal to (Ω*S*, *p*max, *β*•), where *β*• is a Kronecker index corresponding to state space dimension *n*•. For the Kronecker indices of order *n*• it is known that there exists one index *α*•,*<sup>g</sup>* such that *Mα*•,*<sup>g</sup>* is open and dense in *Mn*• . The set *M*Ω*S*,*p*max,*β*• is, therefore, contained in *M*Ω*S*,*p*max,*α*•,*<sup>g</sup>* which implies (14) with Γ*g*(Ω*S*, *n*•) := (Ω*S*, *p*max, *α*•,*g*).

For the second claim choose an arbitrary state space realization (A, B, C) in canonical form such that *<sup>π</sup>*(A, B, C) ∈ *<sup>M</sup>*(Ω*S*, *<sup>n</sup>*•) for arbitrary <sup>Ω</sup>*S*. Define the sequence (*Aj*, *Bj*, *Cj*)*j*∈<sup>N</sup> by *Aj* = (<sup>1</sup> − *<sup>j</sup>* <sup>−</sup>1)A, *Bj* = (1 − *j* <sup>−</sup>1)B, *Cj* <sup>=</sup> <sup>C</sup>. Then *<sup>λ</sup>*|max|(*Aj*) <sup>&</sup>lt; 1 holds for all *<sup>j</sup>*, which implies *<sup>π</sup>*(*Aj*, *Bj*, *Cj*) <sup>∈</sup> *<sup>M</sup>*Γ*α*•,*g*(*n*) for every *<sup>n</sup>* ≥ *nu*(Ω*s*) + *<sup>n</sup>*• and every *<sup>j</sup>*. The continuity of *<sup>π</sup>* implies *<sup>π</sup>*(A, B, C) = lim*j*→∞*π*(*Aj*, *Bj*, *Cj*) ∈ *<sup>M</sup>*Γ*α*•,*g*(*n*) .

#### *Appendix B.3. Proof of Theorem 5*

(i) Assume that there exists a sequence *ki* ∈ *M*<sup>Γ</sup> converging to a transfer function *k*<sup>0</sup> ∈ *M*Γ. For such a sequence the size of the Jordan blocks for every unit root are identical from some *i*<sup>0</sup> onwards since eigenvalues depend continuously on the matrices (cf. Chatelin 1993): Thus, the stable part of the transfer functions *ki* must converge to the stable part of the transfer function *k*0, since the sum of the algebraic multiplicity of all eigenvalues inside the open unit disc cannot drop in the limit. Since *V<sup>α</sup>* (the set of all stable transfer functions with Kronecker index *α*) is open in *V<sup>α</sup>* according to Hannan and Deistler (1988, Theorem 2.5.3) this implies that the stable part of *ki* has Kronecker index *α*• from some *i*<sup>0</sup> onwards.

For the unstable part of the transfer function note that in *M*<sup>Γ</sup> for every unit root *zj* the rank of (*<sup>A</sup>* <sup>−</sup> *zjIn*)*<sup>r</sup>* is equal for every *<sup>r</sup>*. Thus, the maximum over *<sup>M</sup>*<sup>Γ</sup> cannot be larger due to lower semi-continuity of the rank. It follows that for *ki* <sup>→</sup> *<sup>k</sup>*<sup>0</sup> the ranks of (*<sup>A</sup>* <sup>−</sup> *zjIn*)*<sup>r</sup>* for all <sup>|</sup>*zj*<sup>|</sup> <sup>=</sup> 1 and for all *<sup>r</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> are identical to the ranks corresponding to *<sup>k</sup>*<sup>0</sup> from some point onwards showing that *ki* has the same state space unit root structure as *k*<sup>0</sup> from some *i*<sup>0</sup> onwards. Finally, the p.u.t. structure of sub-blocks of B*<sup>k</sup>* clearly introduces an open set being defined via strict inequalities. This shows that *ki* ∈ *M*<sup>Γ</sup> from some *i*<sup>0</sup> onwards implying that *M*<sup>Γ</sup> is open in *M*Γ.

(ii) The first inclusion was shown in Theorem 3. Comparing Definitions 10 and 11 we see 1 <sup>Γ</sup>˜∈K(Γ*g*) *<sup>M</sup>*Γ˜ <sup>⊂</sup> <sup>1</sup> (Ω˜ *<sup>S</sup>*,*n*˜ •)∈A(Ω*S*,*n*•) *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •). By the definition of the partial ordering (compare Definition 9) 1 <sup>Γ</sup>˜≤Γ*<sup>g</sup> <sup>M</sup>*Γ˜ <sup>⊂</sup> <sup>1</sup> (Ω˜ *<sup>S</sup>*,*n*˜ •)≤(Ω*S*,*n*•) *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) holds. Together these two statements imply the second inclusion.

1 (Ω˜ *<sup>S</sup>*,*n*˜ •)∈A(Ω*S*,*n*•) 1 (Ω<sup>ˇ</sup> *<sup>S</sup>*,*n*<sup>ˇ</sup> •)≤(Ω˜ *<sup>S</sup>*,*n*˜ •) *<sup>M</sup>*(Ω<sup>ˇ</sup> *<sup>S</sup>*, *<sup>n</sup>*<sup>ˇ</sup> •) <sup>⊂</sup> *<sup>M</sup>*Γ*g*(Ω*s*,*n*•) is a consequence of the following two statements:

$$\text{In (a)}\quad \text{If } M(\tilde{\Omega}\_{\text{S}}, \vec{n}\_{\bullet}) \subset \overline{M(\Omega\_{\text{S}}, n\_{\bullet})}, \text{ then } \bigcup \underline{\chi\_{(\tilde{\Omega}\_{\text{S}}, \vec{n}\_{\bullet}) \preceq}(\underline{\Omega\_{\text{S}}, \vec{n}\_{\bullet}})} \ M(\check{\Omega}\_{\text{S}}, \vec{n}\_{\bullet}) \subset \overline{M(\Omega\_{\text{S}}, n\_{\bullet})}.$$

$$\text{(b)}\quad \text{If } (\varOmega\_{\operatorname{S}}, \tilde{n}\_{\bullet}) \in \mathcal{A}(\varOmega\_{\operatorname{S}}, n\_{\bullet}), \text{ then } M(\varOmega\_{\operatorname{S}}, \tilde{n}\_{\bullet}) \subset \overline{M(\varOmega\_{\operatorname{S}}, n\_{\bullet})}.$$

For (a) note that for an arbitrary transfer function ˇ *<sup>k</sup>* <sup>∈</sup> *<sup>M</sup>*(Ω<sup>ˇ</sup> *<sup>S</sup>*, *<sup>n</sup>*<sup>ˇ</sup> •) with (Ω<sup>ˇ</sup> *<sup>S</sup>*, *<sup>n</sup>*<sup>ˇ</sup> •) <sup>≤</sup> (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) there is a multi-index Γˇ such that ˇ *k* ∈ *M*Γ<sup>ˇ</sup> . By the definition of the partial ordering (compare Definition 9) we find a multi-index <sup>Γ</sup>˜ <sup>≥</sup> <sup>Γ</sup><sup>ˇ</sup> such that *<sup>M</sup>*Γ˜ <sup>⊂</sup> *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •). By Theorem <sup>3</sup> and the continuity of *<sup>π</sup>* we have *<sup>M</sup>*Γ<sup>ˇ</sup> <sup>⊂</sup> *<sup>π</sup>*(ΔΓ˜) <sup>⊂</sup> *<sup>M</sup>*Γ˜ . Since *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) <sup>⊂</sup> *<sup>M</sup>*(Ω*S*, *<sup>n</sup>*•) by assumption, ˇ *<sup>k</sup>* <sup>∈</sup> *<sup>M</sup>*Γ˜ <sup>⊂</sup> *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) <sup>⊂</sup> *<sup>M</sup>*(Ω*S*, *<sup>n</sup>*•) which finishes the proof of (a).

With respect to (b) note that by Definition 11, A(Ω*S*, *n*•) contains transfer functions with two types of state space unit root structures. First, <sup>A</sup>˜*<sup>u</sup>* corresponding to state space unit root <sup>Ω</sup>˜ *<sup>S</sup>* may be of the form

$$\mathcal{S}\mathcal{A}\_u\mathcal{S}' = \begin{bmatrix} \mathcal{A}\_u & J\_{12} \\ 0 & J\_2 \end{bmatrix}. \tag{A1}$$

Second, <sup>A</sup>ˇ*<sup>u</sup>* corresponding to state space unit root <sup>Ω</sup><sup>ˇ</sup> *<sup>S</sup>* may be of the form (A1) where off-diagonal elements of A*<sup>u</sup>* are replaced by zero. To prove (b) we need to show that for both cases the corresponding transfer function is contained in *M*(Ω*S*, *n*•).

We start by showing that in the second case the transfer function ˇ *<sup>k</sup>* is contained in *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •), where <sup>Ω</sup>˜ *<sup>S</sup>* is the state space unit root structure corresponding to <sup>A</sup>˜*<sup>u</sup>* in (A1). For this, consider the sequence

$$A\_j = \left[\begin{array}{cc} 1 & j^{-1} \\ 0 & 1 \end{array}\right], \quad B\_j = \left[\begin{array}{cc} B\_1 \\ & B\_2 \end{array}\right], \\ \mathbb{C}\_j = \left[\begin{array}{cc} \mathbb{C}\_1 & \mathbb{C}\_2 \end{array}\right].$$

Clearly, every system (*Aj*, *Bj*, *Cj*) corresponds to an *I*(2) process, while the limit for *j* → ∞ corresponds to an *I*(1) process. This shows that it is possible in the limit to trade one *I*(2) component with two *I*(1) components leading to more transfer functions in the *Tpt* closure of *<sup>M</sup>*Γ*g*(Ω*S*,*n*•) than only the ones included in *<sup>π</sup>*(ΔΓ*g*(Ω*S*,*n*•)), where the off-diagonal entry in *Aj* is restricted to equal one and hence the corresponding sequence of systems in the canonical form diverges to infinity. In a sense these systems correspond to "points at infinity": For the example given above we obtain the canonical form

$$\mathcal{A}\_{\circ} = \left[ \begin{array}{cc} 1 & 1 \\ 0 & 1 \end{array} \right], \quad \mathcal{B}\_{\circ} = \left[ \begin{array}{cc} B\_1 \\ & B\_2 / j \end{array} \right], \mathcal{C}\_{\circ} = \left[ \begin{array}{cc} \mathbb{C}\_1 & j \mathbb{C}\_2 \end{array} \right].$$

Thus, the corresponding parameter vector for the entries in B*j*,2 converges to zero and the ones corresponding to C*j*,2 to infinity.

Generalizing this argument shows that every transfer function corresponding to a pair (Ω<sup>ˇ</sup> *<sup>S</sup>*, *<sup>n</sup>*<sup>ˇ</sup> •) in <sup>A</sup>(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •), where <sup>A</sup>ˇ*<sup>u</sup>* can be obtained by replacing off-diagonal entries of <sup>A</sup>*<sup>u</sup>* with zero, can be reached from within *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •).

To prove ˜ *k* ∈ *M*(Ω*S*, *n*•) in the first case, where the state space unit root structure is extended as visible in Equation (A1), consider the sequence:

$$
\tilde{A}\_j = \begin{bmatrix} 1 & 1 \\ 0 & 1 - j^{-1} \end{bmatrix}, \quad \mathcal{B}\_j = \begin{bmatrix} B\_1 \\ B\_2 \end{bmatrix}, \quad \tilde{\mathbb{C}}\_j = \begin{bmatrix} \mathbb{C}\_1 & \mathbb{C}\_2 \end{bmatrix},
$$

corresponding to the following system in canonical form (except that the stable subsystem is not necessarily in echelon canonical form)

$$
\tilde{\mathcal{A}}\_{j} = \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 - j^{-1} \end{array} \right], \quad \tilde{\mathcal{B}}\_{j} = \left[ \begin{array}{cc} B\_{1} + jB\_{2} \\ -jB\_{2} \end{array} \right], \quad \tilde{\mathcal{C}}\_{j} = \left[ \begin{array}{cc} \mathbb{C}\_{1} & \mathbb{C}\_{1} - \mathbb{C}\_{2}/j \end{array} \right].
$$

This sequence shows that there exists a sequence of transfer functions corresponding to *I*(1) processes with one common trend that converge to a transfer function corresponding to an *I*(2) system. Again, in the canonical form this cannot happen as there the (1, 2) entry of *A*˜*<sup>j</sup>* would be restricted to be equal to zero. At the same time note that the dimension of the stable system is reduced due to one component of the state changing from the stable to the unit root part.

Now for a unit root structure <sup>Ω</sup>˜ *<sup>S</sup>* such that (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) ∈ A(Ω*S*, *<sup>n</sup>*•), satisfying

*<sup>S</sup>* <sup>A</sup>˜*uS* <sup>=</sup> A*<sup>u</sup> J*<sup>12</sup> 0 *J*<sup>2</sup> ,

the Jordan blocks corresponding to Ω*<sup>S</sup>* are sub-blocks of the ones corresponding to Ω˜ *<sup>S</sup>*, potentially involving a reordering of coordinates using the permutation matrix *S*. Taking as the approximating sequence of transfer functions ˜ *kj* <sup>∈</sup> *<sup>M</sup>*Γ*g*(Ω*S*,*n*•) <sup>→</sup> *<sup>k</sup>*<sup>0</sup> <sup>∈</sup> *<sup>M</sup>*Γ*g*(Ω˜ *<sup>S</sup>*,*n*˜ •) that have the same structure <sup>Ω</sup>˜ *<sup>S</sup>* but replacing *<sup>J</sup>*<sup>2</sup> by *<sup>j</sup>*−<sup>1</sup> *<sup>j</sup> J*<sup>2</sup> leads to processes with state space unit root structure Ω*S*.

For the stable part of ˜ *kj* we can separate the part containing poles tending to the unit circle (contained in *J*2) and the remaining transfer function ˜ *kj*,*s*, which has Kronecker indices *α*˜ ≤ *α*•. However, the results of Hannan and Deistler (1988, Theorem 2.5.3) then imply that the limit remains in *Mα*• and hence allows for an approximating sequence in *Mα*• .

Both results combined constitute the whole set of attainable state space unit root structures in Definition 11 and prove (b).

As follows from Corollary 4, *<sup>M</sup>*(Ω*S*, *<sup>n</sup>*•) = *<sup>M</sup>*Γ*g*(Ω*S*,*n*•). Thus, (b) implies 1 (Ω˜ *<sup>S</sup>*,*n*˜ •)∈A(Ω*S*,*n*•) *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) <sup>⊂</sup> *<sup>M</sup>*Γ*g*(Ω*S*,*n*•) and (a) adds the second union showing the subset inclusion.

It remains to show equality for the last set inclusion. Thus, we need to show that for *kj* <sup>∈</sup> *<sup>M</sup>*Γ*g*(Ω*S*,*n*•), *kj* <sup>→</sup> *<sup>k</sup>*0, it holds that *<sup>k</sup>*<sup>0</sup> <sup>∈</sup> *<sup>M</sup>*(Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •), where (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) <sup>≤</sup> (Ω<sup>ˇ</sup> *<sup>S</sup>*, *<sup>n</sup>*<sup>ˇ</sup> •) ∈ A(Ω*S*, *<sup>n</sup>*•). To this end note that the rank of a matrix is a lower semi-continuous function such that for a sequence of matrices *Ej* with limit *E*0, we have

$$\text{rank}(\lim\_{j \to \infty} E\_j) = \text{rank}(E\_0) \le \lim\_{j \to \infty} \inf\_{j \to \infty} \text{rank}(E\_j).$$

Then, consider a sequence *kj*(*z*) <sup>∈</sup> *<sup>M</sup>*Γ*g*(Ω*s*,*n*•), *<sup>j</sup>* <sup>∈</sup> <sup>N</sup>. We can find a converging sequence of systems (*Aj*, *Bj*, *Cj*) realizing *kj*(*z*). Therefore, choosing *Ej* = (*Aj* <sup>−</sup> *zk In*)*<sup>r</sup>* we obtain that

$$\text{rank}((A\_0 - z\_k I\_n)^t) \quad \le \quad n - \sum\_{r=1}^t d\_{j, h\_k - r + 1}^k$$

since *kj*(*z*) <sup>∈</sup> *<sup>M</sup>*Γ*g*(Ω*s*,*n*•) implies that the number *<sup>d</sup><sup>k</sup> <sup>j</sup>*,*hk*−*r*+<sup>1</sup> of the generalized eigenvalues at the unit roots is governed by the entries of the state space unit root structure Ω*s*. This implies that ∑*t <sup>r</sup>*=<sup>1</sup> *d<sup>k</sup> <sup>j</sup>*,*hk*−*r*+<sup>1</sup> <sup>≤</sup> <sup>∑</sup>*<sup>t</sup> <sup>r</sup>*=<sup>1</sup> *d<sup>k</sup>* 0,*hk*−*r*+<sup>1</sup> for *<sup>t</sup>* <sup>=</sup> 1, 2, ..., *<sup>n</sup>*. Consequently, the limit has at least as many chains of generalized eigenvalues of each maximal length as dictated by the state space unit root structure Ω*<sup>S</sup>* for each unit root of the limiting system.

Rearranging the rows and columns of the Jordan normal form using a permutation matrix *S* it is then obvious that either the limiting matrix *A*<sup>0</sup> has additional eigenvalues, where thus

*S*A0*S* = <sup>A</sup>*<sup>j</sup>* ˜*J*<sup>12</sup> 0 ˜*J*<sup>2</sup> 

must hold. Or upper diagonal entries in A*<sup>j</sup>* must be changed from ones to zeros in order to convert some of the chains to lower order. One example in this respect was given above: For *Aj* = 1 1/*j* 0 1 the rank of (*Aj* <sup>−</sup> *<sup>I</sup>*2)*<sup>r</sup>* is equal to 1 for *<sup>r</sup>* <sup>=</sup> 1 and 0 for *<sup>r</sup>* <sup>=</sup> 2. For the limit we obtain *A*<sup>0</sup> = *I*<sup>2</sup> and hence the rank is zero for *r* = 1, 2. The corresponding indices are *d*<sup>1</sup> *<sup>j</sup>*,1 = 1, *<sup>d</sup>*<sup>1</sup> *<sup>j</sup>*,2 = 1 for the approximating sequence and *<sup>d</sup>*<sup>1</sup> 0,1 = 0, *<sup>d</sup>*<sup>1</sup> 0,2 = 2 for the limit respectively. Summing these indices starting from the last one, one obtains *d*<sup>1</sup> *<sup>j</sup>*,2 <sup>=</sup> <sup>1</sup> <sup>≤</sup> *<sup>d</sup>*<sup>1</sup> 0,2 = 2 and *d*<sup>1</sup> *<sup>j</sup>*,1 + *<sup>d</sup>*<sup>1</sup> *<sup>j</sup>*,2 <sup>=</sup> <sup>2</sup> <sup>≤</sup> *<sup>d</sup>*<sup>1</sup> 0,1 + *<sup>d</sup>*<sup>1</sup> 0,2 = 2.

Hence the state space unit root structure corresponding to (*A*0, *B*0, *C*0) must be attainable according to Definition 11. The number of stable state components must decrease accordingly.

Finally, the limiting system (*A*0, *<sup>B</sup>*0, *<sup>C</sup>*0) is potentially not minimal. In this case the pair (Ω˜ *<sup>S</sup>*, *<sup>n</sup>*˜ •) is reduced to a smaller one, concluding the proof.

#### **References**

Amann, Herbert, and Joachim Escher. 2008. *Analysis III*. Basel: Birkhäuser Basel.


Aoki, Massanao. 1990. *State Space Modeling of Time Series*. New York: Springer.


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Cointegration, Root Functions and Minimal Bases**

**Massimo Franchi 1,\* and Paolo Paruolo <sup>2</sup>**


**Abstract:** This paper discusses the notion of cointegrating space for linear processes integrated of any order. It first shows that the notions of (polynomial) cointegrating vectors and of root functions coincide. Second, it discusses how the cointegrating space can be defined (i) as a vector space of polynomial vectors over complex scalars, (ii) as a free module of polynomial vectors over scalar polynomials, or finally (iii) as a vector space of rational vectors over rational scalars. Third, it shows that a canonical set of root functions can be used as a basis of the various notions of cointegrating space. Fourth, it reviews results on how to reduce polynomial bases to minimal order—i.e., minimal bases. The application of these results to Vector AutoRegressive processes integrated of order 2 is found to imply the separation of polynomial cointegrating vectors from non-polynomial ones.

**Keywords:** VAR; cointegration; I(d); vector spaces

#### **1. Introduction**

In their seminal paper, Engle and Granger(1987) introduced the notion of cointegration and of cointegrating (CI) rank for processes integrated of order 1, or I(1). They did this in the following way:<sup>1</sup>

DEFINITION: The components of the vector *xt*, are said to be *co-integrated of order d*, *b*, denoted *xt* ∼ *C I*(*d*, *b*), if (i) all components of *xt*, are *I*(*d*); (ii) there exists a vector *β*( = 0) so that *zt* = *β xt* ∼ *I*(*d* − *b*), *b* > 0. The vector *β* is called the *co-integrating vector*.

[...] If *xt* has *p* components, then there may be more than one co-integrating vector *β*. It is clearly possible for several equilibrium relations to govern the joint behavior of the variables. In what follows, it will be assumed that there are exactly *r* linearly independent co-integrating vectors, with *r* ≤ *p* − 1, which are gathered together into the *p* × *r* array *β*. By construction the rank of *β* will be *r* which will be called the "co-integrating rank" of *xt*.

Engle and Granger (1987) did not define explicitly the notion of cointegrating space, but just the cointegrating rank, which corresponds to its dimension; explicit mention of the cointegrating space was first made in Johansen (1988).

The Granger representation theorem in Engle and Granger (1987) showed that the cointegration matrix *β* needs to be orthogonal to the Moving Average (MA) impact matrix of Δ*xt*. More precisely, for Δ*xt* = *C*(*L*)*εt*, the MA impact matrix *C*(1) has rank equal to *<sup>p</sup>* − *<sup>r</sup>* and representation *<sup>C</sup>*(1) = *<sup>β</sup>*⊥*a* , where *β*<sup>⊥</sup> is a basis of the orthogonal complement of the space spanned by the columns of *β* and *a* is full column rank.

Johansen (1991, 1992) stated the appropriate conditions under which the Granger representation theorem holds for I(1) and I(2) Vector AutoRegressive processes (VAR) *A*(*L*)*xt* = *εt*, where the AR impact matrix *A*(1) has rank equal to *r* < *p* and rank factorization *A*(1) = −*αβ* , with *α* and *β* of full column rank. He defined the cointegrating space as the vector space generated by the column vectors *<sup>β</sup><sup>j</sup>* in *<sup>β</sup>* over the field of real numbers R.

**Citation:** Franchi, Massimo, and Paolo Paruolo. 2021. Cointegration, Root Functions and Minimal Bases. *Econometrics* 9: 31. https://doi.org/ 10.3390/econometrics9030031

Academic Editor: Rocco Mosconi

Received: 9 December 2019 Accepted: 13 August 2021 Published: 17 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Johansen (1991) noted that B = rowR(*β* ) is uniquely defined<sup>2</sup> by the rank factorization *A*(1) = −*αβ* , but the choice of basis *β* is arbitrary, i.e., *β* is not identified. Hypotheses that do not constrain B are hence untestable. He proposed likelihood ratio tests on B and described asymptotic properties of a just-identified version of *β* . Later Johansen (1995) discussed the choice of basis *β* as an econometric identification problem of a system of simultaneous equations (SSE) of cointegrating relations describing the long-run equilibria in the process. He discussed identification using linear restrictions, along the lines of the classical identification problem of SSE studied in econometrics since the early days of the Cowles Commission.

The observation in Johansen (1988) that the cointegrating vectors formed a vector space B was an important breakthrough. For instance, it addressed the question: 'How many cointegrating vectors should one estimate in a given system of dimension *p*?'. A proper answer is in fact: A set of *r* linearly independent vectors, spanning the cointegrating space B, i.e., a basis of B.

Similarly, when assuming that a set of *p* interest rates is described by an I(1) process, the notion of cointegrating space B enables one to discuss questions like 'How should one test that all interest rates spreads are stationary?'. In fact, if all ( *p* <sup>2</sup>) = *p*(*p* − 1)/2 interest rates differentials were stationary, then one should have cointegrating rank *r* = *p* − 1, which gives a first testable hypothesis on the cointegrating rank. Moreover there is no need to test all possible interest rates differentials to be stationary, but, if the cointegrating rank has been found to be *p* − 1, one can test that the cointegrating space is spanned by any set of linearly independent *r* contrasts between pairs of interest rates. If the cointegrating rank is found to be 0 < *r* < *p* − 1, one may still want to test the restriction that the cointegrating space B is a subspace of the linear space spanned by all contrasts.

These questions, and many more, found clear answers thanks to the introduction of the notion of cointegrating space. The recognition that the set of cointegrating vectors forms a vector space was then instrumental to represent *any* cointegrating vector as a linear combination of the ones in a basis of the vector space.

The notion of cointegrating space, together with the complementary notion of attractor space, has been recently discussed in the context of functional time series for infinite dimensional Hilbert space valued AR processes with unit roots, see Beare et al. (2017), Beare and Seo (2020), Franchi and Paruolo (2020), and for infinite dimensional Banach space valued AR processes with unit roots, see Seo (2019).

For systems with variables integrated of order *d*, *I*(*d*), with *d* = 2, 3, ... Granger and Lee (1989) and Engle and Yoo (1991) introduced the related notions of multicointegration and polynomial cointegration; see also Engsted and Johansen (2000). However, no proper discussion of cointegrating spaces or of their corresponding bases has been proposed in the literature for higher order systems.

The present paper closes this gap, making use of classical concepts in local spectral theory, see Gohberg et al. (1993). A central role is played by canonical system of root functions, which have already been exploited in Franchi and Paruolo (2011, 2016) to characterize the inversion of a matrix function, and used in Franchi and Paruolo (2019) to derive the generalization of the Granger-Johansen representation theorem for I(*d*) processes.

In order to simplify exposition, this paper focuses on unit roots at a single point *zω*, indexed by frequency *ω*. When *ω* ∈ { / 0, *π*}, the resulting matrices are complex-valued, and the symbol *<sup>F</sup>* is taken to indicate <sup>C</sup>. For *<sup>ω</sup>* ∈ {0, *<sup>π</sup>*}, *<sup>F</sup>* is taken instead to indicate R. Unit roots at distinct seasonal frequencies different from 0 have been considered e.g., in Hylleberg et al. (1990), Gregoir (1999), Johansen and Schaumburg (1998), Bauer and Wagner (2012). Several of these papers paired frequencies ±*ω* when *ω* ∈ { / 0, *π*} to obtain real coefficient matrices for Equilibrium Correction (EC) representations; in order to keep exposition as simple as possible, this is not attempted in the present paper.

To the best of the authors' knowledge, local spectral theory tools are employed here for the first time to discuss the definition of cointegrating space for *I*(*d*) processes, *d* > 1, and related bases. It is observed that several candidate cointegrating spaces exists, corresponding to different choices of the set of vectors and scalars. The sets of vectors are chosen here to be either the set of polynomial vectors or the one of rational vectors, while the set of scalars are taken to be (i) the field *F* = R, C, (ii) the ring of polynomials with coefficients in *F* (denoted *F*[*z*]) or (iii) the field of rational function with coefficients in *F* (denoted *F*(*z*)). The resulting spaces are either vector spaces, in cases (i) and (iii), or a free module in case (ii). The relationship among their bases is discussed following Forney (1975), whose results are used to derive a polynomial basis of minimal degree—i.e., a minimal basis.

The focus of this paper is on the parsimonious representation of the set of cointegrating vectors. As noted by a referee, the present results may find application also in the parametrization and estimation of *I*(*d*) EC systems. This, however, is beyond the scope of the present paper.

The rest of the paper is organised as follows. Section 2 provides the motivation for the paper. Section 3 reports definitions of integration and cointegration in *I*(*d*) systems, where the cointegrating vectors *ζ*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup>* are allowed to be vector functions; here (*z* − *zω*) and its powers are associated with the difference operator and its powers. Section 4 defines root functions and canonical systems of root functions and Section 5 discusses possible definitions of the cointegration space. Section 6 discusses how to derive bases for the various notions of cointegrating space from VAR coefficients. Section 7 discusses minimal bases using results in Forney (1975) and Section 8 applies these results in order to obtain a minimal basis in the *I*(2) VAR case. Section 9 concludes; Appendix A reports background results.

#### **2. Motivation**

This section motivates the study of the represention of cointegrating vectors in terms of bases of suitable spaces, for systems integrated of order two, which are more formally introduced in Section 3 below. Let *xt* be a *p* × 1 vector process, and let Δ = 1 − *L* and *L* be the (0-frequency) difference and the lag operators. Assume that *xt* is integrated of order 2, I(2), with Δ*<sup>j</sup> xt* nonstationary for *j* < 2 and stationary for *j* ≥ 2.

Mosconi and Paruolo (2017) consider the identification problem for the following cointegrating SSE with *I*(2) variables

$$\mathbf{ecm}\_{\ell} = \mathfrak{z}(\boldsymbol{\Delta})^{\prime} \mathbf{x}\_{\ell} \qquad \text{with} \qquad \mathfrak{z}(\boldsymbol{\Delta})^{\prime} := \left( \begin{array}{c} \boldsymbol{\beta}^{\prime} + \boldsymbol{\upsilon}^{\prime} \boldsymbol{\Delta} \\ \boldsymbol{\gamma}^{\prime} \boldsymbol{\Delta} \\ \boldsymbol{\beta}^{\prime} \boldsymbol{\Delta} \end{array} \right) \begin{array}{c} \boldsymbol{r}\_{0} \\ \hline \\ r\_{0} \end{array}$$

The first set of *r*<sup>0</sup> polynomial vectors has coefficient *β* of order 0 (i.e., that multiplies Δ0) and coefficient *υ* of order 1 (i.e., that multiplies Δ1). The last *r*<sup>0</sup> + *r*<sup>1</sup> polynomial vectors have 0 coefficients of order 0 and *γ* and *β* coefficients of order 1. They discussed identification of the SSE with respect to transformations corresponding to pre-multiplication of *ξ*(Δ) (or ecm*t*) by a block triangular, nonsingular matrix of the form

$$Q = \begin{pmatrix} Q\_{00} & Q\_{0\gamma} & Q\_{0\beta} \\ 0 & Q\_{\gamma\gamma} & Q\_{\gamma\beta} \\ 0 & 0 & Q\_{00} \end{pmatrix}.$$

where *Qab* are blocks of real coefficients, *a*, *b* ∈ {0, *γ*, *β*}, with *Q*<sup>00</sup> and *Qγγ* nonsingular square matrices.

They show that *Qξ*(Δ) = *ξ*◦(Δ) has the same structure as *ξ*(Δ) in terms of the null coefficient of order 0 in the last *r*<sup>1</sup> + *r*<sup>0</sup> equations, as well as the same *β* block as the coefficient of order 0 in the first *r*<sup>0</sup> and as the coefficient of order 1 in the last *r*<sup>0</sup> rows. More precisely,


**Remark 1** (*F*-linear combinations)**.** *Note that the Q linear combinations have scalars taken from F* = R*, and that any CI vectors can be obtained as linear combinations with coefficients in F of the rows in ξ*(Δ) *, called in the following 'F-linear combinations'.*

The main motivation to study the notion of cointegration space for *I*(*d*) processes with *d* ≥ 2 comes from the following observation.

**Remark 2** (*F*[Δ]-linear combinations)**.** *The set of CI vectors obtained as F-linear combinations of the rows in ξ*(Δ) *can be also obtained by considering the alternative set of cointegrating vectors*

$$\zeta(\Delta)' := \left( \begin{array}{c} \beta' + v' \Delta \\ \gamma' \end{array} \right) \begin{array}{c} r\_0 \\ r\_1 \end{array}$$

*and choosing linear combinations with scalar in the set of polynomials F*[Δ]*, where a*(*z*) ∈ *F*[*z*] *has the form a*(*z*) = ∑*<sup>n</sup> <sup>j</sup>*=<sup>0</sup> *ajz<sup>j</sup> for some finite n.*

*To show that the set of F*[Δ]*-linear combinations of ζ*(Δ) *is the same as the set of F-linear combinations of ξ*(Δ) *, it is sufficient to show that the rows of ξ*(Δ) *can be obtained as F*[Δ]*-linear combinations of the rows in ζ*(Δ) *, possibly up to terms of the type c* Δ<sup>2</sup> *which generate stationary processes by definition.*

*Note first that β* + *υ* Δ *is common to ξ*(Δ) *and ζ*(Δ) *. In order to obtain γ* Δ *in ξ*(Δ) *from ζ*(Δ) *one needs to select the scalar* Δ *from F*[Δ] *and multiply it by γ . Similarly, in order to obtain β* Δ *in ξ*(Δ) *one only needs to select the scalar* Δ ∈ *F*[Δ] *and multiply it by β* + *υ* Δ *to obtain β* Δ + *υ* Δ2*. Because* Δ2*xt is stationary by the assumption that xt is I(2), the term υ* Δ<sup>2</sup> *can be discarded, and this completes the argument.*

The take-away from Remark 2 is that, if one allows the set of multiplicative scalars to contain polynomials, i.e., if one moves from *F*-linear combinations to *F*[Δ]-linear combinations, then one can reduce the number of rows needed to generate the set of CI vectors: *ξ*(Δ) in fact has 2*r*<sup>0</sup> + *r*<sup>1</sup> rows, while the number of rows in *ζ*(Δ) is *r*<sup>0</sup> + *r*1.

The previous discussion shows that the two sets, *F* and *F*[Δ], could be used as possible set of scalars in taking linear combinations. The first one, *F*, is a field (i.e., a division ring), the second one, *F*[Δ], is a ring but not a field because it lacks the multiplicative inverse.

Given that vector spaces require the set of scalars to be a field, one may also consider another possible set of scalars, namely *F*(Δ), the set of rational functions of the type *a*(Δ) = *c*(Δ)/*d*(Δ) with *c*(Δ), *d*(Δ) ∈ *F*[Δ], and *d*(Δ) not identically equal to 0, indicated as *d*(Δ) ≡ 0. This leads to consider three possible choices for the set of scalars: (i) The field *F*, (ii) the ring *F*[Δ] and (iii) the field *F*(Δ). The rest of the paper discusses relative merits of using any of them.

The above discussion focused on unit roots at *z* = 1, which are associated to the long run behavior of the process. When data are observed every month or quarter, seasonal unit roots, seasonal cointegration and seasonal error correction have been shown to be useful notions, see Hylleberg et al. (1990). For instance, in the case of quarterly series, the relevant seasonal unit roots are at *z* = −1 and at *z* ± *i* where *i* is the imaginary unit. These roots are represented as *z<sup>ω</sup>* = exp(*iω*) with 0 ≤ *ω* < 2*π*, where *z<sup>ω</sup>* = 1, *i*, −1, −*i* correspond to *ω* = 0, <sup>1</sup> <sup>2</sup>*π*, *<sup>π</sup>*, <sup>3</sup> <sup>2</sup>*π*.

Johansen and Schaumburg (1998) showed that the conditions under which a VAR process allows for seasonal integration (and cointegration) of order 1 are of the same type as for roots at *z* = 1, except that expansions of the VAR polynomial are performed around each *zω*, see their Theorem 3. They also provided the corresponding EC form in their Corollary 2; see also Bauer and Wagner (2012) and the discussion in Remark 9 below.

In general, the conditions for integration of any order *d* at a point *zω* on the unit circle can be shown to be of the same type. This paper hence considers the generic case of a linear process with a generic root on the unit circle *z<sup>ω</sup>* = exp(*iω*), and discusses the notions of cointegration, root functions and minimal bases in this general context. This allows to show that the present results hold for generic frequency *ω*, 0 ≤ *ω* < 2*π*.

Incidentally, the results presented below in Section 6 state the generalization of the Granger and the Johansen Representation Theorems presented in Franchi and Paruolo (2019) for a generic unit root *z<sup>ω</sup>* = exp(*iω*) at any frequency *ω*.

#### **3. Setup and Definitions**

This section introduces notation and basic definitions of integrated and cointegrated processes.

#### *3.1. Linear Processes*

Assume that {*εt*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} is a *<sup>p</sup>* <sup>×</sup> 1 i.i.d. sequence, called a noise process,<sup>3</sup> with E(*εt*) = 0 and E(*εtε <sup>s</sup>*) = Ω 1*s*=*<sup>t</sup>* where 1· is the indicator function, and define the linear process *ut* <sup>=</sup> *<sup>μ</sup><sup>t</sup>* <sup>+</sup> *<sup>C</sup>*(*L*)*εt*, where *<sup>μ</sup><sup>t</sup>* is a nonstochastic *<sup>p</sup>* <sup>×</sup> 1 vector and *<sup>C</sup>*(*z*) = <sup>∑</sup><sup>∞</sup> *<sup>j</sup>*=<sup>0</sup> *z<sup>j</sup> C*◦ *<sup>j</sup>* is a *p* × *p* matrix function, with coefficient matrices *C*◦ *<sup>j</sup>* <sup>∈</sup> <sup>R</sup>*p*×*p*. Note that the matrices *<sup>C</sup>*◦ *<sup>j</sup>* are defined by an expansion of *C*(*z*) around *z* = 0. The term *μ<sup>t</sup>* is nonstochastic, i.e., E(*μt*) = *μt*, and can contain deterministic terms. Because E(*εt*) = 0, one sees that E(*ut*) = *μt*, and hence in the following *ut* is often written as *ut* = E(*ut*) + *C*(*L*)*εt*.

The matrix function *C*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=<sup>0</sup> *z<sup>j</sup> C*◦ *<sup>j</sup>* is assumed to be finite when *z* is inside the open disk *D*(0, 1 + *η*), *η* > 0, in C with center at 0 and radius 1 + *η* > 1, i.e., *C*(*z*) is assumed analytic on *D*(0, 1 + *η*). Here and in the following |·| indicates the modulus and *D*(*z*, *ρ*) indicates the open disk *<sup>D</sup>*(*z*, *<sup>ρ</sup>*) :<sup>=</sup> {*<sup>z</sup>* <sup>∈</sup> <sup>C</sup> : <sup>|</sup>*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>*<sup>|</sup> <sup>&</sup>lt; *<sup>ρ</sup>*} with center *<sup>z</sup>* <sup>∈</sup> <sup>C</sup> and radius *ρ* > 0. In this paper *C*(*z*) is assumed to be regular on *D*(0, 1 + *η*), i.e., *C*(*z*) can lose rank only at a finite number of isolated points in *D*(0, 1 + *η*).

Because of analyticity of *C*(*z*), it can be expanded around any interior point of *D*(0, 1+ *η*). In particular, define the point *<sup>z</sup><sup>ω</sup>* :<sup>=</sup> *<sup>e</sup>i<sup>ω</sup>* on the unit circle at frequency *<sup>ω</sup>*, *<sup>ω</sup>* <sup>∈</sup> [0, 2*π*), and observe that it lies inside *D*(0, 1 + *η*) because *η* > 0. Hence one can expand *C*(*z*) as *C*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> Cj* on *D*(*zω*, *η*), *η* > 0. Note that the matrices *Cj* are defined by an expansion of *C*(*z*) around *z* = *zω*, but that the dependence of *Cj* on *ω* is not included in the notation for simplicity. The analysis of the properties of *C*(*z*) is done locally around *z* = *z<sup>ω</sup>* on *D*(*zω*, *η*), *η* > 0.

Similarly to *C*(*z*), one can consider a scalar function of *z*, *a*(*z*) say, or a 1 × *p* vector function *b*(*z*) taken to be analytic on *D*(*zω*, *η*), *η* > 0. This means that *a*(*z*) has representation *a*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> aj* around *z<sup>ω</sup>* and similarly for *b*(*z*) . A special case is when *a*(*z*) is a polynomial of degree *k*, *a*(*z*) = ∑*<sup>k</sup> <sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> aj*, which corresponds to setting all *aj* = 0 for *j* > *k*. Another special case is given by rational functions *a*(*z*) = *c*(*z*)/*d*(*z*) with *c*(*z*) and *d*(*z*) polynomials, where *d*(*z*) ≡ 0 and *z<sup>ω</sup>* is not a root of *d*(*z*). Similarly for *b*(*z*) .

#### *3.2. Integration*

The following definition specifies the *Iω*(0) class of processes as a subset of all linear processes built from the noise sequence *εt*, and introduces the notion of *Iω*(*d*) processes using the difference operator at frequency *<sup>ω</sup>*, <sup>Δ</sup>*<sup>ω</sup>* :<sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*i<sup>ω</sup> <sup>L</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>z</sup>*−<sup>1</sup> *<sup>ω</sup> <sup>L</sup>*. To simplify notation, the dependence of Δ*<sup>ω</sup>* on the lag operator *L* is left implicit. Observe also that, because *<sup>z</sup><sup>ω</sup>* <sup>=</sup> *<sup>e</sup>i<sup>ω</sup>* <sup>=</sup> 0, *<sup>z</sup>* <sup>−</sup> *<sup>z</sup><sup>ω</sup>* in the analytic expansions can be expressed as (−*zω*)(<sup>1</sup> <sup>−</sup> *z*/*zω*), where (1 − *z*/*zω*) corresponds to the operator Δ*ω*.

Next, the definition of order of integration is introduced; this is defined as the difference between two nonnegative integer exponents *d*<sup>1</sup> and *d*<sup>2</sup> of Δ*<sup>ω</sup>* in the representation that links the process *xt* with its driving linear process *ut*. This definition allows for the possibility to have *xt* integrated of negative order.

**Definition 1** (Integrated processes at frequency *ω*)**.** *Let C*(*z*) *be analytic on D*(0, 1 + *η*)*, <sup>η</sup>* <sup>&</sup>gt; <sup>0</sup>*, and let <sup>ε</sup><sup>t</sup> be a noise process. If* {*ut*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>}*, satisfies ut* <sup>=</sup> <sup>E</sup>(*ut*) + *<sup>C</sup>*(*L*)*εt, then ut is called a linear process; if, in addition,*

$$\mathbb{C}(z\_{\omega}) \neq 0, \qquad z\_{\omega} = \exp(i\omega), \qquad 0 \le \omega < 2\pi,\tag{1}$$

*then ut is said to be integrated of order zero at frequency ω, indicated ut* ∼ *Iω*(0)*.*

*Let <sup>d</sup>*1, *<sup>d</sup>*<sup>2</sup> <sup>∈</sup> <sup>N</sup><sup>0</sup> <sup>=</sup> <sup>N</sup> <sup>∪</sup> <sup>0</sup> *be finite non-negative integers; if* {*xt*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} *satisfies* <sup>Δ</sup>*d*<sup>1</sup> *<sup>ω</sup>* (*xt* − <sup>E</sup>(*xt*)) = Δ*d*<sup>2</sup> *<sup>ω</sup>* (*ut* − E(*ut*)) *where ut* ∼ *Iω*(0)*, then xt is said to be integrated of order d* := *d*<sup>1</sup> − *d*<sup>2</sup> *at frequency ω, indicated xt* ∼ *Iω*(*d*)*; in this case xt has representation*

$$
\Delta\_{\omega}^{d\_1}(\mathbf{x}\_t - \mathbf{E}(\mathbf{x}\_t)) = \Delta\_{\omega}^{d\_2} \mathbf{C}(L) \varepsilon\_{t\prime} \tag{2}
$$

*where C*(*z*) *is analytic on D*(0, 1 + *η*)*, η* > 0*, and C*(*zω*) = 0*.*

**Remark 3** (Negative orders)**.** *When d*<sup>1</sup> < *d*2*, the integration order d* := *d*<sup>1</sup> − *d*<sup>2</sup> *is negative. Note also that Definition 1 avoids to define the operator* Δ−<sup>1</sup> *<sup>ω</sup> ; see however Equations* (5) *and* (6) *below.*

**Remark 4** (Mean-0 linear process)**.** *The linear process ut in Definition 1 can have any expectation* E(*ut*)*, which however, does not play any role in the definition of the xt process. Hence, one can assume that* E(*ut*) = 0 *in Definition 1 without loss of generality.*

**Remark 5** (E(*xt*) in Definition 1)**.** *Assume xt* = cos(2*t*) + exp(−3*t*) + *C*(*L*)*ε<sup>t</sup> with C*(*z*) *analytic on <sup>D</sup>*(0, 1 <sup>+</sup> *<sup>η</sup>*)*, <sup>η</sup>* <sup>&</sup>gt; <sup>0</sup>*, and <sup>C</sup>*(*zω*) <sup>=</sup> <sup>0</sup>*, <sup>z</sup><sup>ω</sup>* :<sup>=</sup> *<sup>e</sup>iω. Then* <sup>E</sup>(*xt*) = cos(2*t*) + exp(−3*t*) *and Definition 1 implies that xt is Iω*(0)*. This example shows that the presence of* E(*xt*) *in Equation* (2) *allows to concentrate attention on the stochastic part of the process xt.*

**Remark 6** (Preference for low *d*1, *d*2)**.** *Assume for instance that* (2) *is satisfied for* (*d*1, *d*2) = (1, 0)*, and observe that this implies that* (2) *is satisfied for* (*d*1, *<sup>d</sup>*2)=(<sup>1</sup> <sup>+</sup> *<sup>m</sup>*, *<sup>m</sup>*) *for any <sup>m</sup>* <sup>∈</sup> <sup>N</sup>*. In the following, preference is given to the minimal pair* (*d*1, *d*2) *for which* (2) *is satisfied, i.e., to* (*d*1, *d*2)=(1, 0) *in the example.*

*Leading cases are the ones where either d*<sup>1</sup> *or d*<sup>2</sup> *equals 0. Specifically, when* 0 = *d*<sup>1</sup> < *d*2*, d* = *d*<sup>1</sup> − *d*<sup>2</sup> = −*d*<sup>2</sup> *is negative, and* (2) *reads*

$$
\Delta \mathbf{x}\_t - \mathbf{E}(\mathbf{x}\_t) = \Delta\_{\omega}^{d\_2} \mathbf{C}(L) \boldsymbol{\varepsilon}\_t. \tag{3}
$$

*When d*<sup>1</sup> ≥ *d*<sup>2</sup> = 0 *and hence d* = *d*<sup>1</sup> − *d*<sup>2</sup> = *d*<sup>1</sup> *is nonnegative,* (2) *reads*

$$
\Delta\_{\omega}^{d\_1}(\mathbf{x}\_t - \mathbf{E}(\mathbf{x}\_t)) = \mathbf{C}(L)\varepsilon\_t. \tag{4}
$$

**Remark 7** (Example of *I*0(−1))**.** *As an example, consider the process xt* = *C*(*L*)*ε<sup>t</sup> with C*(*L*) = 1 − *L. Setting ω* = 0 *one finds that Equation* (2) *is satisfied with d* = −1*, i.e., that the process is I*0(−1)*. Selecting any other frequency* 0 < *ω* < 2*π, one sees that Equation* (2) *is satisfied for d* = 0*, i.e., that the order of integration is 0, i.e., Iω*(0) *for* 0 < *ω* < 2*π. This illustrates the fact that a process may have different orders of integration at different frequencies.*

**Remark 8** (*<sup>t</sup>* <sup>∈</sup> <sup>Z</sup> versus *<sup>t</sup>* <sup>∈</sup> <sup>N</sup>0)**.** *Consider the process xt* <sup>=</sup> *<sup>c</sup>* <sup>+</sup> <sup>∑</sup>*<sup>t</sup> <sup>j</sup>*=<sup>1</sup> *ε<sup>t</sup> defined only for <sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> <sup>=</sup> <sup>N</sup> <sup>∪</sup> <sup>0</sup>*, which satisfies* <sup>Δ</sup>0(*xt* <sup>−</sup> *<sup>c</sup>*) = *<sup>ε</sup><sup>t</sup> for <sup>t</sup>* <sup>∈</sup> <sup>N</sup>*. Consider another process* {*x <sup>t</sup>* , *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} *satisfying the same equation* Δ0(*x <sup>t</sup>* <sup>−</sup> *<sup>c</sup>*) = *<sup>ε</sup><sup>t</sup> for <sup>t</sup>* <sup>∈</sup> <sup>Z</sup> *with xt* <sup>=</sup> *<sup>x</sup> <sup>t</sup> for <sup>t</sup>* <sup>∈</sup> <sup>N</sup>0*. The process* {*x <sup>t</sup>* , *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} *is <sup>I</sup>*0(1) *according to Definition 1, and it is suggested to extend this qualification to xt, because it coincides with the x <sup>t</sup> process on the non-negative integers, xt* = *<sup>x</sup> <sup>t</sup> for t* <sup>∈</sup> <sup>N</sup>0*.*

**Remark 9** (One or more frequencies)**.** *Definition 1 of integration refers to a single frequency ω, but it can be used to cover multiple frequencies. In fact, consider the 'ARMA process with unit root structure', as defined in Bauer and Wagner (2012), i.e., a process xt satisfying D*(*L*)*xt* = *vt where D*(*L*) := ∏*<sup>n</sup> <sup>j</sup>*=<sup>1</sup> <sup>Δ</sup>*mi <sup>ω</sup><sup>j</sup> for a (finite) set of frequencies <sup>ω</sup>*1, ... , *<sup>ω</sup>n, with vt a stationary ARMA process vt* = *C*(*L*)*ε<sup>t</sup> with C*(exp(*iωj*)) = 0*. They call* {(*ωj*, *mj*), *j* = 1, ... , *n*}*, the 'unit root structure' of xt, see their Definition 2. This can be obtained using Definition 1 for each ω<sup>j</sup> in turn, noting that vt being ARMA corresponds to a rational C*(*z*)*, which is a special case of the definition above.*

*Hylleberg et al.(1990), Gregoir(1999), Johansen and Schaumburg (1998), Bauer and Wagner (2012) consider xt to be real-valued, which implies that integration frequencies* ±*ω<sup>j</sup> are 'paired', so that if* exp(*iωj*) *is a unit root of the process, so is* exp(−*iωj*)*; this implies that in this case one can pair frequencies* ±*ω<sup>j</sup> with* 0 < *ω<sup>j</sup>* < *π and rearrange coefficients so as to obtain real coefficient matrices in EC representations. This is not done in this paper for reasons of simplicity.*

**Remark 10** (Relation with other definitions)**.** *The definition of an Iω*(0) *(respectively an Iω*(*d*)*) process in the present Definition 1 coincides with Definition 3.2 (respectively Definition 3.3) in Johansen (1996) when setting ω* = 0 *(respectively ω* = 0 *and d*<sup>2</sup> = 0*). The present definition also agrees with Definitions 2.1 and 2.2 of integration in Gregoir (1999), both for positive and negative orders and any frequency ω. The definition also agrees with the one in Franchi and Paruolo (2019) when applied to vector processes.*

**Remark 11** (Entries in *C*(*z*))**.** *When ω differs from* 0 *or π, the point z<sup>ω</sup>* = *ei<sup>ω</sup> has a nonzero complex part; hence the matrix C*(*zω*) *in* (1) *has complex entries and the coefficient matrices Cj in the expansion C*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> Cj are complex even when the coefficients in the expansion around z* = 0 *are real.*

Following Gregoir (1999), the summation operator at frequency *ω* is defined as

$$\mathcal{S}\_{\omega}u\_{t} := \mathbf{1}\_{t>0} \sum\_{j=1}^{t} \boldsymbol{\mu}\_{j} e^{-i\omega(t-j)} - \mathbf{1}\_{t<0} \sum\_{j=t+1}^{0} \boldsymbol{\mu}\_{j} e^{-i\omega(t-j)}.\tag{5}$$

Basic properties of the operator are proved in Gregoir (1999); these include

$$
\Delta\_{\omega} \mathbb{S}\_{\omega} u\_t = u\_t, \qquad \mathbb{S}\_{\omega} \Delta\_{\omega} u\_t = u\_t - u\_0 e^{-i\omega t}, \tag{6}
$$

where {*ut*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} is any sequence over <sup>Z</sup>.

**Remark 12** (Simplifications of Δ*<sup>ω</sup>* and initial values)**.** *Take d*<sup>1</sup> = *d*<sup>2</sup> = 1 *in* (2)*, which in this case reads* Δ*ωxt* = Δ*ωut with ut* ∼ *Iω*(0)*. Applying the* S*<sup>ω</sup> operator on both sides one obtains xt* <sup>−</sup> *<sup>x</sup>*0*e*−*iω<sup>t</sup>* <sup>=</sup> *ut* <sup>−</sup> *<sup>u</sup>*0*e*−*iω<sup>t</sup> . <sup>4</sup> If one assigns the initial value of x*<sup>0</sup> *equal to u*0*, one obtains xt* = *ut, which corresponds to the cancellation of* Δ*<sup>ω</sup> from both sides of* (2)*. The same reasoning applies for generic <sup>d</sup>*1, *<sup>d</sup>*<sup>2</sup> <sup>&</sup>gt; <sup>0</sup> *to the cancellation of* <sup>Δ</sup>min(*d*1,*d*2) *<sup>ω</sup> from both sides of* (2)*. This shows that one can simplify powers of* Δ*<sup>ω</sup> from both sides of* (2) *by properly assigning initial values; this cancellation is always implicitly performed in the following, in line with preference for minimal values of d*1, *d*<sup>2</sup> *as discussed in Remark 6.*

#### *3.3. Cointegration*

Cointegration is the property of (possibly polynomial) linear combinations of *xt* to have a lower order of integration with respect to the original order of integration of *xt* at frequency *ω*. Specifically, consider a nonzero 1 × *p* row vector function *ζ*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=<sup>0</sup> *ζ j* (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup>* , analytic on a disk *D*(*zω*, *η*), *η* > 0. As in Engle and Granger (1987), the idea is to call *ζ*(*L*) cointegrating if *ζ*(*L*) *xt* has lower order of integration than *xt*, excluding cases such as *ζ*(*L*) = Δ*ωa* where *a* by itself does not reduce the order of integration.

This leads to the following definition.

**Definition 2** (Cointegrating vector at frequency *ω*)**.** *Let xt* ∼ *Iω*(*d*) *be as in Definition* 1*, i.e.,*

$$
\Delta\_{\omega}^{d\_1}(\mathfrak{x}\_{\mathfrak{t}} - \mathrm{E}(\mathfrak{x}\_{\mathfrak{t}})) = \Delta\_{\omega}^{d\_2}\mathrm{C}(L)\varepsilon\_{\mathfrak{t},\mathfrak{t}}.
$$

*where d* := *d*<sup>1</sup> − *d*2*, C*(*z*) *is analytic on D*(0, 1 + *η*)*, η* > 0*, and C*(*zω*) = 0*, see* (2)*; let also ζ*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup> be a* 1 × *p row vector function, analytic on D*(*zω*, *η*) *with ζ*(*zω*) = *ζ* <sup>0</sup> = 0 *. Then ζ*(*z*) *is called a cointegrating vector at frequency ω if ζ*(*L*) *xt* ∼ *Iω*(*d* − *s*) *for some s* <sup>∈</sup> <sup>N</sup>*, i.e.,*

$$\mathcal{Z}(L)'\Delta\_{\omega}^{d\_1}(\mathbf{x}\_t - \mathbf{E}(\mathbf{x}\_t)) = \Delta\_{\omega}^{d\_2 + s}\mathcal{g}(L)'\varepsilon\_{t\prime} \tag{7}$$

*where g*(*z*) *is analytic on D*(*zω*, *η*)*, η* > 0*, and g*(*zω*) = 0 *. Given Equation* (2)*, Equation* (7) *is equivalent to the condition*

$$\mathbb{Z}(L)^{\prime}\mathbb{C}(L) = \Delta^{s}\_{\omega}\mathbb{g}(L)^{\prime}, \qquad \mathbb{g}(z\_{\omega})^{\prime} \neq 0^{\prime}. \tag{8}$$

*The positive integer <sup>s</sup>* <sup>∈</sup> <sup>N</sup> *is called the order of the cointegrating vector <sup>ζ</sup>*(*z*) *of <sup>C</sup>*(*z*) *at <sup>z</sup>ω. xt is said to be cointegrated at frequency ω if any cointegrating vector ζ*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ j can be replaced by ζ*(*zω*) = *ζ* <sup>0</sup> *without decreasing the order s in* (8)*; otherwise xt is said to be multicointegrated at frequency ω.*

**Remark 13** (*C*(*z*) has full rank on *D*(*zω*, *η*), *η* > 0, except at *z* = *zω*)**.** *Because cointegrating vectors are by definition different from zero at zω, xt is cointegrated at frequency ω if and only if C*(*zω*) = 0 *has reduced rank. Moreover, because C*(*z*) *is regular on D*(0, 1 + *η*)*, the point z<sup>ω</sup> is isolated, i.e., C*(*z*) *has full rank on D*(*zω*, *η*)*, η* > 0*, except at z* = *zω.*

**Remark 14** (Entries in cointegrating vectors)**.** *Similarly to Remark 11, the coefficient vectors ζ j in the expansion ζ*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup> are in general complex. Note that ζ*(*L*) = Δ*ωa does not satisfy the definition because the requirement ζ*(*zω*) = *ζ* <sup>0</sup> = 0 *is not satisfied.*

**Remark 15** (*d* and *s*)**.** *Recall that d (the order of integration) is the difference between the exponents of* Δ*<sup>ω</sup> on the l.h.s. and r.h.s. of* (2)*. When pre-multiplied by ζ*(*L*) *, the exponent on the r.h.s. decreases by s and the difference of the exponents on the l.h.s. and r.h.s. of* (7) *becomes d* − *s. Because ζ* <sup>0</sup> = 0 *, this can only happen if ζ*(*L*) *factors* Δ*<sup>s</sup> <sup>ω</sup> from C*(*L*)*, see* (8)*. The condition g*(*zω*) = 0 *guarantees that no remaining additional power of* Δ*<sup>ω</sup> can be factored from C*(*L*) *using ζ*(*L*) *.*

**Remark 16** (Examples of cointegration vectors)**.** *Take ζ*(*L*) = *ζ* <sup>0</sup> *with ζ*<sup>0</sup> *chosen in* (col *C*(1))⊥*, and note that this implies s* ≥ 1 *in* (7)*. This shows that the definition contains the I*0(1) *definition of cointegrating vectors as a special case.*

The usual definition of cointegration, see Definition 3.4 in Johansen (1996), considers a *p* × 1 process *xt* ∼ *I*0(1) and defines *xt* cointegrated with cointegrating vector *ζ* = 0 if *ζ xt* "can be made stationary by a suitable choice of initial distribution". The following proposition clarifies that his definition coincides with the one in this paper.

**Proposition 1** (Relation with Definition 3.4 in Johansen (1996))**.** *ζ is a cointegrating vector in the sense of Definition 3.4 in Johansen (1996) if and only if Definition* 2 *is satisfied with ω* = 0 *and ζ*(*z*) = *ζ , d* <sup>=</sup> <sup>1</sup>*, s* <sup>∈</sup> <sup>N</sup>*.*

**Proof.** For simplicity and without loss of generality, set E(*xt*) = 0 and omit the subscript *ω* = 0. Assume Definition 2 is satisfied with *ω* = 0 and *ζ*(*z*) = *ζ* , *<sup>d</sup>* <sup>=</sup> 1, and *<sup>s</sup>* <sup>∈</sup> <sup>N</sup>, i.e.,

$$
\Delta \mathbb{S}' \mathfrak{x}\_{\mathfrak{t}} = \Delta^{\mathfrak{s}} \mathbb{g}(L) \mathfrak{e}\_{\mathfrak{t}} \tag{9}
$$

see Remark 12, and set *vt* :<sup>=</sup> <sup>Δ</sup>*s*−1*g*(*L*)*εt*. Applying <sup>S</sup> to both sides of Equation (9) one finds *ζ xt* − *ζ <sup>x</sup>*<sup>0</sup> <sup>=</sup> *vt* <sup>−</sup> *<sup>v</sup>*0. Note that *vt* is stationary for any *<sup>s</sup>* <sup>∈</sup> <sup>N</sup>, and hence the initial values *ζ x*<sup>0</sup> can be chosen equal to *v*0, so as to obtain *ζ xt* = *vt*, a stationary process.

Conversely, assume that *ζ* is a cointegrating vector in the sense of Definition 3.4 in Johansen (1996). Because *xt* ∼ *I*(1), one has Δ*xt* = *C*(*L*)*εt*, see Definition 1, with *C*(*z*) analytic on a disk *D*(0, 1 + *η*), *η* > 0, which admits expansion *C*(*z*) = *C* + *C*3(*z*)(1 − *z*) around 1, where *C*3(*z*) is analytic on the same disc. A necessary and sufficient condition for

cointegration in the sense of Definition 3.4 in Johansen (1996) is that *ζ C* = 0 as shown in Johansen (1988) Equation (17); see also Engle and Granger (1987, p. 256).5 Hence one finds *ζ* Δ*xt* = Δ*g*(*L*) *ε<sup>t</sup>* with *g*(*z*) := *ζ C*3(*z*), which is analytic on *D*(0, 1 + *η*), *η* > 0, and hence also on *<sup>D</sup>*(1, *<sup>η</sup>*), *<sup>η</sup>* <sup>&</sup>gt; 0. By Corollary <sup>1</sup> below, one has that *<sup>g</sup>*(*z*) satisfies *<sup>g</sup>*(*z*) <sup>=</sup> <sup>Δ</sup>*mg*3(*z*) with finite *<sup>m</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> and *<sup>g</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> . This shows that Definition 2 is satisfied with *ζ*(*z*) = *ζ* , *<sup>d</sup>* <sup>=</sup> 1, and *<sup>s</sup>* <sup>=</sup> *<sup>m</sup>* <sup>+</sup> <sup>1</sup> <sup>∈</sup> <sup>N</sup>.

**Remark 17** (*ζ xt* can have negative order of integration)**.** *Johansen (1996) makes the following observation just after his Definition 3.4: "Note that ζ xt need not be I(0)", which recognises that ζ xt can have negative order of integration. This is indeed the case when to s* = 2, 3, ... *in Definition* 2*, because ζ xt* ∼ *I*(1 − *s*)*.*

**Remark 18** (Relation to other definitions in the literature)**.** *The definition of cointegration in Engle and Granger (1987) reported in the introduction is a special case of the present one with ζ*(*z*) = *ζ* <sup>0</sup> *a constant vector and ω* = 0*, under the additional requirement that all variables are integrated of the same order. For more details on this for the case ω* = 0*, see Franchi and Paruolo (2019). When s* > 1 *and ω* = 0*, Definition 2 covers the definitions of multicointegration and polynomial cointegration in Granger and Lee (1989), Engle and Yoo (1991), Johansen (1996). When s* = 1 *and ω* = 2*πj*/*n for j* = 1, ... , *n where n is the number of seasons, the definition covers seasonal cointegration in Hylleberg et al. (1990), Johansen and Schaumburg (1998).*

**Example 1** (I(1) VAR)**.** *Following Johansen (1988), consider A*(*L*)*xt* = *ε<sup>t</sup> with A*(*z*) = *I* − ∑*k <sup>j</sup>*=1(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*)*<sup>j</sup> Aj analytic on* C*. Assume also that* det *<sup>A</sup>*(*z*) = <sup>0</sup> *has only solutions outside <sup>D</sup>*(0, 1 + *η*)*, η* > 0*, or at z* = 1*, where '*det*' indicates the determinant of a matrix. Here and in the following, let a*<sup>⊥</sup> *indicate a basis of the orthogonal complement of the linear space spanned by the columns of the matrix a. Moreover Pa* := *a*(*a a*)−1*a for a full-column-rank matrix a is the orthogonal projection matrix onto* col(*a*)*. Johansen (1991) (see his Equations (4.3) and (4.4) in Theorem 4.1) showed that for xt to be I(1) at frequency ω* = 0*, a set of necessary and sufficient conditions are:*

*(i) A*(1) = −*α*0*β* <sup>0</sup> *with α*0, *β*<sup>0</sup> *full column rank matrices of dimension p* × *r*0*, r*<sup>0</sup> < *p,*

*(ii) <sup>P</sup>α*0<sup>⊥</sup> *<sup>A</sup>*1*Pβ*0<sup>⊥</sup> <sup>=</sup> <sup>−</sup>*α*1*β* <sup>1</sup> *of maximal rank r*<sup>1</sup> = *p* − *r*0*.*

*In this case xt satisfies* (2) *for d*<sup>1</sup> = 1, *d*<sup>2</sup> = 0*, and ζ*(*L*) = *ζ taken to be any row vector in* B = row*F*(*β* <sup>0</sup>) *with F* <sup>=</sup> <sup>R</sup>*.*

**Example 2** (I(2) VAR)**.** *Following Johansen (1992), consider the same VAR process as in Example 1. Johansen (1992) showed that for xt to be I(2) at frequency ω* = 0*, a set of necessary and sufficient conditions are:*

*(i) A*(1) = −*α*0*β* <sup>0</sup> *with α*0, *β*<sup>0</sup> *full column rank matrices of dimension p* × *r*0*, r*<sup>0</sup> < *p,*

*(ii) <sup>P</sup>α*0<sup>⊥</sup> *<sup>A</sup>*1*Pβ*0<sup>⊥</sup> <sup>=</sup> <sup>−</sup>*α*1*β* <sup>1</sup> *with α*1, *β*<sup>1</sup> *full column rank matrices of dimension p* ×*r*1*, r*<sup>1</sup> < *p* − *r*0*, (iii) <sup>P</sup>*(*α*0,*α*1)<sup>⊥</sup> (*A*<sup>2</sup> <sup>+</sup> *<sup>A</sup>*1*β*¯ <sup>0</sup>*α*¯ <sup>0</sup>*A*1)*P*(*β*0,*β*1)<sup>⊥</sup> <sup>=</sup> <sup>−</sup>*α*2*β* <sup>2</sup> *of maximal rank r*<sup>2</sup> = *p* − *r*<sup>0</sup> − *r*1*.*

*In this case xt satisfies* (2) *for d*<sup>1</sup> = 2, *d*<sup>2</sup> = 0*, and ζ*(*L*) = *ζ* <sup>0</sup> + Δ*ζ* <sup>1</sup> *taken to be any row vector obtained by linear combinations of the rows in β* <sup>0</sup> + (1 − *L*)*α*¯ <sup>0</sup>*A*<sup>1</sup> *and β* <sup>1</sup>*. The notion of cointegrating space for I(2) processes is discussed in detail below, where α*¯ <sup>0</sup>*A*<sup>1</sup> *is called the 'multicointegrating coefficient'.*

#### **4. Root Functions, Cointegrating Vectors and Canonical Systems**

This section introduces root functions and canonical systems of root functions, and their connection to cointegrating vectors, as defined in Definition 2 above.

#### *4.1. Root Functions*

Let *xt* ∼ *Iω*(*d*) be cointegrated at frequency *ω*, i.e., see Definition 2,

$$
\Delta\_{\omega}^{d\_1}(\mathfrak{x}\_t - \operatorname{E}(\mathfrak{x}\_t)) = \Delta\_{\omega}^{d\_2}\operatorname{C}(L)\varepsilon\_t,
$$

where *d* := *d*<sup>1</sup> − *d*<sup>2</sup> and *C*(*z*) has full rank on *D*(*zω*, *η*), *η* > 0, except at *z* = *zω*, see Remark 13.

The following definition of (left) root functions is taken from Gohberg et al. (1993); this definition is given in a neighborhood of *zω*.

**Definition 3** (Root function)**.** *A* 1 × *p row vector function ϕ*(*z*) *analytic on D*(*zω*, *η*) *is called a root function of C*(*z*) *at z<sup>ω</sup> if ϕ*(*zω*) = 0 *and if*

> *ϕ*(*z*) *<sup>C</sup>*(*z*)=(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*sϕ*3(*z*) , *<sup>s</sup>* <sup>∈</sup> <sup>N</sup>, *<sup>ϕ</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> . (10)

*The positive integer s is called the order of the root function ϕ*(*z*) *at zω.*

Observe that *<sup>ϕ</sup>*3(*z*) is 1 <sup>×</sup> *<sup>p</sup>* and analytic on *<sup>D</sup>*(*zω*, *<sup>η</sup>*), *<sup>η</sup>* <sup>&</sup>gt; 0.

**Remark 19** (Factoring the difference operator)**.** *Definition 3 characterizes roots functions by their ability to factor powers of* (*z* − *zω*) *from C*(*z*)*. Note that, because here z<sup>ω</sup>* = exp(*iω*) = 0*, one can write* (*z* − *zω*) *as* (−*zω*)(1 − *z*/*zω*) *where* (1 − *z*/*zω*) *corresponds to the difference operator* <sup>Δ</sup>*<sup>ω</sup> and* (−*zω*) *can be absorbed in <sup>ϕ</sup>*3(*z*) *without affecting its property that <sup>ϕ</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> *.*

**Remark 20** (Local analysis)**.** *Note first that C*(*z*) *cannot be identically 0 in Definition 3, because C*(*z*) *is assumed to be regular. Next take for example the* 2×2 *matrix C*(*z*) = diag((1 − *z*),(1 + *z*)) *which has full rank on* <sup>C</sup>*, except at the two points z*<sup>0</sup> <sup>=</sup> <sup>1</sup> *and z<sup>π</sup>* <sup>=</sup> <sup>−</sup>1*, where it has rank 1.*

*Take first the point at z*<sup>0</sup> = 1*; in this case one could choose a disk D*(1, *η*) *with any η* < 2*, on which C*(*z*) *is analytic and full rank except at z*<sup>0</sup> = 1*. One can verify that a root function is ϕ*1(*z*) = (1, 0)*, which satisfies ϕ*1(*z*) *<sup>C</sup>*(*z*)=(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*)*ϕ*31(*z*) *with <sup>ϕ</sup>*31(*z*) = (1, 0)*. The same can be repeated for the other point z<sup>π</sup>* = −1*, choosing a different disk D*(−1, *η*) *with any η* < 2*, and a root function equal to* (0, 1)*.*

*The implication of this example is that one can have multiple separated points where C*(*z*) *has reduced rank, and apply the above definition to each point separately, using a different disk D for each point. In other words, the discussion of cointegration in this paper is local to a single unit root.*

**Remark 21** (Order)**.** *A root function factorises* (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>s</sup> from <sup>C</sup>*(*z*)*, and <sup>s</sup> indicates the order. The condition <sup>ϕ</sup>*(*zω*) <sup>=</sup> <sup>0</sup> *guarantees that in the analytic expansion <sup>ϕ</sup>*(*z*) <sup>=</sup> <sup>∑</sup><sup>∞</sup> *<sup>n</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*nϕ n, the first term ϕ* <sup>0</sup> *is not the null vector. Note that the condition <sup>ϕ</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> *makes sure that one cannot extract additional factors of* (*z* − *zω*) *from C*(*z*) *using ϕ*(*z*) *.*

It is immediate to see that a cointegrating vector is a root function of *C*(*z*) and vice versa, as stated in the following theorem.

**Theorem 1** (Cointegrating vectors and root functions)**.** *ζ*(*z*) *is a cointegrating vector at frequency ω if and only if ζ*(*z*) *is a root function of C*(*z*) *at zω, and the order of the cointegrating vector and of the root function coincide.*

**Proof.** Observe that any root function satisfies Definition 2 of cointegrating vectors and vice versa, including the definition of their order.

Results in Gohberg et al. (1993) shows that the order of a root functions is finite, because it is bounded by the order of *z<sup>ω</sup>* as a zero of det *C*(*z*), a result that is reported in the next proposition.

**Proposition 2** (Bound on the order of a root function)**.** *The order of a root function of C*(*z*) *at z<sup>ω</sup> is at most equal to the order of z<sup>ω</sup> as a zero of* det *C*(*z*)*, which is finite because C*(*z*) *is regular.*

**Proof.** See Gohberg et al. (1993).

**Corollary 1** (Bound on the order of a cointegrating vector)**.** *The order of any cointegrating vector at frequency ω is finite.*

**Proof.** This follows from Proposition 2 because cointegrating vectors and root functions coincide by Theorem 1.

#### *4.2. Canonical Systems of Root Functions*

Next, canonical systems of root functions for *C*(*z*) at *z<sup>ω</sup>* are introduced, see Gohberg et al. (1993). Choose a root function *φ*1(*z*) of highest order *s*1. Since the orders of the root functions are bounded by Proposition 2, such a function exists. Next proceed iteratively over *j* = 2, ... , choosing the next root function *φj*(*z*) to be of the highest order *sj* such that *φj*(*zω*) is linearly independent from *φ*1(*zω*) , ... , *<sup>φ</sup>j*−1(*zω*) . Because *m* := dim((col *C*(*zω*))⊥) < ∞, this process ends with *m* root functions *φ*1(*z*) ,..., *φm*(*z*) .

Note that the columns in *a* := (*φ*1(*zω*), ... , *φm*(*zω*)) span the finite dimensional space (col *<sup>C</sup>*(*zω*))⊥, so that one can choose vectors (*φm*+1, ... , *<sup>φ</sup>p*) = *<sup>a</sup>*<sup>⊥</sup> that span its orthogonal complement. This construction leads to the following definition.

**Definition 4** ((Extended) canonical system of root functions)**.** *Let φ*1(*z*) , ... , *φm*(*z*) *and φ <sup>m</sup>*+1,..., *φ <sup>p</sup> be constructed as above; then*

$$\phi(z)' = \begin{pmatrix} \phi\_1(z)'\\ \vdots\\ \phi\_m(z)' \end{pmatrix} \qquad \text{and} \qquad \left(\cdots \frac{\phi(z)'}{a'\_{\perp}}\cdots\right) = \begin{pmatrix} \phi\_1(z)'\\ \vdots\\ \frac{\phi\_m(z)'}{a'\_{m+1}}\cdots\\ \frac{\phi\_m(z)'}{\phi'\_{m+1}}\cdots\\ \vdots\\ \phi'\_p \end{pmatrix} \tag{11}$$

*are called a canonical system of root functions (respectively an extended canonical system of root functions) of C*(*z*) *at z<sup>ω</sup> of orders* (*s*1,*s*2, ... ,*sm*) *(respectively* (*s*1,*s*2, ... ,*sm*,*sm*+1, ... ,*sp*)*) with* ∞ > *s*<sup>1</sup> ≥ *s*<sup>2</sup> ≥···≥ *sm* > 0 = *sm*+<sup>1</sup> = ··· = *sp.*

Such a canonical system of root functions is not unique. To see this, one can show that the first row vector *φ*1(*z*) in (11) can be replaced by a combination of *φ*1(*z*) and *φ*2(*z*) , called *φ* <sup>1</sup> (*z*) , and the canonical system of root functions containing *φ* <sup>1</sup> (*z*) would still satisfy the definition. More specifically, define *φ* <sup>1</sup> (*z*) :<sup>=</sup> *<sup>φ</sup>*1(*z*) + (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*s*1−*s*2*φ*2(*z*) and observe that, by Definition 3, *φj*(*z*) *C*(*z*)=(*z* − *zω*) *sjφ*3*j*(*z*) , with *φ*3*j*(*zω*) = 0 , *j* = 1, 2. Hence one has

$$
\phi\_1^\*(z)'\mathcal{C}(z) = (z - z\_\omega)^{s\_1}\widetilde{\phi}\_1'(z) + (z - z\_\omega)^{s\_1 - s\_2 + s\_2}\widetilde{\phi}\_2'(z) = (z - z\_\omega)^{s\_1}\widetilde{\phi}^{\*t}(z)
$$

where *<sup>φ</sup>*3(*z*) :<sup>=</sup> *<sup>φ</sup>*3 <sup>1</sup>(*z*) + *φ*3 <sup>2</sup>(*z*). Because *φ*3*j*(*zω*) = 0 , *<sup>j</sup>* <sup>=</sup> 1, 2, one has *<sup>φ</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> unless *φ*31(*zω*) = −*φ*32(*zω*) . However, this last case is ruled out because it would contradict the fact that *<sup>s</sup>*<sup>1</sup> is maximal. Hence *<sup>φ</sup>*3(*zω*) <sup>=</sup> <sup>0</sup> . This shows that *φ* <sup>1</sup> (*z*) satisfies the definition of root function of order *s*1, and hence it can replace *φ* <sup>1</sup>(*z*) in (11).

While a canonical system of root functions (and also an extended canonical system of root functions) is not unique, the orders *s*<sup>1</sup> ≥ *s*<sup>2</sup> ≥···≥ *sm* > 0 = *sm*+<sup>1</sup> = ··· = *sp* are uniquely determined by *C*(*z*) at *zω*, see Lemma 1.1 in Gohberg et al. (1993); they are called the *partial multiplicities* of *C*(*z*) at *zω*.

Finally, consider the local Smith factorization of *C*(*z*) at *z* = *zω*, see Gohberg et al. (1993), i.e., the factorization

$$C(z) = E(z)M(z)H(z),\tag{12}$$

where *<sup>M</sup>*(*z*) = diag((*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*sh* )*h*=1,...,*<sup>p</sup>* is uniquely defined and contains the partial multiplicities *s*<sup>1</sup> ≥···≥ *sp* of *C*(*z*) at *z* = *zω*; the matrices *E*(*z*), *H*(*z*) are analytic and invertible

in a neighbourhood of *z* = *z<sup>ω</sup>* and are non-unique. *M*(*z*) is called the local Smith form of *C*(*z*) at *z* = *zω*. 6

**Remark 22** (Extended canonical system of root functions in the I(1) VAR case)**.** *In the I*(1) *VAR case, see Example 1, the orders of an extended canonical system of root functions of C*(*z*) *at* 1 *are* (*s*1, ... ,*sr*<sup>0</sup> ,*sr*0+1, ... ,*sp*)=(1, ... , 1, 0, ... , 0) *and a possible choice of an extended canonical system of root functions corresponding to these unique orders is given by the p rows in* (*β*0, *β*1) *.*

**Remark 23** (Extended canonical system of root functions in the I(2) VAR case)**.** *In the I(2) VAR case, see Example 2, the orders of an extended canonical system of root functions of C*(*z*) *at* 0 *are* (*s*1, ... ,*sr*<sup>0</sup> ,*sr*0+1, ... ,*sr*0+*r*<sup>1</sup> ,*sr*0+*r*1+1, ... ,*sp*)=(2, ... , 2, 1, ... , 1, 0, ... , 0) *and a possible choice of an extended canonical system of root functions corresponding to these unique orders is given by the p rows in* (*β*<sup>0</sup> + (1 − *z*)(*α*¯ 0*A*1) , *β*1, *β*2) *.*

#### **5. Cointegrating Spaces**

Let *φ*(*z*) be a canonical system of root functions of *C*(*z*) at *zω*, see Definition 4. Appendix A.2 shows that row*G*(*φ*(*z*) ) with *G* = *F*, *F*[*z*], *F*(*z*) are well defined sets of (generalized) root functions. This section argues that one could take any of them as a definition of 'cointegrating space' for multicointegrated systems. Note that

$$\mathsf{row}\_F(\phi(z)') \subset \mathsf{row}\_{F[z]}(\phi(z)') \subset \mathsf{row}\_{F(z)}(\phi(z)'),$$

so that the three definitions of cointegrating space are naturally nested. Remark that row*F*(*φ*(*z*) ) is a vector space over *F*, row*F*[*z*](*φ*(*z*) ) is a free module over the ring *F*[*z*] of polynomials in *z* (which contains row*F*(*φ*(*z*) )) and row*F*(*z*)(*φ*(*z*) ) is a vector space over the field *F*(*z*) of rationals functions of *z* ((which contains row*F*[*z*](*φ*(*z*) ) and hence row*F*(*φ*(*z*) )). Finally note the central role played by the canonical system of root functions *φ*(*z*) as a basis for these different spaces, which differ for the set of scalars chosen in linear combinations.

#### *5.1. The Cointegrating Space* row*F*(*φ*(*z*) ) *as a Vector Space over F*

The cointegrating space row*F*(*φ*(*z*) ), where *F* = R, C, is a vector space. In fact, the set of all *F*-linear combination of *φ*(*z*) produces a vector space, because row*F*(*φ*(*z*) ) is closed under multiplication by a scalar in *F* by Proposition A1 and with respect to vector addition, as a special case of Proposition A2.

In order to discuss the cointegrating spaces row*F*[*z*](*φ*(*z*) ) and row*F*(*z*)(*φ*(*z*) ), the notion of generalized cointegrating vector is introduced, as the counterpart of the notion of generalized root function, see Definition A1.

**Definition 5** (Generalized cointegrating vector at frequency *<sup>ω</sup>*)**.** *Let <sup>n</sup>* <sup>∈</sup> <sup>Z</sup> *and <sup>ζ</sup>*(*z*) *be a cointegrating vector at frequency ω and order s, see Definition 2; then*

$$\xi(z)' := (1 - z/z\_{\omega})^n \zeta(z)'$$

*is called a generalized cointegrating vector at frequency ω with order s and exponent n.*

#### *5.2. The Cointegrating Space* row*F*[*z*](*φ*(*z*) ) *as a Free Module over F*[*z*]

Consider next row*F*[*z*](*φ*(*z*) ). *F*[*z*] is the polynomial ring formed as the set of polynomials in *z* with coefficients in *F*. As it is well known, *F*[*z*] is a ring but not a field (division ring), see e.g., Hungerford (1980), because polynomials, unlike rational functions, lack the multiplication inverse. The following propositions summarizes that row*F*[*z*](*φ*(*z*) ) is a free module over the ring *F*[*z*] of polynomials.

**Proposition 3** (row*F*[*z*](*φ*(*z*) ) is a *F*[*z*]-module)**.** *Consider* G = row*F*[*z*](*φ*(*z*) )*, where φ*(*z*) *is a canonical system of root functions of C*(*z*) *at z<sup>ω</sup> with coefficients in F, and where F*[*z*] *is the* *ring of polynomials in z with coefficients in F; then* G *is closed with respect to the vector sum, and it is closed under multiplication by a scalar polynomial in F*[*z*]*; hence* G *is a module over the ring F*[*z*] *of polynomials.*

**Proof.** By Propositions A1 and A2, G is closed under addition and under multiplication by a scalar polynomial in *F*[*z*]. One needs to verify that, see e.g., Definition IV.1.1 in Hungerford (1980), for *ζ*(*z*), *ψ*(*z*) ∈ G and 1, *a*(*z*), *b*(*z*) ∈ *F*[*z*]

$$\begin{aligned} a(z) \cdot (\zeta(z)' + \psi(z)') &= a(z) \cdot \zeta(z)' + a(z) \cdot \psi(z)' \\ a(a(z) + b(z)) \cdot \zeta(z)' &= a(z) \cdot \zeta(z)' + b(z) \cdot \zeta(z)' \\ (a(z)b(z)) \cdot \zeta(z)' &= a(z) \cdot (b(z) \cdot \zeta(z)') \\ 1 \cdot \zeta(z)' &= \zeta(z)' \end{aligned} \tag{13}$$

where · indicates multiplication by a scalar. The distributive properties in (13) are seen to be satisfied. This proves the statement.

#### *5.3. The Cointegrating Space* row*F*(*z*)(*φ*(*z*) ) *as a Vector Space over F*(*z*)

Finally consider row*F*(*z*)(*φ*(*z*) ). The set of scalars *F*(*z*) is the field of rational functions in *z* with coefficients in *F*. As it is well known, *F*(*z*) is a field (division ring), see e.g., Hungerford (1980).

**Remark 24** (Rational vectors without poles at *zω*)**.** *Take ζ*(*L*) *to be a rational vector, i.e., of the form ζ*(*z*) = <sup>1</sup> *<sup>d</sup>*(*z*) *b*(*z*) *where d*(*z*) *is a monic polynomial and b*(*z*) *is a* 1 × *p vector polynomial, with d*(*z*) *and b*(*z*) *relatively prime, see Example A1. If d*(*z*) *has no root equal to zω, then ζ*(*z*) *is an analytic function on D*(*zω*, *η*)*, η* > 0*, see Remark A1 and Lemma A1; hence a special case of an analytic vector function ζ*(*z*) *is a rational vector with denominator d*(*z*) *without roots equal to zω.*

**Remark 25** (Rational vectors with poles at *zω*)**.** *If d*(*z*) *has one root equal to z<sup>ω</sup> with multiplicity m, then ζ*(*z*) *has a pole of order m, and it is not an analytic function on some D*(*zω*, *η*)*, η* > 0*; hence Definition 2 cannot be applied, because it requires ζ*(*z*) *to be analytic. However, one could remove the pole of order <sup>m</sup> by defining <sup>ξ</sup>*(*z*) := (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*mζ*(*z*) *, and use Definition 2 on ξ*(*z*) *, which is analytic function, as done in Definition 5.*

**Remark 26** (Representation for generic rational vectors)**.** *In the following, when dealing with rational vectors of the type ζ*(*z*) = <sup>1</sup> *<sup>d</sup>*(*z*) *b*(*z*) *, it is sufficient to consider the case where d*(*z*) *does not have a root at zω, thanks to Definition 5. In fact, let d*(*z*) *be decomposed as d*(*z*) = (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*md*(*z*) *with <sup>d</sup>*(*zω*) <sup>=</sup> <sup>0</sup> *and <sup>m</sup>* <sup>≥</sup> <sup>0</sup>*; in this representation, <sup>z</sup><sup>ω</sup> is a root of <sup>d</sup>*(*z*) *if and only if m* > 0 *and it is not a root if and only if m* = 0*. By Remark 24, ζ*(*z*) *is a (generalized) cointegrating vector if and only if <sup>ξ</sup>*(*z*) := (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*mζ*(*z*) <sup>=</sup> <sup>1</sup> *<sup>d</sup>*(*z*) *<sup>b</sup>*(*z*) *is a cointegrating vector. Hence Definition 5 allows to concentrate on the case where the denominator has no root at zω.*

The following proposition summarizes that row*F*(*z*)(*φ*(*z*) ) is a vector space over the field *F*(*z*) of rational functions.

**Proposition 4** (row*F*(*z*)(*φ*(*z*) ) is a vector space over *F*(*z*))**.** *Let* H = row*F*(*z*)(*φ*(*z*) ) *where φ*(*z*) *is a canonical system of root functions of C*(*z*) *at z<sup>ω</sup> with coefficients in F, where F*(*z*) *is the field of rational function in z with coefficients in F; then* H *is closed with respect to the vector sum, and under multiplication by a scalar rational function in F*(*z*)*, and* H *is a vectors space over the field F*(*z*) *of rational functions.*

**Proof.** H is closed with respect to multiplication by a rational function in *F*(*z*), see Proposition A1, and with respect to vector addition, see Proposition A2. One can verify for *ζ*(*z*), *ψ*(*z*) ∈ H and 1, *a*(*z*), *b*(*z*) ∈ *F*(*z*), that the distribution equalities in (13) are satisfied. Because *F*(*z*) is a field, then H is a vector space over *F*(*z*).

#### **6. The Local Rank Factorization**

This section shows how to explicitly obtain a canonical system of root functions *φ*(*z*) or an extended canonical system of root functions (*φ*(*z*), *<sup>a</sup>*⊥) for a generic VAR process

$$A(L)\mathbf{x}\_{l} = \varepsilon\_{l\prime} \qquad A\_{0} \neq 0, \qquad \det A\_{0} = 0,\tag{14}$$

with *<sup>A</sup>*(*z*) analytic for all *<sup>z</sup>* <sup>∈</sup> *<sup>D</sup>*(0, 1 <sup>+</sup> *<sup>η</sup>*), *<sup>η</sup>* <sup>&</sup>gt; 0, having roots at *<sup>z</sup>* <sup>=</sup> *<sup>z</sup><sup>ω</sup>* <sup>=</sup> *<sup>e</sup>i<sup>ω</sup>* and at *<sup>z</sup>* with |*z*| > 1, see Remarks 1 and 2.

The derivation of the Granger representation theorem involves the inversion of the matrix function

$$A(z) = \sum\_{n=0}^{\infty} (z - z\_{\omega})^n A\_{n\prime} \qquad A\_n \in \mathbb{C}^{p \times p}, \qquad A\_0 \neq 0, \qquad \det A\_0 = 0,\tag{15}$$

in *D*(*zω*, *η*). This includes the case of matrix polynomials *A*(*z*), in which the degree of *A*(*z*) is finite, *k* say, with *An* = 0 for *n* > *k*. 7

The inversion of *A*(*z*) around the singular point *z* = *z<sup>ω</sup>* yields an inverse with a pole of some order *<sup>d</sup>* <sup>=</sup> 1, 2, ... at *<sup>z</sup>* <sup>=</sup> *<sup>z</sup>ω*; an explicit condition on the coefficients {*An*}<sup>∞</sup> *<sup>n</sup>*=<sup>0</sup> in (15) for *A*(*z*)−<sup>1</sup> to have a pole of given order *d* is described in Theorem 2 below; this is indicated as the POLE(*d*) condition in the following. Under the POLE(*d*) condition, *A*(*z*)−<sup>1</sup> has Laurent expansion around *z* = *z<sup>ω</sup>* given by

$$A(z)^{-1} = :(z - z\_{\omega^\vee})^{-d} \mathbb{C}(z) = \sum\_{n=0}^{\infty} (z - z\_{\omega^\vee})^{n-d} \mathbb{C}\_{n\prime} \qquad \mathbb{C}\_0 \neq 0, \qquad \det \mathbb{C}\_0 = 0. \tag{16}$$

Note that *C*(*zω*) = *C*<sup>0</sup> = 0 and *C*(*z*) is expanded around *z* = *zω*. In the following, the coefficients {*Cn*}<sup>∞</sup> *<sup>n</sup>*=<sup>0</sup> are called the Laurent coefficients. The first *<sup>d</sup>* of them, {*Cn*}*d*−<sup>1</sup> *<sup>n</sup>*=0, make up the principal part and characterize the singularity of *A*(*z*)−<sup>1</sup> at *z* = *zω*.

The following result is taken from Franchi and Paruolo (2019) Theorem 3.3.8

**Theorem 2** (POLE(*d*) condition)**.** *Consider A*(*z*) *defined in* (15)*; let* 0 < *r*<sup>0</sup> := rank *A*<sup>0</sup> < *p, r*max <sup>0</sup> := *p and define α*0*, β*<sup>0</sup> *by the rank factorization A*<sup>0</sup> = −*α*0*β* <sup>0</sup>*. Moreover, for j* = 1, 2, ... *define αj, β<sup>j</sup> by the rank factorization*

$$P\_{a\_{\vec{j}\perp}}A\_{\vec{j},1}P\_{b\_{\vec{j}\perp}} = -a\_{\vec{j}}\beta\_{\vec{j}\prime}^{\prime} \qquad a\_{\vec{j}} := (a\_0, \dots, a\_{\vec{j}-1}), \qquad b\_{\vec{j}} := (\beta\_0, \dots, \beta\_{\vec{j}-1}), \tag{17}$$

*where Px denotes the orthogonal projection onto the space spanned by the columns of x and*

$$A\_{h+1,n} := \begin{cases} A\_n & \text{for } h = 0\\ A\_{h,n+1} + A\_{h,1} \sum\_{i=0}^{h-1} \bar{\beta}\_i \mathbb{R}\_i' A\_{i+1,n} & \text{for } h = 1,2,\dots \end{cases} \qquad n = 0,1,\dots \tag{18}$$

*Finally, let*

$$r\_j := \text{rank}(P\_{d\_{j\perp}} A\_{j,1} P\_{b\_{j\perp}}), \qquad r\_j^{\text{max}} := p - \sum\_{i=0}^{j-1} r\_i. \tag{19}$$

*Then, a necessary and sufficient condition for A*(*z*) *to have an inverse with pole of order d* = 1, 2, ... *at z* = *z<sup>ω</sup> – called* POLE(*d*) *condition – is that*

$$\begin{cases} \ r\_j < r\_j^{\max} & \text{(reduced rank condition) for } j = 0, \dots, d - 1 \\\ r\_d = r\_d^{\max} & \text{(full rank condition) for } j = d \end{cases} $$

Observe that because rank *Paj*<sup>⊥</sup> *Aj*,1*Pbj*<sup>⊥</sup> <sup>=</sup> rank *<sup>a</sup> <sup>j</sup>*⊥*Aj*,1*bj*⊥, one has *rj* <sup>=</sup> rank *<sup>a</sup> <sup>j</sup>*⊥*Aj*,1*bj*⊥; hence *d* = 1 if and only if

$$r\_1 = r\_1^{\max}, \quad \text{where} \quad r\_1 = \text{rank}\, a'\_{0\perp} A\_1 \beta\_{0\perp} \quad \text{and} \quad r\_1^{\max} = p - r\_0.$$

This corresponds to the condition in Howlett (1982, Theorem 3) and to the *I*(1) condition in Johansen (1991, Theorem 4.1). Similarly, one has *d* = 2 if and only if *r*<sup>1</sup> < *r*max <sup>1</sup> ,

$$r\_2 = r\_2^{\max}, \quad \text{where} \quad r\_2 = \text{rank}\, a'\_{2\perp} \, (A\_2 + A\_1 \vec{\theta}\_0 \vec{u}'\_0 A\_1) b\_{2\perp} \quad \text{and} \quad r\_2^{\max} = p - r\_0 - r\_{1\prime}$$

which corresponds to the *I*(2) condition in Johansen (1992, Theorem 3).

Theorem 2 is thus a generalization of the Johansen's *I*(1) and *I*(2) conditions and shows that, in order to have a pole of order *d* in the inverse, one needs *d* + 1 rank conditions on *<sup>A</sup>*(*z*): The first *<sup>j</sup>* <sup>=</sup> 0, ... , *<sup>d</sup>* <sup>−</sup> 1 are reduced rank conditions, *rj* <sup>&</sup>lt; *<sup>r</sup>*max *<sup>j</sup>* , that establish that the order of the pole is greater than *j*; the last one is a full rank condition, *rd* = *r*max *<sup>d</sup>* , that establishes that the order of the pole is exactly equal to *d*. These requirements make up the POLE(*d*) condition.

The following result is also taken from Franchi and Paruolo (2019).9

**Theorem 3** (Local Smith factorization)**.** *Consider A*(*z*) *and the other related quantities defined in Theorem 16; for j* = 0, . . . , *d, define the rj* × *p matrix functions γj*(*z*) *as follows*

$$\gamma\_{j,0}' := \beta\_j', \qquad \gamma\_{j,n}' := -\bar{a}\_j'\\A\_{j+1,n}, \quad n = 1,2,\ldots \qquad \gamma\_j(z)' := \sum\_{n=0}^{\infty} (z - z\_{\omega})^n \gamma\_{j,n}', \tag{20}$$

*and define the p* × *p matrix functions* Γ(*z*) *and* Λ(*z*) *as follows*

$$\Gamma(z) := \begin{pmatrix} \gamma\_0(z)^\prime \\ \vdots \\ \gamma\_d(z)^\prime \end{pmatrix}, \quad \Lambda(z) := \begin{pmatrix} (z - z\_\omega)^0 I\_{\mathbb{F}\_0} \\ & \ddots \\ & & (z - z\_\omega)^d I\_{\mathbb{F}\_d} \end{pmatrix}. \tag{21}$$

*Then* Γ(*z*), Ξ(*z*) := *A*(*z*)Γ(*z*)−1Λ(*z*)−<sup>1</sup> *are analytic and invertible on D*(*zω*, *η*)*, η* > 0*, and* Λ(*z*) *is the local Smith form of A*(*z*) *at zω, A*(*z*) = Ξ(*z*)Λ(*z*)Γ(*z*)*. Moreover one can choose the factors E*(*z*), *M*(*z*), *H*(*z*) *for the local Smith factorization of C*(*z*) *defined in* (16)*, see* (12)*, as*

$$E(z) = \Gamma(z)^{-1}, \qquad M(z) = (z - z\_{\omega})^d \Lambda(z)^{-1}, \qquad H(z) = \Xi(z)^{-1}.$$

Theorem 3 shows that the LRF fully characterizes the elements of the local Smith factorization of *C*(*z*) at *zω*. In fact, the values of *j* with *rj* > 0 in the LRF provide the distinct partial multiplicities of *C*(*z*) at *z<sup>ω</sup>* and *rj* gives the number of partial multiplicities that are equal to a given *j*; this characterizes the local Smith form Λ(*z*). Moreover, it also provides the constructions of an extended canonical system of root functions.

Remark that the *<sup>j</sup>*-th block of rows in <sup>Γ</sup>(*z*)*C*(*z*)=(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*d*Λ(*z*)−1Ξ(*z*)−<sup>1</sup> can be written as

$$\gamma\_j(z)'\mathcal{C}(z) = (z - z\_{\omega})^{d-j}\widetilde{\gamma}\_j(z)', \qquad j = 0, \ldots, d,\tag{22}$$

where *γj*(*zω*) = *β <sup>j</sup>* and *<sup>γ</sup>*3*j*(*zω*) have full row rank; here *<sup>γ</sup>*3*j*(*z*) denotes the corresponding block of rows in <sup>Ξ</sup>(*z*)−1. This shows that *<sup>γ</sup>j*(*z*) are *rj* root functions of order *<sup>d</sup>* <sup>−</sup> *<sup>j</sup>* of *<sup>C</sup>*(*z*).

The next result presents the Triangular representation as proved in Franchi and Paruolo (2019, Corollary 4.6).

**Proposition 5** (Triangular representation)**.** *Let xt in* (14) *satisfy the* POLE(*d*) *condition on A*(*z*) *and define*

$$\begin{aligned} \Gamma\_{\circ}(L) &:= \begin{pmatrix} \cdot \varPhi(\underline{L})^{\prime}\_{\omega} \dots \\ \mathscr{P}^{(d-1)}\_{d} \dots \end{pmatrix}, \\ \Phi(L)^{\prime} &:= \begin{pmatrix} \gamma^{(d-1)}\_{0}(L)^{\prime} \\ \gamma^{(d-2)}\_{1}(L)^{\prime} \\ \vdots \\ \gamma^{(0)}\_{d-1}(L)^{\prime} \end{pmatrix} = \begin{pmatrix} \beta^{\prime}\_{0} + \sum\_{k=1}^{d-1} (-z\_{\omega})^{k} \gamma^{\prime}\_{0k} \Delta^{k}\_{\omega} \\ \beta^{\prime}\_{1} + \sum\_{k=1}^{d-2} (-z\_{\omega})^{k} \gamma^{\prime}\_{1k} \Delta^{k}\_{\omega} \\ \vdots \\ \beta^{\prime}\_{d-1} \end{pmatrix}, \end{aligned} \tag{23}$$

*where <sup>γ</sup>*(*d*−*j*−1) *<sup>j</sup>* (*z*) <sup>=</sup> <sup>∑</sup>*d*−*j*−<sup>1</sup> *<sup>k</sup>*=<sup>0</sup> *γ <sup>j</sup>*,*k*(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>k</sup> is the truncation of order <sup>d</sup>* <sup>−</sup> *<sup>j</sup>* <sup>−</sup> <sup>1</sup> *of the root functions γj*(*z*) *in* (20)*. Then xt is I*(*d*) *and it admits the Triangular Representation*

Λ(*L*)Γ◦(*L*)*xt* ∼ *I*(0)

*where no linear combination exists of the l.h.s. that is integrated of lower order.*

Observe that the canonical system of root functions *φ*(*z*) in (23) is not unique and not of minimal polynomial order, as discussed in the next section. The following example applies the above concepts in the *I*(2) VAR case.

**Example 3** (I(2) VAR example continued)**.** *Consider Example 2. Applying truncation to the rows of* (*β* <sup>0</sup> + Δ*α*¯ <sup>0</sup>*A*1)*, see Propositions 5 and A3, one finds that the columns in β* <sup>0</sup> *are root functions of C*(*z*) *at ω* = 0 *of order at least* min(2, 1) = 1*. Consider now one row in* (*β* <sup>0</sup> + Δ*α*¯ <sup>0</sup>*A*<sup>1</sup> + <sup>Δ</sup>2*A* ) *for some matrix A; this root function is of order* 2 *by Remark 23, and its truncation to degree* 1*, i.e., to the corresponding row of* (*β* <sup>0</sup> + Δ*α*¯ <sup>0</sup>*A*1) *is still of order 2 by Propositions 5 and A3, Finally consider one row in* (*β* <sup>0</sup> + Δ*A* )*, which gives a root function of order at least 1; its truncation to a polynomial of degree* 0 *gives the corresponding row of β* <sup>0</sup>*, which has order at least 1 by Propositions 5 and A3. In fact the rows of β* <sup>0</sup> *give root functions of order equal to 1 or to 2, when the corresponding entries in α*¯ <sup>0</sup>*A*<sup>1</sup> *in* (*β* <sup>0</sup> + Δ*α*¯ <sup>0</sup>*A*1) *are equal to 0, as discussed below.*

#### **7. Minimal Bases**

This section describes the algorithm of Forney (1975) to reduce the basis *φ*(*z*) to minimal order, using the generic notation of *b*(*z*) in place of *φ*(*z*) . The generic basis *b*(*z*) is assumed to be rational and of dimension *r* × *p*. This algorithm exploits the nesting row*F*(*b*(*z*) ) ⊂ row*F*[*z*](*b*(*z*) ) ⊂ row*F*(*z*)(*b*(*z*) ). In the following, the *j*-row of *b*(*z*) is indicated as *bj*(*z*) , which is the *j*-th element of the basis, *j* = 1, ... ,*r*. Various modifications of the original basis *b*(0)(*z*) := *b*(*z*) are indicated as *b*(*h*)(*z*) for *h* = 1, 2, 3.

**Definition 6** (Degree of *b*(*z*) )**.** *If b*(*z*) *is a polynomial basis, the degree vj of its j-th row, indicated as vj* := deg *bj*(*z*) *, is defined as the maximum degree of its elements, and the degree v of b*(*z*) *is defined as v* := deg *b*(*z*) := ∑*<sup>r</sup> <sup>j</sup>*=<sup>1</sup> *vj, i.e., the sum of the degrees of its rows.*

The reduction algorithm proposed by Forney (1975, pp. 497–98) consists of the following 3 steps.

Step 1 If *b*(0)(*z*) is not polynomial, multiply each row by the least common denominator of each row to obtain a polynomial basis *b*(1)(*z*) .

Step 2 Reduce row orders in *b*(1)(*z*) by taking *F*[*z*]-linear combinations.

Step 3 Reduce *b*(2)(*z*) to a basis *b*(3)(*z*) with a full-row-rank high order coefficient matrix, i.e., a "row proper" basis.

This procedure gives a final basis *b*(3)(*z*) which has lowest degree, see Forney (1975) Section 3.

**Remark 27** (Spaces and algorithm)**.** *Step 1 works on* row*F*(*z*)(*φ*(*z*) )*, Step 2 works on* row*F*[*z*](*φ*(*z*) )*, Step 3 uses F-linear combinations on Q*(*z*)*φ*(*z*) *with appropriate square polynomial matrices Q*(*z*)*.*

#### *7.1. Step 1*

If *b*(0)(*z*) is polynomial, the algorithm sets *b*(1)(*z*) = *b*(0)(*z*) ; otherwise *b*(0)(*z*) is rational, and its *j*-th row *bj*(*z*) has representation *bj*(*z*) = <sup>1</sup> *aj*(*z*) *cj*(*z*) , where *cj*(*z*) is a polynomial row vector and *aj*(*z*) is a scalar polynomial, and *cj*(*z*) and *aj*(*z*) are relatively prime. The first step consist in computing *b*(1)(*z*) = diag(*a*1(*z*), ... , *ar*(*z*))*b*(0)(*z*) , where *Q*(*z*) := diag(*a*1(*z*),..., *ar*(*z*)) is a square polynomial matrix of dimension *r*.

#### *7.2. Step 2*

The second step reduces the degree of the rows in *b*(1)(*z*) . This involves finding specific points *zh*, *h* = 1, ... , *k*, at which rank(*b*(1)(*zh*) ) < *r*. To find them, one can calculate the greatest common divisor -(*z*) of all *<sup>r</sup>* <sup>×</sup> *<sup>r</sup>* minors of *<sup>b</sup>*(1)(*z*) . If -(*z*) = 1 this step is complete, and the algorithm sets *b*(2)(*z*) = *b*(1)(*z*) ; otherwise one computes the zeros of -(*z*), *<sup>z</sup>*1, ... , *zk* say, where *zh* <sup>∈</sup> <sup>C</sup>, *<sup>h</sup>* <sup>=</sup> 1, ... , *<sup>k</sup>*. The following substep is then applied to each root *zh* sequentially, *h* = 1, . . . , *k*.

Denote by *w*(*z*) the current basis; this will be replaced by *κ*(*z*) at the end of this substep. For *h* = 1, one has *w*(*z*) = *b*(1)(*z*) . For *z* = *zh*, all minors of order *r* of *w*(*zh*) vanish, which means that *w*(*zh*) is singular, i.e., it has reduced rank and rank factorization *w*(*zh*) = *ψa* , say, where *ψ*, *a* are full column rank. Let *c* := (*c*1, ... , *cp*) be one row in *ψ* ⊥. Indicate by *Ac* := {*<sup>i</sup>* : *ci* = <sup>0</sup>} the set of its non-zero coefficients, and let *vi*<sup>0</sup> := max*i*∈*Ac*{*vi*} be the maximal degree of rows in *w*(*z*) with nonzero coefficient in *c* .

This substep consists of replacing row *i*<sup>0</sup> of *w*(*z*) with *c w*(*z*) /(*z* − *zh*), which is still a polynomial vector. In fact *c w*(*zh*) = 0 , so that *c w*(*zh*) has representation *c w*(*zh*) = (*z* − *zh*)*τ*(*z*) with *τ*(*z*) a polynomial vector, so that *c w*(*z*) /(*z* − *zh*) = *τ*(*z*) . This defines *<sup>κ</sup>*(*z*) in terms of *<sup>w</sup>*(*z*) as *<sup>κ</sup>*(*z*) <sup>=</sup> *<sup>B</sup>*(*z*)−1*Qw*(*z*) where *<sup>Q</sup>* is an *<sup>r</sup>* <sup>×</sup> *<sup>r</sup>* square matrix, equal to *Ir* except for row *i*0, equal to *c* , and where *B*(*z*) is a diagonal matrix equal to *Ir* except for having *z* − *zh* in its *i*0-th position on the diagonal. Note that *Q* is nonsingular, because *ci*<sup>0</sup> = 0. The same procedure is applied to each row *c* of *ψ* ⊥.

This substep is repeated for all *zj*, *j* = 1, ... , *k*. The condition on the minors in then recalculated and the substep repeated for the new roots, until the greatest common divisor -(*z*) of all *<sup>r</sup>* <sup>×</sup> *<sup>r</sup>* minors of *<sup>κ</sup>*(*z*) is 1. When this is the case, Step 2 sets *<sup>b</sup>*(2)(*z*) <sup>=</sup> *<sup>κ</sup>*(*z*) .

#### *7.3. Step 3*

The last step operates on the high order coefficient matrix, repeating the following substep. Let *w*(*z*) indicate *b*(2)(*z*) at the beginning of the substep, which will be replaced by *κ*(*z*) at the end of it. Let *vi* be the order of the *i*-th row of *w*(*z*) , indicated as *wi*(*z*) = <sup>∑</sup>*vi <sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> w ij*. The high-order matrix is defined as the *r* × *p* matrix *w* <sup>∗</sup> := (*w*1*v*<sup>1</sup> , ... , *wrvr* ) composed of the coefficient matrix of the highest degree of (*<sup>z</sup>* − *<sup>z</sup>ω*) for each row of *w*(*z*) .

A necessary and sufficient condition for *w* <sup>∗</sup> to be of full rank is that the order of *<sup>w</sup>*(*z*) is equal to the maximum order of its *r* × *r* minors. If this is not the case, *w* <sup>∗</sup> is singular, i.e., it has rank factorization *w* <sup>∗</sup> = *<sup>ψ</sup>a* with *<sup>ψ</sup>* and *<sup>a</sup>* of full column rank. Hence one can choose a vector *c* := (*c*1,..., *cp*) as one row in *ψ* <sup>⊥</sup> for which one has *<sup>c</sup> w* <sup>∗</sup> = <sup>0</sup> .

As before, let *Ac* := {*<sup>i</sup>* : *ci* = <sup>0</sup>} and define *vi*<sup>0</sup> := max*i*∈*Ac*{*vi*}. Let also *ni* := *vi*<sup>0</sup> − *vi*, note that *ni* <sup>≥</sup> 0 for *<sup>i</sup>* <sup>∈</sup> *Ac* and let *<sup>Q</sup>*(*z*) :<sup>=</sup> diag((*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*n*<sup>1</sup> , ...(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*nr* ). Row *<sup>i</sup>*<sup>0</sup> in *<sup>w</sup>*(*z*) is replaced by

$$q(z)' := c'Q(z)w(z)' = \sum\_{i \in A\_{\mathcal{E}}} c\_i \sum\_{j=0}^{v\_i} (z - z\_{\omega})^{j + n\_i} w\_{i,j}' = \sum\_{i \in A\_{\mathcal{E}}} c\_i \sum\_{s=n\_i}^{v\_{i\_0}} (z - z\_{\omega})^s w\_{i, s - n\_i}' \tag{24}$$

$$=\sum\_{j=0}^{v\_{l\_0}-1} (z - z\_{\omega^\flat})^j q\_j^\prime + (z - z\_{\omega})^{v\_{l\_0}} c^\prime w\_\*^\prime = \sum\_{j=0}^{v\_{l\_0}-1} (z - z\_{\omega^\flat})^j q\_{j\prime}^\prime \tag{25}$$

where *s* in the last expression in the first line is defined as *j* + *ni* and *q <sup>j</sup>* := <sup>∑</sup>*i*∈*Ac ciw i*,*ni*+*j* .

The central expression in (24) shows that *q*(*z*) is polynomial because *ni* ≥ 0 in the exponents of (*z* − *zω*). In order to see that the degree of *q*(*z*) is also lower than *vi*<sup>0</sup> , one can note that the the high order coefficient in (25), which correspond to *s* = *vi*<sup>0</sup> in (24), equals ∑*i*∈*Ac ciw <sup>i</sup>*,*vi* = *<sup>c</sup> w* <sup>∗</sup> = <sup>0</sup> . This implies that the order of *q*(*z*) is lower than *vi*<sup>0</sup> , and that replacing row *i*<sup>0</sup> of *w*(*z*) with *q*(*z*) reduces the order of the vector.

This defines *κ*(*z*) in terms of *w*(*z*) as *κ*(*z*) = *NQ*(*z*)*w*(*z*) where *N* is an *r* × *r* square matrix, equal to *Ir* except for row *i*0, equal to *c* . Note that *N* is nonsingular, because *ci*<sup>0</sup> = 0. This process is repeated for all the rows *c* in *ψ* ⊥. Next set *<sup>w</sup>*(*z*) <sup>=</sup> *<sup>κ</sup>*(*z*) and repeat until the high order coefficient matrix has full rank. When this is the case, Step 3 sets *b*(3)(*z*) = *κ*(*z*) .

#### **8. From a Canonical System of Root Functions to a Minimal Basis for** *I***(2) VAR**

This section applies the algorithm of Forney reviewed in Section 7 to *φ*(*z*) in (23) to reduce the basis to minimal order in the *I*(2) VAR example at frequency *ω* = 0. This application leads to the separation of the cases of


The process of obtaining minimal bases does not lead to a unique choice of basis; this leaves open the choice of how to further restrict the basis to obtain uniqueness. Forney (1975) obtains uniqueness requiring the minimal basis to be in upper echelon form. Other sets of restrictions can also be considered. For the sake of brevity, the restrictions on how to obtain a unique minimum basis are not further discussed here.

#### *8.1. Step 1 in I*(2) *VAR*

Consider the triangular representation of an *I*(2) system, see (23):

$$\Gamma\_{\circ}(z) := \begin{pmatrix} \frac{\Phi(z)'}{\beta\_2'} \cdots \\ \beta\_2' \end{pmatrix}, \qquad \phi(z)' := \begin{pmatrix} \gamma\_0^{(1)}(z)' \\ \gamma\_1^{(0)}(z)' \end{pmatrix} = \begin{pmatrix} \beta\_0' + \gamma\_{0,1}'(z-1) \\ \beta\_1' \end{pmatrix}, \tag{26}$$

and apply the algorithm of Forney (1975) to *b*(0)(*z*) := *φ*(*z*) . Because *b*(0)(*z*) is already polynomial, one has *b*(1)(*z*) = *b*(0)(*z*) = *φ*(*z*) .

#### *8.2. Step 2 in I*(2) *VAR*

Next consider Step 2, and set *w*(*z*) = *b*(1)(*z*) . One wishes to find some zero *zh* and some corresponding *c* so as have *c w*(*zh*) = 0 . Denoting *u* = *zh* − 1, one hence needs to find the pair (*u*, *c* ) such that

$$
\varepsilon' \left( \begin{pmatrix} \beta'\_0 \\ \beta'\_1 \end{pmatrix} + \begin{pmatrix} \gamma'\_{0,1} \\ 0 \end{pmatrix} u \right) = 0,\tag{27}
$$

where *u* is a scalar. Note that *u* = 0 is not a possible zero of (27), because *w*(*zω*)=(*β*0, *β*1) is of full row rank, so that *u* = 0. Post-multiplying (27) by the square non-singular matrix (*β*¯ 0, *β*¯ 1, *β*¯ <sup>2</sup>) one finds

$$
\sigma' \left( \begin{pmatrix} I\_{r\_0} & 0 & 0 \\ 0 & I\_{r\_1} & 0 \end{pmatrix} + \begin{pmatrix} \gamma'\_{0,1}\mathbb{B}\_0 & \gamma'\_{0,1}\mathbb{B}\_1 & \gamma'\_{0,1}\mathbb{B}\_2 \\ 0 & 0 & 0 \end{pmatrix} \mu \right) = 0. \tag{28}
$$

Hence, partitioning *c* as *c* = (*ς* , *θ* ) where *ς* is 1 × *r*0, one finds that the second set of equations gives *θ* = 0 and the first one, substituting the expression of *γ* 0,1 = −*α*¯ <sup>0</sup>*A*<sup>1</sup> given in Theorem 3, implies

$$
\mathfrak{s}' \mathfrak{k}'\_0 A\_1 \mathfrak{k}\_0 = \lambda \mathfrak{s}', \qquad \lambda := \mathfrak{u}^{-1} \neq 0,\tag{29}
$$

$$
\xi^\prime \mathbb{k}\_0^\prime A\_1(\beta\_1, \beta\_2) = 0,\tag{30}
$$

where *<sup>λ</sup>* <sup>=</sup> *<sup>u</sup>*−<sup>1</sup> <sup>=</sup> 0 in (29); note also that *<sup>u</sup>* <sup>=</sup> 0 has been simplified in (30). This proves the following proposition.

**Proposition 6** (Step 2 condition in *I*(2))**.** *A necessary and sufficient condition for Step 2 to be non-empty is that* (29)*,* (30) *hold simultaneously, i.e., that* (*λ*, *ς* ) *is a non-zero eigenvalue—left eigenvector pair of α*¯ 0*A*1*β*¯ <sup>0</sup>*, and the left eigenvector v is orthogonal to α*¯ 0*A*1(*β*¯ 1, *β*¯ <sup>2</sup>)*. If this is the case, for each pair* (*λ*, *ς* ) *one has*

$$
\mathfrak{g}' \mathbb{R}'\_0 A\_1 = \mathfrak{g}' \mathbb{R}'\_0 A\_1 P\_{\mathfrak{f}0} = \lambda \mathfrak{g}' \mathfrak{f}'\_0. \tag{31}
$$

Observe that from (27), using *c* = (*ς* , *θ* ) and *z* − 1 = *z* − *zh* + *u* with *u* = *zh* − 1, one finds

$$\begin{split} \varepsilon' w(z)' &= \varsigma' \beta\_0' - (z - z\_h + u) \varsigma' \bar{\alpha}\_0' A\_1 \\ &= \varsigma' \left( \beta\_0' - \bar{\alpha}\_0' A\_1 u \right) - (z - z\_h) \varsigma' \bar{\alpha}\_0' A\_1 = -(z - z\_h) \varsigma' \bar{\alpha}\_0' A\_1. \end{split} \tag{32}$$

where the last equality follows from (31). This shows that under the necessary and sufficient condition in Proposition 6, there is a linear combination *c* of *w* (*z*) where one can factor *z* − *zh* out of *c w* (*z*), which reduces the order from 1 to 0. Here *c w* (*z*), which has degree equal to 1, is replaced by *c w* (*z*)/(*z* − *zh*) = −*ς α*¯ <sup>0</sup>*A*<sup>1</sup> = −*λς β* <sup>0</sup>, which has degree 0. Note that from (31) the new cointegrating relation is in the span of *β* 0.

This can be done for all pairs (*λ*, *ς* ). Let (*λj*, *ς j* ) be all the pairs (*λ*, *ς* ) satisfying the assumptions of Proposition 6, *j* = 1, ... ,*s*, and let *q* := (*λ*1*ς*1, ... , *λkςs*) . Choose also *a* as some matrix (*r* − *s*) × *r* matrix such that (*q*, *a*) is square and nonsingular; many matrices satisfy this criterion, including *q*⊥. The output of Step 2 can be expressed as the following choice of *b*(2)(*z*) :

$$b^{(2)}(z)' = \begin{pmatrix} a'\beta\_0' - (z-1)a'\aleph\_0'A\_1\\ q'\beta\_0' \\ \beta\_1' \end{pmatrix}.\tag{33}$$

**Remark 28** (CI(2,2) cointegration)**.** *This step brings out from φ*(*z*) *some cointegrating relations q β* <sup>0</sup> *that map the I(2) variables directly to I(0) without the help of first differences* Δ*.*

#### *8.3. Step 3 in I*(2) *VAR*

Consider *b*(2)(*z*) in (33) and its high order coefficient matrix

$$w'\_\* = \begin{pmatrix} -a'\bar{a}'\_0 A\_1 \\ q'\beta'\_0 \\ \beta'\_1 \end{pmatrix}.$$

Step 3 requires to find a nonzero matrix *c* such that *c w* <sup>∗</sup> = <sup>0</sup> . Recall that (*β*¯ 0, *β*¯ 1, *β*¯ <sup>2</sup>) is square and nonsingular; hence *c w* <sup>∗</sup> = <sup>0</sup> if and only if, partitioning *<sup>c</sup>* as *<sup>c</sup>* = (*ζ* , *ρ* , *τ* ) one has

$$0'=\mathfrak{c}'w\_\*'\left(\vec{\beta}\_0,\vec{\beta}\_1,\vec{\beta}\_2\right)=\left(\zeta',\rho',\tau'\right)\begin{pmatrix}-a'\aleph\_0'A\_1\vec{\beta}\_0 & -a'\aleph\_0'A\_1\vec{\beta}\_1 & -a'\aleph\_0'A\_1\vec{\beta}\_2\\q' & 0 & 0\\0 & I\_{r\_1} & 0\end{pmatrix}.$$

This equality can be written as

$$
\mathbb{Z}'a'\bar{a}\_0'A\_1(\bar{\beta}\_0, \bar{\beta}\_1) = (\rho'q', \pi'),
\tag{34}
$$

$$
\zeta' a' \mathbb{A}'\_0 \mathbb{A}\_1 \mathbb{B}\_2 = 0. \tag{35}
$$

**Remark 29** (Further degree reductions)**.** *Equation* (35) *requires ζ to be orthogonal to remaining part of the multicointegrating coefficient a α*¯ <sup>0</sup>*A*<sup>1</sup> *in direction of β*2*. In addition ζ also needs to satisfy* (34)*. For some configurations of dimensions,* (34) *could be solvable for* (*ρ* , *τ* ) *in terms of other quantities; in this case* (34) *would not impose further restrictions.*

Let also *ϑ* be any complementary matrix such that (*ζ*, *ϑ*) is square and nonsingular; one possible choice of *ϑ* is *ζ*⊥. The output of Step 3 can be expressed as the following choice of *b*(3)(*z*) :

$$b^{(3)}(z)' = \begin{pmatrix} \theta' a' (\beta\_0' - (z-1)\mathbb{A}\_0' A\_1) \\ \mathbb{S}' a' \beta\_0' \\ q' \beta\_0' \\ \beta\_1' \end{pmatrix} . \tag{36}$$

**Remark 30** (Minimal basis)**.** *This step brings out from φ*(*z*) *some other cointegrating relations ζ a β* <sup>0</sup> *that map the I(2) variables directly to I(0) without the help of first differences* Δ*. Equation* (36) *shows how the canonical system of root functions can be reduced to minimal order.*

**Example 4** (Multicointegration coefficient in the span of *β*2)**.** *Consider the special case when the multicointegrating coefficient α*¯ <sup>0</sup>*A*<sup>1</sup> *satisfies α*¯ <sup>0</sup>*A*<sup>1</sup> = *α*¯ <sup>0</sup>*A*1*Pβ*<sup>2</sup> *, i.e., it has components only in the direction of β*2*. This special case is relevant, because β* <sup>2</sup>Δ*xt* ∼ *I*(1) *while β i* Δ*xt* ∼ *I*(*d*) *with d* ≤ 0 *for i* = 0, 1*.*

*One can see that in this case the conditions in Proposition 6 are not satisfied. In fact* (29) *cannot hold, as α*¯ 0*A*1*β*¯ <sup>0</sup> = 0*. Step 2 is hence empty, and this implies that the rows including q are missing and a* = *I in b*(2)(*z*) *in* (33) *and* (36)*.*

*Applying Step 3, Equation* (34) *is always satisfied by the choice ρ* = 0 *, τ* = 0 *because α*¯ 0*A*1(*β*¯ 0, *β*¯ <sup>1</sup>) = 0*. Equation* (35) *then reads ζ α*¯ 0*A*1*β*¯ <sup>2</sup> = 0*, which is satisfied if and only if δ* := *α*¯ 0*A*1*β*¯ <sup>2</sup> *has reduced rank. In this case, let the rank factorization be δ* = *ψη , with ψ and η of full column rank. One can then let ζ* = *ψ* <sup>⊥</sup> *and choose <sup>ϑ</sup>* <sup>=</sup> *<sup>ψ</sup>*¯ *, so that*

$$b^{(3)}(z)' = \begin{pmatrix} \Psi'\beta\_0' - (z-1)\eta' \\ \zeta'\beta\_0' \\ \beta\_1' \end{pmatrix}. \tag{37}$$

*There are several examples of this separation in the I(2) VAR literature; for example Kongsted (2005) discusses this when r*<sup>0</sup> > *r*2*.*

#### **9. Conclusions**

This paper discusses the notion of cointegrating space for general *I*(*d*) processes. The notion of cointegrating space was formally introduced in the literature by Johansen (1988) for the case of I(1) VAR system. The definition of the cointegrating space is simplest in the *I*(1) case without multicointegration, because there is no need to consider vector polynomials in the lag operator.

Engle and Yoo (1991) introduced the notion of polynomial cointegrating vectors in parallel with the related one of multicointegration in Granger and Lee (1989). However, the literature has not yet discussed the notion of cointegrating space in the general polynomial case; this paper fills this gap.

In this context, this paper recognises that cointegrating vectors are in general root functions, which have been analysed at length in the mathematical and engineering literature, see e.g., Gohberg et al. (1993). This allows to characterise a number of properties of cointegrating vectors.

Canonical systems of root functions are found to provide a basis of several notions of cointegration space in the multicointegrated case. The extended local rank factorization of Franchi and Paruolo (2016) can be used to explicitly derive a canonical system of root functions. This result is constructive, as it gives an explicit way to derive such a basis from the VAR polynomial.

The canonical system of root functions constructed in this way is not necessarily of minimal polynomial degree, however. The three-step procedure of Forney (1975) to reduce this basis to minimal-degree is reviewed and restated in terms of rank factorizations. The application of this procedure to I(2) VAR systems is shown to separate the polynomial and the non-polynomial cointegrating vectors.

**Author Contributions:** The authors contributed equally to the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

*Appendix A.1. Scalar, Vector, Matrix Analytic Functions*

Consider a rational functions *a*(*z*), defined as *a*(*z*) = *c*(*z*)/*d*(*z*) with *c*(*z*) and *d*(*z*) polynomials, where *d*(*z*) ≡ 0. One can ask when *a*(*z*) is analytic on *D*(*zω*, *η*), *η* > 0. The following remark states that this is the case provided *z<sup>ω</sup>* is not a root of *d*(*z*).

**Remark A1** (Rational scalars can be analytic on *D*(*zω*, *η*))**.** *Let a*(*z*) *be a rational function, i.e., a*(*z*) = *c*(*z*)/*d*(*z*) *with c*(*z*) *and d*(*z*) *polynomial; assume also in addition that d*(*z*) *has no root equal to zω. Then a*(*z*) *is analytic on D*(*zω*, *η*)*, for some η* > 0*. In fact, let q* = deg *d*(*z*) *be the degree of d*(*z*)*, and decompose d*(*z*) = ∑*<sup>q</sup> <sup>j</sup>*=<sup>0</sup> *d*◦ *<sup>j</sup> <sup>z</sup><sup>j</sup> as <sup>d</sup>*(*z*) = *<sup>d</sup>*◦ *<sup>q</sup>* ∏*<sup>n</sup> <sup>j</sup>*=1(*z* − *uj*) *kj , where uj are the roots of d*(*z*) *with multiplicity kj, j* = 1, ... , *n using the factor theorem for polynomials, see e.g., Barbeau (1989, p. 56). Then each term* (*z* − *uj*) <sup>−</sup>*kj has an analytic representation on D*(*zω*, *η*)*, η* > 0*, see e.g., Lemma A1 below. Note that this generates an infinite tail in a*(*z*)*, i.e., a*(*z*) *is not polynomial in this case (unless q* = 0*).*

**Lemma A1** (The inverse of a polynomial is analytic away form its roots)**.** *Let <sup>u</sup>*1, ... , *un* <sup>∈</sup> <sup>C</sup> *be the distinct roots of a polynomial <sup>d</sup>*(*z*) *with multiplicities <sup>k</sup>*1, ... , *kn, kj* <sup>∈</sup> <sup>N</sup>*, and let <sup>v</sup>* <sup>∈</sup> <sup>C</sup> *be another point, distinct from uj; then one can pick some radius δ with* 0 < *δ* < min*j*=1,...,*<sup>n</sup> uj* − *v such that d*(*z*)−<sup>1</sup> *is analytic on z* <sup>∈</sup> *<sup>D</sup>*(*v*, *<sup>δ</sup>*)*.*

**Proof.** The polynomial *d*(*z*) can be decomposed as *d*(*z*) = *a* ∏*<sup>n</sup> <sup>j</sup>*=1(*z* − *uj*) *kj* . Next consider each term in the product (*z* − *uj*) *kj* and observe that *<sup>z</sup>* <sup>−</sup> *uj* = (*<sup>z</sup>* <sup>−</sup> *<sup>v</sup>*) <sup>−</sup> (*uj* <sup>−</sup> *<sup>v</sup>*)=(*<sup>v</sup>* <sup>−</sup> *uj*)(<sup>1</sup> <sup>−</sup> *xj*) where *xj* := (*<sup>z</sup>* <sup>−</sup> *<sup>v</sup>*)/(*uj* <sup>−</sup> *<sup>v</sup>*). Define 0 <sup>&</sup>lt; *<sup>δ</sup><sup>j</sup>* <sup>&</sup>lt; *uj* − *v* and note that *xj* < 1 for *<sup>z</sup>* <sup>∈</sup> *<sup>D</sup>*(*v*, *<sup>δ</sup>j*), so that (<sup>1</sup> <sup>−</sup> *xj*)−<sup>1</sup> <sup>=</sup> <sup>∑</sup><sup>∞</sup> *<sup>s</sup>*=<sup>0</sup> *x<sup>s</sup> <sup>j</sup>* for *z* ∈ *D*(*v*, *δj*) and

$$(z - u\_j)^{-k\_j} = (v - u\_j)^{k\_j} \left( \sum\_{s=0}^{\infty} \left( \frac{z - v}{u\_j - v} \right)^s \right)^{k\_j} \qquad z \in D(v, \delta\_j).$$

Hence (*z* − *uj*) <sup>−</sup>*kj* is analytic on *<sup>z</sup>* <sup>∈</sup> *<sup>D</sup>*(*v*, *<sup>δ</sup>j*) for any *<sup>j</sup>* <sup>=</sup> 1, ... , *<sup>n</sup>*, and as a consequence also on *z* ∈ *D*(*v*, *δ*) with 0 < *δ* < min*j*=1,...,*<sup>n</sup> uj* − *v* . This implies that *<sup>d</sup>*(*z*)−<sup>1</sup> <sup>=</sup> *<sup>a</sup>*−<sup>1</sup> <sup>∏</sup>*<sup>n</sup> <sup>j</sup>*=1(*z* − *uj*) <sup>−</sup>*kj* is analytic on *<sup>z</sup>* <sup>∈</sup> *<sup>D</sup>*(*v*, *<sup>δ</sup>*).

Similarly, consider a 1 × *p* vector function *ζ*(*z*) with rational entries. The denominator polynomials in all entries can be collected in a single one, the least common denominator, and hence *ζ*(*z*) has representation *ζ*(*z*) = <sup>1</sup> *<sup>d</sup>*(*z*) *b*(*z*) where *d*(*z*) is a monic polynomial and *b*(*z*) is a 1 × *p* vector polynomial, and *d*(*z*) and *b*(*z*) are relatively prime. The same applies to *p* × *p* rational matrix functions *C*(*z*).

**Example A1** (The least common denominator of bivariate rational vectors)**.** *The least common denominator can be illustrated as follows. Take a* 1 × 2 *rational row vector a*(*z*) = (*a*1(*z*), *a*2(*z*)) = (*c*1(*z*)/*d*1(*z*), *c*2(*z*)/*d*2(*z*))*, where ci*(*z*), *di*(*z*) *are (nonzero) polynomials i* = 1, 2*; then one can find a polynomial d*(*z*) *with lowest degree such that d*(*z*) = *h*1(*z*)*d*1(*z*) = *h*2(*z*)*d*2(*z*) *where hi*(*z*) *are polynomials i* = 1, 2*; d*(*z*) *is the least common multiple of the denominators, i.e., the least common denominator, and one has*

$$a(z)' = \left(\frac{c\_1(z)}{d\_1(z)}, \frac{c\_2(z)}{d\_2(z)}\right) = \left(\frac{c\_1(z)h\_1(z)}{d(z)}, \frac{c\_2(z)h\_2(z)}{d(z)}\right) =: \frac{1}{d(z)}b(z)'$$

*with b*(*z*) := (*b*1(*z*), *b*2(*z*)) := (*c*1(*z*)*h*1(*z*), *c*2(*z*)*h*2(*z*)) *where bj*(*z*) := *cj*(*z*)*hj*(*z*) *are still polynomials, so that b*(*z*) *is a vector polynomial. The vector polynomial b*(*z*) *and the scalar polynomial d*(*z*) *are relatively prime, because there is no scalar polynomial g*(*z*) *that divides both d*(*z*) *and all the elements in b*(*z*) *. The polynomials in b*(*z*) *and d*(*z*) *can still be divided by a scalar in F, so d*(*z*) *can be assumed to be monic.*

**Remark A2** (Rational vector and matrices)**.** *The* 1 × *p analytic vector functions ζ*(*z*) *and p* × *p analytic matrix functions C*(*z*) *can be generated as rational vectors or matrices, as long as their denominator polynomial d*(*z*) *has no root equal to zω. When d*(*z*) *has one root equal to z<sup>ω</sup> with multiplicity m* > 0*, this implies that ζ*(*z*) *or C*(*z*) *have a pole of order m* > 0 *at zω, and ζ*(*z*) *or C*(*z*) *are not analytic on a disk D*(*zω*, *η*) *centered around zω.*

#### *Appendix A.2. Spans of Canonical Systems of Root Functions*

This section considers linear combinations of canonical system of root functions *φ*(*z*) with coefficients in *F*, *F*[*z*] and *F*(*z*). Attention is first given to multiplication of a root function by a rational or polynomial scalar; next generic linear combinations of canonical system of root functions in *φ*(*z*) are considered.

In order to discuss results, the notion of generalized root function is introduced first.

**Definition A1** (Generalized root function)**.** *Let <sup>n</sup>* <sup>∈</sup> <sup>Z</sup> *and <sup>ζ</sup>*(*z*) *be a root function of <sup>C</sup>*(*z*) *at zω and order s, see Definition 3; then*

$$\xi(z)' := (1 - z/z\_{\omega})^n \zeta(z)'$$

*is called a generalized root function of C*(*z*) *at z<sup>ω</sup> with order s and exponent n.*

Observe that this is in line with Definition 5 of generalized cointegrating vectors for rational vectors. The reason for the introduction of the notion of generalized root function is provided by the next proposition.

**Proposition A1** (Multiplication by a scalar)**.** *Let ζ*(*z*) *be a* 1 × *p root function for C*(*z*) *of order s on D*(*zω*, *η*)*, η* > 0*. Then*


*(iii) if a*(*z*) ∈ *F*(*z*)*, a*(*z*) = 0*, then a*(*z*)*ζ*(*z*) *is a generalized root function on D*(*zω*, *η*) *of order s and exponent n* <sup>∈</sup> <sup>Z</sup>*.*

#### **Proof.** Consider first case (*iii*).

(*iii*). *a*(*z*) = *a*1(*z*)/*a*2(*z*) where *ai*(*z*) are relatively prime polynomials, *i* = 1, 2. If *ai*(*z*) has root *<sup>z</sup><sup>ω</sup>* then it admits representation *ai*(*z*)=(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*ni <sup>a</sup> <sup>i</sup>* (*z*) with *ni* <sup>∈</sup> <sup>N</sup><sup>0</sup> and *<sup>a</sup> <sup>i</sup>* (*zω*) = 0, *<sup>i</sup>* <sup>=</sup> 1, 2. Hence *<sup>a</sup>*(*z*) = *<sup>a</sup>*1(*z*)/*a*2(*z*)=(<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*n*1−*n*<sup>2</sup> *<sup>a</sup>*(*z*) where *<sup>a</sup>*(*z*) :<sup>=</sup> *<sup>a</sup>* <sup>1</sup> (*z*)/*a* <sup>2</sup> (*z*) with *<sup>a</sup>*(*zω*) <sup>=</sup> 0 and *<sup>a</sup>*(*z*)*ζ*(*z*) = (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*n*1−*n*<sup>2</sup> *<sup>a</sup>*(*z*)*ζ*(*z*) . The factor (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*n*1−*n*<sup>2</sup> has exponent *n*<sup>1</sup> − *n*2, which can be positive, negative or 0; because *ai*(*z*) are relatively prime polynomials, *i* = 1, 2, either *n*<sup>1</sup> > 0 or *n*<sup>2</sup> > 0 or *n*<sup>1</sup> = *n*<sup>2</sup> = 0. The factor *a*(*z*)*ζ*(*z*) is a generalized root function of order *s*, because *ζ*(*z*) is a root function of order *s* and the scalar factor *a*(*z*) satisfies *a*(*zω*) = 0, so that *a*(*zω*)*ζ*(*zω*) = 0 . This shows that *a*(*z*)*ζ*(*z*) is a generalized root function of order *s* and exponent *n* = *n*<sup>1</sup> − *n*2.

(*ii*). Set *a*2(*z*) = 1 in the proof of *iii*), and note that the exponent is *n*1, which is either 0 or positive.

(*i*). Set *a*1(*z*) = *a*, *a*2(*z*) = 1 in the proof of *iii*), and note that the exponent is *n*<sup>1</sup> = 0.

**Remark A3** (A generalized root function is meromorphic)**.** *A generalized root function ξ*(*z*) *is analytic on D*(*zω*, *η*) *except for the possibility to have poles at the isolated point zω, i.e., it is a meromorphic function on D*(*zω*, *η*)*.*

**Remark A4** (A generalized root function can be analytic)**.** *When the exponent n of ξ*(*z*) *is zero, the generalized root function ξ*(*z*) *coincides with the root function ζ*(*z*) *. When the exponent n of ξ*(*z*) *is positive, then the generalized root function ξ*(*z*) *has a zero at zω. In both cases ξ*(*z*) *is analytic. So a generalized root function can be analytic (with or without a zero at zω).*

**Remark A5** (Generalized root function and cointegration)**.** *Observe that Definition A1 implies the following: given a meromorphic function ξ*(*z*) *, check if it has a root or a pole at zω; this function is a generalized root functions if, after removing the pole or the zero at zω by multiplying it by* (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)−*<sup>n</sup> where <sup>n</sup> is the order of the root or of the pole, the resulting function is a root function, i.e., a cointegrating vector. This is in line with Definition 5.*

Attention is now turned to linear combinations of a canonical system of root functions *φ*(*z*) . The scalars of the linear combination can be in *F*, *F*[*z*] or *F*(*z*). The main result in Proposition A2 below is that *F*[*z*]-linear combinations of *φ*(*z*) generate a generalized root function possibly with a zero at *zω*, while *F*(*z*)-linear combinations of *φ*(*z*) generate a generalized root function possibly with a pole or a zero at *zω*.

In the following, let *<sup>v</sup>* = (*v*1, ... , *vm*) <sup>∈</sup> *<sup>F</sup><sup>m</sup>* be a 1 <sup>×</sup> *<sup>m</sup>* vector with elements in *F*. Let also *Av* be the set of non-zero entries in *v*, *Av* := {*i* : *vi* = 0}, with *nv* the cardinality of *Av* and (*i*1, ... , *inv* ) the ordered set of indices in *Av*, *i*<sup>1</sup> < ··· < *inv* , *ij* ∈ *Av*. Similarly, let *w*(*z*) = (*w*1(*z*), ... , *wm*(*z*)) ∈ *F*[*z*] *<sup>m</sup>* be a 1 <sup>×</sup> *<sup>m</sup>* vector with polynomial elements in *wi*(*z*) ∈ *F*[*z*] with (*j*1, ... , *jnw* ) its ordered set of indices of nonzero elements in *Aw* :<sup>=</sup> {*<sup>i</sup>* : *wi*(*z*) <sup>=</sup> <sup>0</sup>}, and let finally *<sup>u</sup>*(*z*) = (*u*1(*z*), ... , *um*(*z*)) <sup>∈</sup> *<sup>F</sup>*(*z*)*<sup>m</sup>* be a 1 <sup>×</sup> *<sup>m</sup>* vector with rational elements in *ui*(*z*) ∈ *F*(*z*) with (*k*1, ... , *knu* ) as its ordered set of indices of nonzero elements in *Au* := {*i* : *ui*(*z*) = 0}.

**Proposition A2** (Linear combinations)**.** *Let φ*(*z*) = (*φ*1(*z*), ... *φm*(*z*)) *be a canonical system of root functions of <sup>C</sup>*(*z*) *on a disc <sup>D</sup>*(*zω*, *<sup>η</sup>*)*, <sup>η</sup>* <sup>&</sup>gt; <sup>0</sup> *with orders <sup>s</sup>*1, ... ,*sm; let also <sup>v</sup>* <sup>∈</sup> *<sup>F</sup>m, w*(*z*) ∈ *F*[*z*] *<sup>m</sup> and u*(*z*) <sup>∈</sup> *<sup>F</sup>*(*x*)*<sup>m</sup> be nonzero vectors; one has:*


*(iii) u*(*z*) *φ*(*z*) = ∑*<sup>m</sup> <sup>k</sup>*=<sup>1</sup> *uk*(*z*)*φk*(*z*) *is a generalized root function, possibly with a pole or a zero at <sup>z</sup>ω, with exponent <sup>q</sup>* <sup>=</sup> min*k*∈*Au* (*qk*) <sup>∈</sup> <sup>Z</sup> *where qk is the order of <sup>z</sup><sup>ω</sup> as a pole or as a zero of uk*(*z*)*, and with order s* = min*k*∈*Au* (*qk* − *<sup>q</sup>* + *sk*) > <sup>0</sup>*.*

**Proof.** (*i*). By definition *<sup>φ</sup>i*(*z*) = <sup>∑</sup>*si <sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> φ ij*, analytic on *D*(*zω*, *η*) and *φ ij* <sup>∈</sup> *<sup>F</sup>p*. One finds ∑*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> *viφi*(*z*) <sup>=</sup> <sup>∑</sup>*si <sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ϕ <sup>j</sup>* with *ϕ <sup>j</sup>* :<sup>=</sup> <sup>∑</sup>*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> *viφ ij* <sup>∈</sup> *<sup>F</sup><sup>p</sup>* because *<sup>F</sup>* is a field, and hence it is closed under multiplication. Hence *v φ*(*z*) is a polynomial with coefficients vectors in *Fp*, of the same form as each *φi*(*z*) , and one finds that

$$\upsilon' \phi(z)' \mathbb{C}(z) = \sum\_{i=1}^{m} v\_i \phi\_i(z)' \mathbb{C}(z) = \sum\_{i \in A\_{\mathbb{F}}} v\_i (z - z\_{\omega})^{s\_i} \widetilde{\phi}\_i(z)' = (z - z\_{\omega})^s \widetilde{\phi}(z)' \tag{A1}$$

where *<sup>s</sup>* :<sup>=</sup> min{*si*<sup>1</sup> , ... ,*siv* }, *<sup>φ</sup>*3(*z*) :<sup>=</sup> <sup>∑</sup>*nv <sup>h</sup>*=1(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*sh*−*svih<sup>φ</sup>*3*ih* (*z*) and *<sup>φ</sup>*3*ih* (*zω*) <sup>=</sup> <sup>0</sup> . Note that because *v* is a nonzero vector, the set *Av* is not empty. Next observe that *φ*3(*zω*) = 0 otherwise this would contradict the property of *φih* (*z*) to be of maximal order and linearly independent from the previous root function *φi*(*z*) for *i* < *ih*. This shows that *v φ*(*z*) is a root function of order *s*.

(*ii*). Consider *w*(*z*) *φ*(*z*) = ∑*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> *wi*(*z*)*φi*(*z*) , where by Proposition A1. (i), one has that *wi*(*z*)*φi*(*z*) is a generalized root function with representation *wi*(*z*)*φi*(*z*) = (1 − *z*/*zω*)*qiw <sup>i</sup>* (*z*)*φi*(*z*) say, with *qi* ≥ 0 and *w <sup>i</sup>* (*z*)*φi*(*z*) a root function of order *si*. Let *q* := min(*qj*<sup>1</sup> , ... , *qjw* ), and note that *w*(*z*) *<sup>φ</sup>*(*z*) = (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*qζ*(*z*) with *<sup>ζ</sup>*(*z*) :<sup>=</sup> <sup>∑</sup>*nw <sup>h</sup>*=1(1 − *z*/*zω*) *qj <sup>h</sup>*−*<sup>q</sup> w jh* (*z*)*φjh* (*z*) . In order to show that *ζ*(*zω*) = 0 , let *Bw* be the set of indices *<sup>j</sup>* ∈ *Aw* with *qj* = *<sup>q</sup>*, and observe that *<sup>ζ</sup>*(*zω*) = <sup>∑</sup>*j*∈*Bw <sup>w</sup> <sup>j</sup>* (*zω*)*φj*(*zω*) where *w <sup>j</sup>* (*zω*) = 0 by construction and *φj*(*zω*) = 0 by the definition of root function. If *ζ*(*zω*) = 0 this would imply that there is a nonzero linear combination of *φ*(*zω*) equal to 0', i.e., that *φ*(*zω*) is not of full row rank, which contradicts the construction in Definition 4. This implies that *ζ*(*zω*) = 0 , and that *w*(*z*) *φ*(*z*) is a generalized root function of order *q*.

Next, because *φj*(*z*) is a root function of order *sj* one has

$$\begin{aligned} \langle \zeta(z)^\prime \mathbb{C}(z) &= \sum\_{h=1}^{n\_w} (1 - z/z\_{\omega})^{q\_{j\_h} - q} w^{\diamond}\_{j\_h}(z) \phi\_{j\_h}(z)^\prime \mathbb{C}(z) \\ &= \sum\_{h=1}^{n\_w} (1 - z/z\_{\omega})^{q\_{j\_h} - q + s\_{j\_h}} w^{\diamond}\_{j\_h}(z) \widetilde{\phi}\_{j\_h}(z)^\prime = (1 - z/z\_{\omega})^s \widetilde{\zeta}(z)^\prime \end{aligned}$$

where *ζ* <sup>3</sup>(*z*) :<sup>=</sup> <sup>∑</sup>*nw <sup>h</sup>*=1(1 − *z*/*zω*) *qj <sup>h</sup>*−*q*+*sj <sup>h</sup>*−*<sup>s</sup> w jh* (*z*)*φ*3*jh* (*z*) . Finally, in order to prove that the order of the generalized root function is *s*, one needs to show that *ζ* 3(*zω*) = 0 . Let *Cw* be the set of indices *j* ∈ *Aw* with *qjh* − *q* + *sjh* = *s*, and observe that *ζ* <sup>3</sup>(*zω*) := <sup>∑</sup>*j*∈*Cw <sup>w</sup> <sup>j</sup>* (*zω*)*φ*3*j*(*zω*) where *w <sup>j</sup>* (*zω*) = 0 and *φj*(*zω*) = 0 as above. If *ζ* 3(*zω*) = 0 , then there exists a nonzero linear combination of *φ*(*zω*) equal to 0 , which would imply the existence of a root function of higher order obtained by combination of the rows in *φ*(*z*) with index *Cw*, which contradict the fact that the orders are chosen to be maximal. This implies that the order of the generalized root function is equal to *s*.

(*iii*). The proof is the same as in *ii*). Note that here *qi* may be negative.

**Remark A6** (Closure with respect to linear combinations)**.** *Proposition A2 shows that F*[*z*] *linear combinations and F*(*z*)*-linear combinations of a canonical systems of root functions φ*(*z*) *produce generalized root functions. Note that φ*(*z*) *is itself a set of generalized root functions (with 0 exponent). Hence, in this sense, generalized root functions are closed under F*[*z*]*-linear combinations and F*(*z*)*-linear combinations.*

**Remark A7** (Spans)**.** *Indicate the set of G-linear combinations φ*(*z*) *as* row*G*(*φ*(*z*) )*, where G* = *F*, *F*(*z*), *F*[*z*]*. It is simple to observe that*

$$\text{row}\_F(\phi(z)') \subset \text{row}\_{F[z]}(\phi(z)') \subset \text{row}\_{F(z)}(\phi(z)').\tag{A2}$$

**Remark A8** (Role of characteristics of canonical system of root functions)**.** *The proof of Proposition A2 reveals that, in order to conclude that a F*[*z*]*- or F*(*z*)*-linear combination of φ*(*z*) *is a generalized root function, the property that φ*(*zω*) *is of full row rank plays a crucial role. In fact, when reaching the equality w*(*z*) *<sup>φ</sup>*(*z*) = (<sup>1</sup> <sup>−</sup> *<sup>z</sup>*/*zω*)*qζ*(*z*) *where <sup>q</sup> is the exponent of the linear combination, one can show that ζ*(*zω*) = 0 *by making use of this property only, without using the maximal orders of the root functions in φ*(*z*) *. This proves the following corollary.*

**Corollary A1** (Linear combinations of a set of root functions)**.** *Replace the canonical system of root functions φ*(*z*) *in Proposition A2 with a set of m root functions ξ*(*z*) *for C*(*z*) *on D*(*zω*, *η*)*, η* > 0 *such that ξ*(*zω*) *is of full row rank; then the F*[*z*]*- or F*(*z*)*-linear combinations w*(*z*) *ξ*(*z*) *and u*(*z*) *ξ*(*z*) *are generalized root functions with the same exponents as in Proposition A2.*

#### *Appendix A.3. Truncations of Root Functions*

This section discusses how the truncation of a root function still delivers a root function, possibly of lower order. The main implication of this property is that one can take any element in row*G*(*φ*(*z*) ) for *G* = *F*, *F*[*z*], *F*(*z*) and obtain other root functions by truncation, thus enlarging the set of root functions that can be generated from row*G*(*φ*(*z*) ).

Let *ζ*(*z*) := ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup>* be a 1 × *p* root function of order *s* of *C*(*z*) on *D*(*zω*, *η*), *η* > 0, and indicate the truncation of *ζ*(*z*) to a polynomial of degree *r* as *ζ*(*r*)(*z*) := ∑*r <sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ j* ; the remainder *<sup>ζ</sup>*(*z*) <sup>−</sup> *<sup>ζ</sup>*(*r*)(*z*) <sup>=</sup> <sup>∑</sup><sup>∞</sup> *<sup>j</sup>*=*r*+1(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup>* is called the tail of *ζ*(*z*) . The following proposition clarifies that one can modify the tail of a root function without affecting its property to factor some power of (1 − *z*/*zω*) from *C*(*z*). One special case is that one can delete the tail after the order *s* of the root function without changing its order.

**Proposition A3** (Truncations)**.** *Let ζ*(*z*) := ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ζ <sup>j</sup> be a root function of order s for C*(*z*) = ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> Cj on D*(*zω*, *η*)*, η* > 0*, ζ*(*zω*) = 0 *, and let ψ*(*z*) := ∑<sup>∞</sup> *<sup>j</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*<sup>j</sup> ψ j be a* 1 × *p vector function, analytic on D*(*zω*, *η*)*; then*

*(i) for an integer* -≥ 1*, the* 1 × *p row vector ξ*(*z*) *with*

$$\mathfrak{g}(z)' := \zeta(z)' + (z - z\_{\omega})^\ell \psi(z)' \tag{A3}$$

*is still a root function on D*(*zω*, *η*) *of order n* ≥ min(-,*s*)*;*


**Proof.** (*i*). By definition one has *ζ*(*z*) *<sup>C</sup>*(*z*)=(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*s<sup>ζ</sup>* 3(*z*) with *ζ* <sup>3</sup>(*zω*) <sup>=</sup> <sup>∑</sup>*<sup>s</sup> <sup>h</sup>*=<sup>0</sup> *ζ <sup>h</sup>Cs*−*<sup>h</sup>* = 0 . Hence, setting *q* := min(-,*s*), one finds

$$
\zeta(z)'\mathcal{C}(z) = \zeta(z)'\mathcal{C}(z) + (z - z\_{\omega})^\ell \psi(z)'\mathcal{C}(z) = (z - z\_{\omega})^q \widetilde{\zeta}(z)'.
$$

where *ξ* <sup>3</sup>(*z*) = (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*s*−*q<sup>ζ</sup>* <sup>3</sup>(*z*) + (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)-<sup>−</sup>*qψ*(*z*) *C*(*z*). If *ξ* 3(*zω*) = 0 , then *ξ*(*z*) is a root function of order *q*. If, instead, *ξ* 3(*zω*) = 0 , then *ξ*(*z*) is a root function of order *n* greater than *q*; in any case *n* ≥ *q*, with *n* finite by Proposition 2.

(*ii*). Choose *<sup>ψ</sup>*(*z*) = (*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)−-(*ζ*(*z*) <sup>−</sup> *<sup>ζ</sup>*(-)(*z*) ) = ∑<sup>∞</sup> *j*=-<sup>+</sup>1(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*j*−*ζ <sup>j</sup>* in (A3), so that *ξ*(*z*) = *ζ*(-)(*z*) . The statement follows from (*i*).

(*iii*). The coefficients *ζ*(*z*) and *ζ*(*s*−<sup>1</sup>)(*z*) generate the same coefficients *Wh* := ∑*<sup>h</sup> <sup>j</sup>*=<sup>0</sup> *ζ j Ch*−*<sup>j</sup>* for *h* = 0, ... ,*s* − 1 in the convolution *ζ*(*z*) *C*(*z*) = *W*(*z*) := ∑<sup>∞</sup> *<sup>h</sup>*=0(*<sup>z</sup>* <sup>−</sup> *<sup>z</sup>ω*)*hWh*, where *Wh* <sup>=</sup> 0 for *<sup>h</sup>* <sup>=</sup> 0, ... ,*<sup>s</sup>* <sup>−</sup> 1 by definition of order *<sup>s</sup>*, see (8). This implies that *<sup>ζ</sup>*(*s*−<sup>1</sup>)(*z*) is a root function at least of order *s*. However, because root functions in a canonical system of root functions are chosen of maximal order, the order of *ζ*(*s*−<sup>1</sup>)(*z*) is equal to *s*. This completes the proof.

**Remark A9** (Truncated cointegrating vectors)**.** *Proposition A3.* (*ii*) *implies that truncating a cointegrating vector to order* - < *s preserves the cointegrating property, but not necessarily the order s.*

**Remark A10** (Cointegrating vectors in *Iω*(1) VAR can be chosen not to be polynomial)**.** *Consider Example 1, where the orders of integration of (polynomial) linear combinations can be either 1 or 0. In this case, root function are of order at most s* = 1*, and Proposition A3.* (*iii*) *ensures that the root functions can be truncated to order s* − 1 = 0*, i.e., to non-polynomial linear combinations.*

**Remark A11** (A generic *Iω*(1) process may have polynomial cointegration relations)**.** *Consider now the generic case of an I(1) process. The orders of integration of (polynomial) linear combinations can be* 0, −1, −2, ···− *d say, with d* > 0*. In this case, root function are of order at most s* = 1, 2 ... , *d* + 1*, and Proposition A3.* (*iii*) *ensures that the root functions can be truncated to order d. If d* > 0 *this may require polynomial linear combinations also in the Iω*(1) *case.*

**Remark A12** (Polynomial cointegration vectors in *Iω*(2) VAR can be chosen of order at most 1)**.** *Consider Example 2, where the orders of integration of (polynomial) linear combinations can be either 2, 1 or 0. In this case, root function are of order at most s* = 2*, and Proposition A3.* (*iii*) *ensures that the root functions can be truncated to order s* − 1 = 1*, i.e., to polynomial linear combinations of order 1.*

**Remark A13** (Multicointegrated systems require polynomial cointegration relations)**.** *As shown in the previous three remarks, in general, multicointegrated systems require to consider polynomial linear combinations.*

#### **Notes**


<sup>9</sup> The present statement follows by Franchi and Paruolo (2019, Theorem 3.5) with *F*(*z*) and Φ(*z*) there equal to *A*(*z*) and Ξ(*z*)−<sup>1</sup> here.

#### **References**

Barbeau, Edward J. 1989. *Polynomials*. Berlin and Heidelberg: Springer.


Franchi, Massimo, and Paolo Paruolo. 2020. Cointegration in functional autoregressive processes. *Econometric Theory* 36: 803–39.

