**1. Introduction**

High frequency data are those measured in small time intervals. This kind of data is important to study the micro structure of financial markets and also because their use is becoming feasible due to the increase of computational power and data storage.

Perhaps the most popular model used to estimate the volatility in a financial time series is the GARCH(1,1) model; see Engle (1982), Bollerslev (1986):

$$\begin{aligned} r\_t &= \sigma\_t \varepsilon\_t, & \varepsilon\_t &\sim \text{id}\left(0, 1\right) \\ \sigma\_t^2 &= \mu\_0 + \alpha\_1 r\_{t-1}^2 + \beta\_1 \sigma\_{t-1}^2 \end{aligned} \tag{1}$$

with *α*0 > 0, *α*1 ≥ 0, *β*1 ≥ 0, *α*1 + *β*1 < 1.

When we use high frequency data in conjunction with GARCH models, these need to be modified to incorporate the financial market micro structure. For example, we need to incorporate heterogeneous characteristics that appear when there are many traders working in a financial market trading with different time horizons.

The HARCH(*n*) model was introduced by Müller et al. (1997) to try to solve this problem. In fact, this model incorporates heterogeneous characteristics of high frequency financial time series and it is given by

$$\begin{aligned} \sigma\_t &= \sigma\_t \varepsilon\_{t\prime} \\ \sigma\_t^2 &= \varepsilon\_0 + \sum\_{j=1}^n c\_j \left(\sum\_{i=1}^j r\_{t-i}\right)^2 \end{aligned} \tag{2}$$

where *c*0 > 0, *cn* > 0, *cj* ≥ 0 ∀*j* = 1, ... , *n* − 1 and *εt* are identically and independent distributed (i.i.d.) random variables with zero expectation and unit variance.

However, this model has a high computational cost to fit when compared with GARCH models, due to the long memory of volatility, so the number of parameters to be estimated is usually large.

We propose a new model known as the parsimonious heterogeneous autoregressive conditional heteroscedastic model, in short-form PHARCH, as an extension of the HARCH model. Specifically, we call a PHARCH(*m,p*), with aggregations of different sizes *a*1, ... , *am*, where *m* is the number of the market components, the model given by

$$\begin{array}{l} \sigma\_{t} = \sigma\_{t} \varepsilon\_{t} \\ \sigma\_{t}^{2} = \mathbb{C}\_{0} + \mathbb{C}\_{1} \left( r\_{t-1} + \dots + r\_{t-a\_{1}} \right)^{2} + \dots + \\ + \mathbb{C}\_{m} \left( r\_{t-1} + \dots + r\_{t-a\_{m}} \right)^{2} + b\_{1} \sigma\_{t-1}^{2} + \dots + b\_{p} \sigma\_{t-p}^{2} \end{array} \tag{3}$$

where *εt* ∼ i.i.d.(0, <sup>1</sup>), *C*0 > 0, *Cj* ≥ 0, ∀*j* = 1, . . . , *m* − 1, *Cm* > 0, *bj* ≥ 0, *j* = 1, . . . , *p*.

HARCH models are important because they take account the natural behavior of the traders in the market. However they have some problems, mainly because they need to include several aggregations, so the number of parameters to estimate is large, because of the large memory feature of financial time series. Parsimonious HARCH includes only the most important aggregations in its structure, which makes the model more realistic. We can see some simulations in Figure 1, where the characteristics of clustering and volatility are better represented in PHARCH processes than in ARCH or HARCH processes.

**Figure 1.** Simulations of ARCH, HARCH and PHARCH processes.

The organization of the paper is as follows. In Section 2 we provide some background information on Markov chains and give the necessary and sufficient conditions for the PHARCH model to be stationary. In Section 3 we obtain forecasts for the proposed model, and in Section 4 we introduce the data that will be used for illustrative purposes. The actual application is given in Section 5, and we close the paper with some conclusions in Section 6.
