1. Introduction
In econometric studies of time series composed of financial indicators, heteroskedasticity is usually observed in the residuals. This is determined by the fact that some time periods are riskier than others, i.e., fluctuations in terms of random error in one period are more significant compared to fluctuations in another period. This heteroskedasticity of the series and clustering of large fluctuations in certain periods is referred to as volatility clustering in the literature. The parameter estimates obtained with the least squares method are unbiased. However, the standard errors of the parameters are underestimated, and the confidence intervals are in a narrower range, which in turn leads to erroneous decisions about the significance of the parameters.
Financial econometric models are based on different assumptions about the correlation relationships inherent in the time series under analysis, formed by the levels of certain financial indicators (prices, incomes, income growth, etc.). One part of the models assumes that price increases are equivalent to random processes similar in properties to “white noise”. This assumption is at the heart of the so-called “random walk hypotheses” (RWH). Three versions of these hypotheses are described in the scientific literature, which differ from each other in their content of the concept of “white noise” (
Tikhomirov and Dorokhina 2002, p. 347).
The versions of the RWH differ from each other in the presence of certain regularities in the series with functional transformations of these increments (random errors) and in the presence of heteroskedasticity in the residuals. For RWH-1 and RWH-2, the complete absence of correlations between the squares of the random errors (and also of their third, fourth, etc. powers) is assumed. In RWH-2, uncorrelated variance changes are assumed, and with RWH-3, the existence of correlated relationships is confirmed both between the second, third, and higher degrees of the random errors and with respect to the series obtained as the product of the random errors.
Price changes resulting from extraordinary events do not reflect the impact of a number of other more significant events. The information that is continuously released to the market influences, to a considerable extent, the quotation levels and can lead to drastic fluctuations within one business day. The conditional variance of the process can be defined as a random variable whose values at any point in time depend on a number of other variables reflecting the complex market situation. The process reflecting standard deviations in the price level of financial assets is characterised by a much wider range of patterns. Two main hypotheses regarding the variance volatility associated with RWH-3 are considered in the literature. According to the first hypothesis, the values of conditional variances can be viewed as a conditional standard deviation representing a function of previous values. According to the second hypothesis, the conditional variance is independent of the price level, but it is assumed that the variable can be represented as an autoregressive model with chained means, i.e., whereby the value of the indicator at time t is determined by the previous values and the random element.
The object of this study is to analyse the dynamics of the SOFIX prices. SOFIX is the main stock index that tracks the “blue chips” of the Bulgarian Stock Exchange. Since the blue chips are indicators for the whole market, a study of SOFIX can, in practice, substantiate conclusions concerning the Bulgarian economy as a whole. Blue chip stocks are the most liquid stocks on the securities market, and their main positive quality is associated with their liquidity, i.e., with the possibility to sell or buy a significant volume of these stocks at any moment of the trading session on the stock exchange without a significant loss in price.
Three stages can be distinguished in the dynamics of the SOFIX index since its inception on 20.10.2000 (
Minchev 2024a,
2024b). The first stage was connected with the first seven years. It was very intense, and the index increased about 20 times. The second stage covered the period from 2007 to 2009, and during it, as a result of the impact of the World Crisis, there was a sharp decline, and levels of SOFIX dropped in February 2009 to 256, as they were in mid-2003. The third stage has been the longest and has continues until now. After the crisis, the trend has been smooth and upward. The interesting thing about SOFIX is that the impact of the COVID-19 pandemic was not that significant. The index lost a little over 100 units of its value, but the recovery occurred very quickly and at the end of the period amounted to 822 units, which practically reached the levels of 20.08.2005.
Predictions about the future SOFIX levels and, more in particular, about the probability of them reaching their peak values of 2007 in the Bulgarian literature have been made only by financial analysts, while those with scientifically based argumentation are missing.
Minchev (
2024a,
2024b) in his analyses, makes a simple extrapolation forecast of its future quotes. Assuming that the third stage (the lost quinquennium for the Bulgarian Stock Exchange) is already over, and expecting an annual increase of 50% compared to its average of the last twenty years—that is, up to 15% growth per year—he predicts that the SOFlX index will reach its peak of 2000 points in 6.9 years, or in 2031. This forecast assumes that the index will move into the next phase of long-term growth rather than relying on an analysis of time series with appropriate econometric methods and models. In previous studies on the dynamics of SOFIX levels, the application of GARCH methods led to different conclusions.
For example,
Petkov (
2010) finds that GARCH(1,1) is not appropriate for describing SOFIX, while Tsenkov 2011 finds the validity of the EGARCH model.
Petrică et al. (
2017) have found EGARCH and PARCH models to be inappropriate. The authors argue that when studying emerging stock markets, it is more appropriate to use symmetric GARCH models. The results of the analysis obtained by
Ugurlu et al. (
2014) show that GARCH, GJR-GARCH and EGARCH effects do not exist in the Bulgarian Stock Exchange (SOFIX). Most researchers rely on the symmetric GARCH model or various asymmetric models from the GARCH family. This is the main reason why various other kinds of ARMA-GARCH models are used in this study to model SOFIX prices.
In that context, the paper addresses one main research hypothesis (RH): we test whether various ARMA-GARCH models are appropriate in the analysis of SOFIX prices. In order to verify the main hypothesis, empirical research was conducted based on the dataset of the SOFIX index. This study includes several novelties in comparison to other papers focused on similar topics. On the one hand, the data were collected on a daily basis over the period from 20.10.2000 (when the index was introduced) to 28.03.2024. Here, a much longer period is covered, containing 5780 valid observations. On the other hand, new variants with asymmetric GARCH models are also tested. A total of 240 variations of ARMA-GARCH models are evaluated. Finally, with the help of 5000 simulations, the time horizon in which the SOFIX index will reach its peak of 1976.73, realised on 08.10.2007, is predicted.
Considering that as a result of the impact of the World Crisis of 2008 and the COVID-19 pandemic of 2019, there were turbulences in the financial markets and a huge decline in the stock market not only in Bulgaria but also globally, the purpose of this paper is to model the daily returns of the oldest index of the Bulgarian Stock Exchange, SOFIX, since its launch using different variants of ARMA-GARCH models. Adapting the step-by-step guide developed by
Perlin et al. (
2021), based on 5000 simulations, it is predicted when SOFIX will reach the historical peak again and, accordingly, the expected value of the time it will take the market to recover from the current crisis. The extent to which the impact of the 2008 World Crisis and the impact of the COVID-19 pandemic are manifested is also assessed. The main differences regarding the algorithm applied in
Perlin et al. (
2021) and here are observed in the following directions. First, more variants of the GARCH models are tested in this study. In addition to GARCH, EGARCH, and GJR-GARCH, used in the original work, IGARCH and Component GARCH are also approbated in this study. Second, instead of only normal and Student’s
t-distributions of the residuals, six types of distributions are applied here—normal, Student’s
t, GED and their skewed forms. Third, in computing simulations, larger horizons are used, both in terms of covering the period in the past and in terms of the forecast horizon. In order to capture the potential impact of the mentioned crises, all information from the creation of the index to the present moment, i.e., 25 years, is used in the forecasting exercise. The forecast horizon is also much wider and, in this case, covers a period of 75 years. Fourth, two stages are added in the execution of the algorithm. First, the time series constructed from the logarithms of the daily SOFIX returns are subjected to a stationarity test, and second, in the selection of the best model, the results obtained after dividing the period into two sub-periods, training and testing, are considered.
2. Literature Review
The modelling of stock indices traded on the Bulgarian Stock Exchange by GARCH models has been the subject of research in a number of publications by Bulgarian and foreign authors (
Gerunov 2023;
Milinov and Kanaryan 2006;
Patev et al. 2009).
Petkov (
2010) tested the validity of the GARCH(1,1) model with respect to the daily returns of the four stock indices of the Bulgarian Stock Exchange—SOFIX, BG40, BGTR30 and BGREIT. The results of the empirical analysis show that the validity of the standard GARCH(1,1) model is confirmed only with respect to the returns of the BGREIT index. Mainly under the influence of new information, a high degree of persistence of volatility is found, i.e., the impact of shocks on quotes is manifested for a significant period of time after their occurrence. For the BG40 and BGTR30 indices, it is found that GARCH models postulating asymmetry of the series would be more appropriate, while for the SOFIX index, the data confirm the presence of only the ARCH effect.
Paskaleva and Stoykova (
2021) investigate the impact of the world stock markets and the main indices traded on them (France—CAC 40, Germany—DAX, United Kingdom—FTSE 100, Belgium—BEL-20, Romania—BET, Greece—ATHEX20, Portugal—PSI-20, Ireland—ISEQ-20, Spain—IBEX35 and the USA—DJIA) on the Bulgarian stock market and the SOFIX index, respectively, as well as their performance for the period from 03.03.2003 to 30.06.2016. The research period is divided into three sub-periods, pre-crisis (from 03.03.2003 to 29.12.2006), crisis (from 02.01.2007 to 28.12.2012) and post-crisis (from 03.01.2013 to 30.06.2016), and the DCC-GARCH and TGARCH models are tested. The results of the empirical analysis show that the stock markets in the USA and Germany have the most significant impact on the Bulgarian stock market, and this is particularly pronounced during the global financial crisis. It is also found that the stock markets of the United Kingdom, Greece, Ireland, Portugal, Romania and Bulgaria are inefficient, while the German stock exchange could be defined as efficient.
Petrova and Todorov (
2023) estimate and forecast the volatility of the net asset value of 42 Bulgarian investment funds based on daily data for the period 13.07.2020 to 13.07.2023. For this purpose, the authors fit GARCH, EGARCH, TGARCH and GARCH-M models with specification(1,1). The results of the empirical study show that the investment fund with the highest concentration of risk is the Golden Lion Index 30, which was found from the GARCH, EGARCH and GARCH-M models. In fitting the different models, it was also found that the GARCH and EGARCH models successfully optimise the regression parameters of the final equation for all investment funds analysed, resulting in adequate predictions. GARCH-M and TGARCH models were found to be inapplicable for some of the investment funds due to the zero value of the parameters in the regression equations.
Tsenkov and Stoitsova-Stoykova (
2017) estimate the market efficiency of eleven stock markets in Southeastern Europe (SEE)—Bulgaria, Croatia, Greece, Serbia, Slovenia, Turkey, Romania, Montenegro, Macedonia, Banja Luka and Sarajevo (Bosnia and Herzegovina) by applying GARCH, EGARCH, TGARCH and PGARCH models. The study covers the period from 01.01.2004 to 04.11.2015, focusing on the importance of the 2008 global financial crisis for the efficiency of the markets. The results of the empirical study show that eight of the eleven markets analysed were market-inefficient according to the Efficient Market Hypothesis (EMH) throughout the study period. From the pre-crisis to the crisis period, five of the analysed indices decreased their market efficiency. The indices with relatively high market efficiency in the post-crisis period had the highest number compared to the previous periods. Therefore, according to the authors of the study, stock markets in SEE countries are not homogeneous in the context of the Efficient Market Hypothesis.
The results in (
Tsenkov 2011) show the leading influence of the DJIA relative to the dynamics of other indices, which was especially pronounced in the period of the global financial crisis. It is also found that the EMH with respect to the Bulgarian capital market is rejected, as well as the presence and direction of influence on it by developed capital markets. As a result of econometric modelling of the returns and volatility of the studied indices by EGARCH models, it is found that the negative information manifests itself more recently and significantly on the values of the SOFIX index. Differences in the reflection of market information and the determinant influence of the DJIA by the DAX and SOFIX are also found, as well as that 47.28% of the volatility of the Bulgarian index is explained by the impact of the USA index.
Petrică et al. (
2017) investigate the volatility of the stock indices of the London Stock Exchange (FTSE) and the Bulgarian Stock Exchange (SOFIX), i.e., a developed stock market and an emerging stock market, for the period from 04.01.2010 to 27.09.2016 using asymmetric GARCH models, EGARCH and PARCH. The results of the empirical study show that the FTSE index is characterised by a leverage effect, and the most appropriate model to account for conditional variance is the PARCH model(1,1), while for the SOFIX index, EGARCH and PARCH models are found to be inappropriate. Given this, the authors believe that when studying emerging stock markets, which include the Bulgarian Stock Exchange, it is more appropriate to use symmetric GARCH models.
Ugurlu et al. (
2014) model the volatility of stock market returns of the following stock market indices: the Bulgarian Stock Exchange (SOFIX), Prague Stock Exchange Index (PX), Warsaw Stock Exchange (WIG), Budapest Stock Index (BUX) and Istanbul Stock Exchange National 100 Index (XU100). The methodology of the econometric analysis includes GARCH, GJR-GARCH and EGARCH models, and the period of the study is from 08.01.2001 to 20.07.2012. The results of the analysis show that pronounced GARCH effects exist in all stock markets except the Bulgarian stock market and the SOFIX index. Therefore, the authors recommend testing GARCH models of different orders for Bulgaria in subsequent studies. For the other four markets, the authors find that shocks are characterised by persistence, and the impact of old news on volatility is significant, with the Warsaw Stock Exchange having the longest memory of the variance. The empirical results also show that bad news increases volatility and leverage effects in market returns. The authors recommend that multivariate dynamic models should be tested in the future, especially when examining the daily returns of emerging stock markets.
Arneric and Scrabic-Peric (
2018) investigate the presence of weekday effects on major stock indices for the following 10 emerging European stock markets—Romania (BETI), Hungary (BUX), Croatia (CROBEX), Latvia (OMXRGI), Estonia (OMXTGI), the Czech Republic (PX), Slovenia (SBITOP), Bulgaria (SOFIX), Poland (WIG20) and Slovakia (SAX) for the period from 04.01.2007 to 13.05.2015. The results of the econometric analysis prove the significance of the common Monday effects in the mean and variance equations, while the Tuesday effect is significant only in the mean equation. Volatility persistence is also found in the observed emerging stock markets. Based on the results obtained, the authors believe that stock markets in Hungary, Romania, Poland, Slovakia and the Czech Republic are expected to be riskier in the future, while a low level of unconditional volatility characterises those in Estonia, Latvia, Slovenia, Croatia and Bulgaria. In the longer term, high positive cross-correlation is also expected between stock markets in Poland and the Czech Republic, Poland and Hungary, and Poland and Croatia, and negative, but close to zero, between Slovakia and the other countries.
In recent years, a significant number of authors have focused in their academic publications on the impact of the COVID-19 pandemic on the world economy and, in particular, on the development of financial markets.
Khan et al. (
2023) examine the volatility of financial markets during the COVID-19 pandemic. The econometric analysis is based on daily data for the period from 27.11.2018 to 15.06.2021, and the models fitted are GARCH(1,1), GJR-GARCH(1,1) and EGARCH(1,1). The empirical results show high volatility persistence in all financial markets during the COVID-19 pandemic. EGARCH is derived as the most appropriate model to identify the volatility of financial markets before the COVID-19 pandemic, while during the pandemic, as well as for the entire study period, all three models are appropriate to model the financial markets included in the study.
Ouchen (
2023) investigates the possibility of forecasting the daily returns of the major global stock indices (SSE, S&P 500, FTSE 100, DAX, CAC 40 and Nikkei 225) using GARCH and EGARCH models during two separate periods—before COVID-19 (from 01.06.2019 to 30.11.2019) and post-pandemic (from 31.12.2019 to 01.06.2020). The results of the study show that GARCH and EGARCH models are suitable for modelling conditional variance, whose values indicate an increase in stock market volatility during the post-COVID-19 pandemic period compared to the pre-pandemic period. This instability was most pronounced in early January 2020 for the Chinese stock market and in March 2020 for the other five stock markets. During both periods examined, the Frankfurt Stock Exchange was characterised by the highest volatility.
Ncube et al. (
2024) investigate the impact of the COVID-19 pandemic on stock market volatility in sub-Saharan Africa, and more specifically on two large and two small stock exchanges in the region, the Johannesburg Stock Exchange (JSE) and Nigeria Stock Exchange (NGX) and the Zimbabwe Stock Exchange (ZSE) and Lusaka Stock Exchange (LUSE), respectively. The analysis is based on daily stock data, COVID-19 indicators and macroeconomic indicators for the period January 2019 to July 2022. GARCH volatility estimation models and Explainable Artificial Intelligence (XAI) in the form of SHapley Additive exPlanations (SHAP) are fitted to identify significant factors that influenced stock volatility during the pandemic. The empirical results find significant increases in volatility at the onset of the pandemic, with government measures leading to more pronounced effects in larger markets, while population vaccination programs had an impact on reducing volatility. Increased stock market volatility has also been found to result mainly from government measures to contain the spread of the pandemic rather than from its initial manifestation. The authors argue that to reduce the negative effects of the pandemic, governments needed to implement policies to limit the spread of the pandemic’s effects on economic stability and, for investors, policies to diversify investments.
Setiawan et al. (
2021) examine financial market volatility resulting from the COVID-19 pandemic and its impact on returns in two types of economies, emerging (Indonesia) and developed (Hungary), using a GARCH model with specification(1,1). The empirical study covers the period from 27.09.2006 to 31.08.2021. The results show that the COVID-19 pandemic had a negative impact on stock market returns in both developing and developed economies. Stock market volatility is also found to have been more pronounced during the COVID-19 pandemic than during the global financial crisis. Higher negative stock market returns and volatility during the COVID-19 pandemic caused a contraction in economic activity, which has implications for supply and demand shocks. The authors find that advanced economies were able to more easily implement policies to limit the impact of the pandemic compared to countries with more limited financial resources. Fiscal and monetary policies are highlighted as a means of stabilising the economy and maintaining investor confidence in the two stock markets studied.
Tabash et al. (
2024) compare stock returns and volatility in developed (Italy—FTSE Italia All Share, Canada—S&P/TSX, France—CAC 40, Germany—DAX, Japan—Nikkei 225, UK—FTSE 100 and the USA—Nasdaq 100) and emerging (Brazil—BOVESPA, China-SSE Composite Index, India—S&P BSE Sensex, Indonesia—IDX Composite, Mexico—S&P/BMV IPC, Russia—MOEX 10 and Turkey—BIST 100) stock markets between the 2008 financial crisis and the COVID-19 pandemic. The methodology of the study includes GARCH, EGARCH and TGARCH models, and the empirical results show that emerging and developed markets reacted differently to these two crises—while emerging markets reacted similarly in both crisis periods, developed markets responded differently. They were significantly more volatile and sensitive to the effects of the global pandemic of 2019 than to the financial crisis of 2008.
In addition to the application of conventional GARCH models in the study of stock index dynamics, machine learning (ML)-based approaches, as well as hybrid combinations of the two methods, have recently been found to be applicable. A detailed review of the scientific literature on the application of machine learning approaches is presented by
Sezer et al. (
2020) and
Ge et al. (
2022).
Chhajer et al. (
2022) highlight artificial neural networks, support vector machines (SVMs) and long short-term memory (LSTM) as the most widely used machine learning approaches for stock market forecasting.
A significant advantage of ML, and LSTM models in particular, is the systematic ability to account for potential nonlinearity at different time periods. However, these approaches are also characterised by certain drawbacks.
Zhao et al. (
2024) systematise them in three directions. First, the complexity of the models leads to difficult interpretation of the results, which reduces the confidence in their reliability. Second, the models perform very well on the training set, but because of more specific characteristics of the real data, such as nonstationarity, they are not applicable in the general case. Third, well-established neural network architectures are applicable to general problems and are not intentionally constructed to capture specific financial features, such as volatility. GARCH models, on the other hand, are characterised by simplicity, clear interpretation, and satisfactory forecasting performance. In light of this, current scientific research is focused on developing hybrid models that combine the advantages of both approaches. For instance, studying the S&P 500 index in the period from 03.01.2000 to 21.12.2023,
Roszyk and Ślepaczuk (
2024) found that the hybrid LSTM-GARCH model has better predictive abilities than both the classical GARCH model and the LSTM model. Moreover, the predictive ability of the hybrid LSTM-GARCH model improves when the VIX index (The Chicago Board Options Exchange’s Volatility Index), reflecting market sentiment, is included in the model.
Modelling, analysing and, consequently, forecasting the returns of various financial assets, including cryptocurrencies, is the basis for the development of various algorithmic investment strategies. For instance,
Mustapa and Ismail (
2019) chose the ARIMA(2,1,2)-GARCH(1,1) model estimated for the period 2001–2017 as the most suitable for forecasting the monthly values of the S&P 500 index for 2018. Identical results were also reached by
Sun (
2017), investigating the daily returns of various indices, including the S&P 500 index for the period from 01.12.2006 to 01.12.2016.
Vo and Ślepaczuk (
2022) enhance these results by comparing the forecasting performance of the daily returns of the S&P 500 index obtained from ARIMA-GARCH models with other models over the period 01.01.2000–31.12.2019. They find that in the long run, the hybrid ARIMA-GARCH models outperform the standard ARIMA models and the Buy-and-Hold strategy.
Bilyk et al. (
2020) tested the forecasting abilities of different variant GARCH models on daily VIX data for the period 02.01.2013 to 03.10.2019. The results show that the best specifications are fGARCH-TGARCH and GJR-GARCH, and they are comparable to the results of the Buy-and-Hold S&P 500 strategy.
Regarding the development of algorithmic investment strategies based on ML approaches,
Grudniewicz and Ślepaczuk (
2023) investigated selected stock indices, including SOFIX, in the period between 01.01.2002 and 31.03.2023. The results show that ML approaches are more suitable compared to passive strategies in terms of risk-adjusted returns. The Linear Support Vector Machine and Bayesian Generalised Linear Model are identified as the best models. On the other hand,
Ślepaczuk and Zenkova (
2018), testing the capabilities of the SVM algorithm to create an investment strategy for cryptocurrency trading based on daily data for the period from 01.01.2015 to 01.08.2018, failed to propose a strategy that is better than the equally weighted portfolio strategy. Comparing the performance of the LSTM model and classical techniques on daily data for the S&P 500 index for the period 01.01.2000 to 02.05.2020,
Kijewski and Ślepaczuk (
2020) found potential for creating successful investment strategies as a result of combining ML techniques and classical techniques.
As an outcome of the review of the empirical research on the problem considered in the present study, it was found that symmetric and asymmetric GARCH models, machine learning methods, and hybrid combinations of them are applied in modelling stock indices. On the one hand, standard models are characterised by a straightforward interpretation, but on the other hand, they are also subject to drawbacks, some of which are overcome by ML approaches. This provides a background for scientific research to focus on hybrid models that combine the advantages of classical and ML methods. The correct specifications of the different models is a prerequisite for the creation of algorithmic strategies for investing in financial assets.
The global financial crisis and the COVID-19 pandemic have had a significant impact on the stability of stock markets, but it has been characterised by different degrees of intensity in developed and emerging markets. Emerging markets’ instability was similarly demonstrated during both crisis periods, while developed markets’ instability was more pronounced during the COVID-19 pandemic than during the global financial crisis. Regarding the dependence of Bulgarian stock indices on global ones, it has been found that the stock markets in the USA and Germany have the greatest influence on the Bulgarian stock market. This dependence was most pronounced during the global financial crisis.
3. Materials and Methods
Models describing correlations between the squares of the deviations and the average price level have been proposed by
Engle (
1982) and are known in the literature as ARCH models (Autoregressive Conditional Heteroskedasticity models). They are used to predict volatility (variation) in time series and are widely used to assess financial risk by modelling the volatility inherent in financial markets. Periods of high volatility are followed by periods of even higher volatility, and vice versa—periods of low volatility are followed by periods of even lower volatility. This means that volatility manifests itself in the form of clusters that are useful to investors when deciding to purchase relevant financial assets over a period of time. ARCH models are based on the idea that the standard deviation in the time series at the present time is affected by the forecast error in past periods. The higher the value of the error, the higher the value of the standard deviation and vice versa.
ARCH models in the literature have been summarised in different ways. A useful one from a practical point of view is the variant proposed by
Bollerslev (
1986), called the GARCH model (Generalised Autoregressive Conditional Heteroskedasticity model). It measures, models and predicts the volatility (expressed by standard deviation) in the series, which determines their application in risk management areas, in the selection of financial assets in the investment portfolio, or in secondary market pricing. The term "autoregressive conditional heteroskedasticity" means that the model specifies, in addition to the mean, the prior-period dependent variance.
The general form of a GARCH model of order (
p,
q) characterises the dependence of current variance variability (i.e., in period
t) on past changes in the indicators under study and past variance estimates (“old news”):
where
p is the order of the GARCH effect of conditional variance
ht,
q is the order of the ARCH effect of random variance
εt, and
is a constant term.
The GARCH(1,1) model can be represented by the following system of equations:
where
μ is the mean of the return rt;
εt—the random error (shock term), considered as an innovation process with a mean equal to zero and a conditional variance
σt2 or
ht that is dependent on the available information on returns to period
t − 1. The shock term is calculated as follows:
where
ht is conditional volatility and
zt is a white noise error with a mean equal to zero and unit variance
zt~
N(0,1).
Equation (2) specifies the returns equation, known as a conditional mean equation, that specifies return dynamics. Here, it includes the mean and autoregressive (AR) component (reflecting the impact of the previous day’s returns) and the variance estimate, but in practice, it may also include a moving average (MA) component. In this case, we can say that the model is ARMA(1,0)-GARCH(1,1). The decomposition of daily returns into expected conditional mean returns and innovations, which are represented as a standardised “white noise” process with time-varying conditional variance, is very convenient and feasible. On the one hand, all the necessary information is used, and the model is correctly specified. On the other hand, the volatility is known or predetermined at time t − 1.
Equation (3) specifies the volatility (conditional variance), which includes the ARCH (squared residuals of the previous day’s returns or the most recent information) and GARCH factors (autoregressive component characterising the new information). The coefficients α and β must satisfy the stationarity conditions, namely α ≥ 0, β ≥ 0 and α + β ≤ 1. The sum of the parameters α and β reflects the degree of stationarity preservation and is known as persistence.
One of the main limitations of ARCH and GARCH models is their symmetry. They are influenced by the absolute values rather than the actual signs of the variances, i.e., negative and positive changes of the same magnitude have identical influence on the future values of the conditional variance. In asymmetric models, bad and good news have different impacts on future actions. A comprehensive review of the different types of asymmetric GARCH models is given by (
Andersen et al. 2006). In the present study, only those models that have been validated in the following parts of the publication are considered.
Nelson (
1991) proposed an exponential GARCH or EGARCH model that represents the asymmetric effect as a function of standardised innovations. The conditional variance
ht is an asymmetric function of the lags of the random variances
εt−i:
where
αi reflects the influence of the sign of the standard deviations;
γi reflects the effect of the absolute size of the standard deviations;
E|zt| is the expected value of the absolute standardised innovations
zt, which is calculated as follows:
The estimate of the persistence of the conditional variance is calculated as follows:
The main characteristics of the EGARCH model can be stated as follows (
Ghalanos 2022):
The function g(zt) is linear in zt with coefficient θ + 1 when zt is positive, and g(zt) is linear in zt with coefficient θ − 1 when zt is negative.
When θ = 0, large innovations lead to an increase in conditional variance when |zt| − E|zt | > 0 and to a decrease in conditional variance when |zt| − E|zt | < 0.
For θ < 1, the innovation g(zt) in the variance is positive when the innovation zt is less than . Hence, negative innovations in returns εt lead to positive innovations in conditional variance when θ takes values much smaller than 1.
The condition
in the standard GARCH model implies that the process is weakly stationary since the mean, variance and autocovariance take finite and constant values over time. However, this condition is often not satisfied when there is autocorrelation in the residuals. As a result, for a large number of financial indicators, the estimate of the persistence of the conditional variance
turns out to be close to unity. In such cases, the so-called integrated GARCH model, abbreviated IGARCH, proposed by
Engle and Bollerslev (
1986) is used. In particular, the IGARCH(1,1) process can be written as follows:
Glosten et al. (
1993) propose a model (GJR-GARCH) in which positive and negative influences on conditional variance are represented using the indicator function
I:
where
γi represents the “leverage” effect. The indicator function
I takes value 1 when
ε ≤ 0 and value 0 when
ε > 0. The persistence of the conditional variance depends on the type of asymmetry in the conditional distribution used. It is calculated as follows:
where
k is the expected number of standardised residuals
zt that are negative:
By f we denote the standardised conditional density with different parameters reflecting the skewness and shape of the distribution (…). For a symmetric distribution, k takes the value 0.5.
Engle and Lee (
1999) propose a model (Component GARCH or CGARCH) in which the conditional variance is represented as a set of two components (permanent and transitory) that are used to capture short- and long-term effects of shock, respectively. If
qt denotes the permanent component, i.e., the time-varying process, which represents the long-term component of the conditional variance, and
st denotes the transitory component of the conditional variance, the CGARCH(1,1) model can be written as follows (
Liu and Shi 2022):
The sum of α + β measures the autoregressive persistence of the transitory component, and ρ measures the autoregressive persistence of the permanent component. The immediate impacts of volatility shocks () on the short-run component is reflected by α and that on the long-run component by φ. It is constrained as (α + β) < ρ to distinguish between the two components. The persistence of qt should be stronger than that of st. In order to guarantee that the immediate impact of volatility shocks on the long-run component is smaller than that on the short-run component, we let α ≥ φ. The stationarity of the CGARCH(1,1) process is present if 0 < (α+β) < ρ < 1, 0 < φ< β, 0 < φ < α and ω > 0.
Testing the validity of GARCH models is reduced to verifying the existence of their characteristic roots. For this purpose, the
LR test (
LR = TR2) is used, followed by a distribution with degrees of freedom (
φ) equal to the sum of the lag variables used in the models for
εt−i and
ht−j.
T is the number of observations in the time series, and
R2 is the coefficient of determination corresponding to the applied conditional variance model (
ht). If the empirical value of the
LR test is greater than the critical value of
, the model is assumed to be adequate. The parameter estimates are computed using the maximum likelihood method by maximising the function:
Taking the natural logarithm of
LL, we obtain the log-likelihood function:
Parameter estimates using the maximum likelihood method are obtained by maximising the specified likelihood function, which raises the question of the proper selection of the conditional distribution of
εt. In the present study, the most commonly used distributions in the specialised literature, i.e., normal, Student’s
t-distribution, Generalised Error Distribution (GED) and their skewed variants, are applied (see
Table 1).
The designations in the table are as follows:
4. Results
Initial data preparation;
Calculation of descriptive statistics for SOFIX;
Testing for stationarity;
Checking for ARCH effects;
Testing different variations of GARCH models;
Selection of the best ARMA-GARCH model for SOFIX analysis, including changes in the variance equation and distribution parameters;
Simulating and forecasting future SOFIX levels.
4.1. The Data
This study tests the validity of different versions of GARCH models with respect to the logarithms of daily return of the SOFIX index traded on the Bulgarian Stock Exchange, which is calculated as follows:
where
Pt is the index price for day t at market close;
Pt−1 is the index price for the previous day at market close.
The calculation of SOFIX started on 20.10.2000 with a base value of 100. It includes the most liquid issues of 15 Bulgarian joint-stock companies with a capitalisation of at least BGN 40 million and at least 500 shareholders. This study covers the period from the launch of the index until 28.03.2024. Data for the period from 26.11.2001 were obtained using the website
https://stooq.com (accessed on 30 March 2024), while data for the first year were sourced from a previous study (
Petkov 2010).
4.2. Descriptive Statistics and Figures for SOFIX
Figure 1 shows the evolution of the SOFIX index, Panel (a), along with its daily log returns, Panel (b).
Figure 1 presents the value of the SOFIX index in Panel (a) and its corresponding log return in Panel (b). The red circles in Panel (b) indicate the largest 10 absolute (positive or negative) price changes found in the time series of log returns. Using Panel (b) in
Figure 1, we can see that most of the values for the returns hover around zero. Large changes in prices, both in the downward and upward order, alternate in relatively short periods. That pattern of behaviour is noted in the literature as volatility clustering. Of the ten highest absolute values in price changes, seven occurred at the start of the index launch, two (on 22.1.2008 and 19.11.2008) were recorded in the 2009 financial crisis, and one at the start of the COVID-19 pandemic (10.3.2020, two days after the date of the authorities declaring an emergency). Only two absolute changes at the beginning of the period have a positive sign, and all others are negative. These effects should be considered when modelling returns, which fully justifies the use of GARCH models.
Table 2 shows descriptive statistics of the SOFIX return. The most important values are skewness, kurtosis and Jarque–Bera statistics. Positive skewness means that the distribution has a long right tail, and negative skewness implies that the distribution has a long left tail. If the kurtosis is more than 3, the distribution is peaked (leptokurtic), and if the kurtosis is less than 3, the distribution is flat (platykurtic) relative to the normal. The Jarque–Bera test is a goodness-of-fit test of whether sample data have a normal distribution. It is distributed with 2 degrees of freedom.
The SOFIX return has negative skewness and high positive kurtosis. These values define that its distribution has a long left tail and it is leptokurtic. Jarque–Bera (JB) statistics reject the null hypothesis of normal distribution at the 1% level of significance.
4.3. Testing for Unit Roots
The stationarity of the SOFIX index was investigated using the augmented Dickey and Fuller (
Dickey and Fuller 1981) test (ADF), Phillips and Perron (
Phillips and Perron 1988) test (PP), Elliott, Rothenberg and Stock (
Elliott et al. 1996) test (ADF-GLS) and Kwiatkowski, Phillips, Schmidt and Shine (
Kwiatkowski et al. 1992) test (KPSS). These four tests differ with respect to the null and alternative hypotheses. In the ADF, PP and ADF-GLS tests, the null hypothesis is that the series has a unit root, and the alternative is that it is stationary, compared to the KPSS test, where the null and alternative hypotheses are reversed. The main difference between the Phillips and Perron (PP) test and the ADF test lies in the way in which serial correlation and heteroskedasticity in the errors are accounted for. While the ADF test uses parametric autoregression to approximate the ARMA structure of the errors in the test regression, the PP test ignores serial correlation in the test regression. The first advantage of the PP test is its robustness in offering general forms of heteroskedasticity in the error term. Its other main advantage is that the user does not have to specify a lag length for the test regression.
Arltová and Fedorová (
2016), in their study analysing the main problems in the use of unit root tests based on the statements of
Pesaran (
2015) and
Zivot and Wang (
2006), confirm that the main problem of all unit root tests is related to their dependence on the length of the time series under analysis. They point out that the other advantage of the PP test, due to asymptotic theory, is that the test is designed to test unit roots in long time series. The main disadvantage of the ADF and PP tests is their power when the unit root value approaches unity. In these cases, the probability of not rejecting the null hypothesis even when it is not true is high (
Enders 1994, p. 251). Although it is argued that the KPSS test, due to the reversal of the null and alternative hypotheses, avoids the disadvantage associated with power in the ADF and PP tests (
Charteris and Strydom 2011, p. 58), according to
Caner and Kilian (
2001), even the KPSS test suffers from a similar disadvantage. Moreover, the power of the three tests has been shown to be lower in the case where a linear deterministic trend is included in the test regression model. This is why
Arltová and Fedorová (
2016) recommend the ADF-GLS test and the NGP test proposed by (
Ng and Perron 1995), where the power problem is absent. Using simulation methods, the authors perform a comparative analysis for appropriate tests for different lengths of time series and values of the parameter in front of the unit root phi 1. They find that in the case of very long time series (for T = 500), the results of all the analysed tests were very similar. The best results for phi 1 < 0.9 were achieved by the NGP, ADF and PP tests, and for phi 1 > 0.9 by the ADF-GLS and NGP tests.
As can be seen from the results presented in
Table 3, all the tests conclude that the SOFIX series are stationary with respect to the first differences. In terms of levels, ADF, PP and ADF-GLS confirm stationarity, while with KPSS, it is assumed that a unit root is present. Since the two criteria (ADF and PP) where the null hypothesis assumes that there is a unit root have problems in terms of their power, i.e., the null hypothesis is not true, but it is not rejected, and in this case, the situation is reversed, the log returns were used as the basis for estimating the conditional mean and variance equations. This conclusion is confirmed by the ADF-GLS test, in which the power problems are absent.
4.4. Testing for ARCH Effects
Testing for ARCH effects is performed using the Lagrange multiplier (LM) test (
Engle 1982), which is based on a regression model describing the relationship between residual components and their lags. The null hypothesis states that all coefficients prior to lag variables are statistically insignificant, i.e., there are no ARCH effects in the analysed time series.
Table 4 reports the results of the LM-ARCH test with a lag order of up to 5, the number of working days in the week, which is enough to capture the volatility memory in daily returns.
The p-value of the LM test in each of the lag orders tested is less than 0.01. This means that with a 1% risk of error, we can conclude that the probability of the existence of an ARCH effect in the time series composed of the SOFIX index values is too high.
Figure 2 exhibits the squared return series of SOFIX for the sample period stated above. The figure shows that there are periods of time when volatility is too high and periods when volatility is characterised by low values.
Figure 2.
Daily squared log returns of SOFIX index. Source: derived by the authors using Gretl 2023 c for Windows.
Figure 2 presents the squared return series of SOFIX for the period from 23.10.2000 to 28.03.2024.
Figure 2.
Daily squared log returns of SOFIX index. Source: derived by the authors using Gretl 2023 c for Windows.
Figure 2 presents the squared return series of SOFIX for the period from 23.10.2000 to 28.03.2024.
Taking also into account the values of the autocorrelation function of the squared daily return that is presented in
Table 5 and
Figure 3, we can draw a general conclusion about the presence of volatility clustering. From a statistical point of view, the volatility clustering implies a strong correlation between the values of squared returns.
The results of volatility clustering and the presence of an ARCH effect in the return series allow us to proceed to the next stage of the algorithm execution, namely analysing GARCH-type models, which involves estimating the conditional variance.
4.5. Estimating a GARCH Model
The estimation of GARCH models
1 is performed by determining the number of lags in each part, specifying the mean and the variance equations, as well as the parameter distribution. In our case, five variants of the models are included, each of which is tested assuming the validity of six distributions—normal, Student’s
t, GED and their skewed forms. A total of 30 models with only a constant in the conditional mean equation included (that is, neither AR nor MA components are included) are evaluated. The number of models tested corresponds to the combination of five variants of a pure GARCH model (i.e., not including lags in the AR and MA components of the mean equation but considering only a constant), each estimated under six different distributions. The best 10 models according to the BIC and AIC are presented in
Table 6.
Analysing the values of the information criteria, we find that the top 10 models lie in the range of 0.008–0.009. Four of them are Component GARCH models estimated with the Student’s t-distribution and GED, as well as their skewed variants. The remaining six models are standard GARCH, EGARCH and IGARCH, estimated with the standard and skewed Student’s t-distributions. By focusing on the alpha1 and beta1 estimates, we can eliminate the validity of the two IGARCH models because beta1 is statistically insignificant. In the standard GARCH model, the parameters are significant, but their sum is 0.999, raising reasonable doubts about satisfying the stationarity requirement. In the two EGARCH models, the alpha1 parameters, in addition to being statistically insignificant, are also negative in sign, which does not satisfy the stationarity conditions, casting doubt on their validity. In all four versions of the CGARCH models, the alpha1 and beta1 parameters are statistically significant at 1% risk of error, and their sum is less than 1. The parameters pho and phi are also statistically significant, which strengthens the conclusion about the adequacy of the CGARCH model in forecasting the volatility of SOFIX.
4.6. Selection of the Best ARMA-GARCH Model
In this stage, when selecting the best model, the two principal components, AR and MA, are included in the conditional mean equation in addition to the constant. Selection of the best ARMA-GARCH model for SOFIX analysis, including changes in the mean and variance equations and distribution parameters, is performed using the BIC and AIC. In the specification, one lag order is chosen as the maximum value for the AR and MA components in the equation of means, while in the volatility equation for the ARCH component, the maximum lag order is equal to two, and for the GARCH component, the maximum lag is one. A total of 240 models were fitted. The number of models tested corresponds to the combination of five GARCH model variants, each estimated at six different distributions, allowing for 30 combinations. When specifying the AR (lags 0 and 1) and MA (lags 0 and 1) components of the equation of mean, as well as the ARCH (lags 1 and 2) and GARCH (lag 1) components in the equation of variance, eight combinations (2 × 2 × 2 × 1) are possible, resulting in a total of
240 (30 × 8) combinations of tested models. A sample from the best model selection table that includes only the Student’s
t-distribution is presented in
Figure A1 of
Appendix A. The best 10 models according to the BIC and AIC are presented in
Table 7.
All models were estimated assuming the validity of a standard (seven models) or skewed (three models) Student’s
t-distribution.
Table 7 shows that the top five best models, coloured red, are confirmed by both the BIC and AIC. For the rest, there are minor differences in the ordering. According to AIC, the top ten models are located within 0.006, while according to the BIC, the interval is 0.005. With the inclusion of ARMA components in the mean equation, seven of the ten models are CGARCH, and three are EGARCH. Among the top ten models is the best of the models estimated without the inclusion of AR and MA components, namely ARMA(0,0)-CGARCH, estimated under the assumption of a Student’s
t-distribution.
Table 8 presents the parameter estimates of the models occupying the top six positions. In
Table A1 and
Table A2 in
Appendix B are presented results of the estimation of two GARCH (EGARCH and CGARCH) models to the log returns of SOFIX. Each model has one lag at the AR and MA components in the conditional mean equation, the same variance equation (
q = 2,
p = 1 and
q = 1,
p = 1 respectively) and different distributions of residuals (Student’s
t and skewed
t).
The best six models consist of four CGARCH and two EGARCH models. Analysing the estimates of the alpha and beta parameters, we can reject the adequacy of the EGARCH models since the alpha1 and alpha2 parameters are negative and statistically insignificant. In the four CGARCH models, the constants in the mean and conditional variance equations are statistically insignificant, while the estimates of all other parameters, except the alpha2 parameters, are statistically significant at the 1% risk of error. This implies that the models that include a second lag order with respect to the ARCH component are inadequate. In the end, the SOFIX data show that ARMA(1,1)-CGARCH(1,1), estimated under the assumption of a standard or skewed Student’s t-distribution, turns out to be the most appropriate to describe its dynamics. The parameters pho and phi, for shape and skewness, as well as those in front of the AR and MA components, are statistically significant at the 1% significance level. There are almost no differences in the estimates of the models with standard and skewed t-distributions of the residual component, and starting from the BIC, the ARMA(1,1)-CGARCH(1,1) model estimated with the assumption of the standard Student’s t-distribution is chosen as the most appropriate one for prediction. The autoregressive persistence of the transitory component amounts to 0.904, and that of the permanent component amounts to 0.998. The immediate impact of volatility shocks () on the short-run component is 0.242, and that on the long-run component is equal to 0.060. The stationarity of the CGARCH(1,1) process is present because 0 < (α + β = 0.904) < ρ = 0.998 < 1, 0 < φ = 0.060 < β = 0.662, 0 < φ = 0.060 < α = 0.242 and ω does not take a negative value.
4.7. Forecasting SOFIX Index with ARMA(1,1)-CGARCH(1,1) Model
In order to confirm the correctness of the inference about the best model, the cross-validation methodology is applied. We divide the total period, which consists of 5780 observations, into two sets: training (2000–2020) and testing (2021–2024). The first sub-period covers 4980 observations and lasts from 23.10.2000 to 30.12.2020. The second period consists of 800 observations and runs from 04.01.2021 to 28.03.2024. Thus, forecasts for the test period (2021–2024), calculated based on the results of applying the tested five variants of the GARCH models to the training period (2000–2020), can be used as an alternative way to choose the best model. In this way, the forecasts obtained with the tested GARCH variants can be checked with respect to their quality, and the selection of the best model will not only be based on the information criteria but also on the forecast quality measures.
All models were fitted on the training period (2000–2020) with the inclusion of a constant and one lag for each of the AR and MA components in the mean equation, and the ARCH and GARCH components in the conditional variance equation. Starting from the shape of the distribution of the logarithms of the SOFIX returns, out of the six possible distributions, only the Student’s
t-distribution is used here.
Table 9 demonstrates the forecast performance of five tested GARCH variants made for the testing period (2021–2024). In terms of the error metrics MAE and MSE, the ARMA(1,1)-CGARCH(1,1) model outperforms the other specifications, again confirming the conclusion of the best model determined by the information criteria.
Finally, to confirm the robustness of the estimates obtained with the defined best model, we apply it to different sub-periods arising from the stages in the evolution of the index. The first period coincides with the period we have defined as training, while the second period covers the last two periods in the index dynamics, namely from 3.1.2007 to 28.3.2024. The results are presented in
Table 10.
Tracking the parameter estimates for the three periods (full range, beginning period and ending period), we find that the significance of all parameters is maintained without exception, with only minor changes in their values. For example, the parameter estimates for the AR and MA components of 0.981 and −0.968 over the full range period are reduced to 0.974 and −0.955 in the beginning period and increase to 0.987 and −0.975 in the ending period, respectively. Similarly, the parameter estimates of the lag orders in the variance equation of 0.243 and 0.662 over the entire period change to 0.255 and 0.644 in the initial period and to 0.249 and 0.704 in the final period. The same conclusions can be drawn regarding the parameter estimates related to accounting for the impact of long-term shocks and short-term shocks. All these facts prove the robustness of the model and allow its use for forecasting using simulations.
The Ljung–Box tests (
Ljung and Box 1978) and ARCH LM tests provide a means of testing for autocorrelation within the GARCH models’ standardised squared residuals. Ljung–Box statistics can also be used to test serial autocorrelation among the standardised residuals. If the GARCH model has done its job, there should be no autocorrelation within the residuals. With the null hypothesis of the Ljung–Box test, it is argued that there is no autocorrelation between the residuals for a set of lags. The results show (the
p-values > 0.05) that there is no autocorrelation among the squared residuals for different lags.
The LM ARCH test is actually applied to verify the presence of the ARCH effect. It is used to test the null hypothesis of the adequately fitted ARCH process. The results show (the p-values for each of the lags are above 0.05) that the GARCH process is adequately fitted.
Engle and Ng sign bias tests (
Engle and Ng 1993) provide a means of testing for the mis-specification of conditional volatility models. Specifically, they examine whether the standardised squared residual is predictable using (dummy) variables indicative of certain information. The null hypothesis for these tests is that parameters corresponding to the additional (dummy) variables are equal to 0, i.e., no significant negative and positive reaction shocks. The alternative hypothesis is that the additional parameters are non-zero, which indicates a mis-specification of the model. The results show (the
p-values > 0.05) that there are no significant negative or (and) positive shocks.
The robustness checks performed on the results allow us to use the model identified as the best GARCH model to make predictions using simulations. A number of simulations can be calculated to predict future levels of SOFIX returns. The simulations are run by sequentially imputing the first level of returns in the pre-specified best model and emitting random samples from the residual distributions. This approach allows us to generate time series with an arbitrary forecast horizon, with the simulated series having the same characteristics as the original model. Based on the computed simulations, the forecast probability of reaching the historical SOFIX peak is estimated. The forecast is calculated based on 5000 simulations. The simulations are done on the whole period (2020–2024), consisting of 5780 observations.
Since the purpose of the simulation is to predict the period in which the peak value can be reached, two scenarios are applied here. The first covers the entire period since the creation of the index, i.e., the period from 20.10.2000 to 28.03.2024. This scenario can be defined as realistic. Thus, 28.03.2024 is chosen as the starting point for the simulations. In the second scenario, the starting point coincides with the beginning of the second stage in the development of SOFIX, namely 03.01.2007. The scenario can be formulated as pessimistic. A third scenario (defined as optimistic) is not suitable because it would have to drop the date on which the peak value was realised, and in the simulations, the goal is to predict precisely the reaching of this peak value.
Figure 4 presents the simulation results according to the realistic scenario using the GARCH model selected as the best fit in Panel (a) and the probabilities of SOFIX reaching its peak (2024–2100) in Panel (b).
First, the future values of the return logarithms are simulated, and then these simulations are used to calculate the predicted index values. The black line reflects the evolution of the index, while the grey area reflects the different simulations. The historical peak of SOFIX was reached at 08.10.2007, with the closing value of 1976.73. The last day of the sample of prices is 28.03.2024, which defines the starting point of the simulation. The simulation shows that the trend in SOFIX values is rather upward. In the long run, financial index prices tend to increase in value, and the GARCH model used is able to capture this effect. Although the simulated value of the index may decline in the short term, the probability of reaching the historical peak increases with time. The chart shows that the first simulated price that crosses the peak is too soon after 28.03.2024. In calculating the simulations for each time point, the relative fraction of times in which a value of 1976.73 is reached is calculated. The calculations continue for all future dates, resulting in a vector of probabilities.
Figure 4 Panel (b) presents the actual probabilities that future values of the SOFIX will reach the peak before the global financial crisis in 2008. As might be expected, the probability increases over time. The first date, 08.05.2024, shows the first probability, which is greater than 0.1%. According to the model, the probability of the index peaking in the first two months of the forecast period is almost zero. The chances of reaching the peak value are estimated at 50% after 11.02.2034. The probability increases over time, reaching a 90% probability of an event on 13.08.2087, more than sixty years after the last day in the price sample.
Figure 5 presents the simulation results according to the pessimistic scenario.
According to the pessimistic scenario, the probability of the index reaching the peak value is estimated at 50% after 30.12.2045, i.e., 11 years later than in the realistic scenario. The probability increases over time but much more slowly. The chances of reaching the peak values of 1976.73 are estimated to be only 70% as of 26.07.2087. In other words, the probability of reaching the peak value of the index with the pessimistic version is expected to be fulfilled at least 30 years later, compared to the probability accounted for by the realistic option.
5. Discussion
In the present study, the forecast is based on the ARMA(1,1)-CGARCH(1,1) model with a Student’s
t-distribution identified as the best of the GARCH models applied here—standard GARCH, IGARCH, EGARCH, GJR-GARCH and CGARCH, estimated with the assumption of six distributions (Normal, Student’s
t, GED and their skewed forms). In estimating the forecast, it is found that the volatility of the SOFIX index can be described using two components, reflecting the short-run and long-run effects of shocks, respectively, which, according to
Christoffersen et al. (
2008), allows for its more accurate modelling. Long-term volatility is represented by a permanent component, and a transitory component represents short-term volatility. According to the results, when predicting future values, it is necessary to include both an autoregressive component and a first-order moving average in the mean equation.
The parameter α, reflecting the initial impact of shocks on the transitory component of volatility, is statistically significant at the 1% level and amounts to 0.242. Hence, we can say that the initial impact of shocks on the transitory component in SOFIX volatility is substantially positive. The parameter β reflecting the degree of memory in the transitory component is also statistically significant, and its value is 0.662. The sum of the two parameters is equal to 0.904, which means that the mean of the short-run volatility tends toward zero at a geometric level of 0.904.
In the equation describing the long-run volatility, all parameters are significant except the constant ω. As required, the parameter ρ tends toward 1 and, in this case, amounts to 0.998. This implies that the long-run volatility reaches its mean very slowly, with the effect of historical shocks remaining constant over a long period of time. Since the parameter ρ is larger in value than the sum of the parameters α and β, we can conclude that the persistence in volatility is greater in the long run. The parameter φ is equal to 0.060 and reflects the influence of the time-varying constant component.
From the results obtained, we can confirm the conclusion drawn by
Charteris and Strydom (
2011, p. 60) that the CGARCH model specification supports a more appropriate description of SOFIX volatility than a simple GARCH model. They employ the Component GARCH model to examine the volatility of South African and United States Treasury Bonds and Treasury Bills. In estimating the four treasury securities, asymmetric Component GARCH was applied by incorporating the adjustment inherent in Threshold GARCH into the transitory component of the volatility equation. The asymmetry in volatility is present when the parameter γ is positive and statistically significant. Of the four assets, asymmetry is found only with respect to the S.A. T-Bond. The authors find, as we do in this study, that long-term shocks have a more persistent impact on volatility than the effect of short-term shocks, and furthermore, for the same magnitude, negative shocks to South African Treasury Bonds have a more significant impact on volatility than positive shocks.
Naik and Padhi (
2014), in their study, apply GARCH, EGARCH, GJR-GARCH and asymmetric CGARCH models in estimating the volatility of the S&P CNX Nifty index, one of the benchmark indices of the Indian equity markets. All models are estimated assuming Gaussian normal distribution for GARCH(1,1), GJR-GARCH(1,1), asymmetric CGARCH(1,1) and Generalised Error Distribution (GED) for EGARCH(1,1) models. Based on the AIC and BIC, EGARCH(1,1) is determined to be the best of all asymmetric models. The results show that significant asymmetry in volatility is inherent in the index under study, confirming the leverage effect hypothesis.
In contrast to our study,
Naik and Padhi (
2014) do not focus on the components included in the conditional mean equation but only present results concerning the conditional volatility equation. Furthermore, they do not analyse the models assuming different random error distributions, relying only on the normal distribution and the Generalised Error Distribution (GED). Finally, only GARCH models of order(1,1) were tested; estimation of possibilities for other orders was not done, as in our case.
However, as with any simulation, there is no guarantee that our forecast is the best answer and that the calculated probabilities are realistic, especially for such a long time horizon of sixty years. It should be noted that the GARCH model is only a limited representation of the financial sustainability indicator, and no model can perfectly reproduce the real situation. Moreover, only some of the volatility modelling methods are covered in this study. Not all asymmetric GARCH models are considered, like Bayesian estimation methods or multivariate GARCH models, for example. With other methods, it is possible to obtain different results. These methods and models will be used in future research by the authors to model SOFIX dynamics. An object of future work is also the use of the methodology of Sensitivity Analysis, through which the influence of the share price of each company that is included in the scope of the SOFIX index on its price can be studied.