Econometric Analysis of SOFIX Index with GARCH Models

Petkov, Plamen; Shopova, Margarita; Varbanov, Tihomir; Ovchinnikov, Evgeni; Lalev, Angelin

doi:10.3390/jrfm17080346

Open AccessArticle

Econometric Analysis of SOFIX Index with GARCH Models

by

Plamen Petkov

^1,*

,

Margarita Shopova

¹

,

Tihomir Varbanov

¹,

Evgeni Ovchinnikov

¹

and

Angelin Lalev

²

¹

Department of Statistics and Applied Mathematics, Tsenov Academy of Economics, 5250 Svishtov, Bulgaria

²

Department of Business Informatics, Tsenov Academy of Economics, 5250 Svishtov, Bulgaria

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2024, 17(8), 346; https://doi.org/10.3390/jrfm17080346

Submission received: 26 June 2024 / Revised: 4 August 2024 / Accepted: 5 August 2024 / Published: 10 August 2024

(This article belongs to the Section Economics and Finance)

Download

Browse Figures

Review Reports Versions Notes

Abstract

This paper investigates five different Auto Regressive Moving Average (ARMA) and Generalized Auto Regressive Condition-al Heteroscedacity (GARCH models (GARCH, exponential GARCH or EGARCH, integrated GARCH or IGARCH, Component GARCH or CGARCH and the Glosten-Jagannathan-Runkle GARCH or GJR-GARCH) along with six distributions (normal, Student’s t, GED and their skewed forms), which are used to estimate the price dynamics of the Bulgarian stock index SOFIX. We use the best model to predict how much time it will take, after the latest crisis, for the SOFIX index to reach its historical peak once again. The empirical data cover the period between the years 2000 and 2024, including the 2008 financial crisis and the COVID-19 pandemic. The purpose is to answer which of the five models is the best at analysing the SOFIX price and which distribution is most appropriate. The results, based on the BIC and AIC, show that the ARMA(1,1)-CGARCH(1,1) specification with the Student’s t-distribution is preferred for modelling. From the results obtained, we can confirm that the CGARCH model specification supports a more appropriate description of SOFIX volatility than a simple GARCH model. We find that long-term shocks have a more persistent impact on volatility than the effect of short-term shocks. Furthermore, for the same magnitude, negative shocks to SOFIX prices have a more significant impact on volatility than positive shocks. According to the results, when predicting future values of SOFIX, it is necessary to include both a first-order autoregressive component and a first-order moving average in the mean equation. With the help of 5000 simulations, it is estimated that the chances of SOFIX reaching its historical peak value of 1976.73 (08.10.2007) are higher than 90% at 13.08.2087.

Keywords:

SOFIX; modelling; GARCH; EGARCH; IGARCH; Component GARCH; GJR-GARCH; world crisis; COVID-19 pandemic

1. Introduction

In econometric studies of time series composed of financial indicators, heteroskedasticity is usually observed in the residuals. This is determined by the fact that some time periods are riskier than others, i.e., fluctuations in terms of random error in one period are more significant compared to fluctuations in another period. This heteroskedasticity of the series and clustering of large fluctuations in certain periods is referred to as volatility clustering in the literature. The parameter estimates obtained with the least squares method are unbiased. However, the standard errors of the parameters are underestimated, and the confidence intervals are in a narrower range, which in turn leads to erroneous decisions about the significance of the parameters.

Financial econometric models are based on different assumptions about the correlation relationships inherent in the time series under analysis, formed by the levels of certain financial indicators (prices, incomes, income growth, etc.). One part of the models assumes that price increases are equivalent to random processes similar in properties to “white noise”. This assumption is at the heart of the so-called “random walk hypotheses” (RWH). Three versions of these hypotheses are described in the scientific literature, which differ from each other in their content of the concept of “white noise” (Tikhomirov and Dorokhina 2002, p. 347).

The versions of the RWH differ from each other in the presence of certain regularities in the series with functional transformations of these increments (random errors) and in the presence of heteroskedasticity in the residuals. For RWH-1 and RWH-2, the complete absence of correlations between the squares of the random errors (and also of their third, fourth, etc. powers) is assumed. In RWH-2, uncorrelated variance changes are assumed, and with RWH-3, the existence of correlated relationships is confirmed both between the second, third, and higher degrees of the random errors and with respect to the series obtained as the product of the random errors.

Price changes resulting from extraordinary events do not reflect the impact of a number of other more significant events. The information that is continuously released to the market influences, to a considerable extent, the quotation levels and can lead to drastic fluctuations within one business day. The conditional variance of the process can be defined as a random variable whose values at any point in time depend on a number of other variables reflecting the complex market situation. The process reflecting standard deviations in the price level of financial assets is characterised by a much wider range of patterns. Two main hypotheses regarding the variance volatility associated with RWH-3 are considered in the literature. According to the first hypothesis, the values of conditional variances can be viewed as a conditional standard deviation representing a function of previous values. According to the second hypothesis, the conditional variance is independent of the price level, but it is assumed that the variable can be represented as an autoregressive model with chained means, i.e., whereby the value of the indicator at time t is determined by the previous values and the random element.

The object of this study is to analyse the dynamics of the SOFIX prices. SOFIX is the main stock index that tracks the “blue chips” of the Bulgarian Stock Exchange. Since the blue chips are indicators for the whole market, a study of SOFIX can, in practice, substantiate conclusions concerning the Bulgarian economy as a whole. Blue chip stocks are the most liquid stocks on the securities market, and their main positive quality is associated with their liquidity, i.e., with the possibility to sell or buy a significant volume of these stocks at any moment of the trading session on the stock exchange without a significant loss in price.

Three stages can be distinguished in the dynamics of the SOFIX index since its inception on 20.10.2000 (Minchev 2024a, 2024b). The first stage was connected with the first seven years. It was very intense, and the index increased about 20 times. The second stage covered the period from 2007 to 2009, and during it, as a result of the impact of the World Crisis, there was a sharp decline, and levels of SOFIX dropped in February 2009 to 256, as they were in mid-2003. The third stage has been the longest and has continues until now. After the crisis, the trend has been smooth and upward. The interesting thing about SOFIX is that the impact of the COVID-19 pandemic was not that significant. The index lost a little over 100 units of its value, but the recovery occurred very quickly and at the end of the period amounted to 822 units, which practically reached the levels of 20.08.2005.

Predictions about the future SOFIX levels and, more in particular, about the probability of them reaching their peak values of 2007 in the Bulgarian literature have been made only by financial analysts, while those with scientifically based argumentation are missing. Minchev (2024a, 2024b) in his analyses, makes a simple extrapolation forecast of its future quotes. Assuming that the third stage (the lost quinquennium for the Bulgarian Stock Exchange) is already over, and expecting an annual increase of 50% compared to its average of the last twenty years—that is, up to 15% growth per year—he predicts that the SOFlX index will reach its peak of 2000 points in 6.9 years, or in 2031. This forecast assumes that the index will move into the next phase of long-term growth rather than relying on an analysis of time series with appropriate econometric methods and models. In previous studies on the dynamics of SOFIX levels, the application of GARCH methods led to different conclusions.

For example, Petkov (2010) finds that GARCH(1,1) is not appropriate for describing SOFIX, while Tsenkov 2011 finds the validity of the EGARCH model. Petrică et al. (2017) have found EGARCH and PARCH models to be inappropriate. The authors argue that when studying emerging stock markets, it is more appropriate to use symmetric GARCH models. The results of the analysis obtained by Ugurlu et al. (2014) show that GARCH, GJR-GARCH and EGARCH effects do not exist in the Bulgarian Stock Exchange (SOFIX). Most researchers rely on the symmetric GARCH model or various asymmetric models from the GARCH family. This is the main reason why various other kinds of ARMA-GARCH models are used in this study to model SOFIX prices.

In that context, the paper addresses one main research hypothesis (RH): we test whether various ARMA-GARCH models are appropriate in the analysis of SOFIX prices. In order to verify the main hypothesis, empirical research was conducted based on the dataset of the SOFIX index. This study includes several novelties in comparison to other papers focused on similar topics. On the one hand, the data were collected on a daily basis over the period from 20.10.2000 (when the index was introduced) to 28.03.2024. Here, a much longer period is covered, containing 5780 valid observations. On the other hand, new variants with asymmetric GARCH models are also tested. A total of 240 variations of ARMA-GARCH models are evaluated. Finally, with the help of 5000 simulations, the time horizon in which the SOFIX index will reach its peak of 1976.73, realised on 08.10.2007, is predicted.

Considering that as a result of the impact of the World Crisis of 2008 and the COVID-19 pandemic of 2019, there were turbulences in the financial markets and a huge decline in the stock market not only in Bulgaria but also globally, the purpose of this paper is to model the daily returns of the oldest index of the Bulgarian Stock Exchange, SOFIX, since its launch using different variants of ARMA-GARCH models. Adapting the step-by-step guide developed by Perlin et al. (2021), based on 5000 simulations, it is predicted when SOFIX will reach the historical peak again and, accordingly, the expected value of the time it will take the market to recover from the current crisis. The extent to which the impact of the 2008 World Crisis and the impact of the COVID-19 pandemic are manifested is also assessed. The main differences regarding the algorithm applied in Perlin et al. (2021) and here are observed in the following directions. First, more variants of the GARCH models are tested in this study. In addition to GARCH, EGARCH, and GJR-GARCH, used in the original work, IGARCH and Component GARCH are also approbated in this study. Second, instead of only normal and Student’s t-distributions of the residuals, six types of distributions are applied here—normal, Student’s t, GED and their skewed forms. Third, in computing simulations, larger horizons are used, both in terms of covering the period in the past and in terms of the forecast horizon. In order to capture the potential impact of the mentioned crises, all information from the creation of the index to the present moment, i.e., 25 years, is used in the forecasting exercise. The forecast horizon is also much wider and, in this case, covers a period of 75 years. Fourth, two stages are added in the execution of the algorithm. First, the time series constructed from the logarithms of the daily SOFIX returns are subjected to a stationarity test, and second, in the selection of the best model, the results obtained after dividing the period into two sub-periods, training and testing, are considered.

2. Literature Review

The modelling of stock indices traded on the Bulgarian Stock Exchange by GARCH models has been the subject of research in a number of publications by Bulgarian and foreign authors (Gerunov 2023; Milinov and Kanaryan 2006; Patev et al. 2009). Petkov (2010) tested the validity of the GARCH(1,1) model with respect to the daily returns of the four stock indices of the Bulgarian Stock Exchange—SOFIX, BG40, BGTR30 and BGREIT. The results of the empirical analysis show that the validity of the standard GARCH(1,1) model is confirmed only with respect to the returns of the BGREIT index. Mainly under the influence of new information, a high degree of persistence of volatility is found, i.e., the impact of shocks on quotes is manifested for a significant period of time after their occurrence. For the BG40 and BGTR30 indices, it is found that GARCH models postulating asymmetry of the series would be more appropriate, while for the SOFIX index, the data confirm the presence of only the ARCH effect.

Paskaleva and Stoykova (2021) investigate the impact of the world stock markets and the main indices traded on them (France—CAC 40, Germany—DAX, United Kingdom—FTSE 100, Belgium—BEL-20, Romania—BET, Greece—ATHEX20, Portugal—PSI-20, Ireland—ISEQ-20, Spain—IBEX35 and the USA—DJIA) on the Bulgarian stock market and the SOFIX index, respectively, as well as their performance for the period from 03.03.2003 to 30.06.2016. The research period is divided into three sub-periods, pre-crisis (from 03.03.2003 to 29.12.2006), crisis (from 02.01.2007 to 28.12.2012) and post-crisis (from 03.01.2013 to 30.06.2016), and the DCC-GARCH and TGARCH models are tested. The results of the empirical analysis show that the stock markets in the USA and Germany have the most significant impact on the Bulgarian stock market, and this is particularly pronounced during the global financial crisis. It is also found that the stock markets of the United Kingdom, Greece, Ireland, Portugal, Romania and Bulgaria are inefficient, while the German stock exchange could be defined as efficient.

Petrova and Todorov (2023) estimate and forecast the volatility of the net asset value of 42 Bulgarian investment funds based on daily data for the period 13.07.2020 to 13.07.2023. For this purpose, the authors fit GARCH, EGARCH, TGARCH and GARCH-M models with specification(1,1). The results of the empirical study show that the investment fund with the highest concentration of risk is the Golden Lion Index 30, which was found from the GARCH, EGARCH and GARCH-M models. In fitting the different models, it was also found that the GARCH and EGARCH models successfully optimise the regression parameters of the final equation for all investment funds analysed, resulting in adequate predictions. GARCH-M and TGARCH models were found to be inapplicable for some of the investment funds due to the zero value of the parameters in the regression equations.

Tsenkov and Stoitsova-Stoykova (2017) estimate the market efficiency of eleven stock markets in Southeastern Europe (SEE)—Bulgaria, Croatia, Greece, Serbia, Slovenia, Turkey, Romania, Montenegro, Macedonia, Banja Luka and Sarajevo (Bosnia and Herzegovina) by applying GARCH, EGARCH, TGARCH and PGARCH models. The study covers the period from 01.01.2004 to 04.11.2015, focusing on the importance of the 2008 global financial crisis for the efficiency of the markets. The results of the empirical study show that eight of the eleven markets analysed were market-inefficient according to the Efficient Market Hypothesis (EMH) throughout the study period. From the pre-crisis to the crisis period, five of the analysed indices decreased their market efficiency. The indices with relatively high market efficiency in the post-crisis period had the highest number compared to the previous periods. Therefore, according to the authors of the study, stock markets in SEE countries are not homogeneous in the context of the Efficient Market Hypothesis.

The results in (Tsenkov 2011) show the leading influence of the DJIA relative to the dynamics of other indices, which was especially pronounced in the period of the global financial crisis. It is also found that the EMH with respect to the Bulgarian capital market is rejected, as well as the presence and direction of influence on it by developed capital markets. As a result of econometric modelling of the returns and volatility of the studied indices by EGARCH models, it is found that the negative information manifests itself more recently and significantly on the values of the SOFIX index. Differences in the reflection of market information and the determinant influence of the DJIA by the DAX and SOFIX are also found, as well as that 47.28% of the volatility of the Bulgarian index is explained by the impact of the USA index.

Petrică et al. (2017) investigate the volatility of the stock indices of the London Stock Exchange (FTSE) and the Bulgarian Stock Exchange (SOFIX), i.e., a developed stock market and an emerging stock market, for the period from 04.01.2010 to 27.09.2016 using asymmetric GARCH models, EGARCH and PARCH. The results of the empirical study show that the FTSE index is characterised by a leverage effect, and the most appropriate model to account for conditional variance is the PARCH model(1,1), while for the SOFIX index, EGARCH and PARCH models are found to be inappropriate. Given this, the authors believe that when studying emerging stock markets, which include the Bulgarian Stock Exchange, it is more appropriate to use symmetric GARCH models.

Ugurlu et al. (2014) model the volatility of stock market returns of the following stock market indices: the Bulgarian Stock Exchange (SOFIX), Prague Stock Exchange Index (PX), Warsaw Stock Exchange (WIG), Budapest Stock Index (BUX) and Istanbul Stock Exchange National 100 Index (XU100). The methodology of the econometric analysis includes GARCH, GJR-GARCH and EGARCH models, and the period of the study is from 08.01.2001 to 20.07.2012. The results of the analysis show that pronounced GARCH effects exist in all stock markets except the Bulgarian stock market and the SOFIX index. Therefore, the authors recommend testing GARCH models of different orders for Bulgaria in subsequent studies. For the other four markets, the authors find that shocks are characterised by persistence, and the impact of old news on volatility is significant, with the Warsaw Stock Exchange having the longest memory of the variance. The empirical results also show that bad news increases volatility and leverage effects in market returns. The authors recommend that multivariate dynamic models should be tested in the future, especially when examining the daily returns of emerging stock markets.

Arneric and Scrabic-Peric (2018) investigate the presence of weekday effects on major stock indices for the following 10 emerging European stock markets—Romania (BETI), Hungary (BUX), Croatia (CROBEX), Latvia (OMXRGI), Estonia (OMXTGI), the Czech Republic (PX), Slovenia (SBITOP), Bulgaria (SOFIX), Poland (WIG20) and Slovakia (SAX) for the period from 04.01.2007 to 13.05.2015. The results of the econometric analysis prove the significance of the common Monday effects in the mean and variance equations, while the Tuesday effect is significant only in the mean equation. Volatility persistence is also found in the observed emerging stock markets. Based on the results obtained, the authors believe that stock markets in Hungary, Romania, Poland, Slovakia and the Czech Republic are expected to be riskier in the future, while a low level of unconditional volatility characterises those in Estonia, Latvia, Slovenia, Croatia and Bulgaria. In the longer term, high positive cross-correlation is also expected between stock markets in Poland and the Czech Republic, Poland and Hungary, and Poland and Croatia, and negative, but close to zero, between Slovakia and the other countries.

In recent years, a significant number of authors have focused in their academic publications on the impact of the COVID-19 pandemic on the world economy and, in particular, on the development of financial markets. Khan et al. (2023) examine the volatility of financial markets during the COVID-19 pandemic. The econometric analysis is based on daily data for the period from 27.11.2018 to 15.06.2021, and the models fitted are GARCH(1,1), GJR-GARCH(1,1) and EGARCH(1,1). The empirical results show high volatility persistence in all financial markets during the COVID-19 pandemic. EGARCH is derived as the most appropriate model to identify the volatility of financial markets before the COVID-19 pandemic, while during the pandemic, as well as for the entire study period, all three models are appropriate to model the financial markets included in the study.

Ouchen (2023) investigates the possibility of forecasting the daily returns of the major global stock indices (SSE, S&P 500, FTSE 100, DAX, CAC 40 and Nikkei 225) using GARCH and EGARCH models during two separate periods—before COVID-19 (from 01.06.2019 to 30.11.2019) and post-pandemic (from 31.12.2019 to 01.06.2020). The results of the study show that GARCH and EGARCH models are suitable for modelling conditional variance, whose values indicate an increase in stock market volatility during the post-COVID-19 pandemic period compared to the pre-pandemic period. This instability was most pronounced in early January 2020 for the Chinese stock market and in March 2020 for the other five stock markets. During both periods examined, the Frankfurt Stock Exchange was characterised by the highest volatility.

Ncube et al. (2024) investigate the impact of the COVID-19 pandemic on stock market volatility in sub-Saharan Africa, and more specifically on two large and two small stock exchanges in the region, the Johannesburg Stock Exchange (JSE) and Nigeria Stock Exchange (NGX) and the Zimbabwe Stock Exchange (ZSE) and Lusaka Stock Exchange (LUSE), respectively. The analysis is based on daily stock data, COVID-19 indicators and macroeconomic indicators for the period January 2019 to July 2022. GARCH volatility estimation models and Explainable Artificial Intelligence (XAI) in the form of SHapley Additive exPlanations (SHAP) are fitted to identify significant factors that influenced stock volatility during the pandemic. The empirical results find significant increases in volatility at the onset of the pandemic, with government measures leading to more pronounced effects in larger markets, while population vaccination programs had an impact on reducing volatility. Increased stock market volatility has also been found to result mainly from government measures to contain the spread of the pandemic rather than from its initial manifestation. The authors argue that to reduce the negative effects of the pandemic, governments needed to implement policies to limit the spread of the pandemic’s effects on economic stability and, for investors, policies to diversify investments.

Setiawan et al. (2021) examine financial market volatility resulting from the COVID-19 pandemic and its impact on returns in two types of economies, emerging (Indonesia) and developed (Hungary), using a GARCH model with specification(1,1). The empirical study covers the period from 27.09.2006 to 31.08.2021. The results show that the COVID-19 pandemic had a negative impact on stock market returns in both developing and developed economies. Stock market volatility is also found to have been more pronounced during the COVID-19 pandemic than during the global financial crisis. Higher negative stock market returns and volatility during the COVID-19 pandemic caused a contraction in economic activity, which has implications for supply and demand shocks. The authors find that advanced economies were able to more easily implement policies to limit the impact of the pandemic compared to countries with more limited financial resources. Fiscal and monetary policies are highlighted as a means of stabilising the economy and maintaining investor confidence in the two stock markets studied.

Tabash et al. (2024) compare stock returns and volatility in developed (Italy—FTSE Italia All Share, Canada—S&P/TSX, France—CAC 40, Germany—DAX, Japan—Nikkei 225, UK—FTSE 100 and the USA—Nasdaq 100) and emerging (Brazil—BOVESPA, China-SSE Composite Index, India—S&P BSE Sensex, Indonesia—IDX Composite, Mexico—S&P/BMV IPC, Russia—MOEX 10 and Turkey—BIST 100) stock markets between the 2008 financial crisis and the COVID-19 pandemic. The methodology of the study includes GARCH, EGARCH and TGARCH models, and the empirical results show that emerging and developed markets reacted differently to these two crises—while emerging markets reacted similarly in both crisis periods, developed markets responded differently. They were significantly more volatile and sensitive to the effects of the global pandemic of 2019 than to the financial crisis of 2008.

In addition to the application of conventional GARCH models in the study of stock index dynamics, machine learning (ML)-based approaches, as well as hybrid combinations of the two methods, have recently been found to be applicable. A detailed review of the scientific literature on the application of machine learning approaches is presented by Sezer et al. (2020) and Ge et al. (2022). Chhajer et al. (2022) highlight artificial neural networks, support vector machines (SVMs) and long short-term memory (LSTM) as the most widely used machine learning approaches for stock market forecasting.

A significant advantage of ML, and LSTM models in particular, is the systematic ability to account for potential nonlinearity at different time periods. However, these approaches are also characterised by certain drawbacks. Zhao et al. (2024) systematise them in three directions. First, the complexity of the models leads to difficult interpretation of the results, which reduces the confidence in their reliability. Second, the models perform very well on the training set, but because of more specific characteristics of the real data, such as nonstationarity, they are not applicable in the general case. Third, well-established neural network architectures are applicable to general problems and are not intentionally constructed to capture specific financial features, such as volatility. GARCH models, on the other hand, are characterised by simplicity, clear interpretation, and satisfactory forecasting performance. In light of this, current scientific research is focused on developing hybrid models that combine the advantages of both approaches. For instance, studying the S&P 500 index in the period from 03.01.2000 to 21.12.2023, Roszyk and Ślepaczuk (2024) found that the hybrid LSTM-GARCH model has better predictive abilities than both the classical GARCH model and the LSTM model. Moreover, the predictive ability of the hybrid LSTM-GARCH model improves when the VIX index (The Chicago Board Options Exchange’s Volatility Index), reflecting market sentiment, is included in the model.

Modelling, analysing and, consequently, forecasting the returns of various financial assets, including cryptocurrencies, is the basis for the development of various algorithmic investment strategies. For instance, Mustapa and Ismail (2019) chose the ARIMA(2,1,2)-GARCH(1,1) model estimated for the period 2001–2017 as the most suitable for forecasting the monthly values of the S&P 500 index for 2018. Identical results were also reached by Sun (2017), investigating the daily returns of various indices, including the S&P 500 index for the period from 01.12.2006 to 01.12.2016. Vo and Ślepaczuk (2022) enhance these results by comparing the forecasting performance of the daily returns of the S&P 500 index obtained from ARIMA-GARCH models with other models over the period 01.01.2000–31.12.2019. They find that in the long run, the hybrid ARIMA-GARCH models outperform the standard ARIMA models and the Buy-and-Hold strategy. Bilyk et al. (2020) tested the forecasting abilities of different variant GARCH models on daily VIX data for the period 02.01.2013 to 03.10.2019. The results show that the best specifications are fGARCH-TGARCH and GJR-GARCH, and they are comparable to the results of the Buy-and-Hold S&P 500 strategy.

Regarding the development of algorithmic investment strategies based on ML approaches, Grudniewicz and Ślepaczuk (2023) investigated selected stock indices, including SOFIX, in the period between 01.01.2002 and 31.03.2023. The results show that ML approaches are more suitable compared to passive strategies in terms of risk-adjusted returns. The Linear Support Vector Machine and Bayesian Generalised Linear Model are identified as the best models. On the other hand, Ślepaczuk and Zenkova (2018), testing the capabilities of the SVM algorithm to create an investment strategy for cryptocurrency trading based on daily data for the period from 01.01.2015 to 01.08.2018, failed to propose a strategy that is better than the equally weighted portfolio strategy. Comparing the performance of the LSTM model and classical techniques on daily data for the S&P 500 index for the period 01.01.2000 to 02.05.2020, Kijewski and Ślepaczuk (2020) found potential for creating successful investment strategies as a result of combining ML techniques and classical techniques.

As an outcome of the review of the empirical research on the problem considered in the present study, it was found that symmetric and asymmetric GARCH models, machine learning methods, and hybrid combinations of them are applied in modelling stock indices. On the one hand, standard models are characterised by a straightforward interpretation, but on the other hand, they are also subject to drawbacks, some of which are overcome by ML approaches. This provides a background for scientific research to focus on hybrid models that combine the advantages of classical and ML methods. The correct specifications of the different models is a prerequisite for the creation of algorithmic strategies for investing in financial assets.

The global financial crisis and the COVID-19 pandemic have had a significant impact on the stability of stock markets, but it has been characterised by different degrees of intensity in developed and emerging markets. Emerging markets’ instability was similarly demonstrated during both crisis periods, while developed markets’ instability was more pronounced during the COVID-19 pandemic than during the global financial crisis. Regarding the dependence of Bulgarian stock indices on global ones, it has been found that the stock markets in the USA and Germany have the greatest influence on the Bulgarian stock market. This dependence was most pronounced during the global financial crisis.

3. Materials and Methods

Models describing correlations between the squares of the deviations and the average price level have been proposed by Engle (1982) and are known in the literature as ARCH models (Autoregressive Conditional Heteroskedasticity models). They are used to predict volatility (variation) in time series and are widely used to assess financial risk by modelling the volatility inherent in financial markets. Periods of high volatility are followed by periods of even higher volatility, and vice versa—periods of low volatility are followed by periods of even lower volatility. This means that volatility manifests itself in the form of clusters that are useful to investors when deciding to purchase relevant financial assets over a period of time. ARCH models are based on the idea that the standard deviation in the time series at the present time is affected by the forecast error in past periods. The higher the value of the error, the higher the value of the standard deviation and vice versa.

ARCH models in the literature have been summarised in different ways. A useful one from a practical point of view is the variant proposed by Bollerslev (1986), called the GARCH model (Generalised Autoregressive Conditional Heteroskedasticity model). It measures, models and predicts the volatility (expressed by standard deviation) in the series, which determines their application in risk management areas, in the selection of financial assets in the investment portfolio, or in secondary market pricing. The term "autoregressive conditional heteroskedasticity" means that the model specifies, in addition to the mean, the prior-period dependent variance.

The general form of a GARCH model of order (p,q) characterises the dependence of current variance variability (i.e., in period t) on past changes in the indicators under study and past variance estimates (“old news”):

h_{t} = ω + \sum_{i = 1}^{q} α_{i} ε_{t - i}^{2} + \sum_{j = 1}^{p} β_{j} h_{t - j},

(1)

where p is the order of the GARCH effect of conditional variance h_t, q is the order of the ARCH effect of random variance ε_t, and

ω

is a constant term.

The GARCH(1,1) model can be represented by the following system of equations:

r_{t} = μ + a r_{t - 1} + ε_{t};

(2)

h_{t} = ω + α ε_{t - 1}^{2} + β h_{t - 1},

(3)

where

μ is the mean of the return r_t;

ε_t—the random error (shock term), considered as an innovation process with a mean equal to zero and a conditional variance σ_t² or h_t that is dependent on the available information on returns to period t − 1. The shock term is calculated as follows:

ε_{t} = \sqrt{h_{t}} z_{t},

(4)

where h_t is conditional volatility and z_t is a white noise error with a mean equal to zero and unit variance z_t~N(0,1).

Equation (2) specifies the returns equation, known as a conditional mean equation, that specifies return dynamics. Here, it includes the mean and autoregressive (AR) component (reflecting the impact of the previous day’s returns) and the variance estimate, but in practice, it may also include a moving average (MA) component. In this case, we can say that the model is ARMA(1,0)-GARCH(1,1). The decomposition of daily returns into expected conditional mean returns and innovations, which are represented as a standardised “white noise” process with time-varying conditional variance, is very convenient and feasible. On the one hand, all the necessary information is used, and the model is correctly specified. On the other hand, the volatility is known or predetermined at time t − 1.

Equation (3) specifies the volatility (conditional variance), which includes the ARCH (squared residuals of the previous day’s returns or the most recent information) and GARCH factors (autoregressive component characterising the new information). The coefficients α and β must satisfy the stationarity conditions, namely α ≥ 0, β ≥ 0 and α + β ≤ 1. The sum of the parameters α and β reflects the degree of stationarity preservation and is known as persistence.

One of the main limitations of ARCH and GARCH models is their symmetry. They are influenced by the absolute values rather than the actual signs of the variances, i.e., negative and positive changes of the same magnitude have identical influence on the future values of the conditional variance. In asymmetric models, bad and good news have different impacts on future actions. A comprehensive review of the different types of asymmetric GARCH models is given by (Andersen et al. 2006). In the present study, only those models that have been validated in the following parts of the publication are considered.

Nelson (1991) proposed an exponential GARCH or EGARCH model that represents the asymmetric effect as a function of standardised innovations. The conditional variance h_t is an asymmetric function of the lags of the random variances ε_t−i:

\ln (h_{t}) = ω + \sum_{i = 1}^{q} α_{i} g (z_{t - i}) + \sum_{j = 1}^{p} β_{j} \ln (h_{t - j}),

(5)

where

g (z_{t}) = θ z_{t} + γ [|z_{t}| - E |z_{t}|]

(6)

z_{t} = \frac{ε_{t}}{\sqrt{h_{t}}}

(7)

α_i reflects the influence of the sign of the standard deviations;

γ_i reflects the effect of the absolute size of the standard deviations;

E|z_t| is the expected value of the absolute standardised innovations z_t, which is calculated as follows:

E |z_{t}| = \int_{- \infty}^{\infty} |z| f (z, 0, 1, \dots) dz .

(8)

The estimate of the persistence of the conditional variance is calculated as follows:

\hat{P} = \sum_{j = 1}^{p} β_{j}

(9)

The main characteristics of the EGARCH model can be stated as follows (Ghalanos 2022):

The function g(z_t) is linear in z_t with coefficient θ + 1 when z_t is positive, and g(z_t) is linear in z_t with coefficient θ − 1 when z_t is negative.
When θ = 0, large innovations lead to an increase in conditional variance when |z_t| − E|z_t | > 0 and to a decrease in conditional variance when |z_t| − E|z_t | < 0.
For θ < 1, the innovation g(z_t) in the variance is positive when the innovation z_t is less than $\frac{\sqrt{2 π}}{(θ - 1)}$ . Hence, negative innovations in returns ε_t lead to positive innovations in conditional variance when θ takes values much smaller than 1.
The condition $\hat{P} = \sum_{i = 1}^{q} α_{i} + \sum_{j = 1}^{p} β_{j} < 1$ in the standard GARCH model implies that the process is weakly stationary since the mean, variance and autocovariance take finite and constant values over time. However, this condition is often not satisfied when there is autocorrelation in the residuals. As a result, for a large number of financial indicators, the estimate of the persistence of the conditional variance $\hat{P} = \sum_{i = 1}^{q} α_{i} + \sum_{j = 1}^{p} β_{j}$ turns out to be close to unity. In such cases, the so-called integrated GARCH model, abbreviated IGARCH, proposed by Engle and Bollerslev (1986) is used. In particular, the IGARCH(1,1) process can be written as follows:

h_{t} = ω + (1 - β) ε_{t - 1}^{2} + β h_{t - 1} .

(10)

Glosten et al. (1993) propose a model (GJR-GARCH) in which positive and negative influences on conditional variance are represented using the indicator function I:

h_{t} = ω + \sum_{i = 1}^{q} (α_{i} ε_{t - i}^{2} + γ_{i} I_{t - i} ε_{t - i}^{2}) + \sum_{j = 1}^{p} β_{j} h_{t - j},

(11)

where γ_i represents the “leverage” effect. The indicator function I takes value 1 when ε ≤ 0 and value 0 when ε > 0. The persistence of the conditional variance depends on the type of asymmetry in the conditional distribution used. It is calculated as follows:

\hat{P} = \sum_{i = 1}^{q} α_{i} + \sum_{i = 1}^{q} β_{i} k + \sum_{j = 1}^{p} γ_{j},

(12)

where k is the expected number of standardised residuals z_t that are negative:

k = E [I_{t - i} z_{t - i}^{2}] = \int_{- \infty}^{0} f (z, 0, 1 \dots) d z .

(13)

By f we denote the standardised conditional density with different parameters reflecting the skewness and shape of the distribution (…). For a symmetric distribution, k takes the value 0.5.

Engle and Lee (1999) propose a model (Component GARCH or CGARCH) in which the conditional variance is represented as a set of two components (permanent and transitory) that are used to capture short- and long-term effects of shock, respectively. If q_t denotes the permanent component, i.e., the time-varying process, which represents the long-term component of the conditional variance, and s_t denotes the transitory component of the conditional variance, the CGARCH(1,1) model can be written as follows (Liu and Shi 2022):

h_{t} = q_{t} + s_{t}

(14)

q_{t} = ω + ρ q_{t - 1} + φ (ε_{t - 1}^{2} - h_{t - 1})

(15)

s_{t} = (α + β) s_{t - 1} + α (ε_{t - 1}^{2} - h_{t - 1})

(16)

The sum of α + β measures the autoregressive persistence of the transitory component, and ρ measures the autoregressive persistence of the permanent component. The immediate impacts of volatility shocks (

ε_{t - 1}^{2} - h_{t - 1}

) on the short-run component is reflected by α and that on the long-run component by φ. It is constrained as (α + β) < ρ to distinguish between the two components. The persistence of q_t should be stronger than that of s_t. In order to guarantee that the immediate impact of volatility shocks on the long-run component is smaller than that on the short-run component, we let α ≥ φ. The stationarity of the CGARCH(1,1) process is present if 0 < (α+β) < ρ < 1, 0 < φ< β, 0 < φ < α and ω > 0.

Testing the validity of GARCH models is reduced to verifying the existence of their characteristic roots. For this purpose, the LR test (LR = TR²) is used, followed by a distribution with degrees of freedom (φ) equal to the sum of the lag variables used in the models for ε_t−i and h_t−j. T is the number of observations in the time series, and R² is the coefficient of determination corresponding to the applied conditional variance model (h_t). If the empirical value of the LR test is greater than the critical value of

χ_{m}^{2} (α, ϕ)

, the model is assumed to be adequate. The parameter estimates are computed using the maximum likelihood method by maximising the function:

L L = \prod_{t = 1}^{T} \frac{1}{\sqrt{2 π h_{t}}} \exp (- \frac{ε_{t}^{2}}{2 h_{t}}) .

(17)

Taking the natural logarithm of LL, we obtain the log-likelihood function:

l l = - \frac{1}{2} \sum_{t = 1}^{T} [\ln (2 π) + \ln (h_{t}) + \frac{ε_{t}^{2}}{h_{t}}] .

(18)

Parameter estimates using the maximum likelihood method are obtained by maximising the specified likelihood function, which raises the question of the proper selection of the conditional distribution of ε_t. In the present study, the most commonly used distributions in the specialised literature, i.e., normal, Student’s t-distribution, Generalised Error Distribution (GED) and their skewed variants, are applied (see Table 1).

The designations in the table are as follows:

K (v) = \frac{Γ [\frac{v + 1}{2}]}{\sqrt{π (v - 2)} Γ (\frac{v}{2})},

(19)

C (v) = \frac{v}{2} \sqrt{\frac{Γ [\frac{3}{v}]}{Γ {(\frac{1}{v})}^{3}}},

(20)

k_{v} = \sqrt{2^{- \frac{2}{v}} \frac{Γ [\frac{1}{v}]}{Γ (\frac{3}{v})}},

(21)

a = K (v) . 4 λ (\frac{v - 2}{v - 1}),

(22)

b = \sqrt{1 + 3 λ^{2} - a^{2}},

(23)

λ = \tanh (ξ),

(24)

ζ_{t} = \{\begin{matrix} \frac{b ε_{t} + a}{1 - λ} f o r ε_{t} < - a / b \\ \frac{b ε_{t} + a}{1 + λ} f o r ε_{t} > - a / b \end{matrix},

(25)

(v) = \frac{1}{Γ (\frac{1}{v})} \sqrt{\frac{Γ (\frac{3}{v})}{2 . Γ (\frac{1}{v})} {(\frac{1 + λ}{1 - λ})}^{3} - {[2 λ Γ (\frac{2}{v})]}^{2}},

(26)

4. Results

We adapt the algorithm in estimating GARCH models used by Perlin et al. (2021), and the programming code is available for download on GitHub (https://github.com/ip4-2023/garch-sofix/, accessed on 23 June 2024). The steps are as follows:

Initial data preparation;
Calculation of descriptive statistics for SOFIX;
Testing for stationarity;
Checking for ARCH effects;
Testing different variations of GARCH models;
Selection of the best ARMA-GARCH model for SOFIX analysis, including changes in the variance equation and distribution parameters;
Simulating and forecasting future SOFIX levels.

4.1. The Data

This study tests the validity of different versions of GARCH models with respect to the logarithms of daily return of the SOFIX index traded on the Bulgarian Stock Exchange, which is calculated as follows:

r_{t} = \ln (\frac{P_{t}}{P_{t - 1}})

(27)

where

P_t is the index price for day t at market close;

P_t−1 is the index price for the previous day at market close.

The calculation of SOFIX started on 20.10.2000 with a base value of 100. It includes the most liquid issues of 15 Bulgarian joint-stock companies with a capitalisation of at least BGN 40 million and at least 500 shareholders. This study covers the period from the launch of the index until 28.03.2024. Data for the period from 26.11.2001 were obtained using the website https://stooq.com (accessed on 30 March 2024), while data for the first year were sourced from a previous study (Petkov 2010).

4.2. Descriptive Statistics and Figures for SOFIX

Figure 1 shows the evolution of the SOFIX index, Panel (a), along with its daily log returns, Panel (b).

Figure 1 presents the value of the SOFIX index in Panel (a) and its corresponding log return in Panel (b). The red circles in Panel (b) indicate the largest 10 absolute (positive or negative) price changes found in the time series of log returns. Using Panel (b) in Figure 1, we can see that most of the values for the returns hover around zero. Large changes in prices, both in the downward and upward order, alternate in relatively short periods. That pattern of behaviour is noted in the literature as volatility clustering. Of the ten highest absolute values in price changes, seven occurred at the start of the index launch, two (on 22.1.2008 and 19.11.2008) were recorded in the 2009 financial crisis, and one at the start of the COVID-19 pandemic (10.3.2020, two days after the date of the authorities declaring an emergency). Only two absolute changes at the beginning of the period have a positive sign, and all others are negative. These effects should be considered when modelling returns, which fully justifies the use of GARCH models.

Table 2 shows descriptive statistics of the SOFIX return. The most important values are skewness, kurtosis and Jarque–Bera statistics. Positive skewness means that the distribution has a long right tail, and negative skewness implies that the distribution has a long left tail. If the kurtosis is more than 3, the distribution is peaked (leptokurtic), and if the kurtosis is less than 3, the distribution is flat (platykurtic) relative to the normal. The Jarque–Bera test is a goodness-of-fit test of whether sample data have a normal distribution. It is distributed with 2 degrees of freedom.

The SOFIX return has negative skewness and high positive kurtosis. These values define that its distribution has a long left tail and it is leptokurtic. Jarque–Bera (JB) statistics reject the null hypothesis of normal distribution at the 1% level of significance.

4.3. Testing for Unit Roots

The stationarity of the SOFIX index was investigated using the augmented Dickey and Fuller (Dickey and Fuller 1981) test (ADF), Phillips and Perron (Phillips and Perron 1988) test (PP), Elliott, Rothenberg and Stock (Elliott et al. 1996) test (ADF-GLS) and Kwiatkowski, Phillips, Schmidt and Shine (Kwiatkowski et al. 1992) test (KPSS). These four tests differ with respect to the null and alternative hypotheses. In the ADF, PP and ADF-GLS tests, the null hypothesis is that the series has a unit root, and the alternative is that it is stationary, compared to the KPSS test, where the null and alternative hypotheses are reversed. The main difference between the Phillips and Perron (PP) test and the ADF test lies in the way in which serial correlation and heteroskedasticity in the errors are accounted for. While the ADF test uses parametric autoregression to approximate the ARMA structure of the errors in the test regression, the PP test ignores serial correlation in the test regression. The first advantage of the PP test is its robustness in offering general forms of heteroskedasticity in the error term. Its other main advantage is that the user does not have to specify a lag length for the test regression. Arltová and Fedorová (2016), in their study analysing the main problems in the use of unit root tests based on the statements of Pesaran (2015) and Zivot and Wang (2006), confirm that the main problem of all unit root tests is related to their dependence on the length of the time series under analysis. They point out that the other advantage of the PP test, due to asymptotic theory, is that the test is designed to test unit roots in long time series. The main disadvantage of the ADF and PP tests is their power when the unit root value approaches unity. In these cases, the probability of not rejecting the null hypothesis even when it is not true is high (Enders 1994, p. 251). Although it is argued that the KPSS test, due to the reversal of the null and alternative hypotheses, avoids the disadvantage associated with power in the ADF and PP tests (Charteris and Strydom 2011, p. 58), according to Caner and Kilian (2001), even the KPSS test suffers from a similar disadvantage. Moreover, the power of the three tests has been shown to be lower in the case where a linear deterministic trend is included in the test regression model. This is why Arltová and Fedorová (2016) recommend the ADF-GLS test and the NGP test proposed by (Ng and Perron 1995), where the power problem is absent. Using simulation methods, the authors perform a comparative analysis for appropriate tests for different lengths of time series and values of the parameter in front of the unit root phi 1. They find that in the case of very long time series (for T = 500), the results of all the analysed tests were very similar. The best results for phi 1 < 0.9 were achieved by the NGP, ADF and PP tests, and for phi 1 > 0.9 by the ADF-GLS and NGP tests.

As can be seen from the results presented in Table 3, all the tests conclude that the SOFIX series are stationary with respect to the first differences. In terms of levels, ADF, PP and ADF-GLS confirm stationarity, while with KPSS, it is assumed that a unit root is present. Since the two criteria (ADF and PP) where the null hypothesis assumes that there is a unit root have problems in terms of their power, i.e., the null hypothesis is not true, but it is not rejected, and in this case, the situation is reversed, the log returns were used as the basis for estimating the conditional mean and variance equations. This conclusion is confirmed by the ADF-GLS test, in which the power problems are absent.

4.4. Testing for ARCH Effects

Testing for ARCH effects is performed using the Lagrange multiplier (LM) test (Engle 1982), which is based on a regression model describing the relationship between residual components and their lags. The null hypothesis states that all coefficients prior to lag variables are statistically insignificant, i.e., there are no ARCH effects in the analysed time series. Table 4 reports the results of the LM-ARCH test with a lag order of up to 5, the number of working days in the week, which is enough to capture the volatility memory in daily returns.

The p-value of the LM test in each of the lag orders tested is less than 0.01. This means that with a 1% risk of error, we can conclude that the probability of the existence of an ARCH effect in the time series composed of the SOFIX index values is too high.

Figure 2 exhibits the squared return series of SOFIX for the sample period stated above. The figure shows that there are periods of time when volatility is too high and periods when volatility is characterised by low values.

Figure 2. Daily squared log returns of SOFIX index. Source: derived by the authors using Gretl 2023 c for Windows. Figure 2 presents the squared return series of SOFIX for the period from 23.10.2000 to 28.03.2024.

Taking also into account the values of the autocorrelation function of the squared daily return that is presented in Table 5 and Figure 3, we can draw a general conclusion about the presence of volatility clustering. From a statistical point of view, the volatility clustering implies a strong correlation between the values of squared returns.

The results of volatility clustering and the presence of an ARCH effect in the return series allow us to proceed to the next stage of the algorithm execution, namely analysing GARCH-type models, which involves estimating the conditional variance.

4.5. Estimating a GARCH Model

The estimation of GARCH models1 is performed by determining the number of lags in each part, specifying the mean and the variance equations, as well as the parameter distribution. In our case, five variants of the models are included, each of which is tested assuming the validity of six distributions—normal, Student’s t, GED and their skewed forms. A total of 30 models with only a constant in the conditional mean equation included (that is, neither AR nor MA components are included) are evaluated. The number of models tested corresponds to the combination of five variants of a pure GARCH model (i.e., not including lags in the AR and MA components of the mean equation but considering only a constant), each estimated under six different distributions. The best 10 models according to the BIC and AIC are presented in Table 6.

Analysing the values of the information criteria, we find that the top 10 models lie in the range of 0.008–0.009. Four of them are Component GARCH models estimated with the Student’s t-distribution and GED, as well as their skewed variants. The remaining six models are standard GARCH, EGARCH and IGARCH, estimated with the standard and skewed Student’s t-distributions. By focusing on the alpha1 and beta1 estimates, we can eliminate the validity of the two IGARCH models because beta1 is statistically insignificant. In the standard GARCH model, the parameters are significant, but their sum is 0.999, raising reasonable doubts about satisfying the stationarity requirement. In the two EGARCH models, the alpha1 parameters, in addition to being statistically insignificant, are also negative in sign, which does not satisfy the stationarity conditions, casting doubt on their validity. In all four versions of the CGARCH models, the alpha1 and beta1 parameters are statistically significant at 1% risk of error, and their sum is less than 1. The parameters pho and phi are also statistically significant, which strengthens the conclusion about the adequacy of the CGARCH model in forecasting the volatility of SOFIX.

4.6. Selection of the Best ARMA-GARCH Model

In this stage, when selecting the best model, the two principal components, AR and MA, are included in the conditional mean equation in addition to the constant. Selection of the best ARMA-GARCH model for SOFIX analysis, including changes in the mean and variance equations and distribution parameters, is performed using the BIC and AIC. In the specification, one lag order is chosen as the maximum value for the AR and MA components in the equation of means, while in the volatility equation for the ARCH component, the maximum lag order is equal to two, and for the GARCH component, the maximum lag is one. A total of 240 models were fitted. The number of models tested corresponds to the combination of five GARCH model variants, each estimated at six different distributions, allowing for 30 combinations. When specifying the AR (lags 0 and 1) and MA (lags 0 and 1) components of the equation of mean, as well as the ARCH (lags 1 and 2) and GARCH (lag 1) components in the equation of variance, eight combinations (2 × 2 × 2 × 1) are possible, resulting in a total of 240 (30 × 8) combinations of tested models. A sample from the best model selection table that includes only the Student’s t-distribution is presented in Figure A1 of Appendix A. The best 10 models according to the BIC and AIC are presented in Table 7.

All models were estimated assuming the validity of a standard (seven models) or skewed (three models) Student’s t-distribution. Table 7 shows that the top five best models, coloured red, are confirmed by both the BIC and AIC. For the rest, there are minor differences in the ordering. According to AIC, the top ten models are located within 0.006, while according to the BIC, the interval is 0.005. With the inclusion of ARMA components in the mean equation, seven of the ten models are CGARCH, and three are EGARCH. Among the top ten models is the best of the models estimated without the inclusion of AR and MA components, namely ARMA(0,0)-CGARCH, estimated under the assumption of a Student’s t-distribution. Table 8 presents the parameter estimates of the models occupying the top six positions. In Table A1 and Table A2 in Appendix B are presented results of the estimation of two GARCH (EGARCH and CGARCH) models to the log returns of SOFIX. Each model has one lag at the AR and MA components in the conditional mean equation, the same variance equation (q = 2, p = 1 and q = 1, p = 1 respectively) and different distributions of residuals (Student’s t and skewed t).

The best six models consist of four CGARCH and two EGARCH models. Analysing the estimates of the alpha and beta parameters, we can reject the adequacy of the EGARCH models since the alpha1 and alpha2 parameters are negative and statistically insignificant. In the four CGARCH models, the constants in the mean and conditional variance equations are statistically insignificant, while the estimates of all other parameters, except the alpha2 parameters, are statistically significant at the 1% risk of error. This implies that the models that include a second lag order with respect to the ARCH component are inadequate. In the end, the SOFIX data show that ARMA(1,1)-CGARCH(1,1), estimated under the assumption of a standard or skewed Student’s t-distribution, turns out to be the most appropriate to describe its dynamics. The parameters pho and phi, for shape and skewness, as well as those in front of the AR and MA components, are statistically significant at the 1% significance level. There are almost no differences in the estimates of the models with standard and skewed t-distributions of the residual component, and starting from the BIC, the ARMA(1,1)-CGARCH(1,1) model estimated with the assumption of the standard Student’s t-distribution is chosen as the most appropriate one for prediction. The autoregressive persistence of the transitory component amounts to 0.904, and that of the permanent component amounts to 0.998. The immediate impact of volatility shocks (

ε_{t - 1}^{2} - h_{t - 1}

) on the short-run component is 0.242, and that on the long-run component is equal to 0.060. The stationarity of the CGARCH(1,1) process is present because 0 < (α + β = 0.904) < ρ = 0.998 < 1, 0 < φ = 0.060 < β = 0.662, 0 < φ = 0.060 < α = 0.242 and ω does not take a negative value.

4.7. Forecasting SOFIX Index with ARMA(1,1)-CGARCH(1,1) Model

In order to confirm the correctness of the inference about the best model, the cross-validation methodology is applied. We divide the total period, which consists of 5780 observations, into two sets: training (2000–2020) and testing (2021–2024). The first sub-period covers 4980 observations and lasts from 23.10.2000 to 30.12.2020. The second period consists of 800 observations and runs from 04.01.2021 to 28.03.2024. Thus, forecasts for the test period (2021–2024), calculated based on the results of applying the tested five variants of the GARCH models to the training period (2000–2020), can be used as an alternative way to choose the best model. In this way, the forecasts obtained with the tested GARCH variants can be checked with respect to their quality, and the selection of the best model will not only be based on the information criteria but also on the forecast quality measures.

All models were fitted on the training period (2000–2020) with the inclusion of a constant and one lag for each of the AR and MA components in the mean equation, and the ARCH and GARCH components in the conditional variance equation. Starting from the shape of the distribution of the logarithms of the SOFIX returns, out of the six possible distributions, only the Student’s t-distribution is used here. Table 9 demonstrates the forecast performance of five tested GARCH variants made for the testing period (2021–2024). In terms of the error metrics MAE and MSE, the ARMA(1,1)-CGARCH(1,1) model outperforms the other specifications, again confirming the conclusion of the best model determined by the information criteria.

Finally, to confirm the robustness of the estimates obtained with the defined best model, we apply it to different sub-periods arising from the stages in the evolution of the index. The first period coincides with the period we have defined as training, while the second period covers the last two periods in the index dynamics, namely from 3.1.2007 to 28.3.2024. The results are presented in Table 10.

Tracking the parameter estimates for the three periods (full range, beginning period and ending period), we find that the significance of all parameters is maintained without exception, with only minor changes in their values. For example, the parameter estimates for the AR and MA components of 0.981 and −0.968 over the full range period are reduced to 0.974 and −0.955 in the beginning period and increase to 0.987 and −0.975 in the ending period, respectively. Similarly, the parameter estimates of the lag orders in the variance equation of 0.243 and 0.662 over the entire period change to 0.255 and 0.644 in the initial period and to 0.249 and 0.704 in the final period. The same conclusions can be drawn regarding the parameter estimates related to accounting for the impact of long-term shocks and short-term shocks. All these facts prove the robustness of the model and allow its use for forecasting using simulations.

The Ljung–Box tests (Ljung and Box 1978) and ARCH LM tests provide a means of testing for autocorrelation within the GARCH models’ standardised squared residuals. Ljung–Box statistics can also be used to test serial autocorrelation among the standardised residuals. If the GARCH model has done its job, there should be no autocorrelation within the residuals. With the null hypothesis of the Ljung–Box test, it is argued that there is no autocorrelation between the residuals for a set of lags. The results show (the p-values > 0.05) that there is no autocorrelation among the squared residuals for different lags.

The LM ARCH test is actually applied to verify the presence of the ARCH effect. It is used to test the null hypothesis of the adequately fitted ARCH process. The results show (the p-values for each of the lags are above 0.05) that the GARCH process is adequately fitted.

Engle and Ng sign bias tests (Engle and Ng 1993) provide a means of testing for the mis-specification of conditional volatility models. Specifically, they examine whether the standardised squared residual is predictable using (dummy) variables indicative of certain information. The null hypothesis for these tests is that parameters corresponding to the additional (dummy) variables are equal to 0, i.e., no significant negative and positive reaction shocks. The alternative hypothesis is that the additional parameters are non-zero, which indicates a mis-specification of the model. The results show (the p-values > 0.05) that there are no significant negative or (and) positive shocks.

The robustness checks performed on the results allow us to use the model identified as the best GARCH model to make predictions using simulations. A number of simulations can be calculated to predict future levels of SOFIX returns. The simulations are run by sequentially imputing the first level of returns in the pre-specified best model and emitting random samples from the residual distributions. This approach allows us to generate time series with an arbitrary forecast horizon, with the simulated series having the same characteristics as the original model. Based on the computed simulations, the forecast probability of reaching the historical SOFIX peak is estimated. The forecast is calculated based on 5000 simulations. The simulations are done on the whole period (2020–2024), consisting of 5780 observations.

Since the purpose of the simulation is to predict the period in which the peak value can be reached, two scenarios are applied here. The first covers the entire period since the creation of the index, i.e., the period from 20.10.2000 to 28.03.2024. This scenario can be defined as realistic. Thus, 28.03.2024 is chosen as the starting point for the simulations. In the second scenario, the starting point coincides with the beginning of the second stage in the development of SOFIX, namely 03.01.2007. The scenario can be formulated as pessimistic. A third scenario (defined as optimistic) is not suitable because it would have to drop the date on which the peak value was realised, and in the simulations, the goal is to predict precisely the reaching of this peak value.

Figure 4 presents the simulation results according to the realistic scenario using the GARCH model selected as the best fit in Panel (a) and the probabilities of SOFIX reaching its peak (2024–2100) in Panel (b).

First, the future values of the return logarithms are simulated, and then these simulations are used to calculate the predicted index values. The black line reflects the evolution of the index, while the grey area reflects the different simulations. The historical peak of SOFIX was reached at 08.10.2007, with the closing value of 1976.73. The last day of the sample of prices is 28.03.2024, which defines the starting point of the simulation. The simulation shows that the trend in SOFIX values is rather upward. In the long run, financial index prices tend to increase in value, and the GARCH model used is able to capture this effect. Although the simulated value of the index may decline in the short term, the probability of reaching the historical peak increases with time. The chart shows that the first simulated price that crosses the peak is too soon after 28.03.2024. In calculating the simulations for each time point, the relative fraction of times in which a value of 1976.73 is reached is calculated. The calculations continue for all future dates, resulting in a vector of probabilities. Figure 4 Panel (b) presents the actual probabilities that future values of the SOFIX will reach the peak before the global financial crisis in 2008. As might be expected, the probability increases over time. The first date, 08.05.2024, shows the first probability, which is greater than 0.1%. According to the model, the probability of the index peaking in the first two months of the forecast period is almost zero. The chances of reaching the peak value are estimated at 50% after 11.02.2034. The probability increases over time, reaching a 90% probability of an event on 13.08.2087, more than sixty years after the last day in the price sample.

Figure 5 presents the simulation results according to the pessimistic scenario.

According to the pessimistic scenario, the probability of the index reaching the peak value is estimated at 50% after 30.12.2045, i.e., 11 years later than in the realistic scenario. The probability increases over time but much more slowly. The chances of reaching the peak values of 1976.73 are estimated to be only 70% as of 26.07.2087. In other words, the probability of reaching the peak value of the index with the pessimistic version is expected to be fulfilled at least 30 years later, compared to the probability accounted for by the realistic option.

5. Discussion

In the present study, the forecast is based on the ARMA(1,1)-CGARCH(1,1) model with a Student’s t-distribution identified as the best of the GARCH models applied here—standard GARCH, IGARCH, EGARCH, GJR-GARCH and CGARCH, estimated with the assumption of six distributions (Normal, Student’s t, GED and their skewed forms). In estimating the forecast, it is found that the volatility of the SOFIX index can be described using two components, reflecting the short-run and long-run effects of shocks, respectively, which, according to Christoffersen et al. (2008), allows for its more accurate modelling. Long-term volatility is represented by a permanent component, and a transitory component represents short-term volatility. According to the results, when predicting future values, it is necessary to include both an autoregressive component and a first-order moving average in the mean equation.

The parameter α, reflecting the initial impact of shocks on the transitory component of volatility, is statistically significant at the 1% level and amounts to 0.242. Hence, we can say that the initial impact of shocks on the transitory component in SOFIX volatility is substantially positive. The parameter β reflecting the degree of memory in the transitory component is also statistically significant, and its value is 0.662. The sum of the two parameters is equal to 0.904, which means that the mean of the short-run volatility tends toward zero at a geometric level of 0.904.

In the equation describing the long-run volatility, all parameters are significant except the constant ω. As required, the parameter ρ tends toward 1 and, in this case, amounts to 0.998. This implies that the long-run volatility reaches its mean very slowly, with the effect of historical shocks remaining constant over a long period of time. Since the parameter ρ is larger in value than the sum of the parameters α and β, we can conclude that the persistence in volatility is greater in the long run. The parameter φ is equal to 0.060 and reflects the influence of the time-varying constant component.

From the results obtained, we can confirm the conclusion drawn by Charteris and Strydom (2011, p. 60) that the CGARCH model specification supports a more appropriate description of SOFIX volatility than a simple GARCH model. They employ the Component GARCH model to examine the volatility of South African and United States Treasury Bonds and Treasury Bills. In estimating the four treasury securities, asymmetric Component GARCH was applied by incorporating the adjustment inherent in Threshold GARCH into the transitory component of the volatility equation. The asymmetry in volatility is present when the parameter γ is positive and statistically significant. Of the four assets, asymmetry is found only with respect to the S.A. T-Bond. The authors find, as we do in this study, that long-term shocks have a more persistent impact on volatility than the effect of short-term shocks, and furthermore, for the same magnitude, negative shocks to South African Treasury Bonds have a more significant impact on volatility than positive shocks.

Naik and Padhi (2014), in their study, apply GARCH, EGARCH, GJR-GARCH and asymmetric CGARCH models in estimating the volatility of the S&P CNX Nifty index, one of the benchmark indices of the Indian equity markets. All models are estimated assuming Gaussian normal distribution for GARCH(1,1), GJR-GARCH(1,1), asymmetric CGARCH(1,1) and Generalised Error Distribution (GED) for EGARCH(1,1) models. Based on the AIC and BIC, EGARCH(1,1) is determined to be the best of all asymmetric models. The results show that significant asymmetry in volatility is inherent in the index under study, confirming the leverage effect hypothesis.

In contrast to our study, Naik and Padhi (2014) do not focus on the components included in the conditional mean equation but only present results concerning the conditional volatility equation. Furthermore, they do not analyse the models assuming different random error distributions, relying only on the normal distribution and the Generalised Error Distribution (GED). Finally, only GARCH models of order(1,1) were tested; estimation of possibilities for other orders was not done, as in our case.

However, as with any simulation, there is no guarantee that our forecast is the best answer and that the calculated probabilities are realistic, especially for such a long time horizon of sixty years. It should be noted that the GARCH model is only a limited representation of the financial sustainability indicator, and no model can perfectly reproduce the real situation. Moreover, only some of the volatility modelling methods are covered in this study. Not all asymmetric GARCH models are considered, like Bayesian estimation methods or multivariate GARCH models, for example. With other methods, it is possible to obtain different results. These methods and models will be used in future research by the authors to model SOFIX dynamics. An object of future work is also the use of the methodology of Sensitivity Analysis, through which the influence of the share price of each company that is included in the scope of the SOFIX index on its price can be studied.

Author Contributions

Conceptualisation, P.P.; methodology, P.P.; software, P.P.; validation, P.P., E.O. and T.V.; formal analysis, P.P.; investigation, P.P.; resources, P.P.; data curation, P.P. and A.L.; writing—original draft preparation, P.P., E.O. and T.V.; writing—review and editing, P.P. and E.O.; visualisation, P.P.; supervision, P.P. and M.S.; project administration, P.P. and M.S.; funding acquisition, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the Special Account for the Institute for Scientific Research of the D. A. Tsenov Academy of Economics, Svishtov, Bulgaria, which funded the authors in the framework of the infrastructure research project IP4-2023 “Comparative analysis of conventional statistical methods and artificial neural networks in time series forecasting”.

Data Availability Statement

The data and the programming codes in R are publicly available for download on GitHub (https://github.com/ip4-2023/garch-sofix/ accessed on 29 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A

Figure A1. Sample of selecting GARCH models using BIC and AIC, including only Student’s t-distribution.

Figure A1 reports AIC (left panel) and BIC (right panel) values for different ARMA-GARCH models, changing lags and variance equation, and only one distribution parameter—Student’s t-distribution. The best model is the one with the lowest values of AIC and BIC, highlighted by the blue star. All results of the figure can be replicated using R script 05-Find_Best_Garch_SOFIX_5X6_std.R. The results, as well as the figure with the selection of the best model among all 240 tested combinations (when all 6 distributions are estimated), can be obtained by running the script 05-Find_Best_Garch_SOFIX_5X6.R.

Appendix B

Table A1. Estimation results of the best EGARCH and CGARCH models (part 1).

Parameters	Model 1	Model 2	Model 3	Model 4
mu	0.000 **	0.000	0.000	0.000
	(0.000)	(0.000)	(0.000)	(0.000)
ar1	0.976 ***	0.973 ***	0.981 ***	0.981 ***
	(0.002)	(0.003)	(0.002)	(0.002)
ma1	−0.961 ***	−0.956 ***	−0.968 ***	−0.967 ***
	(0.000)	(0.001)	(0.001)	(0.001)
omega	−0.326 ***	−0.328 ***	0.000	0.000
	(0.051)	(0.050)	(0.000)	(0.000)
alpha1	−0.001	−0.002	0.243 ***	0.244 ***
	(0.025)	(0.025)	(0.029)	(0.029)
alpha2	−0.001	−0.001	0.000	0.000
	(0.026)	(0.026)	(0.035)	(0.032)
beta1	0.965 ***	0.964 ***	0.660 ***	0.660 ***
	(0.006)	(0.006)	(0.049)	(0.049)
gamma1	0.542 ***	0.543 ***
	(0.034)	(0.035)
gamma2	−0.208 ***	−0.208 ***
	(0.032)	(0.032)
shape	3.551 ***	3.547 ***	3.783 ***	3.783 ***
	(0.105)	(0.178)	(0.142)	(0.142)
skew		0.989 ***		0.996 ***
		(0.015)		(0.020)
rho			0.998 ***	0.998 ***
			(0.000)	(0.000)
phi			0.059 ***	0.059 ***
			(0.012)	(0.012)
Variance model	EGARCH	EGARCH	CGARCH	CGARCH
Distribution	std	sstd	std	sstd
Log likelihood	19,234.679	19,234.905	19,241.638	19,241.668
AIC	−6.652	−6.652	−6.655	−6.654
BIC	−6.641	−6.639	−6.643	−6.642

Note: ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively.

Table A1 presents the results of the estimation of two GARCH (EGARCH and CGARCH) models to the log returns of SOFIX. Each model has one lag at the AR and MA components in the conditional mean equation, the same variance equation (q = 2 and p = 1) and a different distribution of residuals (Student’s t and skewed t). The results were found executing the script named 04-Estimate_BEST_GARCH_SOFIX_2X2.R.

Table A2. Estimation results of the best EGARCH and CGARCH models (part 2).

	Model 1	Model 2	Model 3	Model 4
mu	0.000	0.000	0.000	0.000
	(0.000)	(0.000)	(0.000)	(0.000)
ar1	0.963 ***	0.961 ***	0.980 ***	0.981 ***
	(0.025)	(0.003)	(0.002)	(0.002)
ma1	−0.943 ***	−0.941 ***	−0.966 ***	−0.967 ***
	(0.033)	(0.001)	(0.001)	(0.001)
omega	−0.519 **	−0.522 ***	0.000	0.000
	(0.189)	(0.113)	(0.000)	(0.000)
alpha1	−0.006	−0.007	0.242 ***	0.243 ***
	(0.014)	(0.014)	(0.016)	(0.016)
beta1	0.944 ***	0.943 ***	0.662 ***	0.662 ***
	(0.021)	(0.012)	(0.052)	(0.051)
gamma1	0.439 ***	0.441 ***
	(0.057)	(0.037)
shape	3.493 ***	3.487 ***	3.780 ***	3.778 ***
	(0.180)	(0.004)	(0.134)	(0.135)
skew		0.987 ***		0.996 ***
		(0.015)		(0.019)
rho			0.998 ***	0.998 ***
			(0.000)	(0.000)
phi			0.060 ***	0.060 ***
			(0.015)	(0.015)
Variance model	EGARCH	EGARCH	CGARCH	CGARCH
Distribution	std	sstd	std	sstd
Log likelihood	19,221.011	19,221.303	19,241.142	19,241.165
AIC	−6.648	−6.648	−6.655	−6.654
BIC	−6.639	−6.637	−6.644	−6.643

*** p < 0.001; ** p < 0.01; * p < 0.05.

Table A2 presents the results of the estimation of two GARCH (EGARCH and CGARCH) models to the log returns of SOFIX. Each model has one lag at the AR and MA components in the conditional mean equation, the same variance equation (q = 1 and p = 1) and different distributions of residuals (Student’s t and skewed t). The results were found executing the script named 04-Estimate_BEST_GARCH_SOFIX_2X2_2.R.

Note

1	Analysis of the data was done using the rugarch package (Ghalanos 2023).

References

Andersen, Torben, Tim Bollerslev, Peter Christoffersen, and Francis Diebold. 2006. Volatility and Correlation Forecasting. In Handbook of Economic Forecasting. Oxford: Newnes, vol. 1, pp. 778–878. [Google Scholar]
Arltová, Markéta, and Darina Fedorová. 2016. Selection of Unit Root Test on the Basis of Length of the Time Series and Value of AR(1) Parameter. Statistika: Statistics & Economy Journal 96: 47–64. [Google Scholar]
Arneric, Josip, and Blanka Scrabic-Peric. 2018. Panel GARCH model with cross-sectional dependence between CEE emerging markets in trading day effects analysis. Romanian Journal of Economic Forecasting 21: 71–84. [Google Scholar]
Bilyk, Oleh, Pawel Sakowski, and Robert Ślepaczuk. 2020. Investing in VIX Futures Based on Rolling GARCH Models Forecasts. Working Papers 2020-10. Warsaw, Poland: Faculty of Economic Sciences, University of Warsaw. Available online: https://econpapers.repec.org/paper/warwpaper/2020-10.htm (accessed on 3 August 2024).
Bollerslev, Tim. 1986. Generalised autoregressive conditional heteroskedasticity. Journal of Econometrics 31: 307–27. [Google Scholar] [CrossRef]
Caner, Mehmet, and Lutz Kilian. 2001. Size Distortions of Tests of the Null Hypothesis of Stationarity: Evidence and Implications for the PPP Debate. Journal of International Money and Finance 20: 639–57. [Google Scholar] [CrossRef]
Charteris, Ailie, and Barry Strydom. 2011. An Examination of the Volatility of South African Risk-Free Rate Proxies: A Component GARCH Analysis. Studies in Economics and Econometrics 35: 49–64. [Google Scholar] [CrossRef]
Chhajer, Parshv, Manan Shah, and Ameya Kshirsagar. 2022. The applications of artificial neural networks, support vector machines, and long–short term memory for stock market prediction. Decision Analytics Journal 2: 100015. [Google Scholar] [CrossRef]
Christoffersen, Peter, Kris Jacobs, Chayawat Ornthanalai, and Yintian Wang. 2008. Option Valuation with Long-Run and Short-Run Volatility Components. Journal of Financial Economics 90: 272–97. [Google Scholar] [CrossRef]
Dickey, David A., and Wayne A. Fuller. 1981. Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root. Econometrica 49: 1057–72. [Google Scholar] [CrossRef]
Elliott, Graham, Thomas J. Rothenberg, and James H. Stock. 1996. Efficient Tests for an Autoregressive Unit Root. Econometrica 64: 813–36. [Google Scholar] [CrossRef]
Enders, Walter. 1994. Applied Econometric Time Series. New York: John Wiley & Sons. [Google Scholar]
Engle, Robert F. 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 50: 987–1008. [Google Scholar] [CrossRef]
Engle, Robert F., and Gary G. J. Lee. 1999. A permanent and transitory component model of stock return volatility. In Cointegration, Causality and Forecasting: A Festschrift in Honour of Clive W.J. Oxford: Oxford University Press, pp. 475–97. [Google Scholar]
Engle, Robert F., and Tim Bollerslev. 1986. Modelling the persistence of conditional variances. Econometric Reviews 1: 1–50. [Google Scholar] [CrossRef]
Engle, Robert F., and Victor K. Ng. 1993. Measuring and Testing the Impact of News on Volatility. The Journal of Finance 48: 1749–78. [Google Scholar] [CrossRef]
Ge, Wenbo, Pooia Lalbakhsh, Leigh Isai, Artem Lenskiy, and Hanna Suominen. 2022. Neural network–based financial volatility forecasting: A systematic review. ACM Computing Surveys (CSUR) 55: 1–30. [Google Scholar] [CrossRef]
Gerunov, Anton. 2023. Stock Returns Under Different Market Regimes: An Application of Markov Switching Models to 24 European Indices. Ikonomicheski izsledvaniq 1: 18–35. [Google Scholar]
Ghalanos, Alexios. 2022. Introduction to the Rugarch Package. Available online: https://cran.r-project.org/web/packages/rugarch/vignettes/Introduction_to_the_rugarch_package.pdf (accessed on 26 June 2024).
Ghalanos, Alexios. 2023. Rugarch: Univariate GARCH Models. Available online: https://cran.r-project.org/web/packages/rugarch/rugarch.pdf (accessed on 26 June 2024).
Glosten, Lawrence R., Ravi Jagannathan, and David E. Runkle. 1993. Relation between expected value and the nominal excess return on stocks. Journal of Finance 48: 127–38. [Google Scholar] [CrossRef]
Grudniewicz, Jan, and Robert Ślepaczuk. 2023. Application of machine learning in algorithmic investment strategies on global stock markets. Research in International Business and Finance 66: 102052. [Google Scholar] [CrossRef]
Khan, Maaz, Umar Nawaz Kayani, Mrestyal Khan, Khurrum Shahzad Mughal, and Mohammad Haseeb. 2023. COVID-19 pandemic & financial market volatility; evidence from GARCH models. Journal of Risk and Financial Management 16: 50. [Google Scholar] [CrossRef]
Kijewski, Mateusz, and Robert Ślepaczuk. 2020. Predicting Prices of S&P500 Index Using Classical Methods and Recurrent Neural Networks. Working Papers 2020-27. Available online: https://econpapers.repec.org/paper/warwpaper/2020-27.htm (accessed on 3 August 2024).
Kwiatkowski, Denis, Peter CB Phillips, Peter Schmidt, and Yongcheol Shin. 1992. Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics 54: 159–78. [Google Scholar] [CrossRef]
Liu, Tong, and Yanlin Shi. 2022. Innovation of the Component GARCH Model: Simulation Evidence and Application on the Chinese Stock Market. Mathematics 10: 1903. [Google Scholar] [CrossRef]
Ljung, Greta Marianne, and George Edward Pelham Box. 1978. On a measure of lack of fit in time series models. Biometrika 65: 297–303. [Google Scholar] [CrossRef]
Milinov, Valentin, and Nigokhos Kanaryan. 2006. The risk-volume relationship of the most liquid Bulgarian shares. Dialog 1: 44–54. [Google Scholar]
Minchev, Teodor. 2024a. When SOFlX Will Break Its 2007 Peak. Available online: https://money.bg/finance/koga-sofix-shte-probie-varha-si-ot-2007-a-godina.html (accessed on 1 June 2024).
Minchev, Teodor. 2024b. SOFlX at Its Highest Level Since 2008. Available online: https://money.bg/finance/sofix-pri-nay-visokoto-si-nivo-ot-2008-godina-nasam.html (accessed on 1 June 2024).
Mustapa, Farah Hayati, and Mohd Tahir Ismail. 2019. Modelling and forecasting S&P 500 stock prices using hybrid Arima-Garch Model. Journal of Physics: Conference Series 1366: 012130, IOP Publishing. [Google Scholar] [CrossRef]
Naik, Pramod Kumar, and Puja Padhi. 2014. Equity Trading Volume and its Relationship with Market Volatility: Evidence from Indian Equity Market. Journal of Asian Business Strategy 4: 108–24. [Google Scholar]
Ncube, Mbongiseni, Mabutho Sibanda, and Frank Ranganai Matenda. 2024. Investigating the Effects of the COVID-19 Pandemic on Stock Volatility in Sub-Saharan Africa: Analysis Using Explainable Artificial Intelligence. Economies 12: 112. [Google Scholar] [CrossRef]
Nelson, Daniel. 1991. Conditional heteroskedasticity in assets returns: A new approach. Econometrica: Journal of the Econometric Society 59: 347–70. [Google Scholar] [CrossRef]
Ng, Serena, and Pierre Perron. 1995. Unit Root Tests in ARMA Models with Data-Dependent Methods for the Selection of the Truncation Lag. Journal of the American Statistical Association 90: 268–81. [Google Scholar] [CrossRef]
Ouchen, Abdessamad. 2023. Econometric Modeling of the Impact of the COVID-19 Pandemic on the Volatility of the Financial Markets. Engineering Proceedings 39: 14. [Google Scholar] [CrossRef]
Paskaleva, Mariya, and Ani Stoykova. 2021. Globalisation effects on contagion risks in financial markets. Ekonomicko-Manazerske Spectrum 15: 38–54. [Google Scholar] [CrossRef]
Patev, Plamen, Nigokhos Kanaryan, and Katerina Lyroudi. 2009. Modelling and forecasting the volatility of thin emerging stock markets: The case of Bulgaria. Comparative Economic Research. Central and Eastern Europe 12: 47–60. [Google Scholar] [CrossRef]
Perlin, Marcelo, Mauro Mastella, Daniel Vancin, and Henrique Ramos. 2021. A GARCH tutorial with R. Revista de Administração Contemporânea 25. [Google Scholar] [CrossRef]
Pesaran, M. Hashem. 2015. Time Series and Panel Data Econometrics. Oxford: Oxford University Press. [Google Scholar]
Petkov, Plamen. 2010. Econometric analysis of the main Bulgarian indices with GARCH(1,1) models. Paper presented at Jubilee Scientific Conference Market Development of the Bulgarian Economy—Two Decades after the Change, Svishtov, Bulgaria, November 4–5; pp. 332–36, PH “Tsenov”. [Google Scholar]
Petrică, Andreea-Cristina, Stelian Stancu, and Virgil Ghițulescu. 2017. EGARCH versus PARCH approach in modeling developed and underdeveloped stock markets. Paper presented at New Trends in Sustainable Business and Consumption, Graz, Austria, May 31–June 3; pp. 513–21, BASIQ. [Google Scholar]
Petrova, Mariana, and Teodor Todorov. 2023. Empirical Testing of Models of Autoregressive Conditional Heteroscedasticity Used for Prediction of the Volatility of Bulgarian Investment Funds. Risks 11: 197. [Google Scholar] [CrossRef]
Phillips, Peter C. B., and Pierre Perron. 1988. Testing for a unit root in time series regression. Biometrika 75: 335–46. [Google Scholar] [CrossRef]
Roszyk, Natalia, and Robert Ślepaczuk. 2024. The Hybrid Forecast of S&P 500 Volatility Ensembled from VIX, GARCH and LSTM Models. Working Papers 2024-13. Available online: https://ssrn.com/abstract=4903194 (accessed on 3 August 2024).
Setiawan, Budi, Marwa Ben Abdallah, Maria Fekete-Farkas, Robert Jeyakumar Nathan, and Zoltan Zeman. 2021. GARCH (1,1) models and analysis of stock market turmoil during COVID-19 outbreak in an emerging and developed economy. Journal of Risk and Financial Management 14: 576. [Google Scholar] [CrossRef]
Sezer, Omer Berat, Mehmet Ugur Gudelek, and Ahmet Murat Ozbayoglu. 2020. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied Soft Computing 90: 106181. [Google Scholar] [CrossRef]
Sun, Kaiying. 2017. Equity return modeling and prediction using hybrid ARIMA-GARCH model. International Journal of Financial Research 8: 154–61. [Google Scholar] [CrossRef]
Ślepaczuk, Robert, and Maryna Zenkova. 2018. Robustness of support vector machines in algorithmic trading on cryptocurrency market. Central European Economic Journal 5: 186–205. [Google Scholar] [CrossRef]
Tabash, Mosab, Neenu Chalissery, Mohamed Nishad, and Mujeeb Saif Mohsen Al-Absy. 2024. Market Shocks and Stock Volatility: Evidence from Emerging and Developed Markets. International Journal of Financial Studies 12: 2. [Google Scholar] [CrossRef]
Tikhomirov, Nikolay, and Elena Dorokhina. 2002. Ekonometrika. Moskva: Rossiĭskaya ekonomicheskaya akademiya. [Google Scholar]
Tsenkov, Vladimir, and Ani Stoitsova-Stoykova. 2017. The Impact of the Global Financial Crisis on the market efficiency of capital markets of south east Europe. International Journal of Contemporary Economics and Administrative Sciences 7: 31–57. [Google Scholar]
Tsenkov, Vladimir. 2011. The hypothesis of efficient markets and the global financial crisis—On the example of the SOFIX, DJIA and DAX indices. Economic Research 3: 53–88. [Google Scholar]
Ugurlu, Erginbay, Eleftherios Thalassinos, and Yusuf Muratoglu. 2014. Modeling volatility in the stock markets using GARCH models: European emerging economies and Turkey. International Journal in Economics and Business Administration 2: 72–87. [Google Scholar] [CrossRef]
Vo, Nguyen, and Robert Ślepaczuk. 2022. Applying hybrid ARIMA-SGARCH in algorithmic investment strategies on S&P500 index. Entropy 24: 158. [Google Scholar] [CrossRef]
Zhao, Pengfei, Haoren Zhu, Wilfred Siu Hung Ng, and Dik Lun Lee. 2024. From GARCH to Neural Network for Volatility Forecast. Proceedings of the AAAI Conference on Artificial Intelligence 38: 16998–7006. [Google Scholar] [CrossRef]
Zivot, Eric, and Jiahui Wang. 2006. Modeling Financial Time Series with S-PLUS^®, 2nd ed. New York: Springer. [Google Scholar] [CrossRef]

Figure 1. Prices and returns of the SOFIX index (2000–2024). Source: derived by the authors using adapted codes in R proposed in the publication of Perlin et al. (2021) and available for download on GitHub (https://github.com/ip4-2023/garch-sofix/, accessed on 23 June 2024). All price data are in the daily frequency, covering the period between October 2000 and March 2024. In (a), we use the consumer price index and the inflation benchmark to calculate real returns in the period. The red circles in (b) indicate the largest 10 absolute (positive or negative) price changes found in the sample of log returns. Figures can be replicated using the R script 02-Do_Descriptive_Figures_SOFIX.R.

Figure 3. Autocorrelation functions of daily squared log returns of SOFIX index. The autocorrelation at lag k (x-axis) shows the linear correlation between squared log returns at times t and t-k for the SOFIX index. The blue dashed lines represent numerical limits for the statistical significance of correlation. The autocorrelagram can be replicated using R script 02-Do_Descriptive_Figures_SOFIX.R.

Figure 4. Simulating the SOFIX index (2024-2029) and probabilities of SOFIX reaching its peak (2024–2100) using the realistic scenario (starting point at 20.10.2000). Panel (a) shows the simulation results using a GARCH model. After estimating the model selected by the BIC, we simulate its forward log returns and use them to construct projections of the future value of the index. The solid black line is the index itself, while the grey lines are the different paths of the simulation. Panel (b) shows the probabilities of SOFIX reaching its peak value once again, calculated using the simulated prices from the GARCH model. The x-axis represents the forward time period, while the y-axis is the probability itself. We calculate the probability as the proportion of times wherein the forward simulated index value is higher than the historical peak value of 1976.73. The simulations were calculated based on the SOFIX data covering the entire study period (2000–2024), consisting of 5780 observations. Figures can be replicated using the R script 06-Simulation_Forecasts_SOFIX.R.

Figure 5. Simulating the SOFIX index (2024–2029) and probabilities of SOFIX reaching its peak (2024–2100) using the pessimistic scenario (starting point at 03.01.2007). Panel (a) shows the simulation results using a GARCH model. After estimating the model selected by the BIC, we simulate its forward log returns and use them to construct projections of the future value of the index. The solid black line is the index itself, while the grey lines are the different paths of the simulation. Panel (b) shows the probabilities of SOFIX reaching its peak value once again, calculated using the simulated prices from the GARCH model. The x-axis represents the forward time period, while the y-axis is the probability itself. We calculate the probability as the proportion of times wherein the forward simulated index value is higher than the historical peak value of 1976.73. The simulations were calculated based on the SOFIX data covering the period from 03.01.2007 to 28.03.2024, consisting of 4257 observations. Figures can be replicated using the R script 06-Simulation_Forecasts_SOFIX-2007-2024.R.

Table 1. Conditional densities for ε_t.

Name of Distribution	Density	Parameters
Normal (norm)	$\frac{1}{\sqrt{2 π h_{t}}} e x p \{- \frac{ε_{t}^{2}}{2}\}$	none
Student’s t (std)	$\frac{K (v)}{\sqrt{h_{t}}} {[1 + \frac{ε_{t}^{2}}{v - 2}]}^{- (v + 1) / 2}$	$v > 2$
Generalised Error Distribution (GED)	$C (v) e x p \{- {\|\frac{ε_{t}}{k_{v}}\|}^{v}\}$	$v > 0$
Skewed normal (snorm)	$\frac{1}{\sqrt{2 π h_{t}}} e x p \{- \frac{ε_{t}^{2}}{2}\} \int_{- \infty}^{α ε_{t}} \frac{1}{\sqrt{2 π}} e x p \{- \frac{t^{2}}{2}\}$	$α \in R$
Skewed t (sstd)	$\frac{b K (v)}{\sqrt{h_{t}}} {[1 + \frac{ζ_{t}^{2}}{v - 2}]}^{- (v + 1) / 2}$	$v > 2, ξ \in R$
Skewed GED (sGED)	$\{\begin{matrix} D (v) e x p \{- β_{1} {\|ε_{t} - m\|}^{v} ε_{t} < m\} \\ D (v) e x p \{- β_{2} {\|ε_{t} - m\|}^{v} ε_{t} \geq m\} \end{matrix}$	$v > 0, ξ \in R$

Table 2. Descriptive statistics.

Statistic	SOFIX Return
Mean	0.00035776
Median	0.00032240
Minimum	−0.20899
Maximum	0.21073
Std. dev.	0.013688
Skewness	−0.75991
Kurtosis	39.625
Jarque–Bera test	378,703.0
p-Value	0.0000
Observations	5780

Source: derived by the authors using Gretl 2023 c for Windows. Table 2 reports the results of the descriptive statistics applied to the log returns of SOFIX.

Table 3. Unit root tests of logarithms of daily returns of SOFIX ¹.

	Level		First Difference
	Test	Lags/BW	Test	Lags/BW
ADF
No Constant	−49.71 ***	33	−91.82 ***	33
Constant, No Trend	−49.75 ***	33	−91.82 ***	33
Constant, Trend	−49.78 ***	33	−91.81 ***	33
PP
Constant, No Trend	−79.75 ***	33	−477.56 ***	33
Constant, Trend	−79.69 ***	33	−477.52 ***	33
KPSS
Constant, No Trend	0.371 *	33	0.0032	33
Constant, Trend	0.212 **	33	0.0032	33
ADF-GLS
Constant, No Trend	−25.46 ***	4	−3.432 ***	4
Constant, Trend	−29.12 ***	4	−6.886 ***	4

¹ Notes: 1. ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively. 2. BW stands for bandwidth. Level refers to the logarithm of the daily SOFIX return, while the first difference is obtained as the difference between the values of the logarithms of the daily returns. 3. Table 3 reports the results of the unit root tests applied to the log returns of SOFIX. The results can be replicated with R script 03-Do_ARCH_and_Unit_Roots_Tests-SOFIX.R.

Table 4. Results of ARCH LM test for SOFIX.

Lag	LM Test	p-Value
1	6035.227	2.9 × 10⁻¹³³
2	1225.037	9.7 × 10⁻²⁶⁷
3	1273.103	1 × 10⁻²⁷⁵
4	1286.057	3.5 × 10⁻²⁷⁷
5	1316.519	1.7 × 10⁻²⁸²

Table 4 reports the results of the ARCH LM test applied to the log returns of SOFIX. Each lag has a corresponding LM statistic and associated p-value. The results can be replicated with R script 03-Do_ARCH_and_Unit_Roots_Tests-SOFIX.R.

Table 5. Autocorrelation functions of daily squared log returns ¹.

LAG	ACF	PACF	Q-Stat.	[p-Value]
1	0.2711 ***	0.2711 ***	424.9105	[0.000]
2	0.0767 ***	0.0035	458.9282	[0.000]
3	0.0729 ***	0.0554 ***	489.7069	[0.000]
4	0.0420 ***	0.0088	499.9015	[0.000]
5	0.0517 ***	0.0377 ***	515.3868	[0.000]
6	0.0640 ***	0.0398 ***	539.1206	[0.000]
7	0.1012 ***	0.0758 ***	598.3805	[0.000]
8	0.0942 ***	0.0457 ***	649.7579	[0.000]
9	0.0247 *	−0.0231 *	653.2860	[0.000]
10	0.0254 *	0.0120	657.0365	[0.000]
11	0.1441 ***	0.1348 ***	777.2511	[0.000]
12	0.2263 ***	0.1646 ***	1073.9199	[0.000]
13	0.0804 ***	−0.0353 ***	1111.4117	[0.000]
14	0.0715 ***	0.0320 **	1141.0590	[0.000]
15	0.0528 ***	−0.0005	1157.2441	[0.000]
16	0.0745 ***	0.0557 ***	1189.4570	[0.000]
17	0.0464 ***	−0.0037	1201.9163	[0.000]
18	0.0286 **	−0.0177	1206.6699	[0.000]
19	0.0361 ***	−0.0198	1214.2409	[0.000]
20	0.0211	−0.0125	1216.8231	[0.000]
21	0.0181	0.0135	1218.7222	[0.000]
22	0.0770 ***	0.0560 ***	1253.1102	[0.000]
23	0.0313 **	−0.0602 ***	1258.8062	[0.000]
24	0.0383 ***	−0.0090	1267.3291	[0.000]
25	0.0519 ***	0.0312 **	1282.9468	[0.000]
26	0.0344 ***	0.0018	1289.8176	[0.000]
27	0.0480 ***	0.0203	1303.1928	[0.000]
28	0.1007 ***	0.0591 ***	1362.1560	[0.000]
29	0.0641 ***	0.0081	1386.0530	[0.000]
30	0.0841 ***	0.0588 ***	1427.2029	[0.000]
31	0.0433 ***	0.0037	1438.1152	[0.000]
32	0.0367 ***	0.0185	1445.9667	[0.000]
33	0.1222 ***	0.0932 ***	1532.8083	[0.000]
34	0.1453 ***	0.0789 ***	1655.5679	[0.000]
35	0.0791 ***	0.0173	1691.9488	[0.000]
36	0.0560 ***	−0.0017	1710.1945	[0.000]
37	0.0503 ***	0.0097	1724.8972	[0.000]

¹ Notes: 1. ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively, using standard error 1/T^0.5. 2. Calculations were made with Gretl 2023 c for Windows. Table 5 presents the autocorrelation function at lag k of squared log returns of the SOFIX index.

Table 6. Estimation results of the best 10 GARCH models.

Parameter	Model 27	Model 28	Model 9	Model 21	Model 10	Model 3	Model 22	Model 29	Model 4	Model 30
mu	0.000 ***	0.000 **	0.000 **	0.000 **	0.000 *	0.000 **	0.000 *	0.000 ***	0.000 *	0.000 *
	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)
omega	0.000	0.000	−0.514 ***	0.000 ***	−0.517 ***	0.000 **	0.000 ***	0.000	0.000 **	0.000
	(0.000)	(0.000)	(0.110)	(0.000)	(0.109)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)
alpha1	0.239 ***	0.241 ***	−0.001	0.271 ***	−0.001	0.270 ***	0.272 ***	0.229 ***	0.271 ***	0.228 ***
	(0.014)	(0.015)	(0.013)	(0.019)	(0.013)	(0.024)	(0.019)	(0.009)	(0.024)	(0.011)
beta1	0.668 ***	0.665 ***	0.944 ***	0.729	0.944 ***	0.729 ***	0.728	0.669 ***	0.728 ***	0.672 ***
	(0.055)	(0.052)	(0.012)		(0.012)	(0.026)		(0.059)	(0.027)	(0.054)
skew		0.995 ***			0.987 ***		0.990 ***		0.990 ***	1.000 ***
		(0.016)			(0.017)		(0.017)		(0.017)	(0.013)
shape	3.782 ***	3.784 ***	3.509 ***	3.651 ***	3.502 ***	3.657 ***	3.649 ***	1.059 ***	3.655 ***	1.060 ***
	(0.134)	(0.135)	(0.179)	(0.139)	(0.179)	(0.201)	(0.138)	(0.011)	(0.202)	(0.007)
gamma1			0.437 ***		0.439 ***
			(0.039)		(0.039)
pho	0.998 ***	0.998 ***						0.999 ***		0.999 ***
	(0.000)	(0.000)						(0.000)		(0.000)
phi	0.063 ***	0.062 ***						0.071 ***		0.072 ***
	(0.017)	(0.016)						(0.019)		(0.018)
Variance model	CGARCH	CGARCH	EGARCH	IGARCH	EGARCH	GARCH	IGARCH	CGARCH	GARCH	CGARCH
Distribution	std	sstd	std	std	sstd	std	sstd	GED	sstd	sGED
Log likelihood	19,218.624	19,218.661	19,200.838	19,188.160	19,201.114	19,188.101	19,188.327	19,198.237	19,188.27	19,198.235
AIC	−6.648	−6.647	−6.642	−6.638	−6.642	−6.638	−6.638	−6.641	−6.637	−6.640
BIC	−6.640	−6.638	−6.635	−6.634	−6.633	−6.632	−6.632	−6.632	−6.631	−6.631

Notes: ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively. The best model is indicated in bold. Table 6 presents the results of the estimation of the best 10 of 30 different GARCH models to the log returns of SOFIX. Each model has a different variance equation but the same assumed distribution of residuals. Likewise, all lags in the variance equation are the same. The results for all 30 models can be obtained by running the script 04-Estimate_SOFIX_5X6.R.

Table 7. Selection of the best 10 ARCH-GARCH models with the BIC and AIC.

No.	Lag AR	Lag MA	Lag ARCH	Lag GARCH	AIC	BIC	Type of Model	Type of Distribution	Model Name
1	1	1	1	1	−6.654720	−6.644347	CGARCH	std	ARMA(1,1)-CGARCH(1,1) std
2	1	1	2	1	−6.654546	−6.643020	CGARCH	std	ARMA(1,1)-CGARCH(2,1) std
3	1	1	1	1	−6.654382	−6.642856	CGARCH	sstd	ARMA(1,1)-CGARCH(1,1) sstd
4	1	1	2	1	−6.654210	−6.641532	CGARCH	sstd	ARMA(1,1)-CGARCH(2,1) sstd
5	1	1	2	1	−6.652138	−6.640612	EGARCH	std	ARMA(1,1)-EGARCH(2,1) std
6	0	0	1	1	−6.647621	−6.639553	CGARCH	std	ARMA(0,0)-CGARCH(1,1) std
7	1	1	2	1	−6.651870	−6.639191	EGARCH	sstd	ARMA(1,1)-EGARCH(2,1) sstd
8	1	0	1	1	−6.648376	−6.639155	CGARCH	std	ARMA(1,0)-CGARCH(1,1) std
9	0	1	1	1	−6.648280	−6.639059	CGARCH	std	ARMA(0,1)-CGARCH(1,1) std
10	1	1	1	1	−6.648101	−6.638880	EGARCH	std	ARMA(1,1)-EGARCH(1,1) std

Table 7 presents the results for the 10 best ARMA-GARCH models after applying the best model selection algorithm using Akaike’s information criteria (AIC) and the Bayesian information criterion (BIC). The best model is indicated in bold. The results of all 240 tested models can be replicated using R script 05-Find_Best_Garch_SOFIX_5X6.R.

Table 8. Estimation results of the best 6 ARMA-GARCH models ¹.

Parameter	Model 1	Model 2	Model 3	Model 4	Model 5	Model 7
mu	0.000	0.000	0.000	0.000	0.000 **	0.000
	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)	(0.000)
ar1	0.980 ***	0.981 ***	0.981 ***	0.981 ***	0.976 ***	0.973 ***
	(0.002)	(0.002)	(0.002)	(0.002)	(0.002)	(0.003)
ma1	−0.966 ***	−0.968 ***	−0.967 ***	−0.967 ***	−0.961 ***	−0.956 ***
	(0.001)	(0.001)	(0.001)	(0.001)	(0.000)	(0.001)
omega	0.000	0.000	0.000	0.000	−0.326 ***	−0.328 ***
	(0.000)	(0.000)	(0.000)	(0.000)	(0.051)	(0.050)
alpha1	0.242 ***	0.243 ***	0.243 ***	0.244 ***	−0.001	−0.002
	(0.016)	(0.029)	(0.016)	(0.029)	(0.025)	(0.025)
alpha2		0.000		0.000	−0.001	−0.001
		(0.035)		(0.032)	(0.026)	(0.026)
beta1	0.662 ***	0.660 ***	0.662 ***	0.660 ***	0.965 ***	0.964 ***
	(0.052)	(0.049)	(0.051)	(0.049)	(0.006)	(0.006)
gamma1					0.542 ***	0.543 ***
					(0.034)	(0.035)
gamma2					−0.208 ***	−0.208 ***
					(0.032)	(0.032)
shape	3.780 ***	3.783 ***	3.778 ***	3.783 ***	3.551 ***	3.547 ***
	(0.134)	(0.142)	(0.135)	(0.142)	(0.105)	(0.178)
skew			0.996 ***	0.996 ***		0.989 ***
			(0.019)	(0.020)		(0.015)
pho	0.998 ***	0.998 ***	0.998 ***	0.998 ***
	(0.000)	(0.000)	(0.000)	(0.000)
phi	0.060 ***	0.059 ***	0.060 ***	0.059 ***
	(0.015)	(0.012)	(0.015)	(0.012)
Variance model	CGARCH	CGARCH	CGARCH	CGARCH	EGARCH	EGARCH
Distribution	std	std	sstd	Sstd	std	sstd
Log likelihood	19,241.142	19,241.638	19,241.165	19,241.668	19,234.679	19,234.905
AIC	−6.655	−6.655	−6.654	−6.654	−6.652	−6.652
BIC	−6.644	−6.643	−6.643	−6.642	−6.641	−6.639

¹ Notes: ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively. Please note that instead of a model with only a constant included in the conditional mean equation, ordered sixth, an ARMA(1,1)-EGARCH(2,1) model with a skewed t-distribution of disturbances is presented here, occupying the seventh position, since the parameter estimates in the model without ARMA components are presented earlier. Table 8 presents the results for the best 6 of 240 different ARMA-GARCH models to the log returns of SOFIX after applying the best model selection algorithm using Akaike’s information criteria (AIC) and the Bayesian information criterion (BIC). Results of the table can be replicated using R scripts 04-Estimate_BEST_GARCH_SOFIX_2X2.R and 04-Estimate_BEST_GARCH_SOFIX_2X2_2.R.

Table 9. Forecast performance measures calculated for the testing period (2021–2024).

Error Metrics	GARCH	EGARCH	JGR-GARCH	IGARCH	CGARCH
MSE	0.0000587245	0.0000587847	0.0000587315	0.0000587260	0.0000587232
MAE	0.0052894390	0.0052938330	0.0052899690	0.0052895470	0.0052894160
DAC	0.5512500000	0.5512500000	0.5512500000	0.5512500000	0.5512500000

Note: in order to simplify the structure of the table, MSE is mean squared error; MAE: mean absolute error; DAC: Directional Accuracy; GARCH: ARMA(1,1)-GARCH(1,1) std; EGARCH: ARMA(1,1)-EGARCH(1,1) std; JGR-GARCH: ARMA(1,1)-JGR-GARCH(1,1) std; IGARCH: ARMA(1,1)-IGARCH(1,1) std; CGARCH: ARMA(1,1)-CGARCH(1,1) std. The numbers in bold indicate the best results. The results in the table refer to the testing period (2021–2024) and can be obtained after running the script 07-Forecasts_SOFIX_5models-2000-2020.R.

Table 10. Validation of the robustness of the parameter estimates in the model defined as best.

Parameter	2000–2020	2007–2024	2000–2024
mu	0.000162	0.000061	0.000248
	(0.000174)	(0.000196)	(0.000209)
ar1	0.974 ***	0.987 ***	0.981 ***
	(0.003)	(0.002)	(0.002)
ma1	−0.955 ***	−0.975 ***	−0.968 ***
	(0.002)	(0.000)	(0.001)
omega	0.000	0.000	0.000
	(0.000)	(0.000)	(0.000)
alpha1	0.255 ***	0.249 ***	0.243 ***
	(0.034)	(0.006)	(0.016)
beta1	0.644 ***	0.704 ***	0.662 ***
	(0.081)	(0.013)	(0.052)
rho	0.998 ***	1.000 ***	0.998 ***
	(0.000)	(0.000)	(0.000)
phi	0.065 ***	0.014 ***	0.060 ***
	(0.020)	(0.002)	(0.016)
shape	3.729 ***	4.180 ***	3.780 ***
	(0.171)	(0.235)	(0.134)
Variance model	CGARCH	CGARCH	CGARCH
Distribution	std	std	std
Log likelihood	16,366.41	14,609.42	19,241.14
AIC	−6.569	−6.860	−6.655
BIC	−6.558	−6.846	−6.644
Ljung–Box tests
Lag[1]	0.002	0.098	0.0004
p-Value	0.965	0.755	0.984
Lag[5]	1.173	0.272	1.196
p-Value	0.819	0.987	0.814
Lag[9]	2.301	1.008	2.054
p-Value	0.866	0.986	0.898
ARCH LM tests
Lag[3]	0.055	0.086	0.002
p-Value	0.814	0.769	0.964
Lag[5]	1.129	0.236	1.273
p-Value	0.695	0.957	0.654
Lag[7]	1.873	0.787	1.664
p-Value	0.744	0.946	0.788
Sign bias tests
Sign bias	0.423	0.671	0.222
p-Value	0.973	0.503	0.824
Negative sign bias	0.973	1.385	1.363
p-Value	0.331	0.166	0.173
Positive sign bias	0.869	1.091	0.660
p-Value	0.370	0.276	0.509
Join effect	2.017	3.382	2.293
p-Value	0.569	0.336	0.514

Notes: ***, ** and * indicate significance at the 1%, 5% and 10% levels, respectively. In order to simplify the structure of the table, Ljung–Box tests stands for Weighted Ljung–Box Test on Standardised Squared Residuals. ARCH LM test: Weighted ARCH LM Tests; sign bias tests: tests for mis-specification of conditional volatility models. In parentheses are given the standard errors of the estimates. Table 10 presents the results of the estimation of the ARMA(1,1)-CGARCH(1,1) model, defined as best, to the log returns of SOFIX during different time periods. The first period is at the beginning and covers the time from 23.10.2000 to 30.12.2020, while the second is at the end and lasts from 3.1.2007 to 28.3.2024. The results were found executing the script 08-Estimate_robustness_SOFIX.R.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Petkov, P.; Shopova, M.; Varbanov, T.; Ovchinnikov, E.; Lalev, A. Econometric Analysis of SOFIX Index with GARCH Models. J. Risk Financial Manag. 2024, 17, 346. https://doi.org/10.3390/jrfm17080346

AMA Style

Petkov P, Shopova M, Varbanov T, Ovchinnikov E, Lalev A. Econometric Analysis of SOFIX Index with GARCH Models. Journal of Risk and Financial Management. 2024; 17(8):346. https://doi.org/10.3390/jrfm17080346

Chicago/Turabian Style

Petkov, Plamen, Margarita Shopova, Tihomir Varbanov, Evgeni Ovchinnikov, and Angelin Lalev. 2024. "Econometric Analysis of SOFIX Index with GARCH Models" Journal of Risk and Financial Management 17, no. 8: 346. https://doi.org/10.3390/jrfm17080346

APA Style

Petkov, P., Shopova, M., Varbanov, T., Ovchinnikov, E., & Lalev, A. (2024). Econometric Analysis of SOFIX Index with GARCH Models. Journal of Risk and Financial Management, 17(8), 346. https://doi.org/10.3390/jrfm17080346

Article Menu

Econometric Analysis of SOFIX Index with GARCH Models

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

4. Results

4.1. The Data

4.2. Descriptive Statistics and Figures for SOFIX

4.3. Testing for Unit Roots

4.4. Testing for ARCH Effects

4.5. Estimating a GARCH Model

4.6. Selection of the Best ARMA-GARCH Model

4.7. Forecasting SOFIX Index with ARMA(1,1)-CGARCH(1,1) Model

5. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

Note

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI