*Article* **Return Based Risk Measures for Non-Normally Distributed Returns: An Alternative Modelling Approach**

**Eyden Samunderu 1,\* and Yvonne T. Murahwa <sup>2</sup>**


**Abstract:** Developments in the world of finance have led the authors to assess the adequacy of using the normal distribution assumptions alone in measuring risk. Cushioning against risk has always created a plethora of complexities and challenges; hence, this paper attempts to analyse statistical properties of various risk measures in a not normal distribution and provide a financial blueprint on how to manage risk. It is assumed that using old assumptions of normality alone in a distribution is not as accurate, which has led to the use of models that do not give accurate risk measures. Our empirical design of study firstly examined an overview of the use of returns in measuring risk and an assessment of the current financial environment. As an alternative to conventional measures, our paper employs a mosaic of risk techniques in order to ascertain the fact that there is no one universal risk measure. The next step involved looking at the current risk proxy measures adopted, such as the Gaussian-based, value at risk (VaR) measure. Furthermore, the authors analysed multiple alternative approaches that do not take into account the normality assumption, such as other variations of VaR, as well as econometric models that can be used in risk measurement and forecasting. Value at risk (VaR) is a widely used measure of financial risk, which provides a way of quantifying and managing the risk of a portfolio. Arguably, VaR represents the most important tool for evaluating market risk as one of the several threats to the global financial system. Upon carrying out an extensive literature review, a data set was applied which was composed of three main asset classes: bonds, equities and hedge funds. The first part was to determine to what extent returns are not normally distributed. After testing the hypothesis, it was found that the majority of returns are not normally distributed but instead exhibit skewness and kurtosis greater or less than three. The study then applied various VaR methods to measure risk in order to determine the most efficient ones. Different timelines were used to carry out stressed value at risks, and it was seen that during periods of crisis, the volatility of asset returns was higher. The other steps that followed examined the relationship of the variables, correlation tests and time series analysis conducted and led to the forecasting of the returns. It was noted that these methods could not be used in isolation. We adopted the use of a mosaic of all the methods from the VaR measures, which included studying the behaviour and relation of assets with each other. Furthermore, we also examined the environment as a whole, then applied forecasting models to accurately value returns; this gave a much more accurate and relevant risk measure as compared to the initial assumption of normality.

**Keywords:** risk; bonds; equities; hedge funds; forecasting; GARCH; value at risk

#### **1. Introduction**

Financial markets have always been prone to exogenous shocks, and as a result, the element of risk is the dominant factor when examining and observing global patterns of the financial world. In historical patterns of crises (for example, 2007–2008), global financial meltdown had a massive impact on how firms could construct their portfolio in an effort to mitigate risk impact. It has become prudent to analyse current risk measures and find

**Citation:** Samunderu, Eyden, and Yvonne T. Murahwa. 2021. Return Based Risk Measures for Non-Normally Distributed Returns: An Alternative Modelling Approach. *Journal of Risk and Financial Management* 14: 540. https:// doi.org/10.3390/jrfm14110540

Academic Editors: Robert Brooks and Adrian Cantemir Calin

Received: 30 June 2021 Accepted: 1 November 2021 Published: 10 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

better ways of measuring it to allow for more accurate risk management and decision making. Thus, the increased volatility of financial markets over the last decade has induced researchers, policy makers and practitioners to develop and design more sophisticated risk management tools. Value at risk (*VaR*) has become the most standard proxy used by financial analysts to measure and quantify risk impact. In mathematical terms, *VaR* is calculated as follows:

$$VaR\_{\mathfrak{a}} = \mathfrak{a} \* \sigma \* \mathcal{W}$$

where *α* reflects the selected confidence level, *σ* the standard deviation of the portfolio returns and *W* the initial portfolio (Jorion 2003).

"Risk measurement is the foundation of risk management and hence of vital importance in any financial institution and investor" (Goodfellow and Salm 2016, p. 80). The universe of risk measures is large, and so, there is no single measure of investment risk that is correct for all purposes or for all investors. This differs based on investor needs and requirements. Risk tends to be measured in relation to price changes, which can take various forms such as relative, absolute, or log price changes. This price change is what this paper bases the risk measures on.

Conditional value at risk (CVaR), which is also known as Mean Excess Loss, Mean Shortfall or Tail VaR, is an alternative to VaR and attempts to deal with its predecessor's shortcomings. It measures the losses beyond the VaR point and has better coherence with properties of a robust risk measure. It is also referred to as expected tail loss and satisfies the measures that defines risk. Therefore, it can be argued that is is an ideal alternative that overcomes the shortfalls of VaR because it represents the loss that is expected beyond the loss given by the VaR. It can be argued that CVaR is a coherent risk measure that inhibits the following properties, such as transition-equivariant, positively homogeneous, convex and monotonic (Pflug 2000).

VaR is such a prevalent method to estimate risk, partly due to the regulatory framework created by the Basel committee on Banking Supervision during the 1990s, which forces the banks to calculate risk adjusted measures of capital adequacy based on VaR for their portfolios. This requirement was put into place in order to mitigate banks from taking on too much financial risk (Basel Committee on Banking Supervision 2009).

A VaR estimate corresponds to a specific critical value of a portfolio's potential oneday profit and loss probability distribution. Given their function both as internal risk management tools and as potential regulatory measures of risk exposure, it is important to quantify the accuracy of an institution's VaR estimates. However, it has some undesirable mathematical characteristics, such as a lack of subadditivity and convexity.

VaR also tends to be popular due to its ease in use and implementation by the average investors. However, with studies emerging, it is being revealed that most data are not homoscedastic but exhibit heteroscedasticity in the variance. This implores the use of the autoregressive conditional heteroscedasticity model (ARCH) and the general autoregressive conditional heteroscedasticity model (GARCH) to capture the changing volatility over time, since they are conditional upon heteroscedasticity (Orhan and Köksal 2011).

This study initially looks at the normality/non-normality of most of the data of returns. If this is affirmed, incorporating non-normality in risk calculations should therefore provide a better way of understanding and measuring the downside risk that is in an asset. The rest of the paper assesses models that consider this and gives a glimpse into the many options available that can be used to assess risk in non-normal distributions.

#### **2. Literature Review**

Risk measurement has evolved over time, and it began with the introduction of probabilities. The discovery and estimation of probabilities paved the way towards the first step in risk quantification. Evidence from Marrison (2002) alludes to the fact that risk is unavoidable in the financial industry, and so the best way to deal with it is to manage how it is measured and how it can provide the greatest risk-adjusted return whilst minimizing the risk exposure of the asset. In their study, Lovreta and Pascual (2020) examined the impact of sovereign default risk by using a vector autogressive (VAR) framework and applying Grainger-causality tests to illustrate implications of parameter instability between banks and sovereign default risk. The findings revealed that structural dependence in the financial system extends between banks and sovereign default risk volatility. The findings made further revelations that structural changes are present in the short-run dynamic relations between the sovereign and banking sector default risk during the period under analysis.

A large strand of risk measurement has focused on risk aggregation concerning risk implied by an aggregate financial position (Francq and Zakoïan 2018). Risk aggregation with dependence uncertainty is commonly referred to as the sum of individual risks with known marginal distributions and unspecified dependence structure. In their study, Francq and Zakoïan (2018) investigated both the estimation of market and estimation risks in portfolios by observing individual returns that follow a semi parametric multivariate dynamic model, and the asset composition was time varying. They developed asymptotic theory, which is used to estimate conditional VaR, and proposed an alternative risk testing approach called Filtered Historical Simulation (FHS). They concluded that by neglecting the estimation risk, practitioners might believe risk is controlled at a given level.

The universe of risk measures is large, and so there is no single measure of investment risk that is correct for all purposes or for all investors. Thus, the vast number of financial risk measures in the literature can be broadly subsumed into two categories: 1. risk as the magnitude of deviations from a target (risk of first kind), and 2. risk as necessary capital with respect to necessary premium (risk of the second kind). This differs based on investor needs and requirements. Risk tends to be measured in relation to price changes, which can take various forms, such as relative, absolute or log price changes. This price change is what this paper bases the risk measures on. The analysis is of various asset returns across different classes.

Generally, there are three specific measures used to measure the risk of individual assets, namely standard deviation, beta and duration. The volatility of asset prices is determined by the standard deviation approach, beta tackles market risk and portfolio risk measures, whilst duration measures the sensitivity of debt security prices to changes in interest rates. As the topic suggests, this paper focuses on the risk measures that are determined by the returns of the assets, and so it first assesses standard deviation, its downfall and what alternatives can be used to make the risk measure more accurate and reliable. It is also important to note that whilst standard deviation may be the appropriate measure of risk for high-volume markets (for example, Germany, the UK and France) it is not the most reliable risk measurement in terms of volatility when considering low-volume markets. It is also imperative that extreme financial external shocks lead to sharp spike jumps in stock prices and subsequently, volatility patterns. Wang et al. (2020) adopted four Jump-GARCH models in order to forecast the jump diffusion volatility which is used as a risk factor. The authors (Wang et al. 2020) considered both linear and nonlinear effects and the VaR of financial institutions by using vector quantile regressions. Evidence from the investigation revealed three interesting findings, which showed that when observing the volatility process of bank stock prices, the jump diffusion GARCH model is better than the continuous diffusion GARCH model. The jump behaviour of bank stock was seen to be heterogeneous due the difference in the sensitivity of the abnormal information shock. The performance of the support vector regression was seen to be a better approach than parametric quantile regression and nonparametric quantile regression. These findings yet again show that there is no "one size fits all" risk measurement model; thus, observing and measuring risk on asset returns requires a much more bespoke approach. Tang and Su (2017) conducted an analysis of the impact of financial events on systematic risk by observing the Chinese banking system. The authors criticize the methodological use of Contingent Claims Analysis (CCA) as a theoretical model to measure systematic risk (or beta of an asset) due to its strict theoretical assumptions and single source of risk information, as well as the fact that it limits the asset value volatility of the macro sector to

being stationary stochastic process. Instead, they proposed an alternative model by relaxing the assumptions of CCA, replacing pure diffusion with jump diffusion, and introducing a macro-jump CCA that acts as a proxy to predict early warning effects of financial events.

Liu et al. (2020) examined the estimation for conditional volatility models and proposed a model-averaging estimator for conditional volatility under a framework of zero conditional mean. The three most popular univariate conditional volatility models are the generalized autoregressive conditional heteroskedasticity (GARCH) model of Engle (1982) and Bollerslev (1986) and the exponential GARCH (or EGARCH) model of Nelson (1990). Using Monte Carlo experiments, Liu et al. (2020) showed that their approach, the model-averaging forecast, leads to a better forecast accuracy, unlike other commonly used methods.

The Gaussian distribution introduced the assumptions of a normal distribution, which inevitably ushered in the mean variance framework; commonly used measures of risk are modelled around the normal distribution. However, it has been seen that most distributions are in fact not normally distributed, and continuing to ignore this fact by maintaining the assumptions used in the current risk measures distorts outcomes and leaves the financial markets unprepared to deal with and manage the degree of risk impact.

On the contrary, most returns tend to be not normally distributed but instead have heavy or fat tails that tend to be skewed. As highlighted by the recent 2007/2008 financial crisis, there are serious deficiencies concerning the risk models used. Most of these stem from the assumptions used in inputting the models; therefore, there is a model risk that is prevalent and it is desirable to find one that has a lower model risk level. Risk management models will need to be improved, putting greater emphasis on stress tests, scenario analysis and the use of forecasting. Bhowmik and Wang (2020) conducted a comprehensive literature review in order to ascertain different risk models raging from GARCH family-based models and used stock market returns as a "barometer" and an "alarm for economic and financial activities". Therefore, in order to prevent market uncertainty and mitigate risk in the stock markets, it is fundamental that the volatility of the stock market index returns is effectively measured. This means that financial market volatility is mainly reflected in the deviation of the expected future value of assets, indicating the possibility that volatility represents the uncertainty of the future price of an asset. This uncertainty is usually characterized by variance or standard deviation. However, it is imperative to note that measuring stock market volatility is a complex process for researchers because volatility tends to cluster. In fact, volatility is seen to be a permanent behaviour of the stock market around the globe. The presence of the volatility in the stock price makes it possible to earn abnormal profits by risk seeking investors. Kumar and Biswal (2019) employed GARCH econometric models in their study, and the results confirmed the presence of volatility clustering and advantageous effects that affect the future of stock markets. When using GARCH family models to analyse and forecast return volatility, the selection of input variables for forecasting is paramount, as essential conditions will be given for the method to have a stationary solution and perfect matching (Nelson 1990).

Dixit and Agrawal (2019) also suggested that the P-GARCH model is the most suitable to predict and forecast stock market volatility for the Bombay Stock Exchange (BSE) and National Stock Exchange (NSE) of India. The authors also highlighted the fact that volatility in financial markets is reflected because of uncertainty in the price, return, unexpected events and non-constant variance, which can be measured through the GARCH models that will give insight for investment decisions.

Variance measures the average squared deviation from the expected value, but this measure does not really give the full picture or description of the risk. In addition, proponents of this approach, such as Bodie et al. (2001), suggest that even though the returns are not normally distributed, if returns of a larger portfolio are used, it begins to resemble a normal distribution as it grows. However, in the case of stock prices, these cannot be negative, and so, to say that the normal distribution will accurately represent this distribution is a stretch.

The 2007/2008 financial crisis revealed that most risk managers are moving away from the historical treatment of risk but instead are laying more emphasis on scenario analysis and stress testing (Crouchy and Mark 2014). Stress testing is a risk management tool used to evaluate the potential impact on portfolio values of unlikely, although plausible events or movements in a set of financial variables (Lopez 2005). They are designed to explore the tails of the distribution of losses beyond the threshold (typically, 99%) used in the VaR analysis. For now, it is worth noting that the reason for them moving more to this approach is that they want to focus on looking at the outcome of a given adverse stress/scenario on their company or on a portfolio.

Volatility as a risk measure is conditional on past information and, by definition, the distribution must be symmetric, and thus not capable of capturing any asymmetric aspect of risk. This makes this risk measure more useful during normal market conditions, but less helpful during crises when the market does not behave according to the normality assumptions (Billio et al. 2016). However, the relevance of volatility forecasting for financial risk management has drawn a significant level of interest in the literature. This volatility forecastibility tends to vary with horizon, and different horizons are relevant in different applications. The estimation of volatility is a key input for calculating VaR because it directly depends on the expected volatility, time horizon and confidence interval for the continuous returns under analysis (Lelasi et al. 2021).

The VaR framework can be defined as a risk-measuring framework developed by financial market professionals as a means of measuring and comparing risk inherent in different markets. It describes the tail loss that can occur over a given time period resulting from its exposure to market risk, at a given confidence level. "It is the loss in market value time of an asset over a given time period that is exceeded with a probability" (Baker and Filbeck 2015). In other words, at a given time, an investor wants to know what he may stand to lose at a set confidence level and ideally, this is what VaR aims to answer. Thus, VaR models were sanctioned for determining market risk capital requirements for large banks and international banking authorities through the 1996 Market Risk Amendment Basel Accord. Spurred on by these developments, VaR has become a standard measure of financial market risk used by financial and even non-financial firms. VaR forecasts are often used as a testing ground when fitting alternative models for representing the dynamic evolution of time series of financial returns.

The downside of using VaR as a risk management tool is that it is most accurate for the evaluation of very short periods, such as a one-day period, and becomes more inaccurate the longer the time period (Krause 2003). Another limitation with VaR is that it does not give any further details on the losses beyond this VaR point. This is a limiting factor because such extreme risk, which is unaccounted for, can even result in the failure of the business or portfolio. VaR assumes that historical correlations do not vary in times of market stress and that the historical correlations remain unchanged, which is an incorrect assumption that can be proven in the data analysis (Baker and Filbeck 2015). VaR usually underestimates ex post volatility. Several studies have tested the performance of VaR in volatility forecasts and tested its limits against ex-post benchmarks (Nieto and Ruiz 2016; Barns et al. 2017). These studies concluded that simpler volatility specifications produce better VaR forecasts. For example, the popular VaR methodology based on a normality assumption on the conditional distribution of returns often leads to the underestimation of financial losses, as fat tails are not accounted for.

Jorion (2003) reiterates how VaR is an incomplete measure and may need to be used in conjunction with other risk measures to make it more accurate regarding shortcomings highlighted in the above discussions. The prominent downside of using VaR measurements is that it more suitable and accurate for the evaluation of very short periods, such as oneday periods. It operates under the assumption that the distribution is normally distributed and so it becomes a more inaccurate measure as the period prolongs. Since it works under "normal market conditions", it can be deemed as misleading for measures with a longer time span, which do not exhibit normal conditions during abnormal market periods. Other

factors such as volatility can also give inaccurate data as they are subject to human bias, further making this proxy measure inaccurate.

As a decision tool, VaR has certain limiting mathematical properties such as convexity and subadditivity, as well as the fact that it gives the point at which the value at risk is and fails to give further details beyond this VaR point. Properties of risk measures can be formulated in terms of preference structures induced by dominance relations (see Fishburn 1980). This is detrimental because such extreme risk, which is not accounted for, can have a severe impact on the business or portfolio failure.

Stress testing is the second most commonly used risk management tool together with VaR, which is preferably used when assessing the investment strategy of a portfolio, to determine portfolio risk. It can also be used in determining hedging strategies, which will reduce portfolio losses (Van Dyk 2016). Thus, stress tests conducted in the context of a risk model can provide a useful alternative or complement to the current ad hoc methods of stress testing (Alexander and Sheedy 2008).

Most areas that may have been neglected in stress testing, which may have resulted in the recent crisis, could include failure to include risks such as securitization risk in the stress testing process. According to Van Dyk (2016), the stress tests that were carried out were unable to recognize the risk inherent in structured products. These risks would differ with differing asset classes such as between equities and bonds.

#### *2.1. Suggested Alternatives for Non-Normal Distributions—Unconditional Distributions*

There are two ways that modelling methods can be classified: as unconditional distributions, which are time independent, and conditional distributions, which are time dependent. Those that are classified as unconditional are those used under the normality assumption, and they assume that distributions of returns are independent of each other and tend to be independent of their past data. These include different variations of VaR. The conditional distribution approach is a newer modelling method designed to cater for nonnormal distributions and address the shortcomings of the normal distribution to accurately model returns and risk. This method disputes the idea that returns are identical and independent, and models used in this category include GARCH and stochastic volatility, which are time-dependent processes. A benefit of this approach is that the models account for volatility clustering, which is a frequently observed phenomenon among return series. Hansen and Lunde (2005) made a comparison of ARCH-type models in terms of their ability to describe variance. They tested different volatility models in order to ascertain a better and more robust description of financial time series. In total, 330 GARCH-type models were compared using DM–USD exchange rate data and daily IBM returns. Findings revealed that there was no evidence to support the notion that GARCH is inferior to other models, and the models that performed better were the ones that accommodated a leverage effect. This argument is consistent with Kumar and Biswal (2019). This view is also mirrored in the work of Kilai et al. (2018), who acknowledge the shortcomings of GARCH-normal because it underestimates risk.

Conditional value at risk (CVaR)/expected tail loss/expected shortfall (ES) are alternatives to VaR and try to deal with their predecessor's shortcomings. These measure the losses beyond the VaR and have better coherence with the aforementioned desired properties of a good risk measure, two of which are convexity and monotonicity, which are there to ensure a desirable optimum exists (Baker and Filbeck 2015). CVaR modelling is cast in terms of the regression quantile function—the inverse of the conditional distribution function. In principle, VaR and CVaR measure different properties of the distribution. VaR is a quantile and CVaR is a conditional tail expectation. The two values only coincide if the tail is cut off. However, the problem choice between VaR and CVaR, especially in financial risk management, has been a popular base for discussion. The reasons affecting choice are based on the differences in mathematical properties, the stability of statistical estimation, the simplicity of optimization procedures and acceptance by regulators, etc.

In their early work, Rockafellar and Uryasev (2000) focussed on minimizing CVaR, whereby they used a technique for portfolio optimization which calculated VaR and optimized CVaR simultaneously. This technique is fundamental for investment companies, brokerage firms, mutual funds, etc., and uses a combination of analytical or scenario-based methods. Several case studies showed that risk optimization with the CVaR performance function and constraints can be carried out for large portfolios and a large number of scenarios with relatively small computational resources. It is also possible to dynamically hedge CVaR with options, and this is because CVaR takes into account average loss exceeding VaR. CVaR is considered as a coherent risk measure (Acerbi and Tasche 2002). CVaR has superior mathematical properties compared to VaR. CVaR is so called a "coherent measure"; for example, the CVaR of a portfolio is a continuous and convex function with respect to positions in instruments, whereas the VaR may even be a discontinuous function (Sarykalin et al. 2014). The theory of coherence requires a measure of risk to satisfy four mathematical criteria: (1) translation or drift variance (linear), (2) homogeneity, (3) monotonicity and (4) subadditivity (Artzner et al. 1999; Sheppard 2013). Risk management conducted using CVaR functions can be performed efficiently, and it can be optimized and constrained with convex and linear programming methods. Overall, CVaR provides an adequate picture of risks reflected in extreme tails. This is a very important property if the extreme tail losses are correctly estimated. However, broadly speaking, problems of risk management with VaR and CVaR can be classified as falling under the heading of stochastic optimization.

By using this measure, it is possible to handle extreme events and to measure risk in a more precise and easier way. Instead of focusing on what is likely to happen, this is achieved through the examination of the conditional expectations, hence the name. Expected shortfall has been one of the methods proposed as an alternative to VaR. For instance, in a situation when imposing limits on traders, by using VaR, the firm will assume too much because it holds less capital to cover this risk. This then puts the company at risk (Danielsson et al. 2005).

As the economic financial meltdown of 2008–2009 demonstrated, VaR fares poorly during periods of market distress (Chen 2014). Basel 2.5 addressed some of the shortcomings of VaR and added stressed VaR (SVaR). SVaR is viewed as a response to the pro-cyclicality of classical VaR. This is "a forward-looking measure of portfolio risk that attempts to quantify extreme tail risk calculated over a long time horizon" (Berner 2012). Its aim is to provide a wider perspective of various inherent risks such as market and credit risk, as well as gap risks and jumps. This is because it analyses a time-period that experienced extremes and takes into account the sudden changes. In short, stressed VaR is used to obtain an idea of the possible losses likely to occur given worse market conditions. SVaR corrects various deficits of the ordinary VaR in times of market stress and it employs the Gaussian (normal) probability formalism in a completely different way than VaR and is designed to account for tail risk and collective behaviour (Billio et al. 2016). The two main properties of SVaR are "fat tail volatilities" that account for outlier events in the risk factors and stressed correlations between risk factors that account for collective market participant behaviour in stressed markets (Dash 2012).

Lichtner (2019) emphasized that what is fundamental to any SVaR model is the choice of the return-type model for each risk factor. In his study, he proposed a generalized return model. This is because of the sensitive nature of SVaR numbers to the chosen return-type model, and researchers have to make prudent choices on which return type to adopt. The findings reveal interesting, different SVaR dynamics for each return type such as absolute, relative, shifted relative, log, etc.

It is a regulatory requirement according to the Basel III framework that SVaR be calculated for banks and financial institutions according to the framework (European Banking Authority 2012). They state that the benefit of using it is that it is procyclical. Applying this to asset classes or portfolios will have positive benefits over the generic VaR. The European Banking Authority (2012) stated that historical data which covers a

continuous 12 months must be used or if it is less, then it must cover the period under which the portfolio was under stress. There is no weighting of historical data when determining the relevant historical period or calibrating the stressed VaR model. This is because the weighting of data in a stressed period would not result in a true reflection of the potential stressed losses that could occur for an asset or portfolio. It is assumed that a stressed VaR can be exceptionally smaller than VaR, and so this is tested as the authors compare some VaR variations. In the Basel framework, it is also recommended that there should be ongoing monitoring of stressed VaR relative to VaR. The ratio between the two measures that is identified at the beginning of the monitoring period should be used as a reference to the continued monitoring. Should the ratio decrease significantly, it should signal the potential need for a review of the stress period. Should the ratio between the stressed VaR and VaR go below one, this should be taken into account as a warning signal that may warrant the review of the stressed period.

Marginal VaR (MVaR) is another variation of VaR and amplifies the notion of understanding an additional amount of risk for new investments. This allows portfolio managers to understand the impact of adding or deducting positions in their portfolios. Volatility measures the uncertainty of the return of an asset when it is considered in isolation. When this asset belongs to a portfolio, what is of critical importance is its contribution to the portfolio risk. The effect of the small changes in a part of the portfolio to the portfolio VaR is measured by marginal value at risk (MVaR).

Incremental value at risk (IVaR) is yet another approach adding onto the traditional VaR approach. It has become a standard tool for making portfolio-hedging decisions, in particular, hedging and speculating with options and reducing the risk in a risk return analysis (Mina 2002). In theory, IVaR is a metric that measures the contribution in terms of relative VaR of a position or a group of positions with respect to the total risk of a pre-existent portfolio. Its aim is to calculate the worst-case scenario most likely to occur for a portfolio in a particular given period. Investors can therefore be able to determine which investment to undertake based on how it affects the portfolio, choosing the one which has the least losses impact. More recently, Jain and Chakrabarty (2020) made further improvements upon the performance of a managed portfolio by proposing the use of MVaR in order to ascertain the desirability of assets for inclusion in a managed portfolio. They empirically conducted the study with S&P BSE 100 as the benchmark index with at least five different optimization problems. Thus, in order to capture the effect of the change in dollar exposure of assets on the overall risk portfolio, the use of VaR of the individual assets is not adequate, and therefore, MVaR provides a suitable fit. This means that MVaR is more amenable when it comes to identifying additional portfolio risk (Jain and Chakrabarty 2020).

Lower Partial Moment (LPM) is a measure of downside risk. This is calculated as the average of squared deviations below a given target return. It is also a supplement to the VaR LPM and applies specifically to the left tale of a distribution of the VaR. It can be the average across all negative deviations from the VaR or the variance of the negative deviations of a pre-defined VaR It is, therefore, a method that places heavier weight on larger deviations from the current VaR in comparison to the smaller deviations (Goodfellow and Salm 2016). An investor can select or set the risk-free rates, and by selecting the degree of the moment, one can specify the risk levels that suit the portfolio or risk needs. Goodfellow and Salm (2016) compared three different risk measures based on the same stock return data, the portfolio variance as in the seminal works of Markowitz (1952) and the VaR based on *t* copula. They concluded that normal assumption substantially underestimates the risk faced by an organization, and therefore, they discredited the use of risk measures based on normal assumption.

Multifactor and Proxy Models are another option. The difference between multifactor models and arbitrage pricing models is that the latter limit themselves to historic data, whilst the former expand the data and include other macro-economic factors in the mode such as ratios and market capitalization. Their assumptions are that market prices usually

fluctuate for no specific reason. They also assume that the returns of a stock are related to the riskiness; if a stock has high returns over a period of time, it must be riskier than the one with lower returns.

Fama and French (1992) found that if markets are reasonably efficient in the long term, then the market capitalization and book to price ratios were good stand-ins or proxies for risk measures. These types of multifactor models will thus do better in comparison to conventional asset pricing models to explain the differences in returns. The Capital Asset Pricing Model (CAPM) is based on the idea that not all risks should affect asset prices. The CAPM gives us insights about what kind of risk is related to return.

#### *2.2. Suggested Alternatives for Non-Normal Distributions—Conditional Distributions*

Serial correlation in asset return time series analysis to determine volatility forecast— ACF PACF—falls under the conditional distribution aspect. The aim of time series analysis is to decompose data to find a trend or seasonal component to it. This is carried out to assess and predict, and serial correlation is one such methodology of time series analysis. This occurs when a given period has returns that are dependent on the previous time, also known as autocorrelation, and is used to describe the relationship between observations on the same variable over an independent period. A zero correlation means they are independent. If otherwise, then they do not evolve in a random process, but rather they are related to their prior values (Napper 2008).

Having asset returns that are serially correlated tends to distort an asset class's actual risk as it reduces risk estimates from a time series characteristic, because serial correlation improperly smoothens an asset class volatility (J.P. Morgan Asset Management 2009). If one has a positive autocorrelation, this will lead to a VaR that underestimates the actual volatility of the asset. If one has a negative autocorrelation, this can result in an overstated volatility. (Baker and Filbeck 2015).

The Copula (t and Gaussian distribution) is a more mathematical approach to risk measurement which is under the Integrated Risk Management (IRM) framework, measuring the error of normal assumption and therefore adjusting it to fit the non-normality aspects of a distribution (Embrecchts et al. 2001). It takes into account events such as shocks and, under this methodology, what is known as fatal shock models, which occur and result in the destruction of the component. Other types are non-fatal shock models, whereby there is a chance of surviving the shock. Shocks can result from natural catastrophes or underlying economic events.

This method allows for the construction of models that measure risk which go beyond the standard ones at the level of dependence. This can be used to stress test different products and portfolios for extreme correlations and for the measurement of dependence. This methodology also addresses the limitations in linear correlation. However, the linear correlation tends to be preferred because of its simplicity and assumes normality. The concept of Copulas is rather abstract, in that it does not capture the joint movements in extreme values. This is where it would be prudent to consider using the next method to take into account the joint movement aspects of the non-normal distribution. This can be achieved by looking at correlation from a portfolio context whilst taking into account the variations that may evolve given different periods and under differing market conditions.

Financial markets are dynamic and constantly changing, and so it is critical to obtain data that are accurate and relevant to the period. In their BIS article, Loretan and English (2000a) suggested that when determining the appropriate time interval to use, risk managers should consider the periods of relatively high or low volatility. This is because these periods will have relevant information on the true and underlying asset relationship. If volatility and correlation are not adequately assessed, it could cause problems for risk managers when they want to stress test or look at worst-case scenarios. The thinking behind this is that higher correlations between assets generally come with added increased volatility.

This can be performed by grouping a period of high volatility and assessing the historic correlation, and this can be applied to any future high volatility scenario projections. A time under normal conditions must also be assessed to see how the correlation of volatiles is under this. It would be good to assess the correlations in these periods and to determine whether or not they increase or decrease with increased volatility. Thus, the main challenge is how to model risk to accommodate the time-varying nature of asset return volatilities (Loretan and English 2000b).

The main reasons for forecasting can be classified as being for asset allocation, for risk management, and for taking bets on future volatility. When looking at risk management, risk is measured to find the potential losses that an asset or portfolio may incur in the future (Reider 2009). To ensure this is possible, there will be a need to have estimates of the future correlations of the assets, as was described above in looking at correlations in risk measures. Following the seminal work of Box and Jenkins (1976), Autoregressive Moving Average (ARMA) models have become the standard tool for modelling and forecasting univariate time series. The ARMA framework can also be extended in order to allow for the possibility of heteroscedasticity. It is also necessary to obtain estimates of the future volatilities of the assets and this is where the ARMA and generalized autoregressive heteroskedasticity (GARCH) family of models come into play. They come in various methods and hybrids, but their intention is to forecast volatility. Volatility is not the same as risk. When it is interpreted as uncertainty, it becomes a key input to many investment decisions and portfolio creations. Thus, a good forecast of the volatility of asset prices over the investment-holding period is a good starting point for assessing investment risk. However, as a concept, volatility is simple and intuitive. It measures variability or dispersion about central tendency. Thus, the greater the deviation, greater the volatility. Furthermore, volatility forecasts are sensitive to the specification of the volatility model, and it is paramount to establish a balance between capturing the salient features of the data and overfitting the data.

Value at risk forecasting with the ARMA models are ideal in times of increased volatility. As previously introduced, the autoregressive and moving average can be used in time series analysis, and stemming from this are the models, which this paper mostly focusses on in the risk forecasts for the asset classes. These are namely the ARMA model, which is a combination of the autoregressive model and the moving average, GARCH model and exponential generalized autoregressive conditional heteroscedasticity (EGARCH) models. These models are designed to deal with common financial data time series characteristics such as thick tails and can be used to forecast the returns. EGARCH models capture the most stylized features of stock volatility, mainly time series clustering, negative correlation returns, lognormality and under certain specifications (e.g., Bollerslev 1986).

While ARCH was developed to model the changing volatility of inflation series, the model and its later extensions were quickly adopted for use in modelling conditional volatility of financial returns. The GARCH technique is a more sensitive way to measure risk in a distribution. GARCH is useful for non-normal distributions to analyse an asset's risk as well as market risk.

It integrates the theory of dynamic volatilities and that is used to describe variance in terms of what is currently observable (Engle 2004). This method takes the weighted average of past squared errors to form a weighted variance. By using the weights, with the most weight going towards the most recent information, it gives more influence to current data.

GARCH was developed to tackle the belief and fact that returns tend to be unpredictable and that they have fat tails. The main advantage of using these GARCH models is that they captured both the heavy tails of a return series as well as capturing the volatility clustering (Mabrouk 2016). According to the model, where there is high volatility, it is likely to remain high, and where there is low volatility, it will be low, given the fact that the periods will be time limited. GARCH adds onto the ARCH model, where the process produces extremes that would naturally be expected from a normal distribution, since the

extreme values during the high volatility period are larger than could be anticipated from a constant volatility process (Engle 2004).

In using the GARCH approach, the analysis uses the exponentially weighted moving average (EWMA), which solves the problem of weights, whereby the recent return has more influence on the variance than that from the previous month. EWMA as a measure of volatility uses a smoothing parameter called *Lambda,* which is there to ensure that the variance that is calculated is more biased towards recent data. The weights exponentially decline at a rate throughout time.

The use of EWMA in this model is more fitting because it incorporates external shocks better than equally weighted moving averages. These estimates are much more efficient than the standard simple moving averages because they provide a more realistic measure of current volatility. However, there are some downsides to the model, such as the use of EWMA. For example, it assumes that the trend will continue in the long term, and so the forecasts determined become increasingly inaccurate as the forecast horizon expands. However, for scenarios with medium volatility of volatility, there is very little penalty for using EWMA regardless of the volatility generating process. The robust structure of EWMA appears to contribute to its greater forecasting accuracy than flexible GARCH. Brooks (2014) shared the logic that GARCH models are designed to capture the volatility clustering effects in the returns. In particular, he mentioned the Exponential GARCH model (EGARCH), about which he says on the plus side, its application ensures that the conditional variance will always be positive.

The stochastic volatility model (SV) is a model that is similar to the GARCH model but differs in that it is parameter driven by latent shocks, whereas GARCH is observation driven (Diebold and Rudebusch 2012). This makes it easier to analyse. The logic behind this model is that it considers issues such as time deformation (the idea that calendar time does not have to move in tandem with financial markets). Calendar time is constant and predictable unlike economic time, which may slow down or speed up. SV model forecasts are noticeably more accurate than GARCH in scenarios with very high volatility of volatility and a stochastic volatility generating process.

Another variation is the realized volatility. This method differs from the GARCH and SV approach, which looked at conditional expectations of discrete time squared returns. Realised volatility looks at realizations and not expectations. This realized volatility facilitates superior risk measurement, which can translate into superior risk management. By using RV, the model is based on continuous time, evolving along the way to allow for more accurate volatility forecasts. It allows for jumps and can be used to improve GARCH models and SV models.

#### **3. Methodology**

Our research design attempted to answer the research questions proposed earlier in this study. In view of the research problem, the proposed study applies a quantitative research design, drawing the analysis from multiple risk model testing methods. Through the testing methods derived from the GARCH family, we ascertain that the traditional assumptions of normality in a distribution do not assist financial decision makers in determining the appropriate models that give accurate risk measures. The normal distribution assumption is based on forecasting the future price changes of assets and basing these forecasts on the belief that prices follow a random walk. A random walk is described as a movement of variables or numbers that do not follow a specific order, and two types of random walk theories can be identified. Firstly, one is based on single price assets and the other is the fundamental model of asset price dynamics. To give our research design more richness, we also employed measures of dispersion in order to ascertain the distance between values within the distribution (range, interquartile range). Overall, three major ways of measuring risk are discussed, namely the variance–covariance approach, historical data and Monte Carlo simulations:

#### Choice of Data sets

Our research study adopted an empirical approach by examining a data set that represents the following asset classes: equities (138), hedge funds (13) and bonds (+20). All the data used in this study were obtained from Bloomberg. There was a diverse range from different classifications, sectors, industry classes, asset types, geographic regions and so forth. However, from this initial data pool, there had to be a selection of only those assets that would exist over the selected test time period, as is explained in the time period cut off section that follows.

#### Backtesting VaR Models

Several backtests were conducted, but the discussion on these methods is by no means exhaustive, since in this context, it is impossible to go through the variety of different methods and their applications. VaR models are only useful if they predict future risks accurately. In order to evaluate the quality of the estimates, the models should always be backtested with appropriate methods. However, a thorough discussion on every backtest method is beyond the scope of this study.

Unconditional coverage was adopted to statistically examine whether the frequency of exceptions over the time interval is in line with the selected confidence level. This was carried out to determine when portfolio losses exceeded VaR estimates.

Conditional coverage was also conducted through the clustering of exceptions in order not only to examine the frequency of VaR violations but also the time when they occur. For this part, both Christoffersen's (1996) interval forecast test and the mixed Kupiec test by Haas (2001) were conducted. As discussed earlier, the choice of parameters in VaR calculations is not arbitrary whenever backtesting is involved. To construct a solid view on the validity of the model, a relatively low confidence level was used. According to Jorion (2003), a confidence level of 95% is well-suited to backtesting purposes. We argue that with this approach, it is possible to observe enough VaR violations within a time period. Since the software had a 95% confidence interval by default, the authors would have also opted to use levels of 90% and 99% and test each one individually. However, we do acknowledge that having more than one level of confidence in the backtesting process makes the testing more effective. The total number of trading days (observations) was 252, which was sufficient to produce some statistically significant backtests and was also in line with the Basel backtesting framework.

As discussed earlier, single backtests can never be enough to evaluate the goodness of a VaR model. If one test yields a decent outcome, the result should always be confirmed by another test (Haas 2001).

Various tests were conducted in order to ascertain the research question, "how to measure risk in a non-normal world". The data were clustered into generic groups based on certain characteristics, and a summary of each group was established whilst accommodating exogenous factors. The cut off point for the data was data existing from 1995, and any asset that had data after this period was filtered out. This was carried out to ensure homogeneity, which allowed for better comparisons. There was a diverse range from different classifications, sectors, industry classes, asset types and geographic regions. However, from this initial data pool, there was only a selection of those assets that existed over the selected test time, and this is explained in the time period section that follows. The returns were calculated normally as opposed to via the lognormal method, which is time additive and assumes normal distribution. The skew of a lognormal distribution is positive, unlike market returns that tend to be negatively skewed. This assumption first had to be proved in the normality tests. It can also be a problem when associating log returns with fat tail assumptions.

#### *3.1. Time Period*

The timeline used in this analysis was a 21-year period ranging from 1995 to 2016. This time encompassed various periods of stress, ranging from the Mexican peso crisis to the

EU debt crisis. These periods are very important, as they reflect the stress periods where one can observe the returns during this time and come up with the most accurate risk measures for these situations. As mentioned in the literature review sections, calculating risk is not only based on models but one has to take into account historic events and study trends, business cycles and other issues, so as to take a more comprehensive approach. The current period of volatility resulting from the COVID-19 effect has not been factored into the study, given that the data available are inadequate, as it is still being collected. This, however, may be investigated further in future studies to assess COVID-19's impact. However, Iuga and Mihalciuc (2020) analysed the impact of global shocks and highlighted the global effects of COVID-19 by developing two models, using a regression model and their study to analyse the time frame from 2001–2020 Q2. This is due to the fact that data are still unfolding, and we minimized our analysis by not putting so much focus on the effect of the COVID-19 pandemic. Nevertheless, we do acknowledge the adverse effect the pandemic has had on economies and financial markets at large.

#### *3.2. Tests*

The first test carried out was to test the non-normality (skewness, kurtosis) of the data in order to see if there was non-normality present in the returns. If this was the case, then the next step was to determine the type of non-normality in the observed returns. To assess the non-normality or normality of the data sets, the Jarque Bera test was used, which is based on the classical measures of skewness and kurtosis (third and fourth moments of a distribution). As these measures are based on moments of the data, this test has a zero breakdown value. This test is a goodness-of-fit test of whether sample data have skewness and kurtosis that matches a normal distribution. Once the distributions were determined, it was easier to select the most appropriate measure or set of measures based on the literature gathered on the various options that can be used to accurately measure risk for a given data set with a given distribution. For any distribution *F* with finite central moments μ*k*(*k* ≤ 3), the skewness is defined as follows:

$$\gamma\_1(F) = \mu\_3(F) / \mu\_2(F)^{\frac{3-2}{2}}$$

For any distribution *F* with finite central moments μ*k*(*k* ≤ 4), the kurtosis is defined as follows:

$$\gamma\_2(F) = \mu\_4(F) / \mu\_2(F)^2$$

The next step was to conduct risk measure tests. After carrying out the normality tests, the next step was to carry out VaR tests. The VaR measure concept condenses the entire distribution of the portfolio returns into a single number that investors have found useful and is easily interpreted as a measure of market risk. The VaR is essentially a *p*-percentage quantile of the conditional distribution of portfolio returns. We selected two assets from each asset class that represent different characteristics to make our study more comprehensive and inclusive of different assets. The first part examined the various ways that directly improve the standard VaR by using its hybrids. Thereafter, the analysis focused on the application of other models to conduct volatility forecasts through the use of modelling methods suitable for conditional distributions which are time dependent. The results were compared with those from the other variation of VaR, which is the CVaR. The theory stands to state that the CVaR focusses on downside or tail risk, and so this measure is likely to be higher than that of standard VaR. Risk managers have a plethora of volatility measures to choose from when calculating VaR measures. Time series models of volatility range from exponentially smoothed and simple autoregressive models, over single-shock GARCH models, to two-shock stochastic volatility models. The benchmark measure pivoted in Morgan Guaranty Trust Company (1996) RiskMetrics sets the conditional mean constant and specifies the variance as an exponential filter:

$$
\sigma\_t^2 = (1 - \lambda)\varepsilon\_{t-1}^2 + \lambda\sigma\_{t-1}^2
$$

In the GARCH model, the most widely used form is GARCH (1, 1) and it has some extensions. In a simple GARCH model, the squared volatility is assessed.

When applying the SVaR, according to the Basel III framework, this can be calculated on top of the VaR with a 99% confidence interval. Applying this, however, to more a practical scenario involves the selection of a stress period. For this analysis, the period adopted was for monthly returns for a non-stress period of January 2003 to December 2005 (36 months) and the stressed period was the financial crisis from August 2007 to December 2009 (29 months). In order to broaden the scope of the analysis, two sets of data were used on the equities over the same periods. One was the monthly returns, which was similar to what was applied to the bonds and hedge funds, and the other was the daily returns of the equities. This was carried out to observe whether there was much difference when SVaR and VaR are calculated using monthly versus daily returns. Another dimension to this analysis was applied to the bonds and the normal period: VaR and stressed VaR were compared between one asset that is normally distributed and one that is not normally distributed. The aim was to visualize the extent of the impact a distribution of a data set had on the results of the test conducted.

In practice, Beta is typically estimated using parametric estimators such as ordinary least squares (OLS) because OLS is the best linear unbiased estimator. Additionally, the monthly, quarterly or annual returns data are very often employed (see Kamara et al. 2018). This is because they are stable and likely distributed. However, the outliers—a very common problem in real asset returns—may seriously affect the performance of the parametric estimators compared with non-parametric estimators.

The test for autocorrelation was another critical test undertaken because it helps in seeking dependence in returns over time. This was achieved by plotting the ACF and PACF to observe whether any correlation in an asset's return existed. This will further help in the application of models such as the GARCH to better forecast volatility in the future. If one examines empirical time series, it is easy to observe that large fluctuations in asset prices are more often followed by large ones, while small fluctuations are more likely followed by small ones. This stylized fact is known as volatility clustering. In order to view the properties and determine whether there were more large fluctuations than pure random processes, we looked at the autocorrelations of the returns.

The autocorrelation function *C*(*xt*, *xt*+*τ*) is defined as follows:

$$c(\mathbf{x}\_{t\prime}\mathbf{x}\_{t+\tau}) \equiv \langle (\mathbf{x}\_{t} - \langle \mathbf{x}\_{t} \rangle)(\mathbf{x}\_{t+\tau} - \langle \mathbf{x}\_{t+\tau} \rangle) \rangle$$

$$\sqrt{\langle \mathbf{x}\_{t^{2}} \rangle} - \langle \mathbf{x}\_{t} \rangle^{2} \sqrt{\langle \mathbf{x}\_{t+\tau}^{2} \rangle} - \langle \mathbf{x}\_{t+\tau} \rangle^{2}$$

where *x* denotes the expectation value of the variable *x*.

After this, autocorrelation for an asset with itself and from the portfolio perspective was carried out. The most commonly used correlation test is the Pearson's test, although it assumes normality. The hypothesis of this would be to see whether there is a relationship between the correlation of a portfolio of assets and volatility during the normal periods against that observed over the stress periods. The assumption is that asset classes have correlations that tend to increase during high volatility periods, unlike during normal market conditions. In a portfolio context, this could mean that the benefit of diversification may be nullified in high-stress conditions, leading to an understated risk.

The next step was to assess the time series of the data to observe how the series moved and thus determine if there was volatility clustering and the type of variance that exists in each asset. In financial markets, prices of stocks and commodities fluctuate over time, which then produce financial time series. These time series are in fact of great interest both to practitioners and theoreticians when making inferences and predictions. This was to determine if the volatility displayed was of a homoscedastic nature (same variance) or heteroskedastic (the opposite of homoskedasticity). This helped to determine whether the data behaved as is expected in financial markets, which is usually heteroskedastic. A characteristic inherent in a robust model is that the financial data should be able to accurately show and capture that volatility clustering exists (Lok 2015) in the time series.

Regression analysis was adopted to predict future volatility. In the process of finding the ideal VaR forecaster, the first step was to calculate the AIC of each asset. The AIC is ideal to show the best model to use in predicting volatility. After completing this, the final step was to forecast the risk using the three models, namely the ARCH, GARCH and EGARCH models for each of the asset classes to see how well the model forecasts volatility for each given asset class.

In carrying out this process, other software packages such as R were considered, but due to the complex nature of the methods, the authors opted to use the Excel software package called NumXL to carry out the computations. This package was familiar as it is their main statistical analysis tool. The next step involved plotting the forecast for the data using the various parameters for each model. Using this package, the parameters were calculated and used to calibrate and find the most efficient parameters to apply to each model. Using this data, the forecasts were plotted over a 12-month time horizon to assess and obtain the most likely volatility movements.

### **4. Results**

#### *4.1. Jarque Bera Test—Non-Normality Test*

Despite the remarkable qualities of the Shapiro–Wilk test, we adopted the Jarque Bera (JB) test, which is applied to various general models, such as nonlinear regressions and conditional heteroskedasticity models.

When using the JB test, the aim of the test is to test the null hypothesis. In this case, the null hypothesis was that it is normally distributed and this is represented by a *p*-value of greater than 0.05, which will lead to failure and to the rejection of the hypothesis. However, if it is less than 0.05, then it means that one rejects the null hypothesis, leading to the conclusion that it is not normally distributed. This was applied to the data below for all three assets classes using this formula:

$$JB = \frac{n}{6} \left( S^2 + \frac{K^2}{4} \right) \sim X\_{V=2}^2$$

where


In our analysis, we first derived the asymptotic distribution of the sample skewness and kurtosis coefficients of the model's standardized residuals. After, we constructed an asymptotic X2 test of normality based on the results.

#### 4.1.1. Hedge Fund Indices

Upon carrying out the tests on the hedge fund asset class, a summary of the findings from the Jarque Bera test are shown in Appendix A. All 11 funds have returns that are not normally distributed. By assessing the skewness, it shows that the returns are mostly negatively skewed or close to zero. This confirms the previous hypothesis on the distribution of returns, which stated that most returns exhibit a negative skew and that they tend to not be normally distributed. The kurtosis of the distribution is positive, indicating that the data set has fat tails. One particular asset with a low skewness of 0.0002 is the Dow Jones Credit Suisse Long/Short Equity Hedge Fund index, which would make one assume the distribution is normal, but if one looks at the kurtosis, it is 4. This means that even though its skew is close to zero, it is leptokurtic, which indicates that there is a clustering of the distribution around the mean. This will result in the distribution having higher peaks and fat tails, since most of the returns are clustered around the mean. The HFRMI Index and

the Dow Jones credit Suisse Dedicated Share Bias Hedge Funds have a kurtosis less than 3, of 0.83 and 1.53, respectively, indicating that the distributions are platykurtic, meaning that they have lighter tails.

#### 4.1.2. Equity Indices

The represented equity indices in Appendix B were selected to cover various diversifications, such as different geographic locations, and were either price weighted or market weighted. There were also stocks that represent small capitalization, mid-capitalization as well as large capitalization. This selection of indices ensured that the selection was broader and covered a wider range of the indices and so it was more representative of the markets as a whole. In developing these classifications, the authors also applied the World Economic Situation and Prospects (WESP), which classifies all countries into three segments, which are developed economies, economies in transition and developing economies.

Twenty-three indices were used to carry out this test. The results show that (be it capital weighted, all index weighted or price weighted) the distributions are not normally distributed, as indicated by the N. The only anomaly that is observed regarding normality, with a Y representing a yes, was with the four Japanese Stock Indices: the TPX500 Index (high capitalization), TPXSM Index (small capitalized), TPXM400 Index (mid capitalized) and the TPX100 Index (high liquidity), t the null hypothesis is not rejected, implying that the distribution is normally distributed. This only occurs in 4 indices of 23 observations.

For the rest of the indices, due to a vast number, a large number of different observations are made, but overall, it can be observed that they do not have a normal distribution. The kurtosis is mixed, with some being leptokurtic and others being platykurtic; regarding the differences in tails, some are heavy, implying high tail risk, and some are light, implying less risk in comparison to the heavy tails. The skewness results also differ when looking at the ones classified as being not normally distributed. They all exhibit symmetric distribution (between minus 0.5 and 0.5), and some are moderately skewed (between 0.5 to 1 and minus 0.5 to 1). The other measure is highly skewed, and this is shown by the indices in the data set that have a skewness greater than plus or minus 1, such as the BGSMDC index. Out of these 19 indices, only 4 have a negative skew, whilst the rest are positively skewed. Generally, a negative skew implies long-left tails, which indicates a higher chance of extreme negative outcomes, and positive implies a lesser likelihood of poor outcomes.

#### 4.1.3. Bond Indices

There were around 60 bond indices, but the cut off was made to only include those that cover the test period, since most of them were younger issued bonds. Appendix C has the summary of the results. The results from this data set indicate that all the indices are not normally distributed, and so this supports the argument. Here, kurtosis seems to be above 3 for almost all of the results, indicating the heavy presence of fat tails. Again, almost all of the skewness results indicate a strong negative skew. However, the JPM Global Aggregate Bond index is the only one that has a normal distribution. Its skewness and kurtosis are close to zero, indicating potential symmetry.

Since the initial tests proved the non-normality of majority of asset returns, the analysis was narrowed down to adequately analyse the three asset classes, and two from each asset class were selected. This was to ensure that there was diversity in the results, which were representative of the general population. Below in Table 1 is a summary of the chosen assets and the description that is used in the analysis.

The two indices selected to represent the hedge funds are the HFRI Fund of Funds Composite Index (HFRI) and the Dow Jones Credit Suisse Event Driven Hedge Fund Index (DJCS). They each look at a hedge fund where one is a composite whilst the other is specific (Hedge Fund Research Inc. 2017). The HFRI is a hedge fund that invests with multiple managers through funds or managed accounts. There is, therefore, a diversified portfolio, since the idea behind it is to lower risk significantly by gaining the advantage of investing with more than one investment manager.


**Table 1.** Summary of asset classes representatives.

Source: Authors.

Equity Index: In selecting the equity indices to use, the authors chose two based on variant geographic locations. The first one is an equity index from Finland called the OMX HELSINKI INDEX (HEX). This is an all-shares index which reflects the current status and changes in the stock market. The other index that is examined is the Botswana Domestic Companies Index (BGSMDC), and this is an index that tracks companies that are traded in Botswana. The intention behind these choices was to obtain diverse representation. In this case, the HEX represents a stock market in the developing world against one from a nation in the developed nations which are found on two different continents, namely Europe and Africa. The classification of developed and developing nations is given according to the World Economic Situations and Prospects (WESP), which classifies all countries of the world into three broad categories, namely developing economies, economies in transition and developed economies.

Bond Index: The Merrill Lynch US High Yield Master II Index (H0A0) is the index that represents the corporate side of bonds because it is normally used as a benchmark index for high yield corporate bonds. This differs from the other bond index, which is the JP Morgan Government Bond Index-Emerging Markets (GBI-EM). This covers comprehensive emerging market debt benchmarks that track local currency bonds issued by emerging market governments.

#### *4.2. Summary Statistics on Asset Class*

In the first part of the analysis, an overview of the three asset classes was carried out to test for normality using the Jarque Bera test. This information is applied on these asset classes to estimate model and forecast volatility and the software used in Excel aids in the calculations. In this section, the authors give an overview of initial observations, which are further proved and justified using other models as the analysis continues.

The first step is a summary of Table 2, below, which was generated in NUMXL for the three assets under study. Below is the summary.


**Table 2.** Summary statistics. Source: authors.


**Table 2.** *Cont.*

#### *4.3. Standard Deviation*

Regarding the equities, the HEX has a higher standard deviation compared to the BGSMDC of 7.57% and 4.29%, respectively, showing that is has higher volatility. For the hedge funds, they have a much lower standard deviation compared to equities of 1.65% and 1.80%, respectively, for the HFRI and the DJCS. On the bonds, the GBI-EM is lower than the HOAO, since the former is based on government issues that tend to have lower risk, whilst the latter has the corporate bonds, which carry more risks with them, such as credit and default risk. Thus, it is justified to be riskier. Overall, it can be seen that equities carry higher risks than the other asset classes, making it a riskier asset class characterized by high volatility.

#### *4.4. Skew*

The next result on the distribution of the asset returns is from the third central moment, which is skewness. Four out of six of the assets have a negative skew. The negative skew proves the assumptions, which states how the distributions of returns tend to be negatively skewed. The presence of skewness shows that the distributions are non-normal. However, the Government Bond index GBI-EM and the Botswana BGSMDC have positive skews of 0.1 and 1.9, respectively. The GBI-EM is normally distributed, and its skew is highly insignificant. For the BGSMDC, the positive skew needs to be investigated further. For the asset, it has a positive mean and so a combination of a positive mean and positive skew implies that investing in this asset not only results in positive expected returns but also positive surprises on the upside as well. Investments in this section are highly unstable and this is usually short term until the market catches on. This result is significant given that this is a stock index in a growing market with high growth potential.

#### *4.5. Excess Kurtosis*

The fourth central moment measure looks at the peakedness of a distribution. In Table 2, the NUMXL statistics output is calculated as excess kurtosis instead of kurtosis. Excess kurtosis is defined as kurtosis minus 3. A normal distribution should ideally have an excess kurtosis of 0. These tests also stand to prove otherwise that the returns are not only non-normal and skewed, but that they possess long tails to the left. The negative skew in the returns implies frequent small gains and a few extreme losses. If the figure is above 0, it is leptokurtic, and this signals the presence of fat tails in the data set. Fat tails are considered undesirable because they imply additional risk.

In the tests, the returns from the equity and hedge fund all have kurtosis greater than zero, implying that there are high levels of tail risk. However, on the bonds, a different result is seen. The H0A0 index that represents them is also similar to the equities and hedge funds, although it is unusually higher. This is because on the returns, it has one of the highest ranges of −18% to 11%. This index represents high yield corporate bonds and so with high reward comes high risk, hence the fat tails. Comparing it, however, to the government issue of tracking bonds, it has a kurtosis of 0.6 and was concluded to be normally distributed, and that is why its kurtosis is almost zero, reflecting the small risk exposure this asset has in comparison to its other peers that all showed non-normality of returns.

#### *4.6. White Noise and ARCH Effect*

White noise and the ARCH effect is further explored in the volatility forecasting section. However, as an overview, it can be seen that the ARCH effect gives details on the distribution's fat tails and excess kurtosis. All the assets with an excess kurtosis greater than 0 are assumed to be serially correlated and have fat tails, further affirming the non-normality of the returns. White noise is defined as an assumption that a series has zero mean and constant variance. When applying this concept to the Autoregressive and Moving Average models, these are there to fix any assumptions that go against the white noise assumption according to Katchova (2013). Ideally, for the data to be applied to the model, it must first have the white noise characteristic. Its relevance will be shown in the section on ACF and PACF.

### *4.7. VAR Tests*

Given that the data sets exhibited non-normal distributions with the exception of the GBI-EM bond index, the next step was to take an example from each asset class to first calculate the VaR and then compare it with CVaR. This comparison sought to prove how VaR underestimates risk and determine if CVaR is a better method. The calculations for all the VAR and CVar were made at 95%, 99% and 99.9% confidence intervals. Below, Figure 1 shows the graphs plotted for the two equity indices under analysis.

Figure 1 illustrates that the conditional VaR has a higher risk measure, which indicates how these equities have a much higher downside risk impact when measuring risk. Thus, for equities, it would be ideal to use the CVaR to capture this very aspect.

Similar results of this asset class to the equities are illustrated in Figure 2. However, it can be seen that generally, the hedge fund measurements for both the VaR and CVaR are lower than those of the equities, showing that this asset class generally has lower risk in relation to equities.

The results of the analysis of the bonds are depicted in Figure 2. From previous tests, it was denoted that the sovereign bond index GBI-EM was normally distributed, and so one would expect that the ideal measure would be the standard VaR. This is because the distribution is assumed to have little tail risk and its risk forecasts can be based on the Gaussian assumptions. For the HOAO index, which is not normally distributed, due to its distribution type it tends to have thicker tails and so a high tail risk. Therefore, to cater to the downside risk inherent in the tails, this index would be best suited with the CVaR measure (See Figure 3).

**Figure 1.** VaR versus CVaR for HEX and BGSMDC Equity Asset Index. Source: authors.

**Figure 2.** VaR vs. CVaR-HFRI and DJCS hedge fund.

**Figure 3.** VaR and CVaR for HOAO and GBI-EM Bond Index Asset Class.

#### *4.8. Stressed VaR versus VaR*

In this case, the stressed VaR was calculated at a 99% confidence interval, as it is a regulatory requirement according to the Basel III framework in the case of banks and financial institutions. SVaR can be calculated by calculating the risk measures over a period where there was financial stress. These tests were then used as robustness checks. The stressed VaR must cover a period exceeding 12 months, and this was taken into account. The normal period was from January 2003 to December 2005 and the stressed period covered the financial crisis period from August 2007 to December 2009.

The results in Figure 4 above give a visual depiction of what is observed. The first analysis looks at the equities data. Because the data set also included the daily prices, the authors decided to calculate the normal and stressed VaR for both indices using the daily prices as well as the monthly prices to compare and see if there is an impact based on the time used in the calculation.

**Figure 4.** VaR versus SVaR equities. Source: authors.

Firstly, by comparing the VaR and SVaR for both scenarios, it can be observed that that the stressed VaR has a higher risk component in comparison to the normal period. The next step was to look at the different time horizons. When using the daily returns, the VaR and SVaR both show that the risk measure is lower in comparison to the value that is computed monthly. A reason for this could be that when data are observed daily, smaller incremental changes are shown as compared to data that are calculated over larger intervals, where higher deviations between the starting price and the next price will lead to higher return movements.

The next step was to look at the hedge fund data observed over a monthly period. This also displays similar results to the equities, showing that the stressed value at risk has a higher risk measure in comparison to the VaR in the normal period.

In this scenario, the authors compared the VaR observed in Figure 5 over the total time (1995–2016) to a normal period. The previous VaR readings were −7% and −12% for the HFRI and DJCS, respectively, whilst the normal time VaR in this section was −1% and −2%, respectively, showing that it is much lower than the 1995–2016 period.

The final step in this analysis was to look at the stressed VaR for the bonds in Figure 6. Firstly, it can be observed that the stressed VaR is higher than the normal period VaR, similar to the other asset classes' results. However, when looking at this date to make it more comprehensive, another angle is used to observe the data from Figure 6. Firstly, when observing the HOAO that is depicted in blue, it is seen that the SVaR is significantly higher than the VaR. However, when looking at the GBI-EM which is normally distributed, it can be seen that the readings for the VaR and SVaR are very similar, with the VaR being −3.57% whilst the SVaR is −3.08%, giving a negligible difference of 0.26%. This shows that in the case of a normal distribution without fat tails, the returns are generally uniform even in stressed periods.

**Figure 6.** VaR versus SVaR bond index. Source: authors.

#### *4.9. Volatility Analysis/Time Series Analysis*

In this analysis, the aim is to calculate the weighted moving average (WMA) and the exponentially weighted moving average (EWMA), which is the volatility of the returns. The weighted moving average examines the stationary of the series. In this case, a 12-month equal weighted moving average is calculated, and the forecast horizon is the current period. By analysing the time series, it is possible to take note of outliers, seasonal or cyclical trends and other patterns, all of which will help in the prediction of future volatility in the asset returns. In addition, by looking at the time series, it is easier to identify the impact of stress periods described earlier in this paper. The WMA is calculated using the formula below:

$$wma\_t^k = \frac{\sum\_{i=1}^k \mathfrak{x}\_{t-i} \mathfrak{w}\_i}{\sum\_{i=1}^k w\_i}$$

where:


The EWMA is also calculated, and it shows the volatility over time of the asset and is calculated as follows:

$$
\sigma\_t^2 = \lambda \sigma\_{t-1}^2 + (1 - \lambda) \mathbf{x}\_{t-1}^2
$$

where:

*Xt* = the value of the time series value at time t.

Λ = the smoothing parameter (i.e., a non-negative constant between 0 and 1).

The HEX equity index in Figure 7 tracks stocks in an EU nation and it can be seen how volatile the returns were during the 2000 bubble and 2008 financial crisis. These markets were directly affected by the US economy situation, as their businesses and markets are interlinked. Generally, the trend, though volatile, is centred on the mean, signifying a stationary time series.

**Figure 7.** HEX monthly time series analysis, equity. Source: authors.

The time series in Figure 8 is different from the other equity returns, as it generally has a stable volatility, centring on the mean and fewer spikes during the 1997 Asian crisis, which could be because of the impact of Asian investors to the African nation being affected. It was not really affected by the tech bubble, implying that most of its investors were probably not in the tech industry. The WMA is generally more stable and smoothed out.

**Figure 8.** BGSMDC monthly time series analysis, equity. Source: authors.

Figure 9 reveals that the HFRI hedge fund has a unique characteristic, which is structured in a way so that it has many fund managers and so that it tends to be more diversified, and so volatility is around the mean and relatively stable. Clustering and huge sparks are shown in the crisis periods for the returns line.

**Figure 9.** HFRI monthly time series analysis, hedge fund. Source: authors.

The DJCS in Figure 10 is also generally stable, since it is a hedge fund and experiences spikes during the stress period as it is based on liquid securities. This means that any change in the financial markets is reflected in the returns. It is noted that the WMA is relatively stable over time.

**Figure 10.** DJCS monthly time series analysis, hedge fund. Source: authors.

Figure 11 shows that the returns and the volatility are generally stable and only clustered around the stress period. There is little volatility because of the nature of bond pricing and its general predictability in comparison to equities. The notable high volatility is as a result of the overall financial markets being disrupted during periods of stress.

**Figure 11.** H0AO monthly time series analysis, bonds. Source: authors.

The GBI-EM in Figure 12 tracks the emerging markets' government bonds; by virtue of the debt being sovereign, it is generally stable, and the returns and volatility (EWMA) only spike in the stress period. The WMA is generally smooth and centred around the mean.

**Figure 12.** GBI-EM monthly time series analysis, bond index. Source: authors.

Overall, with the exception of the GBI-EM index (it seems to show a steady and constant volatility), most of the time series exhibit no constant volatility, which is shown by the separate periods of high volatility and low volatility. This is consistent with heteroskedasticity or conditional heteroskedasticity. This means that it is ideal to use more advanced forecasting financial modelling tools on it because it behaves in a way that is expected of financial markets that tend to exhibit periods of high volatility and low volatility.

Through the application of time series analysis, it becomes necessary to pay attention to the order and so this is where the implementation of tests such as the serial correlation test is required. One such method was the time series analysis that was previously examined above. Now, the focus is on serial correlation, and this test is conducted with correlograms. This test is also critical in identifying the model order used in the tests such as ARMA models.

The ACF and PACF are used to detect the form of time series process. For it to be a suitable time series to apply forecasting models, it must be a stationery process, which is a desirable trait. This can be observed from the correlograms below (see Figure 13).

The log returns do not exhibit strong interdependency, though lag order 1 shows marginal significance. To shed more light on this, it is critical to assess whether the time series exhibits white noise, which implies no serial correlation. As shown in Table 2, the white noise test was conducted and it showed that both the HEX and BGSMDC tested false, implying that there is no white noise and so the series is serially correlated.

For the hedge fund plot (Figure 14), a similar pattern to that of the equities shows that the returns do not exhibit strong interdependency, though lag order 1 for the ACF although the PACF for HFRI has three significant lags. To assess this, Table 2 for the statistics summary, shows that the white noise test also showed that both assets tested false, which means there is also serial correlation of the returns of each asset.

**Figure 13.** ACF and PACF plots for equities. Source: authors.

**Figure 14.** *Cont*.

**Figure 14.** ACF and PACF plot for hedge funds. Source: authors.

Figure 15 illustrates the bond correlograms, and firstly, it shows that the corporate bond, H0AO, has one significant ACF lag, and for the PACF, it has four significant lags, which are lag 1, 2, 3 and 13. This result clearly shows that the asset has serial correlation. To affirm this Table 2, also shows false results for white noise, further confirming it is serially correlated.

However, when looking at the GBI-EM index, there a few significant lags for the ACF (7 and 13) and PACF (7 and 12). This asset was already affirmed to be normally distributed, and this further shows that the white noise test showed a true result, implying that the distribution is not serially correlated.

After looking at the correlation of the returns, it is also necessary to look at the squared values to assess whether there is correlation amongst them. This assists in decisions on the best model to use. Details on the arch effect are summarized in Table 2, and it shows that for bonds, the HFRI hedge fund and HEX equity, the ARCH effect is true, meaning that the squared values are correlated. This is the opposite for the BGSM and DJCS, which tested false in the tests.

**Figure 15.** *Cont*.

**Figure 15.** ACF and PACF plot for bonds. Source: authors.

#### *4.10. Correlation Test—Portfolio*

In this test, the authors carried out correlation tests between two time periods for each asset. For the purpose of uniformity, the same parameters applied in the stressed value at risk test are applied here as well.

By assessing Table 3, various pieces of information can be deduced, firstly by looking at the correlation of assets in a similar asset class. The equity indices BGSMI and HEX are positively correlated under normal conditions; they move together with a coefficient of 0.05 but, during the stress period, this relationship is reversed, and it can be seen that the correlation is now negative at −0.1. This means under normal market conditions they almost move together, though to a small significance, and soon after, they move away from each other.

The next class is the hedge funds, which both have a strong positive correlation that only strengthens during stress periods. In normal conditions, the correlation is 0.85, and it is 0.96 in a stressed scenario. For the bonds, it can be seen that the correlation coefficient is fairly stable and positive at 0.69 in normal conditions and 0.61 in the stressed period.

*J. Risk Financial Manag.* **2021**, *14*, 540


**Table 3.** Correlation matrix for all assets. Source: authors.

By making combinations using all assets, it can be seen that the most drastic change in correlation was observed between the HEX-BGSMI equity indices and the HEX equity– GBI-EM bond, in which all the correlations shifted from being negatively correlated to positively correlated. However, for both index combinations, the initial coefficients were close to zero, and so they did not really have a strong relationship to start with. Even during the stress period, the coefficients remained close to zero, with the BGSMI-HEX moving from a positive 0.05 to a negative 0.1 and the HEX-GBI-EM moving from a negative 0.01 to a positive 0.19.

Overall, the results from this test are inconclusive, showing that the behaviour of assets does not rely on the class but on how they relate on an individual level with each other. In coming up with a portfolio, one would have to test all the assets, as was carried out in this section, and observe how each asset relates to the other so that the best risk measures can be applied to cater for periods when the normal volatility of the assets shifts in stress periods.
