1. Introduction
Forecasting oil prices is an important problem in the energy market. It is crucially important for both oil-importing and oil-exporting countries. Moreover, oil prices are a key factor in many macroeconomic forecasts. Unfortunately, this task happens to be very hard. One of the reasons is the very high complexity of the oil market. As a result, there is no fixed or even commonly accepted forecasting technique [
1].
Various forecasting methods have been developed in case of oil prices. For example, time-series models, financial models, structural models, qualitative models, artificial neural network-based models, and many other sophisticated techniques. However, none of these has been found as a superior to the others. Therefore, many institutions (for example, Eurosystem/ECB staff macroeconomic projections and International Monetary Fund) focus mostly on the predictions based just on futures contracts [
2]. Unfortunately, the predictive power of such a method is not satisfactory. Moreover, such forecasts are usually worse than naïve forecasts [
3,
4].
Therefore, the problem of developing forecasting method for oil price is still an open and challenging task. Herein, a novel Bayesian method, i.e., Dynamic Model Averaging (in short: DMA) is presented. This method starts from considerably many simple regression models (it is not known a priori which model is the best). Next, in each period forecasts produced by each of these regression models are given weights, and the weighted average forecast is computed [
5]. In particular, the important advantage of DMA is that both the state space model and models’ regression coefficients can vary in time.
Indeed, various studies have shown that the significant determinants of oil price might vary in time [
6,
7,
8,
9,
10]. Therefore, it seems interesting to consider a methodological framework in which several potentially important oil price drivers would be examined. For example, DMA estimates certain time-varying posteriori probabilities which might be used to quantify the importance of the considered drivers in influencing oil price.
It is worth noting that such an approach has recently been applied in economics and finance [
11,
12,
13,
14,
15,
16,
17]. However, DMA has still not been studied too extensively. Naser [
18] has applied DMA to oil prices, but her conclusions were that the quality of prediction is not so good. Herein, it is argued that DMA actually performs very well for oil prices. Moreover, this research extends the previous applications of DMA. In particular, herein a more thorough examination of DMA in the context of oil price is performed, rather than just a simple model estimation and its diagnostic. Indeed, several remarks on data preselection, within the particular context of the oil market, are formulated. It is shown that they can significantly improve the quality of prediction given by DMA. Therefore, this paper is a try to fill the existing literature gap.
Amongst various conclusions derived from this research are the following: that Chinese economy is an important oil price driver since 1990s; market stress’ impact on oil price decreased in 2000s; generally, indices from stock markets play important roles as oil price drivers, not fundamentals like supply and demand; there are some weak arguments in favor of speculation during the oil price surge in 2007–2008, but they might also be applied to periods when oil price used to be more stable. Generally, except better quality of forecast, DMA brings new knowledge about the oil market. Technically, it has been found that data normalization is highly beneficial, sometimes a reduction of the number of drivers leads to better forecasts, and DMA is robust to initial parameters’ calibration.
The structure of this paper is as follows: first, a two-part literature review is presented. In the first part a short review with arguments why a new method, especially one like DMA, can be useful in forecasting oil price is given. The second part is devoted to preselecting potential oil price drivers, i.e., to find which drivers have already been found useful in forecasting oil prices in previous researches. Next, a shortly reminder about DMA methodology is provided. Finally, DMA models are estimated and outcomes are discussed. For the reader’s convenience a Glossary is added at the end of this paper.
3. Data
Based on the above literature review ten potential oil price drivers have been initially proposed: stock markets indices, volatility of stock markets, interest rates, economic activity, exchange rates, supply and demand, and inventories. All of them are presented in
Table 1.
Monthly data beginning on January 1986 and ending on December 2015 has been analyzed. As a result each of the initially considered time-series consists of 360 observations. The detailed data description is given in
Table 1. For more information on data and its methodology the reader is referred to the diverse data providers (see the References and Data Sources (
Appendix A) sections at the end of the paper). The frequency of data has been chosen to be the highest one, which allows to merge several economically justified drivers. For example, economic activity is measured by Kilian’s index [
101], which is given with monthly frequency. If GDP-based measures were to be used, at least quarterly frequency is possible. On the other hand, daily (or even higher frequency) data is easily available for stock markets, but for several interesting oil market factors the data availability frequencies are no higher than monthly ones [
103]. As a result, the monthly frequency has been chosen to obtain a consistent collection of time-series.
The spot price of crude oil was measured by WTI spot price (WTI), because according to Yu et al. [
34] this is the most common benchmark oil price. The originally obtained data on the oil import (IMP) is provided in the weekly frequency. It has been aggregated to monthly frequency by taking mean values for the corresponding months. Following, for example, Bu [
104], Karali and Power [
105] and Kao and Wan [
106], strategic petroleum reserves (SPR) has been excluded from the data in order to measure private consumers’ demand only. Indeed, the change in SPR happens rather due to political reasons, and they are influenced mainly by the occurrence of natural disasters and, moreover, are prone to geopolitical decisions. Generally, such an exclusion has been performed in numerous other studies for the oil market.
The equity market stress has been measured by VXO (the volatility index based on trading of S&P 100 options,
http://www.cboe.com/micro/vxo). Nowadays, such a measurement would rather be done with a help of VIX (i.e., a measure of market expectations of near-term volatility conveyed by S&P 500 stock index option prices,
http://www.cboe.com/micro/vix). But the analysis provided in this paper dates back to 1986. The calculation methodology of VXO has been changed in 2004, and the new index, namely VIX, would not be consistent with data for the whole analyzed period.
The variable (driver) CHI has been constructed in the following way. Shanghai Composite Index (
http://english.sse.com.cn/home) was taken since December 1990. It has been rescaled at December 1990 in a way to glue with Hang Seng Index (
http://www.hsi.com.hk), which has been taken for the period before December 1990. Shanghai Composite Index is not available before December 1990. Fortunately, Hang Seng Index is commonly seen as a back-door to China markets. Therefore it can serve as a proxy for Chinese economy before December 1990. However, for the period when Shanghai Composite is available it is better to use this index directly. Indeed, for example, taking look on data from 2012, it can clearly be seen that Shanghai Composite Index and Hang Seng behave a bit differently. On the other hand, it seems that there is no better alternative to measure Chinese economy before 1990 for the purposes of this paper.
Generally, there exists a lot of easily available data for U.S. On the contrary, corresponding time-series for the whole world are usually missing. Therefore, following, for example, Hamilton [
91] and Kilian and Murphy [
94], U.S. data has been taken as satisfactory proxies.
Futures (NFP) have not been included in the constructed models, except just one model, due to the following two main reasons. First of all, following Alquist and Kilian [
4], they were used as an alternative forecast.
Such a practice is common in several financial institutions [
107]. Secondly, the initial test simulations of various models (not included herein) have indicated that the inclusion of NFP in DMA models does not significantly improve the predictions. As clarified later in the text, the data are kept at their levels, if not stated otherwise. However, certain class of models is described later, for which data were rescaled to fit between 0 and 1.
5. Results
According to
Section 4.4 “full” and “reduced” DMA models have been estimated. Drivers included in “full” DMA models are presented in
Table 2, and drivers which have emerged to be present in “reduced” versions of models are presented in
Table 3.
MSE for all estimated models are presented in
Table 4. Additionally, for an easier outlook comparisons between “full” and “reduced” models, normalized and non-normalized models, and the ones with different forgetting factors are presented in
Figure 1,
Figure 2 and
Figure 3.
First of all, it should be noticed that in 75% of cases DMA has an advantage over BMA by producing forecast of a better quality (measured by minimizing MSE). However, in 25% of cases BMA is superior over any estimated DMA model. In 35% of cases DMA with
α = 0.95 is superior over DMA with
α = 0.99. But only in 25% of cases DMA with
α = 0.99 is superior over BMA and, simultaneously, DMA with
α = 0.95 is superior over DMA with
α = 0.99. In other words, a smaller forgetting factor leads to a smaller MSE. In 35% of cases DMA with
α = 0.99 produces smaller MSE than BMA and smaller than DMA with
α = 0.95 (see
Figure 1 and
Table 4).
Secondly, if for a given forgetting factor the “reduced” and the “full” version of a model are compared, it occurs that only in 1/3 of cases the “reduced” version of a model produces smaller MSE than the “full” version of a model. However, if only models which have produced MSE smaller than benchmark forecasts (i.e., naïve forecast and 1-month futures) are considered, then 52% of them are models in the “reduced” version. Generally, only 38% of constructed models have produced MSE smaller than benchmark forecasts (see
Figure 2 and
Table 4).
In 70% of cases a given model has produced smaller MSE for normalized data than for non-normalized data (see
Figure 3 and
Table 4). In various cases it has happened that a non-normalized model has MSE no smaller than benchmark forecasts, but a normalized version of this model has MSE smaller than benchmark forecasts. Therefore, the improvement gained from normalization can sometimes even decide whether a model can beat benchmark forecasts (i.e., produce smaller MSE).
It is worth to notice that in the already performed financial applications of DMA [
11,
13,
14,
15,
16,
17,
18] explicit data normalization has not been considered. The original time-series have usually been taken in 1st differences in order to obtain stationarity. Stationarity is a necessary assumption for ordinary regression, but it is not necessary for DMA. Although, taking 1st differences of variables is a common practice in economy and finance, it should be stressed that it is not required from the theoretical point of view in DMA.
Of course, the outperformance of benchmark forecasts by selected DMA models is quite small (approximately the best of estimated DMA models lowers MSE by 10% in comparison to the naïve forecast, and by 8% in comparison to the 1-month futures forecast). In particular, from comparing Model 2 and Model 3 it can be seen (
Table 4) that adding NFP (futures prices) improves the forecast quality. However, amongst all the considered models it is Model 4 with
α = 0.99 and with normalized data which is characterized by the smallest MSE (the difference between the “full” and the “reduced” version is in this case negligible).
However, outcomes are robust to the selection of the forgetting factor
α. Amongst models with
α = 0.95 it is the “reduced” version of Model 5 with normalized data which minimizes MSE. For α = 0.99 it is both the “reduced” and the “full” version of Model 4 with normalized data. For
α = 1 it is the “reduced” version of Model 4 with normalized data. From
Table 3 it can also be seen that when working with non-normalized data, the algorithm described in
Section 4.4 tends to exclude drivers more often than if models with normalized data are applied. As the aim of this research is to find drivers of oil price, this serves as another argument in favor of normalizing data.
It would be desirable to select one model amongst all the estimated models, i.e., the one which behaves “the best”. First of all, it is reasonable to consider only models, which outperformed benchmark forecasts. Secondly, it would be desirable if model’s errors (see
Table 4) would not depend on the forgetting factor
α. Indeed, there is such a model, i.e., Model 5 in the “reduced” version with normalized data. Clearly, it is the most robust model against changing the forgetting factor, even if all estimated models are considered (not only those outperforming naïve forecast and futures forecast). Interestingly, it is also the second “best” model under the criterion of minimizing MSE. (But the “best” one has only 1% smaller MSE.) It should be noticed that this model is the one, which, first, consists of normalized data; secondly, is in the “reduced” version; and, thirdly, has been obtained by the off-line structure estimation algorithm by Karny and Kulhavy [
111] with an assumption that regression coefficients do not vary (see
Section 4.4).
Finally, it is stressed that the model which minimizes MSE is the one with the forgetting factor α = 0.99 (as well as the above described, the “best”, model). Indeed, this observation supports the hypothesis that DMA can be a useful method in oil price forecasting.
Therefore, in
Figure 4 there are presented probabilities
pt(
X) described in
Section 4.5. In
Figure 4 they are presented for the above chosen, the “best” model, i.e., Model 5 with normalized data, in the “reduced” version, and with the forgetting factor
α = 0.99. These probabilities express the probability that a driver
X is useful for forecasting oil price at time
t based on weights attached by DMA to regression models which include this driver.
As some kind of a robustness check this model was also estimated with WTI oil price replaced by the Brent oil price (BRENT). It can be seen that the outcomes are quite similar. Indeed, it should be noticed that
pt(
X) start from the same value of 0.5, i.e.,
p0(
X) = 0.5 for every
X. This is just a direct consequence of Equation (3). Afterwards, DMA “learns” from the upcoming new data. Therefore, it is crucial to use DMA for sufficiently long time-series. Of course, this requirement is met in the analysis presented herein. The period of 30 years is covered, with 360 observations. Approximately first 20% of observations play a “learning” role for models. This can be seen in
Figure 4 as
pt(
X) adapt quickly their values. However, they are not the exact values of
pt(
X) which are important to interpret, but their time-paths. In other words, from the economical point of view it is interesting to observe how the probability that a given driver is important in forecasting oil price varies in time.
Usually, researchers divide samples into “learning” period and “testing” periods. Herein, as already mentioned the “learning” one consists of 20% of the first observations, and the “testing” one—from the remaining 80%. Indeed, the DMA is a recursively estimated model, in which the model adaptation takes place every time the new information is added. Therefore, no fixed coefficients are estimated during the “learning” period to be used in the “training” period. The “learning” period is rather a period excluded from further evaluation, during which DMA adapts its parameters from the starting values. In other words, the time given for DMA to “catch the signal”. Later on, the model still continually changes its parameters. However, then it “catches the changes in the signal”, not that it still tries to “catch the signal itself”.
First of all, it can be seen that stock markets played an important role as oil price driver between 1992 and 2000. This observation is consistent with previous researches. Later, it was decreasing until around 2005. In 2008 this role suddenly increased, but since around 2013 it has kept to decline. Therefore, it can be seen that during the oil price surge in 2007–2008 stock market behavior played an important role.
The market stress played an important role until around 2005. Since then, its role as an important oil price driver started to decrease and this continued until 2007. This means that before the beginning of the recent global financial crisis and the oil price surge investors were not putting much attention to market risk. Indeed, many estimated DMA models gave posteriori inclusion probabilities of the 1st and the 2nd lag of VXO marginal values shortly before 2007. Later, the role of this driver suddenly increased. Its role was increasing until around 2012. Recently, its role as an important oil price driver has been decreasing.
The role of Chinese economy was systematically increasing between 1992 and 2000. Later, its role started to decline, but since around 2005 it started to increase again. These observations confirm that China become an important player on the oil market. Moreover, this importance was present in 1990s also.
The role of interest rate was increasing between 1992 and 2000. Later, it started to decrease. Its role as an important oil price driver started to increase around 2009, but recently it is again declining.
The role of exchange rates keep rather a stable time-path, with just some slight exceptions (oscillations). For example, its role as an important oil price driver was increasing before 2000. Later, its role was slightly decreasing until 2007.
The role of global economic activity as an important oil price driver was increasing between 1992 and 2000. According to DMA models, it was playing an important role until around 2010. Later, its role started to decrease.
The role of supply forces increased between 2000 and 2006. It can be seen that their role decreased shortly before the oil price surge in 2007–2008. During this surge their role suddenly increased, but recently they started to decline again.
The role of demand forces (measured by consumption and import quotas) present similar conclusions with each other. Around 2000 their role as important oil price drivers increased. Later, their role decreased. Around the recent global financial crisis and the oil price surge their roles suddenly increased again. Also, recently their roles have increased.
Interestingly, the role of inventories as an important oil price driver increased between 1995 and 2004. However, in 2005 its role decreased. Suddenly, in 2007 its role increased again, but since around 2009 its role was decreasing. Just recently, its role started to increase again. This can serve as some weak argument in favor of the previously mentioned hypothesis of the role of speculation on the oil market in late 2000s.
The role of futures prices played an important role as an oil price driver in 1990s. However, in 2000s (except some small peaks around 2005) they did not play an important role. But, since 2009 it can be observed that their role systematically increases.
Finally, it can be seen that the autoregressive component, i.e., lags of WTI, plays an important role as an oil price driver. Posteriori inclusion probability of the 1st lag of WTI decreased only occasionally. For example, around 2001, 2005 and 2010. However, all decreases (except the one around 2001) were compensated by increases of posteriori inclusion probability of the 2nd lag of WTI. Therefore, the autoregressive component almost always have played an important role as an oil price driver. If it has not been just the 1st lag, then it has been the 2nd lag. For example, in 1990s both lags played an important role. It is interesting to notice that the decrease of the importance of both lags have been observed around the oil price surge. In other words, highly common and advocated in literature autoregressive models became less useful in this period, i.e., other drivers took the leading role then.
Similar interpretations can be based on selecting drivers for which pt is greater than 0.5. It can be seen that during 1990s the main drivers of oil price were: developed stock markets, Chinese economy and autoregressive components. Later, in the 2000s the importance of these drivers decreased. Especially, the market stress index become less important as an oil price driver. During the oil price surge Chinese economy was an important driver. Later, its role decreased, but recently it has been increasing again. Recently, the role of futures prices has also been increasing. They played an important role in 1990s, but later (in 2000s) their role decreased.
Summarizing the above considerations, it can be seen that in different periods, different drivers play an important role in oil price forecasting. This is important and very characteristic advantage of DMA models. Except that some of them have produced smaller errors than benchmark forecasts, DMA models dynamically change weights ascribed to regression models. In other words, as the market situation changes, DMA is able to select the most important drivers for the modelled time-series.
The forecast from the selected model, i.e., Model 5 for normalized data and in the “reduced” version was compared with forecasts from some other models. The selected DMA models were taken with the same forgetting parameter as Model 5 (i.e., 0.99) and also in the “reduced” versions. The comparing was done with the Diebold-Mariano test [
119]. This test was chosen because it relies on relatively few assumptions and is quite popular. The results are presented in
Table 5. The null hypothesis of this test is that the forecast accuracies from both methods are different. The alternative hypothesis is presented in rows of
Table 5. It can be seen that, assuming 5% significance level, it cannot be said that the selected model produced significantly more accurate forecast than BMA, 1-month futures or Model 4. However, it can be said that the selected model produced significantly more accurate forecast than Equal-Weighted Averaging, naïve method and Model 1, Model 2 and Model 3.
Additionally, to illustrate the practical application, from the investors perspective it was checked if DMA can be used as some kind of an investment strategy. For this, Model 1 in “reduced” version and with normalized variables were taken. The forgetting factor was set to
0.99. The simple strategy was constructed in the following way. If DMA predicted the oil price increase in the next month, the investor should buy oil. Otherwise, he or she should “buy” MSCI index. This strategy is called DMA in
Table 6.
The benchmark strategy can be simply to buy/sell in one-month period oil. In other words, this correspo nds just to buying oil and selling it after some time. This strategy is called “hold oil” in
Table 6. The third strategy considered was to buy oil, if 1-month futures prices were predicting its increase; otherwise—to “buy” MSCI index.
The results are reported in
Table 6. First of all, it should be noticed that the strategy based on futures prices generates on average a loss.
Table 6 reports mean monthly returns from the given strategy, standard deviations of these returns, and the Sharpe ratio, i.e., the ratio of mean to standard deviation. The higher values of Sharpe ratio are preferred, as this corresponds to higher expected return under the same risk; or the same expected return under the smaller risk. It can be seen that DMA-based strategy allows to obtain, first of all, a slightly higher returns; secondly—smaller risk, comparing to benchmark strategies. Consequently, the Sharpe ratio from this strategy is approximately 45% higher than the one from the benchmark strategy “hold oil”.
6. Conclusions
In this paper it has been discussed how Dynamic Model Averaging can help in forecasting and finding drivers of crude oil prices in a time-varying context. In particular, this method allows for both the model’s state space and models’ parameters (regression coefficients) uncertainty. It has been found that Dynamic Model Averaging can slightly improve the quality of prediction for the crude oil price in comparison to alternative methods. In particular, 10% decrease in mean squared error (MSE) has been found. It should be stated that Dynamic Model Averaging has occurred to produce smaller errors than its predecessor, i.e., Bayesian Model Averaging. In other words, it has been found that forgetting definitely reduces the size of forecast error. Although, the improvement is not spectacular, it still seems interesting, because there is no consensus amongst researchers and practitioners which forecasting method of oil price is the best one.
Technically, it has also been found that data should be normalized (rescaled to fit between 0 and 1) before inserting into Dynamic Model Averaging. Usually, the common practice is to transform data in order to obtain stationarity. This is not required in Dynamic Model Averaging. On the other hand, arguments have been presented why normalization improves Dynamic Model Averaging forecast. It might serve as a non-trivial advice for some future researches, as the popularity of Dynamic Model Averaging is growing in recent years.
Moreover, Dynamic Model Averaging produces certain time-varying weights (probabilities) which can be used to describe the importance of a given driver in oil price forecasting. In particular, this research has confirmed that (according to Dynamic Model Averaging method) developed stock markets, market stress, Chinese economy growth, global economic activity, interest rates, exchange rates, oil futures prices and autoregressive behavior were most important oil price drivers in 1990s. The role of the autoregressive behavior was especially important during 1990s and the oil price surge in 2007–2008. In 2000s the most important drivers were: global economic activity, autoregressive behavior, and in a lesser degree, market stress and supply. During the oil price surge the important drivers were: stock markets, market stress, production, consumption, inventories quotas and autoregressive behavior. Recently, the dominant drivers are: Chinese economy, consumption, inventories quotas, futures prices and autoregressive behavior. It has been observed that the role of Chinese economy played an important role in impacting crude oil price also in 1990s. Interestingly, market stress’ role has been declining in the beginning of 2000s, before the beginning of the recent global financial crisis.
As a result, it has been found that it is drivers from the equity market rather than the fundamental microeconomic or macroeconomic factors which are useful in forecasting crude oil price. It has also been found that adding autoregressive components is strongly preferred. This is an interesting result as, for example, in the recent study Kruse and Wegener [
120] indicated only Kilian’s index of global economy activity as the significant determinant of oil price persistence. Nevertheless, that research was focused on modelling the persistence. Moreover, the variables to further averaging were selected through some initial testing one-variable models, which could have excluded certain joint relations.
From the policymaking point of view it seems that nowadays seeking a new swing producer of crude oil (after Saudi Arabia abdicated its role in 2014) is useless. On the other hand, much volatility of oil price can come from financial speculation. Therefore, ensuing commodities futures market regulations seems to be a reasonable direction. Still, the performed research has shown that the role of futures trading nowadays is similar to that in 1990s. Paradoxically, shortly after deregulating Commodity Futures Modernization Act of 2000 speculative pressures on oil price decreased. On the other hand, clearly the connection between spot and futures prices loosened greatly. Nevertheless, this research has shown rather a general financialization of oil market through tight links with stock markets. Therefore, it is rather a complex relation, instead of simple explanation by speculation on futures. Also, for U.S. the Strategic Petroleum Reserves should be kept at relatively high level. On the other hand, monetary policies had higher impact on oil price in 1990s than after 2000. It seems that confronting oil price volatility should be rather by reducing overall demand for oil than by increasing demand elasticity. Within this context, the performed research advocates rather search for alternative energy sources than to extend offshore drilling or in wildlife terrains (like, for example, in Arctic National Wildlife Refuge).
It has been found that in different periods, different drivers play a significant role as oil price drivers. The inclusion of various drivers is beneficial, because the averaging approach is more flexible to capture abrupt oil price changes. Initially, in this research ten drivers have been considered (without lags, therefore, making 1024 regression models to be averaged). Whereas, the “best” estimated model has consisted of three drivers, but with lags, therefore making totally seven variables and 128 models being averaged. Within this context, it is clear that prudent number of models has occurred to be preferred in averaging procedure. However, this research has shown that Dynamic Model Averaging can produce interesting results starting from even three variables.
Although, numerous variations of Dynamic Model Averaging have been applied, results from single models are consistent with each other. In other words, Dynamic Model Averaging has occurred to be robust against different parameters’ settings, and, even changing the initial set of potential oil price drivers. This presents the considered method as worth further studies.