*3.2. Method*

3.2.1. Signals and Excess Return of Simple Moving Average Strategies

The simple moving average (SMA) crossover is, by far, the most widely used among technical trading rules or TTRs [33]. The traditional simple moving average (SMA) rule issues buy (sell) signals that generate trades. When the short-period moving average rises above (or falls under) the long-period moving average by a pre-specified level or percentage (which is often set to zero in investment practice), buying (or selling) trades are initiated. As such, when the short-period moving average (*S*) exceeds the long-period moving average *(L*), a purchase signal is issued as follows:

$$\left[\sum\_{\lambda=1}^{S} P\_{t-(\lambda-1)}/S\right] > \left[\sum\_{\lambda=1}^{L} P\_{t-(\lambda-1)}/L\right] \Rightarrow Buy \text{ at time } t \tag{2}$$

where *Pt* is the price at time *t*, and the band equals zero.

Sell signals are generated when the short-period moving average (*S*) is below the long-period moving average (*L*):

$$\left[\sum\_{\lambda=1}^{S} P\_{t-(\lambda-1)}/S\right] < \left[\sum\_{\lambda=1}^{L} P\_{t-(\lambda-1)}/L\right] \Rightarrow \text{Sell at time } t \tag{3}$$

Further, excess returns over a given benchmark produced by a SMA TTR is estimated as:

$$f\_{t+1} = (1 + y\_{t+1} S\_1(X\_{1,t}, \beta\_1^\*) / (1 + y\_{t+1} S\_0(X\_{0,t}, \beta\_0^\*) - 1)\tag{4}$$

where *S*<sup>1</sup> and *S*<sup>0</sup> are "signal functions" that take two permissible values, 1 for long trading positions and −1 for short trading positions. The signal value represents the total percentage of capital allocated at moment *t* in a trading position, which further implies a 100% allocation of capital at any moment in this trading system.

The signal function converts indicators *X*1,*t*+<sup>1</sup> or *X*0,*t*+<sup>1</sup> and parameters *β*<sup>∗</sup> <sup>1</sup> or *β*<sup>∗</sup> <sup>0</sup> in Equation (4) into trading positions. The nominator in the above equation represents the SMA technical rule to be tested, while the denominator represents the benchmark. Here, the buy-and-hold (BH) strategy, a traditional benchmark strategy in portfolio management, is the benchmark of choice.

Average excess return for a particular TTR is then estimated as:

$$\overline{f} = n^{-1} \sum\_{t=R}^{T} \hat{f}\_{t+1} \tag{5}$$

The parameters in Equation (4) are the lengths of the two MA averages (*n*<sup>1</sup> for the short MA and *n*<sup>2</sup> for the long MA). See Anghel and Tudor [34] for more detailed information of signals and excess returns of SMAs.

Although some pairs of parameters are popular in the literature and in practice, we avoid pre-setting them and instead we run all rules using parameters ranging from 1–30 for *S* and 31–500 for *L* for the first subperiod, and parameters ranging from 1–15 for *S* and 16–120 for *L* for subsequent pandemic subperiod. We decide to restrict the parameter *n*<sup>2</sup> to a maximum value of 120 (representing approximately 6 months of trading) in the second subsample, which is consistent with practitioners' trading strategies based on TTRs (i.e., Menkhoff [17] showed that technical analysis is generally employed for trading decisions that do not exceed a horizon of 6 months). In the first subperiod, we permit a wider investigation and allow the second parameter to vary up to a maximum value of 500, which represents more than two years of trading.

Thus, for the larger pre-pandemic window:

$$n\_1 \in \{1: 30\}$$

and

$$m\_2 \in \{31:500\}$$

and therefore we have a total number of SMA TTRs tested on 21 years of data corresponding to the pre-pandemic timeframe equal to: length (*n*1) × length (*n*2) = 30 × 470 = 14,100 for the parameter *β*∗ *<sup>l</sup>* in Equation (4).

Subsequently, for the smaller pandemic interval:

$$n\_1 \in \{1: 15\}$$

and

$$m\_2 \in \{16:120\}$$

corresponding to a total number of SMA TTRs tested during the 1 and <sup>1</sup> <sup>4</sup> years of pandemic timeframe equal to: length (*n*1) *x* length (*n*2) = 15 × 105 = 1575 for the parameter *β*<sup>∗</sup> *<sup>i</sup>* in Equation (4).

Hence, first, we test 14,100, and subsequently 1575 technical trading crossover rules based on Simple Moving Averages, computed as:

$$\text{short }SMA\_t = 1/n\_1 \sum\_{t=n\_1}^t \mathbf{X}\_t \tag{6}$$

$$\text{long }SMA\_t = 1/n\_2 \sum\_{t-n\_2}^t \mathcal{X}\_t \tag{7}$$

The function *S*<sup>1</sup> in Equation (4) will then dynamically convert into trading positions (long or short) according to the specified 14,100/1575 SMA TTRs.

R software was used to implement the method and perform estimations.

#### 3.2.2. Robustness Checks

The first step in our estimations consists in computing average excess returns over the benchmark buy-and-hold trading strategy, as in Equation (5) produced by the 14,100/1575 SMA TTRs for each of the three energy markets (WTI, Brent and XLE) in the two sample periods (pre-pandemic and pandemic).

Secondly, the significance of excess returns produced by the 14,100/1575 SMA rules is tested.

In order to accurately accomplish this task, we should consider the high non-normality of the three energy markets. For non-normal distributions, the null hypothesis of normality could lead to serious inference errors when estimating classis statistical significance diagnostics. All three energy series are highly non-normal, presenting highly leptokurtic distributions (see Table 2). Although this is expected from daily returns, especially in the case of crude oil markets, results are nonetheless surprising and show a huge amount of excess kurtosis for all three markets, both pre and during the COVID-19 pandemic, but especially higher during the pandemic period. Leptokurtosis signifies that negative returns occur more often than positive returns, and estimations confirm this is indeed the case for the crude oil market (both WTI and Brent) and also for the energy fund XLE. Further, the Anderson–Darling (A–D) test is estimated to test the normality assumption for the three energy markets in the two sample periods. Results presented in Table 2 allow us to reject the null hypothesis of normality for all markets and all time periods.

**Table 2.** Distribution characteristics.


\* significant at 1%.

Thus, to deal with non-normality in our data when testing for significance, we implement the popular bootstrapping methodology proposed by Brock et al. [25] in estimating *p*-values, under a random walk assumption for the distribution of returns [35] for all three energy series. As such, the null model is first fit to empirical data and its parameters are further estimated. The residuals are subsequently 1000 times randomly re-sampled (i.e., Brock et al. [25] generated 500 random series in their original study) and combined with the model parameters to generate random price series that will present the same characteristics as the original series. According to Brock et al. [25], the results do not differ significantly irrespective of which null model is employed (random walk, AR (1), GARCH-M, or EGARCH). Thus, for the null hypothesis, we continue with the random walk assumption in this study.

Hence, firstly we test whether the 14,100 SMA TTRs can generate excess returns for traders in the three energy markets during the pre-pandemic period. Further, after first estimating excess returns produced by the 14,100 trading rules for the three energy series in the 1999–2019 period, the bootstrapping methodology allows to compare the excess returns produced by a particular TTR applied to the real time series to excess returns that resulted from the empirical distribution, where the empirical distribution has been constructed by applying the same 14,100 trading rules to 1000 simulated time series with replacement under the null of a random walk. Thus, we sample with replacement from the original return series 1000 times for each of our original energy markets (WTI, Brent and XLE), obtaining 1000 simulated series or markets for each of the three real energy markets, each simulated series having the same length as the original series (i.e., 5367 for the pre-pandemic period). We therefore produce three data frames each with dimensions (5367 × 1000) on which the significance of each of the 14,100 TTRs is tested. This implies that for each of the three energy markets, for the pre-pandemic timeframe, the 14,100 TTRs are first applied on the real time series of returns and subsequently on 1000 simulated return series for the respective energy market. Finally, the returns for each trading rule and the mean return across trading rules are estimated.

The procedure will then be replicated for the smaller COVID-19 window so that three simulated data frames with dimensions (319 × 1000) will be produced (where 319 is the number of observations of the original series and 1000 the number of simulated time series).

The average return *f* ∗ *<sup>b</sup>* is thus obtained by applying the TTRs on the simulated series, where *b* = 1, ... , *B* is the number of the simulation from the total of *B* simulations performed. Here, *B* = 1000.

Next, for the pre-pandemic period, results' significance is tested by comparing excess returns obtained on each of the three real energy markets to excess returns produced on the 3 × 1000 total simulated series of returns, each of length 5367. The main idea underlying this bootstrap methodology is that for a trading rule to be statistically significant at the α level, it must generate more revenue on fewer than 1% of the bootstrapped series than on the original series. The bootstrap *p*-value is then the percentage of times the buy-sell profit for the rule is greater on the 1000 random series than on the original series.

The same method is applied during the COVID-19 interval, where 3 × 1000 simulated series, each of length 319 have been produced.

Therefore, the estimated bootstrap *p*-value results from comparing the average real return *f* with the quantiles of average simulated returns *f* <sup>∗</sup> = *f* ∗ *<sup>b</sup>* , *b* = 1, . . . , *B*. Hence:

$$\text{B random bootstrap } p \text{-value} = \frac{\sum\_{b=1}^{B} \mathbf{1}\_{\{\overline{f} < \overline{f}\_b^\*\}}}{B} \tag{8}$$

Finally, we account for the inherent data-snooping bias by following the standard Reality Check (RC) procedure for data snooping proposed by White [30].

White [30] develops the Reality Check Test applied to the best model (here, the best performing TTR) selected from a large sample of previously tested models. His algorithm consists in firstly computing the performance of the benchmark, which is expressed here as average excess return over the BH return. Thus, the first step consists in computing *f* <sup>1</sup>—the average excess performance of rule 1, followed by computing *f* ∗ <sup>1</sup> = *f* ∗ 1,*b*, *b* = 1, ... , *B*, which is a vector of length B (the number of simulations or bootstrapped samples, here, set again to 1000) containing the average excess performances on simulated (bootstrapped) time series, all for rule 1. Basically, up to this point, the procedure is identical to the earlier random bootstrap *p*-value estimation.

Next, White [30] sets *<sup>V</sup>*<sup>1</sup> = *<sup>f</sup>* <sup>1</sup> and *<sup>V</sup>*<sup>∗</sup> 1,*<sup>b</sup>* = *f* ∗ 1,*<sup>b</sup>* − *f* <sup>1</sup>, *b* = 1, ... , *B*, so that the performance of rule 1 relative to the benchmark is tested by comparing *V*<sup>1</sup> with the quintiles of *V*∗ 1,*b*. Similarly, for rule 2:

$$\overline{V}\_2 = \max\left\{\overline{f}\_{2'}, \overline{V}\_1\right\} \tag{9}$$

and

$$\nabla\_{2,b}^\* = \max \left\{ (\overline{f}\_{2,b}^\* - \overline{f}\_2)\_\prime, \nabla\_{1,b}^\* \right\} \tag{10}$$

where, as before, *b* = 1, ... , B. In order to test whether the best of rule 1 and 2 is better than the benchmark, *V*<sup>2</sup> is compared with the quintiles of *V*<sup>∗</sup> 2,*b*.

Thus, there is a recursive process of testing whether the best model for the *k*th rule is superior to the benchmark, where *k* = 3, ... , l and *l* is the number of rules to be tested (here, *l* equals first 14,100 and subsequently, 1575 corresponding to the two sub-periods). The method thus implies comparing:

$$\overline{V}\_k = \max \left\{ \overline{f}\_{k'} \, \overline{V}\_{k-1} \right\} \tag{11}$$

with the quintiles of:

$$\nabla\_{k,b}^\* = \max \left\{ \overline{f}\_{k,b}^\* - \overline{f}\_{k'} \ \nabla\_{k-1,b}^\* \right\} \tag{12}$$

where *b* = 1, ... , *B* for each of the *l* rules until a conclusion can be reached about the best performing trading rule.

Formally, Reality Check *p*-value could be expressed as:

$$RC \ p\text{-value} = \frac{\sum\_{b=1}^{B} \mathbf{1}\_{\{\mathcal{T}\_l < \nabla\_{l,b}^\*\}}}{B} \tag{13}$$

#### **4. Results and Discussion**

In Table 3, we present the parameters and performance (excess returns over the benchmark BH returns) for the best performing TTRs encountered on the three energy markets across the two subperiods (results for the pre-pandemic period are presented in Panel A, while results for the pandemic period are reported in Panel B). Random bootstrapping *p*-values resulting from 1000 iterations, together with the number of signals generated by the optimal TTR are also presented. Note that transaction costs are not included in the first estimations.

**Table 3.** The best TTRs' parameters and performance with no transaction costs and BH returns as benchmark.


\*\* denotes significance at the 5% level, \*\*\* denotes significance at the 10% level. <sup>1</sup> To be more suggestive, daily returns have been annualized such that for every market: annual excess return = [(1 + daily excess return)ˆ252−1]. The benchmark return is the buy-and-hold return. <sup>2</sup> This represents the random bootstrapping p-values resulting from 1000 iterations across the three energy markets and the two subperiods. Even without adjusting for data-snooping bias, this approach is nonetheless relevant not only for comparative purposes with previous studies, but also as it helps in identifying the total number of TTRs that are profitable prior to data-snooping bias adjustment. A tested TTR is statistically significant at the 5% level if excess returns on the 1000 random bootstrapped series exceed excess returns on the original series less than 5% of the time.

> Results in Table 3 indicate that technical analysis appears to be significantly more profitable over the pandemic period than over the pre-pandemic period. Excess returns achieved by all 14,100 SMA crossover TTRs are negative in the pre-pandemic period for the WTI and XLE markets, indicating some small profits only for the Brent crude oil market (annualized excess return of the optimal TTR over the buy-and-hold benchmark return of about 2%, which is statistically strong, with a 1000 random bootstrap *p*-value of 0.037). Figure 4 reflects excess return for all 14,100 tested TTRs for the Brent market over the 21-years of pre-pandemic period. We chose to show only the Brent market as it is the only one for which some over-performing rules exist. It is obvious by looking at the chart below that only a small number of strategies are able to gain excess return over the benchmark BH strategy for the Brent market in the pre-pandemic period. Indeed, estimations confirm that only 7 rules out of the universe of 14,100 (or approximately 0.04%) are over-performing during 1999–2019.

> We thus far conclude that none of the 14,100 moving average crossover TTRs can generate excess returns on the WTI and XLE markets, suggesting that the two energy markets are weak-form efficient over the 1999–2019 period with respect to these technical indicators. However, it seems that the same 14,100 rules were able to achieve statistically significant excess return, albeit rather small in magnitude, for the Brent market over the 1999–2019 pre-COVID-19 period, indicating this market might present weak-form inefficiency.

**Figure 4.** Annualized excess returns achieved by all 14,100 SMA crossover rules for the Brent market over the 1999–2019 period.

In turn, the pandemic period presents consistent excess returns achieved by the 1575 tested SMA trading strategies for all energy markets, and especially for the Brent market where an annualized excess return of over 284% has been achieved by the best performing TTR, which is SMA (12,17). However, for the WTI and XLE markets, this over-performance (annualized excess return of 20% for WTI and 83.45% for XLE) does not hold strong when its significance is tested via the standard bootstrapping methodology (1000 random *p*-values of 0.182 and 0.148, respectively). For the Brent market, TTRs are again able to achieve superior and statistically significant predictability (1000 random *p*-value equals 0.037).

Thus, we show in Figure 5 the excess return for all 1575 tested strategies for the Brent market over the 1 and <sup>1</sup> <sup>4</sup> year of pandemic period. Again, only the Brent market has been chosen, as it is the only one where signs of inefficiency are present. Therefore, while the best rule's performance indicates that over-performing trading strategies in terms of excess returns over the BH strategy exist for all markets in the second subperiod, the number of out-performing strategies is nonetheless very high for the Brent market. More precisely, 1528 out of the total number of 1575 TTRs (more than 97%) managed to achieve positive excess returns (which is also confirmed by Figure 5, where it can be easily seen that most of the strategies gain abnormal returns during the COVID-19 pandemic, whilst we remember that only 7 TTRs were found to be over-performing over the pre-pandemic period).

Moreover, as mentioned earlier, these excess returns are statistically significant for the Brent market during the pre-pandemic and also during the COVID-19 period (with 1000 random bootstrap *p*-values of 0.037 and 0.064, respectively). Moreover, we notice that during the pandemic period, the most successful SMA TTRs are the ones with shorter time horizons in the long-run moving average, while the time horizon for short-run moving average varies across the three markets. For example, n2 equals 16 (XLE), 17 (Brent), and 28 (WTI) when it is allowed to vary in the interval (16:120) during the pandemic period.

**Figure 5.** Annualized excess returns \* for all 1575 SMA crossover rules for the Brent market over the January 2020–March 2021 COVID-19 pandemic period.

There is also some variation in the trading frequency of the best performing trading rule across the three energy markets over the two subperiods. For example, during the pandemic window in the WTI market, the most profitable TTR only signals a total of 8 trades over the 1 and <sup>1</sup> <sup>4</sup> year period, whilst for Brent and XLE 16 and, respectively, 18 trades are generated. Surprisingly, the best performing TTRs over the 1999–2019 period do not signal significantly more trades than over the much shorter pandemic period for WTI and XLE. On the contrary, for the Brent market, the optimal TTR is the short-term moving average rule SMA (5, 33), which generates a total of 220 trading signals over the pre-pandemic 21-year period, whilst the optimal TTR generates only 16 trading signals over the COVID-19 period, as seen above.

Despite the fact that the analysis is performed ex-post and also that transaction costs have not been included at this point, the above results still indicate some predictability of technical indicators in the case of Brent market, and especially during the pandemic period, that needs further investigation.

So, we test next for the economic significance of results and find that excess returns during the pandemic period remain abnormal for Brent when we include transaction costs in estimations. For the pre-pandemic window, excess returns disappear with the inclusion of trading costs.

Table 4 presents excess return net of transaction costs over the benchmark buy-andhold strategy for the best performing TTR on the Brent market during the pandemic period, along with its corresponding 1000 bootstrapped *p*-value and the data snooping adjusted RC *p*-value. Meanwhile, Figure 6 reflects annualized excess returns net of transaction costs for all 1575 technical rules applied to the Brent market over the same period. The graph confirms that an overwhelming 96.20% of TTRs are still over-performing (1515 out of 1575 tested TTRs) after trading costs of 5 basis points (bps) are considered. This implies that only 13 rules' performance has been affected by the inclusion of transaction costs. Moreover, the over-performance is high in terms of magnitude of excess returns, with more than 92% of rules (1452 TTRs) achieving annualized excess returns of over 10%, and more than 31% of TTRs (493 rules) achieving annualized excess returns higher than 50%, while the best performing rule gains 270% annualized excess return net of transaction costs over the BH strategy. On the other hand, the under-performance is far less severe: only

60 TTRs out of the total universe of 1575 tested over the COVID-19 period (or 3.8%) have no economic value relative to the benchmark BH trading strategy, the majority of which (34 TTRs or 56.67%) under-performing by less than 10% in annualized terms.

**Table 4.** Excess returns of the best performing trading rule on the Brent market during the COVID-19 pandemic (January 2020–March 2021) net of transaction costs \*.


\* Includes trading costs of 5 bps. \*\* The buy-and-hold strategy is the benchmark.

**Figure 6.** Excess returns\* net of transaction costs of 5 bps for all 1575 SMA cross-over rules for the Brent market over the January 2020–March 2021 COVID-19 pandemic period.

When it comes to the statistical significance of the best TTR's performance, results hold strong when the bootstrapping methodology is applied (the 1000 random bootstrapping *p*-value of 0.064 is not affected by the inclusion of transaction costs in the estimation), but in turn the *p*-value resulting from the Reality Check test is no longer significant (RC *p*-value = 0.406). This indicates that the adjustment for data-snooping bias still has an important impact on the significance of results.

Overall, excess returns gained by the optimal TTR on the Brent market during COVID-19 do not hold strong after accounting for data-snooping bias by employing White's Reality Check test, but we feel the adjustment of the *p*-value via the RC procedure might be too severe and the procedure too conservative in this particular situation. We argue that the vast number of over-performing rules encountered on the Brent crude oil market over the COVID-19 pandemic period together with the magnitude of this over-performance, compared with the small number of underperforming rules (60 TTR out of 1575) and the "mild" relative underperformance (only 7 rules, or 0.04% of all TTRs achieve relative losses higher than 50%, while the majority encounter losses of less than 10% relative to the benchmark) already mitigates the data-snooping bias.

Consequently, in light of the aforementioned arguments, one cannot completely exclude the possibility that this adjustment via the RC procedure might be too severe and thus we should not be too quick to eliminate the possibility that over-performing TTRs might exist on the Brent market during the COVID-19 pandemic.
