3.1. Data Processing
In order to test the benefits of the SRI in the investment policy of the SIEFOREs, we used the investment policy of the S&P Mexican target risk index methodology [
6] that, as S&P Dow Jones LLC states in the index presentation, tries to replicate the multi-asset performance of the four basic type of SIEFOREs, excluding the international debt, real estate, and commodity portfolio components presented in the CONSAR’s investment policy of
Table 1.
As noted, this multi-asset policy excludes these types of securities and the S&P Dow Jones methodology document does not give a clear explanation about this exclusion, even if the company publishes benchmarks that measure the performance of these markets. In order to be consistent with the Mexico target risk methodology, we used the investment levels presented in
Table 2 and
Table 3 and we set aside the use of these type of securities herein.
Departing from this, we simulated the performance of the Mexican S&P target risk indexes with a different starting date than 28 August 2018. For this purpose, we used the investment levels of each index presented in
Table 3. This is a simplified version of the investment levels suggested in the S&P Dow Jones [
6] Methodology of
Table 2. As noted, we changed the use of the four redemption term benchmarks of both the M government bonds (bonos M) and Udibonos (inflation-linked bonds) and we used the two all redemption term benchmarks. The main reasons for these changes were:
- (1)
We tested the performance of the investment policy since 17 January 2005. That is, the starting period when, as mentioned in the introduction section of this study, the SIEFOREs started to invest in equities. The S&P target risk indexes started their value in 31 December 2008 and some benchmarks, such as the 10–20 and 20+ year M bonds, started their historical values on 1 October 2008 and 24 October 2006, respectively.
- (2)
The same issue happens with the 10–20 and 20+ year Udibono indexes that started to be measured on 4 January 2006.
As noted in
Table 3, the S&P global 100 was used instead of the S&P global 1200. The main reason for this change is the fact that the S&P Global 100 measures the performance of the 100 most liquid and biggest companies of the former index. The drawback of the S&P global 1200 is the possibility that some stocks might not be as liquid as expected.
From these indexes, we used the investment levels presented in
Table 2 to simulate the performance of January 17, 2005 base 100 indexes (
Bi,t).
One of the drawbacks that we faced with the historical data of the S&P/BMV IPC sustainable (IPCS) is that the index has its first historical value on 28 November 2008 and that date does not show the historical performance that the index would have had in the financial crisis period of 2007–2008. In addition, this shorter time series would have shortened the length of the simulation period had we used it. In order to solve this issue, we made a backward calculation of the index from 17 January 2005 by fixing the constituents at the first day of calculation (28 November 2008) and using their market capitalization values. With this “synthetic” IPCS index, we could perform a historical simulation of the Mexican IPCS from 17 January 2005 to 31 July 2018.
For exposition simplicity, we summarize the simulated benchmarks in
Table 4, and give the SIEFORE type and the Mexican equity index used in each.
With all the historical market indexes data, we retrieved the historical index price, P
t, data from the databases of Reuters Eikon [
45], Bank of Mexico [
46], and VALMER [
47] (The Mexican Stock Exchange price vendor) and calculated their base 100 values at the simulation’s start date (
Bi,t), given the investment levels of each index in
Table 3. We also downloaded the yearly historical rate in the secondary fixed income market of the 28-day CETES, a security that is considered the risk-free asset in Mexico. Once we had these historical simulated index values,
Bi,t, we calculated the continuous time return, Δ%
Bi,t, time series by following the next expression:
With this historical return data,
, we estimated the mean (
μi) and standard deviation (σ
i) values and the next Gaussian log-likelihood function (
LLF):
We used the assumption that the returns were normally distributed by the fact that we used daily simulations and because even if it could be more appropriate to use a Student-
t distribution in the financial time series, we were estimating the observed parameters in the full time series of our simulations (a total of 3407 weekly observations). Had we executed a quantitative investment decision process, such as a portfolio selection, an algorithmic trading one, or a financial risk measurement objective, we would have tested the goodness of fit of other probability functions. By the fact that we are measuring only the performance and because we want to be consistent with the method used by Areal, Cortez, and Silva [
34], we used a Gaussian
LLF.
Following this last assumption, we also assumed that the performance of the simulated benchmark could be modeled with a two-regime (
S = 1,2) Gaussian Markov-switching model that uses the next log-likelihood function:
In this two-regime model, we denominated S = 1 as the good-performing or normal time periods (or regime) in which there is a lower standard deviation or volatility than in S = 2 that will be known as a bad-performing, distress, or crisis regime. This leads to the expectation of σs=2 > σs=1.
With the application of Hamilton’s [
48,
49,
50] filter for the inference of the Markov-switching model (3), we determined the mean (
μi,s=i) and standard deviation (
σi,s=i) for each regime in each simulated benchmark and also the
, the Akaike [
51], Schwarz [
52], and Hannan-Quinn [
53] information criterions. This was done to determine the fit of either a Gaussian single or a two-regime scenario in the time series.
As a methodological note, we estimated Hamilton’s filter with the Expectation-Maximization (E-M) algorithm of Dempster, Laird, and Rubin [
54] by using the MSwM R library of Sanchez-Espigares and Lopez-Moreno [
55].
We also used the values of
μi,s=i and
σi,s=i to determine in a normal or distress scenario if the expected return (
μi,s=i) and the risk exposure (
σi,s=i) was higher or lower if the investment in the IPCS was made. This required the testing of the next null hypotheses in each four pairs of simulated benchmarks (or portfolios) for each type of target-risk index summarized in
Table 3:
The first three hypotheses (4) to (6) present our expected position that the SRI pays a better than or at least equal return as the conventional one and the second third, (7) to (9), present our position that the risk exposure is lower if SRI is used in the Mexican pension funds.
Finally, with hypothesis (10) to (12), we tested if there is a better mean-variance efficiency by measuring the Sharpe ratio [
56] (either in a single or a two-regime scenario) with the annualized mean return of each simulated benchmark by using the mean secondary market yearly rate of the 28-day CETES (
μrf) and the annualized observed standard deviation.
Test Results Review
As a first result, we present in
Figure 1 the historical (simulated) performance of the conservative (SB1) and moderate (SB2) target risk indexes given the investment levels of
Table 3.
As noted, there was no significant difference in the performance of the simulated indexes for both types of SIEFOREs. Only the moderate-IPC showed differences, such as underperformance against the moderate-IPCS during the financial or Euro zone debt crisis time periods (mid 2006 to March 2008 and February 2010 to January 2014) and a marginal overperformance at the end of the simulation.
In practical terms, the conservative index that invests in the IPCS showed an accumulated return of 108.57% and the one that invested in the IPC, 108.27%. This result suggests that it is recommendable to use SRI in this specific case or risk profile without the significant loss of performance. We state this by the fact the difference in accumulated return was marginal and slightly in favor of the conventional investment style. Also, by referring to the performance in
Table 5, we observe that the yearly accumulated return, standard deviation, and mean-variance efficiency (Sharpe ratio) were practically similar, leading us to suggest that it is preferable to have an SRI only investment policy in the local equity component of SB1 SIEFOREs.
For the case of type 2 SIEFOREs (SB2), the moderate indexes, it is noted that the performance showed some short-term differences, but at the end of the simulation, the simulated indexes showed an accumulated return of 175.51% for the case that invests in conventional Mexican equities and 173.14% for the SRI case. Similar to the previous type of index, we found no considerable difference between the mean return paid and the level of risk exposure (standard deviation) between the index that invested only in SRI stocks and the one that invested with a conventional strategy. Despite this, there was also a non-significant, but higher difference between the mean-variance efficiency of these two simulated indexes. The Sharpe ratio difference of the moderate-IPC with the moderate-IPCS was 0.0896. That means that for each extra 1% of risk level exposure in the moderate-IPCS, we would earn 0.08% less of the return above the risk-free rate than the moderate-IPC. As noted, the mean-variance efficiency loss was still small, as in the case of the conservative indexes that had a Sharpe ratio difference of 0.0061% of extra return lost for each 1% extra of risk exposure.
In
Figure 2, along with the results shown in
Table 5, we show the performance of the SB3 type SIEFOREs (growth indexes) and the SB4 (aggressive indexes) ones. In the first case, the short-term differences increased in relation to the SB2 SIEFORE type and the accumulated return at the end of the simulation. This can be noted by the fact that the accumulated gains were 173.98% for the SB3 (growth index) that invested in the IPC index and 171.02% for the case that invested in the IPCS (a 2.96% difference). Here, the short-term return loss was wider than in the previous type of SIEFOREs, but was still relatively low by the fact that the risk exposure (standard deviation) was just 0.2668% higher (in annual terms) in the growth-IPCS than in the growth-IPC. This result led to a Sharpe ratio or mean-variance efficiency loss (in relation with the growth-IPC) of only 0.1039% of extra return lost in the growth-IPCS (against the growth-IPC), given an extra 1% of risk exposure. As noted, in these three type of indexes, we cannot accept all of our particular hypotheses, (4) to (12), if we are numerically strict, but if we review our analysis in practical terms, the increase in risk exposure and the loss of expected (accumulated) return and Sharpe ratio was marginal and did not have a significant impact if these three types of public pension funds in Mexico perform an only socially responsible investment in their local equity component.
The same result, but with slightly wider differences in the short-term, is noted for SB4s (aggressive indexes) in the lower panel of
Figure 2. The fluctuation of the SB4 that invested in the IPCS index was wider than the previously simulated ones. The accumulated earnings were 184.50% in the aggressive-IPC and 178.5% for the Aggressive-IPCS (a more notorious difference of 6%) and the risk exposure, in yearly terms, was 0.72%. With this wider accumulated return, but a marginally different risk exposure, the mean-variance efficiency loss of investing only in Mexican SRI stocks was of only 0.16%.
A result of interest from the previous results was the level of risk exposure of all the type of SIEFOREs that invested in Mexican SRI stocks (
Table 5). These show a higher risk exposure in the cases that invested in the IPCS, a result that goes against the conclusions of De la Torre and Martinez [
57], De la Torre et.al. [
58], and De la Torre and Macias [
5], who suggest that the performance of the IPCS sustainable index is statistically equal to the IPC comp and the IPC index. Even if this result is notable, the differences were practically marginal and we can summarize that even if we do not have strong proof to accept the null hypotheses (4) to (12), we also do not have strong proof to accept their alternative, leading us to suggest that the observed differences could be observed only in the short-term.
As a potential source of difference among the indexes that invested in the IPCS vs. the ones that invested in the IPC, is the fact that the base year in which De la Torre and Macias [
5] recalculated the IPC sustainable index in a backward simulation was older than that used herein. This leads us to note that the accumulated return of the IPC index in the simulated period was 278.7238% vs. that observed in the IPCS of 258.0864%. If we multiply the difference of these accumulated returns by 20% (the investment level in the local equity factor for the aggressive or SB4 index), we arrive to the 5.92% difference between these two simulated benchmarks.
The other potential source of difference is the fact that the IPCS is more diversified in market capitalization size than the IPC, that is a blue-chip index. With this in mind, it is expected that the potential source of under-performance of the IPCS (that lead to this difference) is due to the fact that some small or mid-market capitalization stocks could have lagged behind the performance of the IPCS (and its corresponding target-risk indexes). In order to give support to our position on this matter, we present, in
Table 6, a regression of the performance attribution test in which we regressed (from 25 October 2006 to 31 July 2018, with a total of 2955 observations) the historical return time series of each simulated target risk index with the historical data of each type of assets index. For the local equity component (either IPC or IPCS), we divided it into three market cap indexes: The IPC large, mid, and small-cap benchmarks. These three benchmarks integrate the 60 stock members of the IPCcomp, which is the marker index benchmark from which the 35 members of the IPCS are screened. For the specific case of the simulated indexes that performed a conventional investment style (IPC), we performed the regression only with the large-cap index because the IPC has, as previously mentioned, only large-cap stocks.
As noted, only the growth and the aggressive target risk indexes were significant and negative in their β value. This was observed either in the mid cap or in the small cap indexes.
Despite this result, it is also noted that the riskiest simulated indexes that invested in the IPCS (growth or SB3 and aggressive or SB4) had a lower mean-variance efficiency and accumulated return. Despite this and given the analysis of
Table 5 and
Table 6, we can suggest that this difference is due only to the performance of mid and small cap stocks and holds only in the short-term.
The accumulated returns of the SB3 to SB4 type of SIEFOREs are in line with the results and conclusions of SRI given in the work of Derwall, Koedjik, and Horst [
26] that suggest that the shunned-stock hypothesis holds. This is due to the fact that there was an observable underperformance in some simulated SRI portfolios against the ones that hold shunned, no-SRI stocks. Despite this, there is still proof to support the use of an only SRI strategy in Mexican pension funds. This may not lead to the creation of alpha, but to the value-driven quality of SRI (even if this means sacrificing performance marginally).
Given the last statement, we reviewed the performance not in a single regime time series, but in a two-regime one. The reason of this test was because we wanted to review the performance of this simulated indexes in bad-performing, distress, or crisis time periods. We used this nomenclature indistinctly (by following Hamilton [
48,
49,
59,
60,
61]) to talk about the time periods in which the volatility levels in the financial markets of interest were higher. Our position is that the previous results do not filter the performance that each type of SIEFORE (target risk index) would have had, had they invested in the IPCS in distress time periods.
Departing this, we present in
Table 7 the goodness of fit of the one and two-regime log-likelihood functions, (2) and (3), along with their information criterions. As noted in all the cases, there was a better fit to the data if the simulated indexes time series were described with a two-regime Markov-Switching model. With this goodness of fit result, we present in
Table 8 the max drawdown and mean-variance efficiency results in a single (as in
Table 5) and two-regime scenario.
In order to determine if a given realization belonged to a given regime, we used the next rule: Δ%
Bi,t,s=2 if P(
s = 2|
rt,
μs=i,
σs=i, π
s=i,
P) > 0.5 or Δ%
Bi,t,s=1 if P(
s = 2|
rt,
μs=i,
σs=i,π
s=I,P) ≤ 0.5. Please refer to Hamilton [
39,
41] and Hauptman et al. [
62] for further reference.
We also show, in the same table, the risk or standard deviation levels, along with the Sharpe ratio as in (13) and mean expected returns.
In a two-regime perspective, the picture looks similar to the previous review in
Table 5. The main difference arrives in the second regime (the bad-performing, distress, or crisis one). In this specific scenario, practically all the simulated indexes or SIEFORE types had a better expected return if they used the IPCS instead of the IPC. This last result goes in line with Areal, Cortez, and Silva [
34], De la Torre and Martinez [
57], and De la Torre and Macias [
5] by the fact that the IPCS (SRI) had a better performance in crisis time periods.
Finally, we present in
Figure 3 a box plot of the percentage variations or returns of the simulated type of SIEFOREs (or indexes). From a paired perspective in each SIEFORE type and by paying attention to the 95% confidence interval boxes, it is noted that the use of the IPCS did not dramatically change the behavior of the performance of a SIEFOREs type 1 to type 4. By making a review of the pair of box plots, it is noted that they practically have the same shape, but the performance of the individual returns showed more extreme outliers in some specific dates. These extreme returns give a stronger support to our position that the observed differences between the four pairs of simulated indexes are observed only in the short-term and the expected (mean) return and risk (standard deviation) differences are due to these short-term outliers and not to a systematic poor performance of SRI.
As a corollary of results, we can mention that even if there is weak proof to accept all the proposed null hypotheses (if we are numerically strict), the results of
Table 5 and
Table 7, the performance attribution test of
Table 6, the Markov-Switching model of
Table 8, and
Figure 3 suggest that even if there is a marginal “under-performance” in the short-term, this does not hold in the long-term. Also, we can suggest that investing in SRI is much better for Mexican public pension funds in distress or bad-performing time periods, such as the ones observed in 2007-2008 (just to give a significant example in our simulation period).