**1. Introduction**

The last three decades brought an unprecedented growth of exchange traded funds (ETFs) and index funds, which enable investors to quickly move their capital around the world. Currently, more easily than ever before, investors can relocate their equity allocation from Germany to Brazil or from Japan to South Africa. Not surprisingly, the ETF industry has been growing very rapidly. Already in 2017, the assets under managemen<sup>t</sup> of ETFs exceeded five trillion U.S. dollars, and the compound annual growth rate over the past four years amounted to almost 19% (Lord 2018). The growth of ETFs coincides with a structural change in asset managemen<sup>t</sup> and a shift from active investing to passive investing. As of December 2017, passive funds accounted for 45% of the aggregate assets under managemen<sup>t</sup> in U.S. equity funds, compared to less than 5% in 1995 (Anadu et al. 2018). This profound revolution requires a whole new set of tools for equity investors, who now focus much less on which stocks to choose than on which countries to allocate money in.

The asset pricing literature produced a preponderance of trading signals, which help to predict the cross-section of individual stock returns. Recent surveys documented literally hundreds of different equity anomalies (e.g., Harvey et al. 2016; Hou et al. 2018). Notably, many of these cross-sectional patterns, such as value, momentum, or seasonality, have their parallels at the inter-market level and could be potentially used for country allocation. The last 30 years of asset pricing research produced mounting evidence regarding the cross-sectional predictability of country equity returns. The studies documenting numerous country-level equity anomalies not only provide new insights

into international asset pricing but can also be translated into e fficient country allocation strategies. Moreover, they are invaluable to practical investors.

The studies of the cross section of country equity returns not only examined di fferent return patterns but also employed di fferent methodologies and data sources. Issues such as choice of the index provider, return computation methodology, or portfolio formation can visibly influence the results. The diversity of empirical design and data sources and preparation methods calls for systematic review and for introducing a structure into the methodological choices in the field of country-level asset pricing.

The major objective of this article is to provide a comprehensive review of the current state of literature on the cross section of country equity returns. In particular, our survey considers data sources and preparation, research methods, and, last but not least, the cross-sectional return patterns documented in the country-level equity returns. The cross-section of stock-level returns is summarized in many excellent surveys, concerning both the anomalies themselves (e.g., Nagel 2013; Harvey et al. 2016; Hou et al. 2018; Bali et al. 2016), as well as methodological and data choices (Jagannathan et al. 2010; Waszczuk 2014a, 2014b). For country-level cross-sectional asset pricing, such surveys are clearly missing. To the best of our knowledge, any such review has not been ye<sup>t</sup> presented. This work aims to fill this gap. We not only review, but we also structure and introduce some order into the current state of country-level asset pricing literature.

The article reviews three aspects of the studies of cross-section of country equity returns. First, we focus on the choice of data and the underlying asset universe as well as on dataset preparation. At the same time, we review the approaches regarding the country coverage, study period, return measurement, currency unit, and asset universe. Second, we survey some common methodological choices in the asset pricing literature, such as the number of portfolios, return calculation, and portfolio weighting scheme. Finally, we examine the current state of knowledge on country-level cross-sectional return patterns. We review the most prominent of such patterns, such as momentum, value, long-run reversal, size, seasonality, and price and non-price risk, as well as a basket of minor anomalies. We also discuss several additional aspects of these return patterns, including their fundamental sources and implementation details. Finally, we also consider additional practical aspects of country-level return patterns: the role of trading costs and strategy timing.

The remainder of the article proceeds as follows. Section 2 focuses on datasets, data preparation, and asset universe. Section 3 focuses on some specific methodological choices. Section 4 reviews the documented cross-sectional patterns in country equity returns. Finally, Section 5 concludes the article.

#### **2. Datasets and Sample Preparation**

This section concentrates on the choice of dataset representing country equity returns and preparation of the sample. We survey the approaches to selection of country coverage, study period, return measurement period, currency unit, and asset universe.

#### *2.1. Country Coverage*

The datasets used in examinations of the cross-sectional patterns in country index returns are obviously smaller than in the stock-level studies, which often encompass several thousand companies. Naturally, the scope in this case is limited to the countries with operating stock markets. The early studies usually focused on less than 20 developed markets. For example, Keppler (1991a), Ferson and Harvey (1994a), and Richards (1995) considered 18 developed markets. Modern studies usually concentrate on about 40 countries selected on the basis of classification into developed and emerging by one of the major index providers. For instance, Clare et al. (2016) investigate 40 markets, and Fisher et al. (2017) examine 37. The broadest studies take into account also less tradeable frontier markets, and their sample size can exceed 70. The article by Avramov et al. (2012), investigating 75 equity markets, may serve as an example of such an approach. Perhaps one of the broadest studies

was conducted by Suleman et al. (2017), who took into consideration 83 countries. The detailed outline of the sample size in selected studies is presented in Table 1.


**Table 1.** Research samples in the studies of the country equity returns.

Note. The table summarizes the data samples used in selected studies of the cross-section of country equity returns, indicating the number of countries covered, the length of the study period, and the asset universe.

In general, the larger sample size increases the power of statistical tests and allows additional insights on the examined return pattern. However, some studies may deliberately limit the sample of considered countries. One reason for that may be the focus put on some particular geographical region. For example, Grobys (2016) concentrates solely on the European Monetary Union. Another motivation to limit the number of countries in the sample may be alignment of the study with research practice. Since very liquid futures or ETFs cover only a small number of countries, the examinations may be reduced to just 10–20 of the most tradable markets. For example, the highly influential studies of Asness et al. (2013) or Keloharju et al. (2016) examined the samples of only 18 and 16 equity indices, respectively. Furthermore, the studies utilizing early security data tend to be sometimes quite narrow due to data unavailability. For example, Hurst et al. (2017), who investigated more than a century of evidence of trend following profits, constrained their scope to only 11 countries.

#### *2.2. Study Period*

The study period is usually dictated by the index data availability. Thus, numerous studies focusing on the most prominent cross-sectional patterns start in the years 1969 or 1970, when the coverage of many developed markets by MSCI begins (e.g., Balvers and Wu 2006; Bhojraj and Swaminathan 2006; Muller and Ward 2010; ap Gwilym et al. 2010). Consequently, the research period encompasses usually three to four decades. If the study period is shorter, this is usually due to inability to collect some sort of additional data for the 1970s or 1980s. For example, Berkman and Yang (2019), who focus on country-level analysts' recommendations, reduced their study period to the years 1994–2015. Finally, as an alternative to equity indices, some studies proxy the equity markets with respective ETFs. In such cases, the price availability is, naturally, shorter. Smith and Pantilei (2015), who test the "Dogs of the World" strategy in ETFs, examine their returns for the years 1997–2012.

A separate and rapidly growing field encompasses studies of early security data that allow insights into the long-run nature of the financial market phenomena. In asset pricing studies in particular, examinations of the close-to-century long datasets make it possible to check the true robustness of the return patterns and secure against the risk of false discoveries and data mining. Some data providers, like Global Financial Data, offer their own proprietary indices going back to the 19th or even 18th century. A representative study of this type could be Geczy and Samonov (2017), who examine the momentum effect in the returns on major asset classes for the years 1800–2014. Baltussen et al. (2019a) research several major anomalies for the years 1799–2016. Other studies researching similar long-run data sets include Ilmanen et al. (2019), Hurst et al. (2017), or Spierdijk et al. (2012). For a detailed outline of different lengths of the study period, see Table 1.

#### *2.3. Return Measurement Periods*

The most common choice in individual stock studies is to use monthly returns. The motivation is that this choice forms a consensus that allows the large number of observations necessary for statistical tests to be accumulated and, at the same time, mitigates the influence of microstructure effects (Waszczuk 2014a). Nearly all of country-level studies take a similar approach and utilize monthly returns (e.g., Richards 1997; Chan et al. 2000; Blitz and van Vliet 2008). This refers, in particular, to the studies of early security data (e.g., Geczy and Samonov 2017; Baltussen et al. 2019a), where the more frequent observations are hardly available.

The use of different return intervals is rather infrequent and usually limited to examinations of alternative holding periods, as in Andreu et al. (2013) or Kasa (1992). On the other hand, Vu (2012) is one of the very few studies that relies on weekly returns to amass a bigger quantity of observations.

#### *2.4. Currency Unit*

Asset pricing studies of firm-level data frequently focus on single countries (e.g., Fama and French 2015) or replicate analyses in multiple individual markets (e.g., Chui et al. 2010). Therefore, the role of currency unit is of lesser importance and the calculations oftentimes rely on local currencies. On the other hand, in the cross-country analysis the volatile foreign exchange rates and inflation rates—especially in emerging and frontier markets—play a significant role. Consequently, the majority of cross-country studies set a common currency as a unit of calculations, and the most obvious and common choice is the U.S. dollar. Dobrynskaya (2015), Clare et al. (2016), Keppler and Encinosa (2011), or Smith and Pantilei (2015) may serve as examples of papers that denominate all the prices in U.S. dollars. This currency is also a default choice in the studies that utilize futures or ETFs as representation of the country exposure, as it directly expresses the perspective of a U.S. investor (e.g., Andreu et al. 2013; Daniel and Moskowitz 2016; Smith and Pantilei 2015).

The use of returns calculated on the basis of local currency prices is rather rare. Usually such a framework has a sort of robustness check applied or is examined explicitly to evaluate the role of currencies in return predictability (Chan et al. 2000; Bhojraj and Swaminathan 2006). For example, Jordan et al. (2015) examine empirically the importance of the currency numeraire for the stock return predictability. They argue that, for instance, the presence (absence) of predictability for an American investor does not need to imply the existence (absence) of predictability for other international investors. Sometimes the local currency returns are also used in the studies of early security data to alleviate the problem of reliability of more than a century-old foreign exchange rate (Geczy and Samonov 2017). Nonetheless, even for the early asset prices studies the U.S. dollar is the very common choice (Baltussen et al. 2019a).

#### *2.5. Asset Universe*

Examination of cross-sectional patterns in country equity markets requires some representation of the market return. In country-level studies the asset universe comprises usually one of two types of instruments: either equity indices or some real investable instruments.

The major benefit of equity indices is that they provide a broad and accurate representation of the local equity markets. The articles investigating samples of international stock market indices basically follow one option. Most commonly, the studies are based on indices from a single provider. Alternatively, a study can rely on an amalgamation of local indices computed by national stock exchanges or local companies.

The use of indices from a single provider certainly has some benefits. They include calculation transparency, result comparability, and consistency in index calculation across many countries. Indices provided by MSCI are the most popular choice in country-level asset pricing. MSCI indices represent value-weighted equity portfolios covering approximately 85% of the largest and most liquid companies in each country. They also form the basis for multiple investment products, including popular iShares ETFs. MSCI estimated that more than 7 trillion US dollars were benchmarked to MSCI indices as of June 2011 (Cenedese et al. 2016). Furthermore, importantly from a practitioner's perspective, MSCI usually does not apply any retroactive changes to the reported returns of its indices, so it reduces the risk of potential biases.

The current coverage encompasses 85 countries, including developed, emerging, frontier, and so-called standalone markets. The data period dates back to December 1969. The additional benefit of the MSCI indices is that they are calculated in several di fferent ways, including di fferent currencies, controlling for taxes, accounting for dividends, etc. For example, the MSCI indices were used by Dobrynskaya (2015), Clare et al. (2016), Fisher et al. (2017), Keppler and Encinosa (2011), Richards (1997), Balvers and Wu (2006), Keimling (2016), Keloharju et al. (2016), Malin and Bornholt (2013), Ferson and Harvey (1994b), and many others.

Datastream Global Equity Indices are the second most popular index choice. These cover currently 64 countries and go back in time to January 1973. Notably, the Datastream indices also assure a broad and consistent international representation, and at certain periods in the past their coverage may be better than in the case of MSCI. This index provider was selected, for example, by Bali and Cakici (2010), Umutlu (2015, 2019), and Zaremba (2019).

The studies of more than a century long dataset usually take advantage of indices computed by Global Financial Data (GFD). This provides time-series going back to the 19th century for numerous developed and emerging markets. Obviously, such long-run datasets are not free from different biases or omissions, and the index portfolios frequently contain very few securities, but certainly they provide a unique look into the past data. The GFD indices were employed by Geczy and Samonov (2017) and Baltussen et al. (2019a), among others.

One of the drawbacks of the indices obtained from different providers is that their coverage may differ; some countries may be taken into account by one provider but not considered by others. Consequently, to maximize the size of the research sample, some studies merge indices from different sources. Erb et al. (1995), in one of the first studies of this type, represent the developed markets by the MSCI indices and the emerging ones by the portfolios calculated by the International Finance Corporation (IFC). Avramov et al. (2012) use MSCI indices and supplement the coverage of missing countries with Datastream portfolios. Geczy and Samonov (2017) blend Bloomberg and GFD indices. Finally, Baltussen et al. (2019b) collect data from Bloomberg, with gaps filled in by Datastream data, spliced with index-level data, as in Baltussen et al. (2019b), and, eventually, backfilled data downloaded from Global Financial Data.

Besides using the indices from acknowledged providers, there are also several other options. Ellahie et al. (2019) use aggregated stock-level data from CRSP and Compustat. In other words, they calculate the country portfolios themselves instead of obtaining them from external sources. The final variant is to use national indices computed by local providers such as DAX, NIKKEI, or S&P. This approach is employed by Chan et al. (2000) and Vu (2012), among others. This approach has two major benefits. First, it may help to increase the dataset, because these local indices may have a longer history available than their counterparts offered by MSCI or Datastream. Second, the most liquid equity index futures are oftentimes linked with local indices rather than with international ones. For instance, in Poland the most liquid equity index future is based on the WIG20 index computed by the Warsaw Stock Exchange. Consequently, the use of local indices may be more aligned with investment practice. Nevertheless, on the other hand, the major shortcoming in relying on the local indices is the lack of computational consistency. Different indexes rely on different selection and weighting methods, so the study outcomes may be potentially influenced by the index calculation methodology, resulting in misleading conclusions. For example, better performance of an index in a certain market may stem from its bigger exposure to small-cap companies rather than from a true factor examined by the researcher.

Instead of investigating "paper" equity indices, some studies focus on actual investment instruments providing exposure to international markets. While this framework may potentially limit the size of the dataset, certainly it reflects most closely the investor's practical perspective. Following this reasoning, Daniel and Moskowitz (2016), Moskowitz et al. (2012), and Hurst et al. (2017) base their computations on futures markets. Alternatively, Andreu et al. (2013), Breloer et al. (2014), and Smith and Pantilei (2015) focus on single-country ETFs.

For more examples of different asset universes used by the studies of the cross-section of country equity returns, see Table 1.

#### **3. Methodological Choices**

The country-level asset pricing studies strongly rely on econometric and statistical toolsets very similar to those used in the regular studies applied to the individual firms. The two most common approaches are cross-sectional (or panel) regressions and portfolio sorts. These two complementary approaches are frequently used jointly, as recommended by Fama (2015), and their benefits and shortcomings are discussed in detail by Fama and French (2008).

In the most typical applications of the cross-sectional regressions following Fama and MacBeth (1973), the future returns are regressed against a number of return-predicting variables, i.e., characteristics. Cross-sectional regressions are used, for instance, by Bali and Cakici (2010), Fisher et al. (2017), and

Stocker (2016). Sometimes this approach is supplemented with different types of panel regressions, as in Hjalmarsson (2010), Lawrenz and Zorn (2017), and Bali and Cakici (2010). Wisniewski and Jackson (2018) apply pooled ordinary least squares and two-way fixed-effects regressions.

Portfolio sorts are the second most popular tool. In this framework all the considered assets—which are in this case country equity markets—are ranked based on certain empirical characteristics, such as past returns or valuation ratios. Subsequently, they are grouped into subsets, and portfolios are formed. Finally, the performance of the cross-sectional portfolios is evaluated on the basis of mean returns, volatilities, Sharpe ratios, and with factor pricing models and monotonicity checks in the style of Patton and Timmermann (2010). The portfolio sorts reduce the cross-sectional dimension of the joint distribution of returns and also help to reduce the impact of measurement error (Waszczuk 2014a).

In the evaluation of the portfolios from one-way sorts, called also single-sorts, there is also a common practice to calculate the returns on a di fferential portfolio (or spread portfolio, long-short portfolio, zero-investment portfolio), which takes long and short positions in the two most extreme quantiles of assets from one-way sorts. The performance of such portfolios is then subsequently evaluated. Importantly, it should be noted that frequently this exercise serves as a quick check of monotonicity rather than a reflection of actual investment performance. Due to tradability and short-sale limitations, forming and rebalancing zero-investment portfolios across many countries is not always possible, unless they are made of liquid futures, as in Daniel and Moskowitz (2016) or Moskowitz et al. (2012). Some further discussion of the details of sorting methods in asset pricing studies is provided in Bali et al. (2016), Vaihekoski (2004), Van Dijk (2011), and Waszczuk (2014a)

The outcomes of the cross-sectional analysis based on portfolio sorts is sensitive to several methodological choices made by the researcher. Importantly, some of the country-level practices may di ffer from stock-level studies due to di fferent number of assets, data availability, liquidity considerations, etc. I will focus then on several of the most important methodological choices.

#### *3.1. Number of Portfolios*

The studies of the cross-section of returns on common stocks rely on datasets of hundreds or thousands of companies. Therefore, decile (e.g., Jegadeesh and Titman 1993, 2001; Lakonishok et al. 1994) or quintile (Banz 1981; Chan et al. 1998) groupings belong to the most common choices. At the country level the number of assets is more limited, so this type of study requires also a smaller number of portfolios. Otherwise, the grouping could result in portfolios containing only a few—or even one—markets, hence being susceptible to the noise in returns. The most popular choices include tertiles (e.g., Daniel and Moskowitz 2016; Geczy and Samonov 2017; Asness et al. 2013; Atilgan et al. 2019), quartiles (Richards 1997; Blitz and van Vliet 2008; Macedo 1995a; Malin and Bornholt 2013; Erb et al. 1995), or quintiles (Clare et al. 2016). Alternatively, some studies which assume different portfolio formation methodologies, consider only two portfolios—long and short (e.g., Moskowitz et al. 2012). Bali and Cakici (2010) consider portfolio groupings including 30%, 40%, and 30% of the markets.

Finally, a number of studies, instead of assuming a certain quantile cut-o ff point, focus only on the extreme portfolios from single-sorts and assume a fixed number of countries included. For instance, Kortas et al. (2005) include 11 of the most extreme countries in each portfolio. On the other hand, Keloharju et al. (2016) test cross-sectional seasonality based on portfolios including the three equity indices with the highest or lowest average return in the past.

#### *3.2. Portfolio Weighting Scheme*

Once the portfolios are formed, the next important step is the selection of the weighting scheme. The most common choice is between the value-weighted portfolios and equal-weighted portfolios. In the first framework, the returns are weighted according to the market capitalization. On the other hand, in the equal-weighted approach, all the returns are assigned an equal-dollar value. At the stock level, the value-weighting approach is markedly more popular, and there are several reasons for that. The equal-weighted portfolios may tend to assume very large positions in small and micro companies, which would be unrealistic in practice, due to liquidity or market capacity issues, for example. In addition, the equal-weighted portfolios have a built-in rebalancing assumption, which may distort the results (Willenbrock 2011). Finally, value-weighting deemphasizes observations that are more likely to suffer from the data errors, thus reducing the variation in average returns. Nevertheless, at the country level the choice is not that obvious. Indeed, the equal-weighted portfolios may gravitate towards small and illiquid frontier markets, where any large exposure of frequent share purchases may be unrealistic. However, on the other hand, in the case of a limited sample size of just 30–40 countries, the value-weighted portfolios may be strongly dominated by only a few of the largest countries. Furthermore, the aggregate market value may not always be available, or it may not have any intuitive equivalent, as is the case with the futures or ETFs. Consequently, the equal-weighted portfolios are much more common, or at least used along with the value-weighted portfolios. The equal-weighted portfolios are used, for example, by Geczy and Samonov (2017), Clare et al. (2016), Hurst et al. (2017), and Balvers and Wu (2006). The value-weighted strategies, on the other hand, are analyzed by Chan et al. (2000) and Rikala (2017).

Besides the classical value- or equal-weighted portfolios, some articles pursue alternative frameworks. Clare et al. (2016) and Moskowitz et al. (2012) use so-called risk-parity, i.e., they weight the portfolio components on their inverse volatility. On the other hand, Ilmanen et al. (2019) and Asness et al. (2013) link the weight with the value or rank of the underlying characteristic so that the absolute weight increases when the sorting variables take more extreme values.

#### *3.3. Return Calculation: The Treatment of Dividends and Taxes*

The index-level return calculations face two major methodological choices. The first issue refers to the treatment of dividends. Most of the studies are based on total return indices, which include reinvested dividends, regardless of the particular index provider (e.g., Richards 1997; Balvers and Wu 2006; Bali and Cakici 2010). Accounting for dividends reflects the investor's perspective well; nonetheless, sometimes the coverage and the length of the time-series may be bigger for the price returns. Therefore, price indices, which do not account for dividends, are employed by Keppler (1991a), for instance. On the other hand, ap Gwilym et al. (2010) and Geczy and Samonov (2017) use both price and return indices. Finally, some examinations use the two types of measures in combination as different inputs. For example, Clare et al. (2016) measure portfolio performance with the total return indices but compute return predictive signals based on price indices.

The total returns indices include dividends, which are taxed in various ways in the majority of countries. Importantly, the dividend tax rates may vary both across time and countries, affecting the net portfolio performance. Some groups of investors, like mutual funds, may be exempted from taxation on dividends in many countries. Nonetheless, this is not true for all the countries, at all times, and for all the groups of investors. Consequently, the taxes may still potentially affect the cross-section of country equity returns. The majority of the country-level asset pricing studies use gross returns, not accounting for taxation. On the other hand, Zaremba (2016) also use MSCI Net Return indices, which account for dividend tax rates within the particular countries.

#### **4. Cross-Sectional Patterns in Country-Level Returns**

We now turn to the review of patterns demonstrated in the cross-section of country equity returns. We begin by focusing on the most prominent and best-established ones, such as momentum, size, and value and, subsequently, carry on with more minor return regularities. In addition, we consider different types of risk that influence future index-level returns. To introduce some order, we arbitrarily classify these risks into the ones that can be derived from prices (price based), and others, that is, non-price risks such as credit or political risks. Eventually, we survey the studies' treatment of some additional aspects of the country-level anomalies, such as factor timing and the role of trading costs.

## *4.1. Momentum*

The momentum effect, which is the tendency of assets with high (low) past returns to continue to overperform (underperform) in the future, is one the most robust and pervasive asset pricing anomalies ever documented. It has been demonstrated in U.S. and international stocks, including developed, emerging, and frontier, markets, commodities, bonds, currencies, and also in equity market indices.

**Index-level evidence.** The first empirical evidence for the momentum effect in country equity indices may be found in Ferson and Harvey (1994b), Macedo (1995a, 1995b), Richards (1997), and Asness et al. (1997). Other researchers have continued the examinations of country-level momentum in the subsequent years. Balvers and Wu (2006) investigate a Jegadeesh and Titman (1993)-style portfolio based on stock market indices from 18 developed equity markets within the years 1969–1999. They demonstrate strong momentum effects, which worked particularly well in combination with the mean-reversion patterns. In the same year, Bhojraj and Swaminathan (2006) published a paper which examined a broader sample of 38 country indices within the same period. The authors document that the quintile of the best performing countries over the previous 6 months continued to significantly outperform the laggard indices during the next three quarters. The mean return on the long/short portfolio within a year after its formation amounted to 7.65%.

The following years saw further examinations of the momentum effect that extended the study sample both in terms of number of countries and the length of the study period. Muller and Ward (2010) investigated 70 countries and Zaremba (2016) researched 74. In terms of the sample length, several studies extended the time-series back to the 19th century and researched approximately 200 years of returns (Geczy and Samonov 2017; Hurst et al. 2017; Baltussen et al. 2019a). The momentum effect is robust to many considerations and could be successfully implemented with the use of ETFs (Andreu et al. 2013). Angelidis and Tessaromatis (2018) argue that "country-based factor portfolios offer a viable alternative implementation of factor investing in a world of illiquidity, transaction costs, and capacity constraints." Some other studies that investigated the momentum effect at the country level are Chan et al. (2000), Daniel and Moskowitz (2016), Grobys (2016), Guilmin (2015), Ilmanen et al. (2019), Breloer et al. (2014), Nijman et al. (2004), Clare et al. (2017), L'Her et al. (2004), Vu (2012), Kortas et al. (2005), and Shen et al. (2005).

**Formation and holding periods.** The seminal study of Jegadeesh and Titman (1993) considered 3–12-month-long sorting and holding periods. Numerous country-level studies, including the early ones, take a similar approach (e.g., Balvers and Wu 2006; Andreu et al. 2013). Later studies frequently used the approach advertised by Fama and French (1996), i.e., 1-month holding period and 12-month sorting period with the most recent month skipped (e.g., Dobrynskaya 2015; Blitz and van Vliet 2008; Asness et al. 2013). The 1-month skip period is usually applied in order to disentangle the short-term reversal effect discovered by Rosenberg et al. (1985), Jegadeesh (1990), and Lehmann (1990). Nonetheless, at the country level no similar one-month reversal effect has been documented, and Zaremba et al. (2019) argue that the returns display rather a short-term continuation. Consequently, the country-level studies do not always assume the one-month skip period, and if they do, this is usually motivated by liquidity and implementation issues (Asness et al. 2013; Baltussen et al. 2019b). For this reason, Geczy and Samonov (2017), who study early security data, decided to skip even two months in part of their tests.

**Momentum improvements and alternative implementations.** While the classical momentum assumes sorting the indices on raw past returns, a number of studies offer alternative, but closely related approaches. Notably, while some are conceptually very close to momentum, more detailed tests show that they provide incremental information about future returns. Moskowitz et al. (2012) and Hurst et al. (2017) evaluate so called "time-series momentum". This strategy assumes including markets into long or short portfolio depending on whether the excess return in the sorting period was positive or negative. ap Gwilym et al. (2010), Clare et al. (2017), and Baltussen et al. (2019b) test trends following strategies that focus on whether the most recent index value is above or below its moving average. Bornholt and Malin (2010, 2011) research the 52-week high strategy, whereby the

return-predicting signal is the distance to the 52-week maximum index value. Avramov et al. (2018) concentrate on the distance between short- and long-run moving averages of prices. Finally, several studies demonstrate that the momentum e ffect could be e fficiently combined with long-run reversal to augmen<sup>t</sup> the performance of the strategy (Balvers and Wu 2006; Asness et al. 2013; Bornholt and Malin 2014).

**Sources of the momentum e** ff**ect.** The stock-level momentum studies highlight a number of di fferent explanations of the momentum e ffect, such as risk premium, behavioral underreaction or overreaction, herding, or confirmation. For example, Bhojraj and Swaminathan (2006) highlight the potential overreaction to news about macroeconomic conditions. In addition, Cenedese et al. (2016) link the momentum e ffect with the tendency of investors to increase their holdings in markets that have recently outperformed (Froot et al. 1992; Bohn and Tesar 1996; Gri ffin et al. 2004; Chabot et al. 2014). Other studies o ffer some alternative explanations. Balvers and Wu (2006) link the momentum e ffect with production-based asset pricing concepts. From the risk-based perspective, Asness et al. (2013) argue that global funding liquidity risk is a partial source of the momentum pattern. Cooper et al. (2019) demonstrate that momentum returns are explained by the portfolio loadings on global macroeconomic risk factors. Eventually, Evans and Schmitz (2015) link the global momentum e ffect with data mining for anomalies, calling it a likely example of a selection bias.

#### *4.2. Size E*ff*ect*

The country-level size e ffect is a phenomenon parallel to the firm-level size e ffect discovered by Banz (1981). Keppler and Traub (1993) were the first to demonstrate that low-capitalization equity markets outperform large equity markets. The authors found that the smaller national equity markets in the MSCI Developed Markets universe produced an average annual return of 19.19% within the years 1975–1992. This outcome compared favorably with the 12.67% total compound return on the MSCI World Index. Furthermore, the small markets displayed lower downside characteristics. The outperformance of the small firms was later confirmed also by Asness et al. (1997) and by Keppler and Encinosa (2011). The size, measured with market capitalization, also belonged to the risk attributes examined by Harvey (2000).

The size e ffect was further demonstrated in several more recent studies. Fisher et al. (2017) show that stocks from small equity markets tend to have higher average returns than stocks from large countries. Notably, they accentuate that the country size e ffect is largely independent of the firm size e ffect and other country quantitative factors such as the momentum or value e ffects. Zaremba and Umutlu (2018) also demonstrate the size e ffect in large international sample, and Li and Pritamani (2015) show that it drives the returns on emerging and frontier markets. Similarly, Pungulescu (2014) points out that the market size e ffects account for up to 1% per year in terms of expected returns in emerging countries. Finally, Rikala (2017) focuses solely on European markets and finds no consistent evidence that small countries outperform large ones.

**Sources of the country size premium.** The firm-level size e ffect is frequently linked to additional risk factors, such as liquidity, information risk (see Norges Bank (2012) for a comprehensive review of the sources of small firm e ffect). While Fisher et al. (2017) provide evidence that the country-size effect is not simply a firm-size e ffect "in disguise" (the e ffect does not arise because smaller markets are populated by smaller firms), the potential explanations usually oscillate around the concept of risk. Rikala (2017) writes that "Intuitively, small countries producing higher returns is logical because of the widely acknowledged return profile of small stocks; investing in small firms produces higher returns in exchange for greater volatility and possibly even a return premium; a return in excess of the required compensation for additional risk." Fisher et al. (2017) conjecture that the small-country e ffect is due to home bias, but they provide mixed evidence in support of this conjecture. They also demonstrate that the country size e ffect does not simply stem from lower analysts' coverage. Zaremba (2016) shows that accounting for country-specific risks (sovereign, political, etc.) can largely explain the abnormal returns for small markets. Finally, Pungulescu (2014), similarly to Zaremba (2016), demonstrates

that the size effect is more pronounced in emerging countries than in developed countries, and the size premium exists independently of the segmentation premium documented in the literature. Finally, Zaremba and Umutlu (2018) provide evidence that the country size premium is strongly concentrated in January, as in the case of the firm size effect (Keim 1983; Lamoureux and Sanger 1989; Daniel and Titman 1997). Last but not least, a white paper by Evans and Schmitz (2015) argues that the cross-sectional pattern related to market capitalization may be simply a statistical artifact, which cannot be confirmed in the recent data.

#### *4.3. Value E*ff*ect*

The value effect refers to the tendency of stocks with low valuation ratios, such as the price-to-earnings ratio or price-to-book ratio, to outperform stocks with high valuation ratios. For individual stocks, this phenomenon has been well known for about six decades now (Nicholson 1960; Basu 1975, 1977, 1983; Reinganum 1981), but in the equity indices it has been documented only in the 1990s (Keppler 1991a, 1991b). In one of the earliest studies, Macedo (1995a, 1995b, 1995c) researches the performance of country portfolios based on 18 country equity indices. She forms quartile portfolios from sorts on three different indicators, the book-to-market ratio, dividend yield, and earnings yield, and tests their performance within an almost 20-year period. She concludes that the "cheap" countries outperformed the "expensive" markets, and the differential annual return between the countries with the lowest and highest valuation ratios ranged from 1.25% to 8.54%, depending on the ratio selection, rebalancing frequency, and hedging approach.

The valuation effect was also confirmed in more recent studies that use broader data samples and longer timespans. For example, Angelidis and Tessaromatis (2018) investigated the performance of 23 developed markets within the 1980–2014 period. They found that value portfolios vividly outperformed market portfolios, delivering information ratios ranging from 0.27 to 0.39, depending on the weighting scheme. Further evidence for the value effect across countries was provided by Faber (2012), Klement (2012), Angelini et al. (2012), Ellahie et al. (2019), Novotny and Gupta (2015), Keimling (2016), Kim (2012), Heckman et al. (1996), Ferson and Harvey (1994b, 1998), Kortas et al. (2005), Lawrenz and Zorn (2017), Ferreira and Santa-Clara (2011), Desrosiers et al. (2007), Zaremba and Szczygielski (2019), Asness et al. (1997), and, finally, L'Her et al. (2004). Furthermore, Baltussen et al. (2019b) included the value effect in their two-century study, confirming its pervasive and robust character. However, Kim (2012) and Zaremba (2016) show that the effect is stronger among the emerging markets rather than in developed countries.

**Valuation ratios.** The value effect in country equity indices can be examined with different valuation rations. The majority of them are parallels of similar ratios or techniques used at the firm level. The most popular include price-to-earnings (P/E) ratio (e.g., Ellahie et al. 2019; Kim 2012; Keimling 2016), price-to-book (P/B) ratio (Ellahie et al. 2019; Angelidis and Tessaromatis 2018; Kortas et al. 2005), or dividend yield (Keimling 2016; Hjalmarsson 2010; Keppler 1991a). Some articles focus also on modified versions of these valuation ratios. For instance, Kortas et al. (2005) use forward P/E ratios and Lawrenz and Zorn (2017) concentrate on conditional price-to-fundamental ratios. The other utilized ratios encompass price-to-cash flow ratio (e.g., Keppler 1991a; Keimling 2016). Desrosiers et al. (2007) offer an alternative framework based on residual income. Zaremba and Szczygielski (2019) review several popular valuation ratios to conclude that the EBITDA-to-EV signal seems to be the most effective predictor of future cross-sectional returns. In addition, Ferreira and Santa-Clara (2011) show that several ratios can be combined to obtain superior performance.

Finally, there is one specific valuation ratio, which was designed purportedly for the country-level predictions: the cyclically adjusted price-to-earnings ratio, abbreviated CAPE. This technique could be traced back to the seminal work "Security Analysis" by Graham and Dodd (1940). The authors put forward an idea of smoothing earnings over the previous few years in order to calculate valuation ratios. Nevertheless, the true father of the application of CAPE to equity premium predictions is Robert Shiller, the Nobel laureate of 2013. In his 1988 study (Campbell and Shiller 1988, p. 675) he

demonstrated that "a long moving average of real earnings helps to forecast future real dividends." Consequently, it might be also used to predict future returns. CAPE, called also Shiller P/E, is computed as an index value divided by the average of trailing 10-year earnings adjusted for inflation. Numerous studies demonstrate that CAPE could be also successfully applied to country selection. For example, Faber (2012) examines the role of CAPE in a sample of 30 countries' equity markets for the years 1980–2011. Faber (2012) provides evidence that an equal-weighted quarter portfolio of the countries with the lowest CAPE produces a mean yearly return of 13.5%, whereas the most expensive markets deliver only 4.3% per year. At the same time, the equal-weighted portfolio of all of the countries in the sample returned 9.4% per year. Klement (2012) demonstrates that CAPE can predict returns even within a five- to ten-year horizon. The efficiency of CAPE as the predictor of future returns was later verified and confirmed also by Angelini et al. (2012), Novotny and Gupta (2015), Keimling (2016), and Ilmanen et al. (2019).

**Sources of the value e**ff**ect across countries.** The common reasoning regarding the value effect is similar to the parallel effect at the firm level, linking it either to behavioral mispricing or to some risk factors not captured by the established asset pricing models. Nonetheless, the catalogue of risks may be slightly different due to differences in the nature of the asset class. Ellahie et al. (2019) find that low P/B countries face temporarily depressed current earnings, and their recovery in future earnings growth is uncertain. Moreover, the markets with low P/B also exhibit greater downside sensitivity to global earnings growth. Ferson and Harvey (1998) argue, for instance, that the P/B ratio has cross-sectional explanatory power at the global level, mainly because it contains information about global market risk exposures. Zaremba (2016) also shows that the country specific risk explains a large part of the country-level value premium.
