*4.5. Incremental Regression R<sup>2</sup>*

#### 4.5.1. The Determinants of Spread Benchmarks

The correlation analysis offers some interesting results. However, these results alone may not fully reveal how much variation in a liquidity benchmark is captured by a variation in a liquidity proxy. This is because a covariate variable can affect both the benchmark and the proxy, spuriously increasing or decreasing the correlation. One needs to control for covariate variables while examining the associations between liquidity benchmarks and liquidity proxies.

One variable is known to affect measured liquidity is price. Another variable is firm size. We include both the log of stock price (*PRICE*) and the log of firm size (*SIZE*) as control variables in a cross-sectional regression framework. The dependent variable is one of the high-frequency liquidity benchmarks, while the independent variables are the control variables and a low-frequency liquidity proxy, added one at a time. We run the regressions using all sample stocks. Specifically, the regression takes the following form:

$$SPREAD\_i = \varphi\_0 + \varphi\_1 PRICE\_i + \varphi\_2 SIZE\_i + \sum CDILMs + \sum IDLMs + \varepsilon\_{i\prime} \tag{1}$$

where *SPREAD* = *QS*, *ES*, and *RS*, respectively. Subscript *i* refers to individual stocks, *i* = 1, ..., 1183. Country dummies and industry dummies are included. We obtain the adjusted *R*2, *<sup>R</sup>*2*ADJ*1 from the above regressions. Subsequently, we run the following regressions, adding spread proxies:

$$\begin{array}{ll}SPREAD\_{\text{i}} = \varrho\_{0} & + \varrho\_{1}\log\left(PRICE\right)\_{\text{i}} + \varrho\_{2}\log\left(SIZE\right)\_{\text{i}} + \varrho\_{3}SPREAD\\_PROXYZ\_{\text{i}}\\ & + \sum CDLIMs + \sum IDILMs + \varepsilon\_{i\prime} \end{array} \tag{2}$$

where *SPREAD\_PROXY* = *ROLL*, *HASB*, and *LOT*, respectively. The corresponding adjusted *R*<sup>2</sup> is *<sup>R</sup>*2*ADJ*2. The incremental adjusted *R*<sup>2</sup> is calculated as:

$$Incremental \,\,\mathbb{R}^2 = \mathbb{R}\_{AD/2}^2 - \mathbb{R}\_{AD/1}^2. \tag{3}$$

Panel A of Table 5 summarizes the regression results. The estimated coefficient of log(*PRICE*) is insignificant, but the estimated coefficient of log(*SIZE*) has the predicted negative sign, and is highly significant in all regressions. In the case when ES is the dependent variable, the benchmark regression of Equation (1) has an adjusted *R*<sup>2</sup> of 0.325. When *ROLL*, *HASB*, and *LOT* are added one at a time as in Equation (2), then the corresponding estimates (*t*-statistics) are 0.296 (3.84), 0.256 (4.73), and 0.122 (3.94), respectively. Furthermore, the corresponding adjusted *R*<sup>2</sup> increases to 0.361, 0.343, and 0.464, respectively. The incremental *R*<sup>2</sup> are 0.036, 0.018, and 0.139, respectively. The notable largest incremental *R*<sup>2</sup> comes from adding *LOT* in the regression. The results that were obtained using *QS* and *RS* as dependent variables are similar. The largest incremental *R*<sup>2</sup> values, 0.134 and 0.082, respectively, again come from adding *LOT* in the regression. The conclusion from the incremental *R*<sup>2</sup> analysis is fully consistent with the conclusions drawn from measurement error and the correlation structure analysis.




4.5.2. The Determinants of Price Impact Benchmarks

Now, we examine the determinants of price impact benchmarks. We first run the following regression:

$$PRICE\\_IMPCT\_i = \varphi\_0 + \varphi\_1 \log(PRICE\_i) + \varphi\_2 \log\left(SIZE\right)\_i + \sum CDILMs + \sum IDILMs + \varepsilon\_i,\tag{4}$$

where PRICE\_IMPACT = *LAMBDA*, *IMP*, and *ASC*, respectively. Subscript *i* refers to individual stocks, *i* = 1, ..., 1183. Country dummies and industry dummies are included. Thereafter, we run the following regressions, adding price impact proxies:

$$\begin{array}{l} \text{PRICE\\_IMPAC} \dot{\text{\$i = }} = & \varphi\_0 + \varphi\_1 \log(PRICE\_i) + \varphi\_2 \log\left(SIZE\_i\right)\_i + \varphi\_3 PRICE\\_IMPAC\\_PROX\_i \\ & + \sum CDLIMs + \sum IDLMs + \varepsilon\_i. \end{array}$$

(5)

where *PRICE\_IMPACT\_PROXY* = *AMIHUD*, 1/*AMIVEST*, and *PASTOR*, respectively. The incremental adjusted *R*<sup>2</sup> is calculated as in Equation (3).

Panel B of Table 5 summarizes the regression results. Overall, the results for price impacts are weaker than those for spreads. In the case when *LAMBDA* is the dependent variable, the basic regression in Equation (4) yields an adjusted *R*<sup>2</sup> of 0.087. When *AMIHUD*, 1/*AMIVEST*, and *PASTOR* are added one at a time as in Equation (5), the corresponding estimates (*t*-statistics) are 0.047 (1.51), 0.403 (3.12), and 0.914 (1.42), respectively. The corresponding adjusted *R*<sup>2</sup> increase to 0.160, 0.671, and 0.114, respectively. The incremental *R*<sup>2</sup> values are 0.073, 0.584, and 0.027, respectively. The notable largest incremental *R*<sup>2</sup> come from adding 1/AMIVEST in the regression.

To understand why the results from adding price impact proxies are weaker when LAMBDA is used as the price impact benchmark, we run the same regressions for each of the four country groups partitioned by turnover. Group G1 has a total of 530 stocks. The regression results from the G1 group now become much stronger. When *AMIHUD*, 1/*AMIVEST*, and *PASTOR* are added one at a time, the corresponding estimates (*t*-statistics) are 1.097 (13.21), 1.988 (10.15), and 5.045 (7.12), respectively. The incremental *R*<sup>2</sup> values become 0.389, 0.396, and 0.319, respectively. Therefore, for emerging liquid markets in the G1 group, all three price impact proxies do a very good job in predicting the price impact benchmark, *LAMBDA*. The weak results are driven by less liquid markets. The notably large incremental *R*<sup>2</sup> value of 0.587, by adding 1/*AMIVEST* in the regression, is driven by stocks in the G3 group.

The results using *IMP* and *ASC* as dependent variables yield a different conclusion. Panel B shows that the largest incremental *R*<sup>2</sup> values at 0.018 and 0.098, respectively, come by adding *PASTOR* in the regression. Both of these results are driven by the stocks in the G3 group, where the countries are less liquid. To some extent, the conclusion regarding *PASTOR* is consistent with the measurement error analysis for price impact in Panel B of Table 3, where *PASTOR* turns out to be the better proxy for less liquid markets in the G3 and G4 groups.

#### *4.6. Firm and Market Characteristics and Accuracy of Liquidity Proxies*

The analysis so far focuses on how accurately various low-frequency liquidity proxies predict high-frequency liquidity variables. Here, we examine which firm and market characteristics determine the effectiveness of liquidity proxies. Specifically, we run regressions to see if the accuracy of individual liquidity proxies depends on firms and market characteristics that are known to affect liquidity. The dependent variable in the regression is calculated as:

$$ACCLIRACY\_i = \log(1/|Low\,Frequency\,\text{Prox}y\_i - High\,Frequency\,\text{Benchmark}\_i|)\tag{6}$$

Since the denominator of *Accuracyi* is the measurement error of a liquidity proxy, the smaller the error, the larger the value of *Accuracyi*. We apply log transformation because *Accuracyi* exhibits extreme distribution. The regression is run as follows:

$$\begin{aligned} ACCIRACY\_i &= \phi\_0 + \quad \phi\_1 \log(PRICE\_i) + \phi\_2 Turnover\_i + \phi\_3 Stock\ Volatility\_i + \phi\_5 \log(SIZE\_i) \\ &+ \phi\_6 Invstability\_i + \phi\_7 MarketVolatility\_i + \phi\_8 Legendre\_i \\ &+ \phi\_7 Trading\ Mechanicalism\_i + CDIIMs + IDIMs + \varepsilon\_i. \end{aligned} \tag{7}$$

In constructing the dependent variables, spread proxies include *ROLL*, *HASB*, and *LOT*. Spread benchmarks include *ES*, *QS*, and *RS*. Furthermore, price impact proxies include *AMIHUD*, 1/*AMIVEST*, and *PASTOR*, while price impact benchmarks include *LAMBDA*, *IMP*, and *ASC*. The independent variables include five firm characteristic variables and three market characteristic variables. The firm characteristics include stock price, turnover, return volatility, firm size, and investability. The investability portrays accessibility by foreign investors and takes a value between zero (non-accessible to foreigners) and one (fully accessible).

The variables that capture market characteristics include market volatility, legal origin, and trading mechanism. Market volatility is the daily return standard deviation of the leading market index in each market. A country's legal origin is from La Porta et al. (1998). The legal origin variable takes the value of one if the country's legal system is based on common laws, and zero otherwise. La Porta et al. (1998) report that countries with common law systems generally have a stronger investor protection system than those with other legal systems. The degree of information asymmetry is lower. The data on trading mechanisms are from Jain (2005). We assign a value of one for a pure limit-order system, and zero for a dealer or a hybrid system.

The regression results of Equation (6) for spread proxies appear in Panel A of Table 6. For brevity, we only report the results when the spread benchmark is ES. The evidence shows that volatility is significantly related to the accuracy of spread proxies. Individual firms' return volatilities have significant and negative signs for all the three spread proxies. Obviously, volatility adds noise to the estimates of the proxies. Thus, high volatility is associated with less accuracy. Firm size has a positive sign and it is highly significant. Spread proxies are more accurate when the firms are larger. Investability has a positive sign and is highly significant. This suggests that spread proxies are more accurate when firms are more accessible to foreign investors. When *LOT* is used as a proxy for *ES*, the model performs the best with an *R*<sup>2</sup> exceeding 0.40. *LOT* portrays spread better when a firm has a higher stock price, a higher turnover, a smaller return volatility, a larger market capitalization, and is more accessible to foreigners.

Among the market characteristics, market volatility displays a positive and highly significant coefficient in all three models with *ROLL*, *HASB*, and *LOT* being the spread proxies, respectively. The results show that when individual stock volatility is accounted for, greater market volatility increases the accuracy of a spread proxy. One possible explanation is that higher market volatility is an indicator of market development (Bekaert and Harvey 1995). Greater market level volatility increases the effectiveness of spread proxies. The legal origin exhibits a significant and positive coefficient in models when *ROLL* and *HASB* are used as the proxies for *ES*.

Now, we turn to the regressions for price impact proxies in Panel B of Table 6. For brevity, we only report the results when the price impact benchmark is *LAMBDA*. Overall, the set of firm and market characteristics explains the variations in the accuracy of price impact proxies reasonably well. The *R*<sup>2</sup> values range from 0.529 to 0.816, much higher than the *R*<sup>2</sup> values from the regressions for spread proxies that range from 0.316 to 0.409. In general, the accuracy increases with turnover, firm size, and investability. The accuracy decreases with individual stock return volatility. The indicator variable for legal origin is positive and highly significant in all three regressions. This suggests that all the three price impact proxies are more effective in markets with common law legal systems. The indicator variable for the trading mechanism is positive and highly significant in all three regressions as well. This suggests that all three price impact proxies work better in a limit-order based system than in dealer or hybrid systems.


**Table 6.** Determinants of accuracy of Liquidity Proxies. The table presents the coefficients (*t*-statistics) from the cross-sectional regressions of the accuracy measures of liquidity proxies on firm and market characteristics. The accuracy measure is calculated as log(1/|Proxy − Benchmark|). The spread benchmark is *ES*. The spread
