*3.6. Contagion*

Wan and Wong (2001) provide a simple example of a refinancing game with incomplete information, where the lack of transparency is both necessary and sufficient for the propagation of local financial distress across disjoint financial networks. The authors note that contagion is an important topic in both economics and finance.

There are some tests for contagion, for example, the test developed by Fry et al. (2010), and Fry-McKibbin and Hsiao (2015). The tests can be used for big data, and also for large data sets that might not be characterized as big data.

<sup>1</sup> The S&P/HKEX LargeCap is a 25-stock index representing the large cap universe for Hong Kong.

#### *3.7. Technical Analysis*

The new financial indicator introduced by Wong et al. (2001) to test the performance of stock market forecasts can be classified as technical analysis. Substantial research have been undertaken in technical analysis. For example, Wong et al. (2003) use technical analysis in signalling the timing of stock market entry and exit.

The authors introduce test statistics to test the performance of the most established of the trend followers, namely the Moving Average, and the most frequently used counter-trend indicator, namely the Relative Strength Index. Using Singapore data, the empirical results indicate that the indicators can be used to generate significantly positive return. It is found that member firms of the Singapore Stock Exchange tend to enjoy substantial profits by applying relatively simple technical indicators.

Wong et al. (2005) examine the profitability of applying technical analysis that signals the entry and exit from the stock market in three Chinese stock markets, namely the Shanghai, Hong Kong and Taiwan Stock Exchanges. Applying the trading signals generated by the MA family to the Greater China markets, generate significantly positive returns that outperform the buy-and-hold strategy. The cumulative wealth obtained also surpasses that of the buy-and-hold strategy, regardless of transaction costs.

In addition, the authors analyse the performance of the MA family before and after the 1997 Asian Financial Crisis, and find that the MA family works well in both sub-periods, as well as in different market conditions of bull runs, bear markets and mixed markets. The empirical observation that technical analysis can forecast the directions in these markets implies that the three China stock markets are not efficient. Lam et al. (2007) examine whether a day's surge or plummet in stock price serve as a market entry or exit signal. Returns of five trading rules based on one-day and intraday momentum are estimated for several major world stock indices. It is found that the trading rules perform well in the Asian indices, but not in those of Europe and the USA.

Kung and Wong (2009a) investigate whether these measures have led to less profitability for those investors who employ technical rules for trading stocks. Their results show that the three trading rules consistently generate higher annual returns for 1988–1996 than those for 1999–2007. Furthermore, they generally perform better than the buy-and-hold (BH) strategy for 1988–1996, but perform no better than the BH strategy for 1999–2007. These findings sugges<sup>t</sup> that the efficiency of the Singapore stock market has been considerably enhanced by the measures implemented after the financial crisis.

Kung and Wong (2009b) use two popular technical trading rules to assess whether the gradual liberalization of Taiwan's securities markets has improved the efficiency in its stock market. The results show that the two rules have considerable predictive power for 1983–1990, become less predictive for 1991–1997, and cannot predict the market for the period 1998–2005. These empirical results indicate that the efficiency of the Taiwan stock market has been greatly enhanced by the liberalization measures implemented in the past 20 years. The above studies examine technical analysis for reasonably big data sets. In addition, academics and practitioners can apply technical analysis to examine the performance of a larger number of stock markets, as well as other financial market for larger data sets.

#### *3.8. Cost of Capital*

Gordon and Shapiro (1956) develop the dividend yield plus growth model for individual firms, while Thompson (1985) improves the theory by combining the model with an analysis of past dividends to estimate the cost of capital and its 'reliability'. Thompson and Wong (1996) extend the theory by obtaining estimates of the cost of equity capital and its reliability.

Wong and Chan (2004) extend the theory by developing estimators of the reliability, and prove that the estimators are consistent. Estimation of the cost of equity capital and its reliability can be used for both big data, and large data sets that might not be classified as big data.

#### *3.9. Robust Estimation*

Bian and Dickey (1996) develop a robust Bayesian estimator for the vector of regression coefficients using a Cauchy-type g-prior. This estimator is an adaptive weighted average of the least squares estimator and the prior location, and is robust with respect to flat-tailed sample distributions.

Bian and Wong (1997) develop an alternative approach to estimate the regression coefficients. Wong and Bian (2000) introduce the robust Bayesian estimator developed by Bian and Dickey (1996) to the estimation of the Capital Asset Pricing Model (CAPM), in which the distribution of the error component is widely known to be flat-tailed.

In order to support their proposal, the authors apply both the robust Bayesian estimator and the least squares estimator in simulations of CAPM, and also in the analysis of CAPM for US annual and monthly stock returns. The simulation results show that the Bayesian estimator is robust and superior to the least squares estimator when the CAPM is contaminated by large normal and non-normal disturbances, especially with Cauchy disturbances.

In their empirical study, the authors find that the robust Bayesian estimate is uniformly more efficient than the least squares estimate in terms of the relative efficiency of one-step ahead forecast mean square errors, especially in small samples. They introduce the robust Bayesian estimator developed by Bian and Dickey (1996) as this robust Bayesian estimator is adaptive and robust with respect to flat-tailed sample distribution. However, few papers have used this estimator in practice.

This estimator is adaptive and robust in the sense that if the sample does not contain outliers, the estimator will rely more on the sample information. On the other hand, if there are many outliers in the sample, the robust Bayesian estimator will use more information arising from the prior. To the best of our knowledge, only the estimator in Bian and Dickey (1996) has this feature, and so this estimator is recommended. It should be noted that the robust Bayesian estimator can be used for big data, and for large data sets that might not be interpreted as such.

#### *3.10. Unit Roots, Cointegration, Causality Tests, and Nonlinearity*

We have applied several tests related to unit roots, cointegration, and causality, including for higher moments, specifically a simple test for causality in volatility (see Chang and McAleer 2017), and discuss a few of the innovations below.

Tiku and Wong (1998) develop a unit root test to accommodate data that follow an AR(1) process. We use the three moment chi-square and four moment F approximations to test for unit roots in an AR(1) model when the innovations have one of a wide family of symmetric Student's *t*-distributions. In cointegration analysis, vector error-correction models (VECMs) have become an important means of analysing long run cointegrating equilibrium relationships.

The usual full-order VECMs assume all nonzero entries in their coefficient matrices. However, applications of VECMs to economic and financial time series data have revealed that zero entries are indeed possible. If indirect causality or Granger non-causality exists among the variables, the use of a full-order VECM will incorrectly conclude only the existence of Granger causality among these variables.

In addition, the statistical and numerical accuracy of the cointegrating vectors estimated in a misspecified full-order VECM will be problematic. It has been argued that the zero–non-zero (ZNZ) patterned VECM is a more straightforward and effective means of testing for both indirect causality and Granger non-causality. Wong et al. (2004) present simulations and an application that demonstrate the usefulness of the ZNZ patterned VECM.

Lam et al. (2006) develop some properties on the autocorrelation of the k-period returns for the general mean reversion (GMR) process, in which the stationary component is not restricted to the AR(1) process but takes the form of a general autoregressive–moving-average (ARMA) process. The authors derive some properties of the GMR process and three new nonparametric tests that compare the relative variability of returns over different horizons to validate the GMR process as an alternative to a

random walk. The authors examine the asymptotic properties of the novel tests, which can be used to identify random walk models from the GMR processes.

The traditional linear Granger causality test has been widely used to examine linear causality among several time series in bivariate settings, as well as in multivariate settings. Hiemstra and Jones (1994) develop a nonlinear Granger causality test in a bivariate setting to investigate the nonlinear causality between stock prices and trading volume. Bai et al. (2010) extend the work by developing a nonlinear causality test in multivariate settings.

Bai et al. (2011b) discuss linear causality tests in multivariate settings, and thereafter develop a nonlinear causality test in multivariate settings. A Monte Carlo simulation is conducted to demonstrate the superiority of the proposed multivariate test over its bivariate counterpart. In addition, the authors illustrate the applicability of the proposed test to analyze the relationships among different Chinese stock market indices.

Hui et al. (2017) propose a simple and efficient method to examine whether a time series process possesses any nonlinear features by testing dependence remaining in the residuals after fitting the data with a linear model. The advantage of the proposed nonlinearity test is that it is not required to know the exact nonlinear features and the detailed nonlinear forms of the time series process. It can also be used to test whether the hypothesized model, including linear and nonlinear components of the variable being examined, is appropriate as long as the residuals of the model being used can be estimated.

The simulation study shows that the proposed test is stable and powerful. The authors apply the proposed statistic to test whether there is any nonlinear feature in sunspot data, and whether the S&P 500 index follows a random walk. The conclusion drawn from the proposed test is consistent with results that are available from alternative tests.

An early development in testing for causality (technically, Granger non-causality) in the conditional variance (or volatility) associated with financial returns was the portmanteau statistic for non-causality in the variance of Cheung and Ng (1996). A subsequent development was the Lagrange Multiplier (LM) test of non-causality in the conditional variance by Hafner and Herwartz (2008), who provided simulation results to show that their LM test was more powerful than the portmanteau statistic for sample sizes of 1000 and 4000 observations.

Although the LM test for causality proposed by Hafner and Herwartz (2008) is an interesting and useful development, it is nonetheless arbitrary. In particular, the specification on which the LM test is based does not rely on an underlying stochastic process, so the alternative hypothesis is also arbitrary, which can affect the power of the test.

Chang and McAleer (2017) derive a simple test for causality in volatility that provides regularity conditions arising from the underlying stochastic process, namely a random coefficient AR process, and a test for which the (quasi-) maximum likelihood estimates have valid asymptotic properties under the null hypothesis of non-causality. The simple test is intuitively appealing as it is based on an underlying stochastic process, is sympathetic to Granger (1969, 1988) notion of time series predictability, is easy to implement, and has a regularity condition that is not available in the LM test.

We note that using cointegration, causality and nonlinearity tools is very useful in analyzing many important issues and explains many financial and physiological phenomena well. For example, using cointegration, causality and nonlinearity tools, Batai et al. (2017) examine the factors that maintain a long-run equilibrium, short-run impact, and causality with the exchange rate of Mongolia over China to shed light on exchange rate determination.

The authors find that, in the long run, the gross domestic product (GDP) of China and the index of world price have significantly positive effects, while Mongolia's GDP and the Shanghai stock index have significantly negative effects on the Mongolian exchange rate.

The research also reveals the existence of a short run dynamic interaction, and highly significant linear and nonlinear multivariate causality from all the explanatory variables to the Mongolian exchange rate. The authors observe that there is strong linear causality from each of the GDPs of Mongolia and China and the index of world price to Mongolian exchange rate, but not from the index of world price. Moreover, there is strongly significant nonlinear causality from the Shanghai stock index to the Mongolian exchange rate, and weakly significant nonlinear causalities from both the GDP of China and the index of world price on the Mongolian exchange rate, but not from Mongolia's GDP. The empirical findings are useful for investors, manufacturers, and traders for their investment decision-making, and for policy makers for their decisions regarding both monetary and fiscal policies that could affect the Mongolian exchange rate.

Academics and practitioners can apply unit root, cointegration, causality, and nonlinearity tests in many different areas for big data, and large data sets, as in empirical finance that uses nano-tick data, and dynamic panel data models with both large cross section and time series components. The literature in applying unit root, cointegration, causality and nonlinearity tests includes Wong et al. (2004, 2006), Qiao et al. (2007, 2008a, 2008b, 2009, 2011), Foo et al. (2008), Chiang et al. (2009), Vieito et al. (2015), Chang and McAleer (2017), among many others.

#### *3.11. Confidence Intervals*

Homm and Pigorsch (2012) use the Aumann and Serrano index to develop a new economic performance measure (EPM), which is well known to have advantages over alternative measures. Niu et al. (2018) extend the theory by constructing a one-sample confidence interval of EPM, and construct confidence intervals for the difference of EPMs for two independent samples. The authors also derive the asymptotic distribution for EPM and for the difference of two EPMs when the samples are independent. They conduct simulations to show the proposed theory performs well for one and two independent samples.

The simulations show that the proposed approach is robust in the dependent case. The theory developed is used to construct both one-sample and two-sample confidence intervals of EPMs for the Singapore and USA stock indices. It is worth noting that estimation of the confidence intervals can be used for big data, and large finite samples that are not regarded as big data.

The theory of confidence intervals for EPM developed in Niu et al. (2018) can be used to develop the theory of confidence intervals for any risk measure or economic indicator, which, in turn, could be used to construct confidence intervals for big data, large finite data samples that are not otherwise classified as big data.

#### *3.12. Other Econometrics Models and Tests*

The literature provides numerous alternative econometric/statistic models/tests, several of which have been used in a number of cognate disciplines, including economics, finance, management, marketing and statistics. Some of these are discussed below.

Wong and Miller (1990) develop a theory and methodology for repeated time series (RTS) measurements on autoregressive integrated moving average noise (ARIMAN) process. The theory enables a relaxation of the normality assumption in the ARIMAN model, and to identify appropriate models for each component series of the relevant stochastic process. The authors discuss the properties, estimation, and forecasting of RTS ARIMAN models and illustrate with examples.

Wong et al. (2001) extend the theory and methodology of Wong and Miller (1990) by allowing the error variance, as well as the number of repetitions, to change over time. They show that the model is identified, and derive the maximum likelihood estimator using the Kalman filter technique.

Tiku et al. (2000) consider AR(q) models in time series with non-normal innovations represented by a member of a wide family of symmetric distributions (Student's *t*). Since the ML (maximum likelihood) estimators are intractable, we derive the MML (modified maximum likelihood) estimators of the parameters and show that they are remarkably efficient. We use these estimators for hypothesis testing, and show that the resulting tests are robust and powerful.

Tiku et al. (1999a) extend the methods by considering AR(q) models in time series with asymmetric innovations represented by two families of distributions, namely (i) gamma with support IR: (0, ∞), and (ii) generalized logistic with support IR: ( − <sup>∞</sup>,<sup>∞</sup>). As the maximum likelihood estimators (MLE) are intractable, the authors derive modified maximum likelihood (MML) estimators of the parameters and show that they are very easy to compute and are also efficient. The authors investigate the efficiency properties of the classical LS (least squares) estimators. Their efficiencies relative to the proposed MML estimators are very low.

Tiku et al. (1999b) estimate coefficients in a simple regression model in the presence of autocorrelated errors. The underlying distribution is assumed to be symmetric, namely one of Student's t family for illustration. Closed form estimators are obtained and shown to be remarkably efficient and robust.

Wong and Bian (2005) extend the results to the case where the underlying distribution is a generalized logistic distribution. The generalized logistic distribution family represents very wide skewed distributions ranging from highly right skewed to highly left skewed. Analogously, the authors develop MML estimators as the ML estimators are intractable for the generalized logistic data. The authors examine the asymptotic properties of the proposed estimators and conduct simulations to establish small sample properties of small size and high power.

Bian and Dickey (1996) develop a robust Bayesian estimator for the vector of regression coefficients using a Cauchy-type g-prior. This estimator is an adaptive weighted average of the least squares estimator (LSE) and the prior location, and is robust to fat-tailed sample distributions. Wong and Bian (2000) introduce the robust Bayesian estimator to the estimation of the Capital Asset Pricing Model (CAPM) in which the distribution of the error component is well known to be fat-tailed.

In order to support their proposal, the authors apply both the robust Bayesian estimator and the least squares estimator (LSE) in simulations of CAPM, and also in analysing CAPM for US annual and monthly stock returns. The simulation results show that the Bayesian estimator is robust and superior to LSE when CAPM is contaminated by large normal and/or non-normal disturbances, especially by Cauchy disturbances.

In the empirical study, the authors find that the robust Bayesian estimate is uniformly more efficient than the LSE in terms of the relative efficiency of one-step ahead forecast mean square errors, especially in small samples. Bian et al. (2013) develop a modified maximum likelihood (MML) estimator for the multiple linear regression model with underlying Student's *t*-distribution.

The authors obtain a closed form solution of the estimators, derive the asymptotic properties, and demonstrate that the MML estimator is more appropriate for estimating the parameters of the Capital Asset Pricing Model (CAPM) by comparing its performance with LSE for monthly returns of US portfolios. The empirical results reveal that the MML estimators are more efficient than LSE in terms of the relative efficiency of one-step-ahead forecast mean square errors in small samples.

Bian et al. (2011) develop a new test, namely the trinomial test, for pairwise ordinal data samples to improve the power of the sign test by modifying its treatment of zero differences between observations, effectively increasing the use of sample information. Simulations demonstrate the power superiority of the proposed trinomial test statistic over the sign test in small samples in the presence of tied observations.

The authors also show that the proposed trinomial test has substantially higher power than the sign test in large samples and also in the presence of tied observations, as the sign test ignores information from observations resulting in ties.

It is worth noting that all of the above estimation and testing procedures can be used for big data, as well as for finite samples that might not be classified as big data.
