**3. Research Methods**

This study incorporated four steps: First, this study applied an entropy index of economic diversification to determine the levels of diversification of Macao economy. Second, the study adopted DEA to calculate the efficiency by a bootstrapping model to understand the efficiency of Macao's gambling industry. Third, the study derives the transition probability matrix of the three scenarios through interviews with native economic experts and the senior executives of large-scale gambling companies. Last, this paper employed the Markov chain to forecast the appropriate scale for Macao's gambling industry. This section introduces measuring methods, data source and chosen indicators.

#### *3.1. Entropy Index of Economic Diversification*

The measurement of diversification of Macao's economy is related to the measurement of the diversification of its industrial structure. Industrial structure refers to the constitution of each industry and the proportion of its added value in an economy. The industrial added value can be generally calculated by the producer's price (including product tax) and the basic price (not including the product tax). The franchise tax (part of product tax) of Macao's gambling industry should be considered part of the industrial total added value, because it is deducted from industrial total revenue or total turnover, generated from industrial economic activities that are not levied by extra ways and is regarded as a part of the output. Meanwhile, the gambling industry plays a leading role in Macao's economy and makes huge tax contributions, so calculating the gambling tax as an industrial output can more accurately reflect the proportion of the gambling industry in Macao's overall economic structure. From the viewpoint of appropriate diversification of Macao's economy development, the GDP calculated by the producer's price has a higher reference value than if it is calculated by the basic price in the input-output analysis. Thus, this study applied the industrial total added value (by producer's price) by chain price to calculate the entropy index of economic diversification.

The Entropy Index of Diversification (EID) [28] is one of the indicators that measures economic diversification in academia. Its calculation equation is

$$EDI = \sum\_{i=1}^{N} S\_i \ln\left(\frac{1}{S\_i}\right) \tag{1}$$

In Equation (1), *N* is the number of industries; *Si* is the proportion of added value of the *i*th industry of the total added value; and *ln* is the natural logarithm. If all added value is concentrated on one industry, the score of the entropy index is 0; with an increasing number of industries, the maximum score is *ln(N*). Thus, a larger entropy index score refers to a higher level of economic diversification, while the smaller score reflects a relatively high level of economic concentration.

#### *3.2. Data Envelopment Analysis (DEA)*

This study applied the Data Envelopment Analysis (DEA) to measure the efficiency of Macao's gambling industry. DEA is a nonparametric frontier efficiency analysis that measures the production efficiency of the Decision-Making Unit (DMU). This model is applied to evaluate DMU's technology efficiency (TE) under the condition of constant returns to scale (CRS). Its basic premise is to assume that there are *K* DMUs including L input indicators and M output indicators that need to be evaluated. Then, the equation of the *i*th (*i* = *1*, *2*,... , *K*) DMU's DEA model is

$$\min \left( \theta - \varepsilon (e\_1^T \mathbf{s}^- + e\_2^T \mathbf{s}^+) \right);$$

$$\text{s.t.} \sum\_{i=1}^{k} \mathbf{x}\_{il} \lambda\_i + \mathbf{s}^- = \theta \mathbf{x}\_l^n, l = 1, 2, \dots, L; n = 1, 2, \dots, K$$

$$\sum\_{i=1}^{k} y\_{im} \lambda\_i - \mathbf{s}^+ = y\_{m'}^n m = 1, 2, \dots, m; \lambda\_i \ge 0, i = 1, 2, \dots, K. \tag{2}$$

In Equation (2), *<sup>θ</sup>* is the TE value, 0 ≤ *<sup>θ</sup>* ≤ 1; *<sup>ε</sup>* is a dimensionless variable; s– and s<sup>+</sup> values ≥0 are slack variables; *e<sup>T</sup>* <sup>1</sup> is an *<sup>m</sup>* dimension unit vector; *<sup>e</sup><sup>T</sup>* <sup>2</sup> is a *k* dimension unit vector; λi ≥ 0 is the weighted variable; *x*il is the *l*th (*l* = *1, 2,* ... *, L*) recourse input of the *i*th DMU; and *y*im is the *m*th (*m* = *1, 2,* ... *, M*) output of the *i*th DMU. When production technology is considered under the condition of variable returns to scale (VRS), the constraint condition ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *λ<sup>i</sup>* = 1 is introduced into Equation (2) to get the DEA-BCC model (proposed by Banker, Charnes and Cooper) [29].

In conclusion, the DEA method fits the condition of *more input and more output* and does not need to set the functional forms of input variables and output variables. However, a study conducted by Simar and Wilson [46] revealed that both estimator ˆ *θCCR* calculated by the CCR model (proposed by Charnes, Cooper and Rhodes) [47] and estimator ˆ *θBBC* calculated by the BCC model are consistent under scenario of constant returns to scale (CRS). However, ˆ *θBBC* has consistency while ˆ *θCCR* does not under the scenario of variable returns to scale (VRS), which results in errors of calculation and goes against practical research. In view of this, studies by Zhang [48] and Yuan et al. [49] revealed that Macao gambling industry does not have constant returns to scale (CRS) and thus, this study only applied the DEA-BCC method and bootstrap method. In the following text, we describe the use of a bootstrapping-DEA model to attach more importance to the efficiency of Macao gambling industry as compared with a traditional DEA-BCC method to reveal its advantages or disadvantages.

The basic concept of the bootstrap method is obtaining a known sample *θ*<sup>0</sup> = (*θ*1, *θ*2, ... , *θK*) from an unknown ensemble of probability distribution *f* by random sampling and then using a sample parameter *ϕ*ˆ = *ϕ(θ*0) obtained from sample *θ*<sup>0</sup> to estimate the ensemble parameter *ϕ = ϕ*(*f*). If the probability distribution of sample parameter *ϕ*ˆ is unknown, it is necessary to use an empirical density function with a sample parameter simulated by bootstrap repeated sampling to judge the error between ensemble parameter *ϕ* and sample parameter ˆ*ϕ*. Based on the premise of bootstrap, steps of bootstrapping-DEA are as follows:

Step 1: Based on each DMU's input–output set (*Xi, Yi*), *i* = 1, 2, ... , *K*, get the initial efficiency score *θ*ˆ *<sup>i</sup>*, consisting of a set *θ*ˆ**<sup>0</sup>** = (*θ*ˆ**1**, *θ*ˆ**2**, ..., *θ*ˆ*K*) of efficiency scores of all Decision Making Units;

Step 2: Apply the smoothing bootstrap method to get a bootstrap sample *θ*∗ **<sup>1</sup>***<sup>b</sup>* = (*θ*<sup>∗</sup> **<sup>1</sup>***b*, *θ*<sup>∗</sup> **<sup>2</sup>***b*, ... , *θ*<sup>∗</sup> *Kb*) by repeated sampling from the efficiency score sample *θ*ˆ*<sup>0</sup>* = (*θ*ˆ*1*, *θ*ˆ*2*, ... , *θ*ˆ*K*); *b* refers to *b* iterations by using the bootstrap method;

Step 3: Based on the smoothing bootstrap efficiency set *θ*∗ *<sup>b</sup>* = (*θ*<sup>∗</sup> *1b*, *θ*<sup>∗</sup> *2b*, ... , *θ*<sup>∗</sup> *Kb*) estimated by Step 2 and under conditions of constant output, adjust the initial input variable *X*<sup>i</sup> and then get *X*∗ *ib* = (*θ*ˆ *i* , *θ*∗ *ib*) × *Xi*, *i* = 1, 2, . . . , *K*;

Step 4: Based on the adjusted input-output amount (*X*∗ *ib*, *Yi*), *i* = 1, 2, ... , *K,* use the DEA method to compute each DMU's efficiency score *θ*∗ *1b* again;

Step 5: Repeat Steps 2 to 4 B times and then get a series of efficiency scores *θ*∗ *1b*, *b* = 1, 2, . . . , *B*;

Step 6: Compute the error **Bias** (*θ*ˆ *<sup>i</sup>*) of the initial efficiency score *θ*ˆ *<sup>i</sup>* of each DMU and the adjusted efficiency score *θ <sup>i</sup>* after error correcting:

$$\overline{\text{Bias}}(\hat{\theta}\_{\hat{\imath}}) = \mathcal{B}^{-1} \sum\_{\mathbf{b}=1}^{\mathcal{B}} (\hat{\theta}\_{\hat{\imath}\mathbf{b}}^{\*}) - \hat{\theta}\_{\hat{\imath}\mathbf{b}}$$

$$\tilde{\theta}\_{\hat{\imath}} = \theta\_{\hat{\imath}} - \overline{\text{Bias}}(\theta\_{\hat{\imath}}) = 2\theta\_{\hat{\imath}} - B^{-1} \sum\_{\mathbf{b}=1}^{\mathcal{B}} (\theta\_{\hat{\imath}\mathbf{b}}^{\*})$$

Step 7: Calculate the confidence interval of the adjusted efficiency score *θ <sup>i</sup>* after error correcting the confidence level α. The confidence interval of the adjusted efficiency score *θ <sup>i</sup>* after error correcting is *Pr*(−<sup>ˆ</sup> *<sup>b</sup>*<sup>α</sup> <sup>≤</sup> *<sup>θ</sup>*ˆ<sup>∗</sup> *ib* <sup>−</sup> <sup>ˆ</sup> *<sup>θ</sup><sup>i</sup>* ≤ −*a*ˆα)=1 <sup>−</sup> <sup>α</sup> and furthermore, we have <sup>ˆ</sup> *θ<sup>i</sup>* + *a*ˆ<sup>α</sup> ≤ *θ <sup>i</sup>* <sup>≤</sup> <sup>ˆ</sup> *θ* + ˆ *b*α.

The bootstrapping-DEA model can avoid issues like having a small-sized sample, sample sensitivity and outliers, rectify the offsetting of the efficiency score and make up for the shortcomings of the traditional DEA method. The small economy size and incomplete economic industry in Macao result from difficulty getting relevant data. The bootstrapping-DEA model can solve the problem of having insufficient samples by repeated sampling so that the actual efficiency and future development tendency can be better analyzed.

### *3.3. Markov Chain Forecast*

The Markov chain, a widely used random process model, involves the quantitative analysis of a system's status transformation that can be transferred with time and has the property of probability, which allows it to consider the influences of previous events on later events. The Markov chain forecast

method predicts laws of development of systematic dynamic data in the light of probability of status transformation and fits issues to a forecast with big random fluctuations but requires the objects being forecasted to have the characteristics of a Markov chain as well as being a mean value-like stable process. A Markov chain forecast has to establish a systematic status transition probability matrix, which can be estimated by a market survey, expert interview and regression model. Let us respectively assume that x2, ... xn and *E* = {1, 2, ... , *m*} are a series of index sequence values of the Markov chain and the status space of the Markov chain.

Let us compute the distribution vector in the initial status to form the state vector and then calculate sequential values of the indicator through a transition matrix step, using *fij* to refer to times when status *i* transfer to status *j*, *i, j*∈*E*. In practical applications, only one step transition probability matrix is considered. Matrix (*fi,j)i,j*∈*E*, constituted by *fij* (*i, j*∈*E*), is named the status transition probability matrix. The transition probability is the value that divides every matrix element by the sum of each row it is in and can be referred to as *Pij* (*i, j*∈*E*). The equation is

$$P\_{ij} = \frac{f\_{ij}}{\sum\_{j=1}^{m} f\_{ij}}.\tag{3}$$

A Markov chain model is used to forecast, while *χ*<sup>2</sup> can be used to test whether the random variable sequence of the system has the Markov property. The times that the sequential value of indicator transfers from status *i* to status *j* through a one-step transition matrix can be referred to as *fij*, *i,j*∈*E* and then we can get the system's marginal probability, the equation of which is

$$P\_{\bar{j}} = \frac{\sum\_{i=1}^{m} f\_{\bar{i}\bar{j}}}{\sum\_{i=1}^{m} \sum\_{\bar{j}=1}^{m} f\_{\bar{i}\bar{j}}}.\tag{4}$$

It is the sum of elements in the *j*th column divided by the sum of all elements in the state transition frequency matrix. When *n* is large enough, *x*<sup>2</sup> = 2 ∑*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>m</sup> <sup>j</sup>*=<sup>1</sup> *fij* = log *pI J pJ* obeys the distribution of *<sup>x</sup>*<sup>2</sup> at the free degree of (*<sup>m</sup>* − 1)2. If the *<sup>x</sup>*<sup>2</sup> value is larger than the value of *<sup>x</sup>*<sup>2</sup> <sup>α</sup>(*m* − 1) 2 , the sequence has the Markov property. Otherwise the sequence cannot be forecast by the Markov chain.

Forecasting the development tendency of Macao's gambling industry by applying the Markov chain and researching features of Macao's industrial structure has strong practical significance. Since the Markov chain forecasting method has significant support for short-term research and relatively strong sensitivity to short-term change, this method was chosen to forecast changes in Macao's industrial diversification.
