**1. Introduction**

Since appearing in 1928, the Cobb-Douglas function has been a highly crucial tool in economic research. This functional form has become very popular due to its ease of use and empirical adaptation to di fferent data sets. Solow (1957) and his followers used the Cobb-Douglas in their growth theories. However, this type of function is criticized because of its rigid premises. One of them is the unit ES, which, according to many empirical results, does not coincide with facts. Moreover, the unit ES masks the role of the ES for economic growth processes. Several theoretical and empirical studies published have explored this limitation. For example, among others, Antrás (2004) stated that the ES is not appropriate for the US economy, and Werf (2007) argued that the Cobb-Douglas function is not suitable for modeling policies for climate change, while Young (2013) revealed that the ES of the aggregate production function and the production function of most U.S. industries could not be equal to one and had estimates less than 0.62. Therefore, the CES function with an ES other than one was announced in 1961 (Arrow et al. 1961). Since then, an increasing amount of studies around the world have used the CES function for economic analysis, while the number of works evaluating elasticities using the Cobb-Douglas function decreased substantially. Specifically, Heubes (1972) theoretically argued that either the time path or the level of the output growth rate depends on the ES value. Among empirical studies, Ferguson (1965), La Grandville (1989), Klump and Grandville (2000), Pitchford (1960), Azariadis (1993), and Galor (1995) focussed on the e ffects of the ES on economic growth. In Vietnam, to the knowledge of the author, the Cobb-Douglas function and its di fferent modifications are commonly used, and at present, no empirical research on the CES function has been carried out. Besides, most previous research on production functions applied mainly traditional quantitative methods, such as the accounting method or the frequentist approach, being a subject of much criticism from modern statisticians as it gave unreliable results in many cases (Briggs and Nguyen 2019; Anh et al. 2018; Kreinovich et al. 2019).

Because of the above reasons, the author conducted this study to estimate the ES via specifying an aggregate CES function using a non-frequentist method, namely the Bayesian nonlinear mixed-effects regression.

The remainder of the paper is structured as follows. Section 2 introduces the theoretical framework of the ES and its relationship with economic growth. Section 3 provides the theoretical analysis of the ES in the CES. Empirical studies on the ES in the CES and its association with economic growth are reviewed in Section 4. Section 5 discusses the data and estimation method. Bayesian simulation results are provided in Section 6. Section 7 includes the conclusion.

#### **2. Theoretical Background of the ES**

#### *2.1. The ES*

Production functions are an important instrument of economic analysis in the neoclassical tradition. They are often utilized to analyze the economic performance of an economy, as well as those of enterprises, industries and industrial complexes. Homogeneity and returns to scale particularize a neoclassical production function under the conditions of uniform changes in all inputs. Nonetheless, when the inputs change at different rates, how does the function change? In this case, the nature of the production function varies depending on the ES. In general, the ES plays a significant role in economic growth process.

The marginal rate of technical substitution between two inputs (*MRSij*) illustrates the rate at which one input must decrease to hold a production level unchanged when another input increases:

$$MRS\_{i\bar{j}} = -\left(\frac{d\mathbf{x}\_{\bar{j}}}{d\mathbf{x}\_{\bar{i}}}\right) = \frac{f\_{\bar{i}}}{f\_{\bar{j}}}$$

where *xi*, *xj* are the first and second inputs, respectively.

The limitation of this coefficient is that it is dependent on the measurement unit of resources. Therefore, the usage of the ES instead is more appropriate:

$$
\sigma\_{ij} = \frac{\partial \left( \mathbf{x}\_j / \mathbf{x}\_i \right)}{\partial MRS\_{ij}} \times \frac{MRS\_{ij}}{\mathbf{x}\_j / \mathbf{x}\_i},
$$

where <sup>σ</sup>*ij*—the ES of input *xi* for input *xj*.

The ES denotes how the ratio of inputs changes if the marginal rate of technical substitution between them varies by one percent. Hicks (1932) first proposed this definition for the case of two inputs. In the case of n inputs, the method of calculating the ES is inconsistent. In a later work of Hicks and Allen (1934), a generalized ES was suggested. Accordingly, the formula for the two-input case is applied to any two inputs in a multivariate function with the assumption that other inputs remain unchanged. This is the Hicks Elasticity of Substitution (HES). However, the restriction of the HES is that because the optimal quantity of all inputs is simultaneously decided by enterprises, the ratio between any two inputs is affected not only by relative prices but also by the prices of other inputs. The optimization behavior of enterprises requires:

$$MRS\_{i\bar{j}} = \frac{f\_i}{f\_{\bar{j}}} = \frac{p\_j}{p\_i}.$$

then

$$
\sigma\_{i\bar{j}} = \frac{\partial \left( x\_{\bar{j}} / \mathbf{x}\_{i} \right)}{\partial \left( p\_{\bar{j}} / p\_{i} \right)} \times \frac{p\_{\bar{j}} / p\_{i}}{\mathbf{x}\_{\bar{j}} / \mathbf{x}\_{i}}.
$$

where *pj*, *pi* are the price of *xj*, *xi*, respectively.

Under the optimization condition, the ES indicates how the input ratio varies if their price ratio changes by one percent. Let us consider a function with three inputs *f*(*<sup>x</sup>*1, *x*2, *<sup>x</sup>*3). With this preposition, *MRS*12 = *p*2 *p*1 . The HES between *x*1 and *x*2 shows how the ratio between them changes if *MRS*12 = *p*2 *p*1 changes by one percent with the assumption of a fixed amount of *x*3. However, it is noted that a change of *p*2 *p*1 may make the amount of *x*3 vary due to variations in the ratios of *p*2 *p*3 and *p*1 *p*3 . Thus, the assumption of a fixed quantity of the third input is not always correct. The use of the HES is correct only for the Cobb-Douglas and the CES because the change in the third input does not impact on the ratio between the first two inputs. In the meanwhile, for generalized functions, the HES may yield biased results.

Hicks and Allen proposed a Partial Elasticity of Substitution to measure the ES. Later, this coefficient was studied in detail by Allen and Uzawa, so it was called the Allen-Uzawa Elasticity of Substitution (AUES). AUES is calculated by the following formula:

$$\sigma\_{ij} = \frac{\mathbf{x}\_1 \times f\_1 + \dots + \mathbf{x}\_n \times f\_n}{\mathbf{x}\_i \times \mathbf{x}\_j} \times \frac{F\_{ij}}{F} \lambda$$

where *F* = *det*[0 *f*1 ... *fn f*1 *f*11 ... *f*<sup>1</sup>*n fn fn*1 ··· *fnn* ],

$$f\_{ij}(y, p) = \frac{\partial^2 f}{\partial \mathbf{x}\_i \times \partial \mathbf{x}\_j},$$

where *Fij* denotes algebraic addition to element *fij* in *F*.

In the two-input case, AUES is reduced to the HES. Nevertheless, Blackorby and Russell (1981) claim that deduction from the ES between two inputs to the ES between multiple inputs is not correct. They proved the non-informativeness of AUES in several cases. So, the Morishima Elasticity of Substitution (MES) was proposed instead:

$$M\_{i\bar{j}}(y,\ p) = \frac{p\_i \times \mathbb{C}\_{i\bar{j}}(y,\ p)}{\mathbb{C}\_{\bar{j}}(y,\ p)} - \frac{p\_i \times \mathbb{C}\_{i\bar{j}}(y,\ p)}{\mathbb{C}\_{i}(y,\ p)}.$$

where *<sup>C</sup>*(*y*, *p*) is a cost optimization function derived from:

$$\mathbb{C}\_{i}(y,p) = \frac{\partial \mathbb{C}(y,p)}{\partial p\_{i}},\\\mathbb{C}\_{ij}(y,p) = \frac{\partial^{2} \mathbb{C}(y,p)}{\partial p\_{i} \times \partial p\_{j}}.$$

McFadden (1963) created a new development in the elasticity theory showing the possibility of the ES to have different values for various input pairs. According to this author, it is not possible to construct a neoclassical production function with an arbitrary set of the ES when the number of inputs is more than two. That is, if we propose different ES for various input groups, it is necessary to use a different type of production function that may not be fixed at different input quantities and at various prices.

In this study, the author uses the ES between the two inputs, capital and labor. In this case, the ES is a measure of the ease of substitution between capital and labor, or a measure of their similarity from a technological view. When the ES is large, the inputs are similar to each other. So when an input increases, the technology enables this factor to be easily substituted for the element remaining constant. In the case of a small ES, the technology views the inputs as unsimilar, so it is difficult to substitute one input for the other. In other words, as expressed by Nelson (1965), the ES can be referred to as an index of the rate at which diminishing marginal return sets in as one input increases in relation to the other. If the ES is great, then it is easy to substitute one input for the other or to increase output by increasing one input. Hence, a diminishing marginal return will set in slowly or not set at all. From here, we could confirm that the ES has an effect on the economic growth as long as inputs grow at different rates so their proportions change.

#### *2.2. Impact of the ES on Economic Growth*

In order to show the positive effect of the ES on economic growth, let us use a 2-factor linear homogenous production function with Hicks-neutral technical change (*A*):

$$y = A(t) \times F(\mathbf{K}, L) \tag{1}$$

Differentiating (1), we get:

$$\frac{dy}{dt} = \frac{\partial A}{\partial t} \times F(K, L) + A \times \frac{\partial F}{\partial K} \times \frac{\partial K}{\partial t} + A \times \frac{\partial F}{\partial L} \times \frac{\partial L}{\partial t} \tag{2}$$

As known, 1 − α = ∂*y* ∂*K Ky* , α = ∂*y*∂*L Ly* . Hence, the output growth rate is the following:

$$\frac{\Delta y}{y} = \frac{\Delta A}{A} + (1 - \alpha)\frac{\Delta K}{K} + \alpha \times \frac{\Delta L}{L} \tag{3}$$

We have:

$$\mathbf{g}\_{\mathbf{g}} = \mathbf{g}\_{A} + \mathbf{g}\_{k} + a(\mathbf{g}\_{l} - \mathbf{g}\_{k}) \tag{4}$$

The elasticity of production with respect to labor is written as a function of the ES:

$$\alpha = (1 - a) \frac{w/r}{K/L}, \; w = \begin{array}{c} \exists y \\ \exists L' \end{array} \; r = \begin{array}{c} \exists y \\ \exists K' \end{array} \tag{5}$$

or in logs and differentiating with respect to time:

$$\frac{d\ln\alpha}{dt} = \frac{d\ln(1-\alpha)}{dt} + \frac{d\ln(w/r)}{d\ln(\text{K}/\text{L})} \times \frac{d\ln(\text{K}/\text{L})}{dt} - \frac{d\ln(\text{K}/\text{L})}{dt} \tag{6}$$

It is known:

$$\frac{d\ln(w/r)}{d\ln(\text{K}/\text{L})} = \frac{1}{\sigma} \tag{7}$$

Therefore

$$\frac{d\ln\alpha}{dt} = \frac{d\ln(1-\alpha)}{dt} + \frac{d\ln(\text{K}/\text{L})}{dt} \left(\frac{1-\sigma}{\sigma}\right) \tag{8}$$

and

$$\frac{\Delta\alpha}{\alpha} = -\frac{1}{1-\alpha} \times \Delta\alpha + \frac{1-\sigma}{\sigma} \left(\frac{\Delta K}{K} - \frac{\Delta L}{L}\right) \tag{9}$$

So, we get:

$$
\Delta \alpha = \alpha (1 - \alpha) \frac{\sigma - 1}{\sigma} (\mathcal{g}\_l - \mathcal{g}\_k) \tag{10}
$$

Assuming the constant growth rates of technical progress and the inputs, the output growth rate (*gy*) may vary only because of changes in α. Combining (4) with (10), we obtain:

$$\frac{d\mathcal{g}\_{\mathcal{Y}}}{dt} = \alpha (1 - a) \frac{\sigma - 1}{\sigma} (\mathcal{g}\_{\mathcal{I}} - \mathcal{g}\_{\mathcal{k}})^2 \tag{11}$$

In case *gl* - *gk*, the sign of (11) will be positive if σ > 1 and negative if σ < 1. Thus, the magnitude of the ES effects is dependent of the difference between the growth rates of capital and labor. In case *gl* ≈ *gk*, the variation of *gy* over time is small or the impact of the ES on economic growth rate is weak.

In addition, Heubes (1972) stated that not only the time path but also the level of the output growth rate are functions of the ES. Let us differentiate (4) with respect to time and σ to ge<sup>t</sup> for small *dt* and *d*σ;

$$dg\_{\mathcal{Y}} = \left(\frac{\partial a}{\partial t}dt + \frac{\partial a}{\partial \sigma}d\sigma\right)(g\_l - g\_k) = \left\{a(1-a)(g\_l - g\_k)dt - \frac{ca}{\sigma^2}\frac{h(K/L)}{c + (K/L)^{\frac{1-\sigma}{\sigma}}}\,d\sigma\right\}(g\_l - g\_k) \tag{12}$$

In case *gl* > *gk* (*gl* < *gk*) and *K*/*L* < 1 (*K*/*L* > <sup>1</sup>), the higher growth rate of output is correlated to a greater ES. Hence, <sup>δ</sup>*gy* δσ > 0. If the ES is low, a strong impact of the relatively scarce input on output emerges as its elasticity of production is great. With a growing σ, the elasticity of production diminishes for the scarce input, but it increases for the relatively abundant factor. The impact of the ES change on the output growth rate becomes small for high levels of the ES. The growth rate is independent of the ES when *K*/*L* = 1.

#### **3. ES in the CES Function**

Before analyzing the ES in the CES, we consider the Cobb-Douglas function. The work of Cobb and Douglas (1928) is a turning point in the field of production functions. It can be said, although there have been some previous studies on production functions (Schumpeter 1954; Stigler 1952; Barkai 1959; Lloyd 1969; Velupillai 1973; Samuelson 1979; Humphrey 1997), for the first time the relationship between inputs and outputs is mathematically formulated and empirically assessed in (Cobb and Douglas 1928). During a vacation at Amherst, Paul Douglas asked math professor Charles Cobb to sugges<sup>t</sup> an equation describing the relationship between capital and labor and output based on time series data on the U.S. manufacturing sector for the period 1889–1922. As a result, a joint paper showed up, where the authors concluded that their model fits the data well. The initial Cobb-Douglas function has the following form:

$$y = A \times \mathbf{x}\_1^{\alpha} \times \mathbf{x}\_2^{1-\alpha} \tag{13}$$

where *x*1 is capital (*K*), *x*2 is labor (*L*); *A*, α are parameters.

However, in the later works, Douglas removed the assumption that sum of elasticities of output by capital and labor equals one, and used the functional form (14):

$$y = A \times K^{a\_1} \times L^{a\_2} \tag{14}$$

where *A* denotes technical change; *a*1, *a*2 are exponentials and elasticities of output by capital and labor, respectively.

The Cobb-Douglas has some properties. First, it belongs to the neoclassical class with 0 < *a*1 < 1, 0 < *a*2 < 1 and therefore, reflects the law of positive and diminishing marginal productivity. Second, its homogeneity is *a*1 + *a*2. In case *a*1 + *a*2 = 1, we ge<sup>t</sup> a linear homogenous function. If *a*1 + *a*2 > 1, then the multiplicative function points to a growing economic system as the output grows faster than the inputs.Then, returns to scale (ε) increase. Meanwhile, if *a*1 + *a*2 < 1, returns to scale decrease. *a*1 + *a*2 = 1 denotes constant returns to scale. Returns to scale are also the homogeneity of the production function and equal to *a*1 + *a*2 :

$$
\varepsilon = \frac{dy/y}{ydx/x} = a\_1 + a\_2 \tag{15}
$$

where *dyy* = *a*1 × *dx*1 *x*1 + *a*2 × *dx*2 *x*2 ; *dx*1 *x*1 = *dx*2 *x*2 = *dxx* .

As we know, in the Cobb-Douglas function, the ES equals one.

Although the Cobb-Douglas is a powerful mathematical tool to describe production processes, as mentioned above, this functional form has extremely rigid premises. Hence, the CES function came into sight. The CES was established by Arrow et al. (1961) or ACMS for short. The authors dedicated the analysis to the ES. The production functions at that time assumed that the ES receives a fixed value, such as zero for Leontieff and one for Cobb-Douglas, which, in their view, is too rigid. Moreover, in order to assess the impact of economic policies on factor income, the CES is more appropriate (Miller 2008) or the Cobb-Douglas hides the role of the ES on economic growth and technical progress (Pereira 2003).

To examine the goodness of fit of the Leontieff and Cobb-Douglas functions, ACMS performed econometric analysis of the behavior of the ratio of labor income to nominal output. As long as output and input prices remain unchanged, the proportion is fixed and determined only by the parameters of the function. Rejection of the Cobb-Douglas (and Leontieff) functions are based on the arguments below.

The invariance of labor share in the Cobb-Douglas is expressed as follows:

$$\frac{p\_2 \times L}{y} = a\_2 \tag{16}$$

Equation (16) is rewritten in logs:

$$
\ln \frac{y}{L} = a + \ln(p\_2) \tag{17}
$$

where *ln* 1*a*2 = *a*.

For the Leontieff function, the ratio between inputs arises from the production process, but is not influenced by price, i.e.,:

$$\frac{L}{y} = \mathcal{Y} \tag{18}$$

Equation (18) takes the form of logs:

$$\ln\left(\frac{y}{L}\right) = a$$

where *ln* 1γ = *a*.

> Hence, we need to analyze the following function:

$$
\ln\left(\frac{y}{L}\right) = c + b \times \ln(p\_2) + \varepsilon \tag{20}
$$

where ε is a random error.

It is necessary to test the hypotheses *b* = 0, *b* = 1. Investigating a data sample of 24 industries of 19 countries, ACMS came to the conclusion that, in most cases, the hypotheses *b* = 0, *b* = 1 are rejected.

The above finding encouraged the researchers to construct a new type of production function with a more flexible labor share, which is expressed in the following:

$$
\ln\left(\frac{y}{L}\right) = c + b \times \ln(p\_2) \tag{21}
$$

where parameter *b* can have any value, but not zero or one.

From (21), under a condition of nonexistence of restraints on *b*, we can ge<sup>t</sup> a CES function. Through some transformations, the last version of the CES is the following:

$$F(\mathbb{K}, L) = \gamma(\delta \times \mathbb{K}^{-0} + (1 - \delta)L^{-0})^{\frac{-1}{\theta}} \tag{22}$$

where θ = 1−*b b* is substitution parameter, δ = *a*1 × γθ is distribution parameter; γ is efficiency parameter and *a*1 + *a*2 = <sup>γ</sup><sup>−</sup>θ, the ES, σ = 1 1+θ .

So that the CES function (22) is a neoclassical one, assumptions 0< δ< 1; γ >0; θ > − 1 must be made. The premise of Hicks-neutral technical progress in the CES implies that the output produced by combining capital with labor is assumed to grow exponentially in a way that does not alter the marginal rate of technical substitution between the inputs. Therefore, the parameters of the production function will be stable over time.

In case σ > 1, i.e., −1 <θ< 0, capital and labor are substitutable, so rising *K*/*L* leads to an increase in capital share.

If σ < 1, i.e., 0 <θ< <sup>∞</sup>, capital and labor are complementary, and thus, when *K*/*L* increases, labor share rises.

In case σ = 1 (θ = 0), then the Cobb-Douglas is obtained.

#### **4. Empirical Research on the Elasticity of Factor Substitution and Its Association with Economic Growth**

#### *4.1. Estimation of the ES*

Solow (1957) was a pioneer, and his followers used the Cobb-Douglas function, where technical change is referred to as neutral, and therefore changes in the ES were completely ignored (the ES is always equal to one). In their models, technical change is called total factor productivity (TFP). Nevertheless, in many empirical studies, the ES varies. For example, among others, Nerlove (1967) on a survey found that changes in period or concept may generate the different values of the ES. Comparing ES estimates from six alternative functional forms, five different measures of the rental price of capital, and two estimation techniques, Berndt (1976) went to a similar conclusion. McFadden (1978) tested the constancy of the ES for the steam-electric generating industry and revealed that the ES obtains a value of approximately 0.75. Hamermesh (1993) showed that the ES varies from 0.32 to 1.16 in the US and from 0.49 to 6.86 in the UK.

The consideration of the U.S. processing industry over a 200-year period indicates that ES values tend to change. The evidence shows that the ES was close to zero in the 19th century (Asher 1972; Uselding 1972; Schmitz 1981), close to one in the mid-20th century (Zarembka 1970), and greater than one in the late 20th century (Blair and Kraft 1974; Hsing 1996). Duffy and Papageorgiou (2000) estimated the ES based on a CES function on a cross-section of 82 countries and found the ES greater than one for developed economies and lower than one for developing economies. These authors concluded that the ES level is related to a country's stages of development. Using a Variable Elasticity of Substutution (VES) for 12 OECD countries (1965–1986), Genç and Bairam (1998) revealed that the average ES is greater than one. It is noteworthy that the diversity of results is because of the difference in data sets and estimation techniques. The above analyses also revealed that the ES is stable for a sample period, but rises with economic development.

#### *4.2. Impact of the ES on Economic Growth*

Theoretically, in early growth theory, some authors attempted to prove the significance of the ES. Solow (1957), Pitchford (1960), and Sato (1963) stated that allowing the ES to ge<sup>t</sup> any value will generate multiple growth paths, and some of them will be unbalanced. Recently, Azariadis (1993), using the overlapping generations model of growth, showed the possibilities of poverty traps depending on the values of the ES.

Ferguson (1965) ensured that in the case of a non-unitary ES, the output growth rate is dependent on the ES, as well as the growth rate of the savings ratio. La Grandville (1989), making use of the Slutsky equation, provided another evidence on the positive relationship between the ES and the output. The larger the ES, the higher production level that can be obtained. Barro and Sala-i-Martin (1995) found that under certain conditions, a large ES generates endogenous, steady-state growth. Later, Klump and Grandville (2000) proved that a greater ES leads to more probable endogenous growth and higher long-term growth rates. Also, the greater the ES, the higher steady-state income per capita. If the ES is more than one, we can achieve a unique steady-state and possibility of endogenous growth (Barro and Sala-i-Martin 1995). In the meantime, Pitchford (1960), Azariadis (1993), and Galor (1995), among others, considered that an ES lower than one in a CES function indicates multiple steady-states

and poverty traps for per capita output. Two studies relying on La Grandville conducted by Yuhn (1991) and Cronin et al. (1997) attempted to test the relationship between the ES and economic growth. Comparing the US with South Korea, Yuhn (1991) found that the ES was higher for South Korea, which helps explain the higher growth rates acquired in this country after the 1960s. Utilizing data set for the 1961–1991 period, Cronin et al. (1997) estimated an ES of 13.01 between telecommunication and capital. Changes in the ES a ffect growth rate since production is an increasing function of the ES. In the CES case, the ES influences growth in almost every case, except when both inputs are increasing at the same rate (Kamien and Schwartz 1968).

Most studies on production functions in Vietnam made use of the frequentist methods or accounting method to estimate the Cobb-Douglas function. As known, this production function has an ES of one. Tu and Nguyen (2012) used the Cobb-Douglas function to analyze the impact of inputs on co ffee productivity in DakLak province. Q.H. Nguyen (2013) applied the accounting method to build a Cobb-Douglas function for Hung Yen province to identify the resources of economic growth of this province. Khuc and Bao (2016) built an extended Cobb-Douglas function to identify factors contributing to the Vietnamese industry growth. Using the accounting method, Le estimated Vietnam's Cobb-Douglas function based on enterprise data of mining, processing industry, electricity and water production and distribution. The results show that the proportion of labor and fixed assets in the total output of the studied sectors ranges from 0.11 to 0.39 and 0.89 to 0.61, respectively.

For other types of the Cobb-Douglas function, Pham and Ly (2016) constructed a translog Cobb-Douglas function for the manufacturing enterprises of Vietnam, having net revenue as the output and capital, labor, and other costs as the inputs, based on data extracted from the 2010 Vietnam Enterprise Survey by the General Statistics O ffice. Huynh (2019) used the MLE method on a dataset extracted from the Enterprise Survey of the General Statistics O ffice for the period 2013-2016 to build a Battese-Coelli production function and analyze the factors a ffecting technical e fficiency of small and medium enterprises in Vietnam.

It is noted that in the production function theory, many studies have tried to «soften» the premises of the Cobb-Douglas and the CES. But so far no other functions could surpass them in terms of popularity. Moreover, because of the very rigid premises of the Cobb-Douglas, the CES is increasingly explored. Hence, in the present work, the CES is selected to estimate the ES based on the data set of the Vietnamese nonfinancial enterprises.

#### **5. Methodology and Data**

#### *5.1. Method and Model*

There are several methods applied to estimate the ES, but di fferent techniques can be divided into two main groups: Direct and indirect. A direct method allows for estimating the ES through the specification of a production function. The indirect method explores the link between the ES and factor shares to obtain the estimates. We can estimate the ES via the first-order profit maximization condition for labor employment. McFadden (1978) considered that choosing estimation methods depends on data availability, while Mizon (1977) preferred the direct method to the indirect way as the former provides estimates for a large number of functional forms using a common estimation technique and data set. In this study, following Mizon (1977), the author chooses the direct method.

Note that most of the previous studies estimated the ES within the frequentist framework using the CES or the VES. However, in the last three decades, the Bayesian approach has been popularized in social sciences thanks to some of its important strengths (Nguyen et al. 2019; Briggs and Nguyen 2019; Thach et al. 2019; Thach 2019). So, the question of when to use Bayesian analysis and when to use frequentist analysis depends on our specific research problem. For instance, firstly, if we would like to estimate the probability that a parameter belongs to a given interval, the Bayesian framework is appropriate. But if we want to perform a repeated-sampling inference about some parameter, the frequentist approach is needed. Secondly, from what was just mentioned, frequentist confidence

intervals do not have straightforward probabilistic interpretation compared to Bayesian credible intervals. A 95% confidence interval can be explained as follows: If the same experiment is repeated many times and confidence intervals are computed for each experiment, then 95% of those intervals will contain the true value of the parameter. The probability that the true value falls in any given confidence interval is either one or zero, and we do not know which. Meanwhile, a 95% Bayesian credible interval provides a straightforward interpretation that the probability that a parameter lies in an interval is 95%. Thirdly, frequentist analysis is performed to approximate the true values of unknown parameters, while Bayesian analysis provides the entire posterior distribution of model parameters.

In the current study, making use of the direct method, the author estimates the ES through specifying an aggregate CES function. To estimate the CES function, the Bayesian nonlinear mixed-effects regression is performed. The Bayesian mixed-effects models with the grouping structure of the data consisting of multiple levels of nested groups contain both fixed effects and random effects. Our two-level mixed-effects model accounts for the variability between enterprises, which are identified by the id variable. According to Nezlek (2008), the results of analyses of multilevel data that do not take into account the multilevel nature of the data may (or perhaps will) be inaccurate. Based on Equation (22), our nonlinear model has the following expression:

$$
\ln y\_i = \beta\_0 - \frac{1}{\theta} \ln \left( \delta \times K\_i^{-\theta} + (1 - \delta) L\_i^{-\theta} \right) + \varepsilon\_i \tag{23}
$$

where *lnyi* is natural logarithm of output, *Ki* and *Li* are natural logarithm of capital and labor used, respectively, β0 is an intercept, θ is used to calculate σ = 1 1+θ , ε*i* is a random error. The conditions 0 <δ< 1, θ > −1 must be satisfied so Equation (23) is a neoclassical function.

In Bayesian analysis, we use conditional probability:

$$p(B) = \frac{p(A, B)}{p(B)}\tag{24}$$

to derive Bayes's theorem:

$$p(A) = \frac{p(A|B) \times p(B)}{p(A)}\tag{25}$$

where *A*, *B* are random vectors.

Assuming that a data vector *y* is a sample from a probability model with the unknown parameter vector θ, this model is written using a likelihood function:

$$L(\theta; y) = f(y; \theta) = \prod\_{i=1}^{n} f(y\_i | \theta) \tag{26}$$

where *f*(*yi*θ) is a probability density function of *y* given θ.

Relying on given data *y*, we infer some properties of θ. In Bayesian analysis, model parameters θ is a random vector.

We begin Bayesian analysis by specifying a posterior model. The posterior model combines given data and prior information to present the probability distribution of all parameters. Therefore, the posterior distribution has two components: A likelihood function containing information about the model parameters based on observed data, and prior distribution, including known information about the model parameters. By Bayes' law, the likelihood function and priors are combined to form the posterior model:

$$\text{Posterior} \propto \text{Likelihood} \times \text{Prior} \tag{27}$$

Because both *y* and θ are random variables, we apply Bayes's theorem to obtain the posterior distribution of θ given *y*:

$$p(y) = \frac{p(y|\theta) \times p(\theta)}{p(y)} = \frac{f(y;\theta) \times \pi(\theta)}{m(y)}\tag{28}$$

where *m*(*y*) ≡ *p*(*y*) known as the marginal distribution of *y* which is formulated as follows:

$$m(y) = \int f(y;\theta) \times \pi(\theta) \times d(\theta) \tag{29}$$

where *f*(*y*; θ) is a likelihood function of *y* given θ, π(θ) is a prior distribution for θ, *m*(*y*) is also known as the prior predictive distribution.

In cases when the posterior distribution is derived in closed form, we can proceed immediately to the inference step. However, except for some special models, the posterior distribution is scarcely available and needs to be estimated through simulation. Bayesian methods can be used to simulate many models. To simulate Bayes models, MCMC algorithms often require e ffective sampling and verify convergence of MCMC chains to the stationary distribution.

Experience of fitting Bayesian models shows that the specification of priors can rest on previous studies and expert knowledge. In our research, the propositions of a neoclassical production functions and previous research can sugges<sup>t</sup> us to specify priors. To specify the CES, referring to Arrow et al. (1961), Afees (2015) or Lagomarsino (2017), we proposed to assign the normal N(1,100) prior to parameter β0, the uniform(0,1) prior to parameter δ, the gamma(1,1) prior to parameter θ, and the Igamma(0.001, 0.001) prior to the variance component for *<sup>u</sup>*1*j* (σ2*id*) and the overall variance parameter (σ<sup>2</sup> 0) in this research.

Our Bayesian nonlinear mixed-e ffects regression model is as follows: The likelihood function:

$$
\ln y\_{i\rangle} = \beta\_0 - \frac{1}{\Theta} \ln \left( \delta \times \ln k 2010\_{i\rangle}{}^{-0} + (1 - \delta) \ln l\_{i\rangle}{}^{-0} \right) + u\_{1\rangle} + \varepsilon\_{i\rangle} \tag{30}
$$

The priors:

$$
\beta\_0 \sim \text{N}(1, 100)
$$

$$
\delta \sim \text{uniform}(0, 1)
$$

$$
\begin{aligned}
\theta &\sim \text{gamma}(1, 1) \\
 u\_{1j} &\sim \text{N}\left(0, \sigma\_{\text{id}}^2\right) \\
\sigma\_{\text{id}}^2 &\sim \text{Gamma}(0.001, 0.001) \\
 \sigma\_0^2 &\sim \text{Gamma}(0.001, 0.001)
\end{aligned}
\tag{31}
$$

where *lnyij*, *lnk*2010*ij*, *lnlij* are natural logarithm of output, capital, labor employed, respectively in constant 2010 prices, β0 is e fficiency parameter, θ is substitution parameter, δ is distribution parameter, <sup>ε</sup>*ij* is the random error, year *i* = 2008, ... , 2018, and enterprise *j* = 1, 2, 3, ... , 227.
