**3. Methodology**

#### *3.1. Models*

In this paper, we apply three different models to investigate energy use efficiency. In all models, we assume that the production technology consists of one output *Y* and a vector of four inputs *X* = (*L*, *K*, *NEM*, *E*), where *L* is the labor, *K* is the capital stock, *NEM* is the non-energy materials, and *E* is the energy. The production technology using multiple outputs (transformation function), can be written, in implicit form as,

$$\mathcal{A}\mathcal{F}(\mathbf{Y}, \mathbf{X}) = 1. \tag{1}$$

If the manufacturing process does not experience production shocks, <sup>A</sup> <sup>=</sup> 1, and <sup>F</sup>(*Y*, *<sup>X</sup>*) = 1. However, since both positive and negative shows hit the production, the transformation function is made stochastic by setting A = exp(*v*), *v* can be both positive and negative. Besides, if inputs are not used with 100% efficiency, the transformation function in (1) can be expressed as

$$\mathcal{A}\mathcal{F}(\mathbf{Y}, \theta \mathbf{X}; \mathfrak{B}) = 1,\tag{2}$$

where *θ* < 1 is the input technical efficiency (defined as the ratio of minimum of each input required and actual amount used) and *β* is the set of the technology parameters of the function F. Since the transformation function is homogeneous of degree 1 in inputs (see [35]), so we can rewrite (2) as

$$\mathcal{A}\mathcal{F}(\mathbf{Y}, \lambda\theta \mathbf{X}; \mathfrak{B}) = \lambda, \quad \lambda > 0. \tag{3}$$

Further, we can set *λ* = (*Eθ*)−1, where *E* is the energy input. Note that any other input could have been chosen to be in place of *E*. Then (3) becomes

$$X\_1^{-1} \theta^{-1} = f(\mathbf{Y}, \mathbf{X}\_{-E}; \boldsymbol{\mathfrak{f}}) \exp \mathbf{v},\tag{4}$$

where *<sup>X</sup>*˜ <sup>−</sup>*<sup>E</sup>* = (*L*/*E*, *<sup>K</sup>*/*E*, *NEM*/*E*). Taking logs of both sides of (4) and denoting *<sup>u</sup>* <sup>=</sup> <sup>−</sup> log *<sup>θ</sup>* <sup>≥</sup> 0, we obtain (Model 1)

$$-\log E = \log f(\mathbf{Y}, \mathbf{\tilde{X}}\_{-E}; \mathbf{\mathcal{B}}) + \upsilon - u. \tag{5}$$

The stochastic frontier (SF) formulation in (5) is known as the input distance function formulation, where *u* is input oriented inefficiency, which measures percentage (when multiplied by 100) over-use of all the inputs. For small values of *<sup>u</sup>*, *<sup>e</sup>*−*<sup>u</sup>* <sup>≈</sup> <sup>1</sup> <sup>−</sup> *<sup>u</sup>*. That is, technical efficiency is 1 minus technical inefficiency. It is important to keep this relationship in mind because we switch from one to the other quite frequently. Technical efficiency in this model refers to the efficiency of all inputs including energy. That is, in this model, inefficiency, *u*, is interpreted as over-use of all the all inputs, including energy, at the same rate. The other two models focus exclusively on energy-use efficiency. Before we explain how *u* can be estimated, we introduce two other approaches.

The transformation function can also be written as a factor requirement function (see, e.g., [36]). Since the focus is on energy use, we can express the technology in terms of *E*, and write it as,

$$E = G(\mathcal{Y}, \mathbf{X}\_{-E}),\tag{6}$$

where *<sup>X</sup>*−*<sup>E</sup>* = (*L*, *<sup>K</sup>*, *NEM*). Again, assuming that both positive and negative shocks *<sup>v</sup>* can influence energy requirement and positing that energy is not used 100% efficiently used, we can rewrite (6) as

$$E = \lg(\mathbf{Y}, \mathbf{X}\_{-\to}; \gamma) \exp \boldsymbol{v}' \exp \boldsymbol{u}',\tag{7}$$

where *γ* is the vector of parameters of the energy requirement function, *v* is a symmetric error term and *u* is the energy use inefficiency. Taking to logs of both sides of the (7) gives us the energy requirement function with inefficiency, viz., (Model 2)

$$\log E = \log \lg (Y, X\_{-\mathbb{E}}; \gamma) + v' + u'. \tag{8}$$

This approach was, for example, applied by [6,29] to plant-level data using the second-generation SF model. Note that (8) has a stochastic cost function type formulation. Any inefficiency in the use of energy will increase cost.

Finally, in our last model we recognize endogeneity of output *Y*. That is, we assume profit maximizing behavior to derive the energy demand function

$$E = H(w, \mathbf{X}\_{-E}),\tag{9}$$

where *w* = *wE*/*p*, *wE* is the energy price and *p* is the output price. Similar to the factor requirement function, we can obtain energy use inefficiency from the demand function (Model 3)

$$\log E = \log h(w, X\_{-E}; \mathcal{S}) + v^{\prime\prime} + u^{\prime\prime}, \tag{10}$$

where *δ* is the vector of parameters of the energy demand function, *v* is a symmetric error term and *u* is the energy use inefficiency.

The difference between (5) and (10) is that in the latter energy input is chosen optimally by maximizing profit. In (5) energy overuse treats all other inputs as given. That is, inefficiency in this model shows by how much energy is overused to produce a given level of output and all other inputs. On the other hand, inefficiency in (10) comes from excess use of energy when all other inputs and output are chosen optimally instead of taking them as exogenously given. From econometric estimation point of view this means *<sup>Y</sup>* and *<sup>X</sup>*−*<sup>E</sup>* are exogenous in Model 2, whereas they are endogenous in Model 3.

In the next sub-section we examine all three models in more detail in the light of panel stochastic frontier framework. In particular, we add firm-heterogeneity and decompose inefficiency into persistent and transient components.

#### *3.2. Stochastic Frontier Approach with Panel Data*

The stochastic production frontier function approach was introduced for cross-sectional data independently by [37,38]. This is expressed as

$$\log q\_i = r(\mathbf{X}\_i; \boldsymbol{\omega}) + \upsilon\_i - \mu\_{i\prime} \tag{11}$$

where *r*(·) is the technology (namely, the production function in logarithmic form), *qi* is an output, *X<sup>i</sup>* is a vector of inputs (in log) for a production unit *i*, *ω* is a vector of parameters that define the technology, *vi* is the usual error/noise term, and *ui* ≥ 0 is the inefficiency. In this model, the data are cross-sectional and hence error components *vi* and *ui* represents cross-sectional shocks to the production and production unit-specific inefficiency. When panel data are available, shocks and inefficiency can be both time-constant and time-varying. The authors of [22,39,40] were first to recognize this and formulated the following 4-component stochastic frontier model for panel data. We use this framework for our Model 1, and write it as:

$$\log q\_{it} = r(\mathbf{X}\_{it}, trend; \omega) + v\_{0i} - u\_{0i} + v\_{it} - u\_{it} \tag{12}$$

where *t* is a time period in which a production unit *i* is observed. In (12) we have two additional terms compared to (11). More specifically, *vit* is the usual symmetric error term, *v*0*<sup>i</sup>* is an individual (production unit) effect also known to represent individual production shock (or heterogeneity), *u*0*<sup>i</sup>* ≥ 0 is the persistent or structural time-invariant inefficiency, and finally *uit* ≥ 0 is the transient or short-term time-varying inefficiency. Thus, the overall inefficiency is the sum of persistent and transient inefficiency and overall efficiency *TEoverall* is decomposed into persistent *TEpersistent* and transient *TEtransient*, i.e.,

$$TE^{overall} = TE^{persient} \times TE^{transient} \tag{13}$$

Note that persistent and transient efficiency (*TEpersistent* and *TEtransient*) are defined as *e*−*u*0*<sup>i</sup>* and *e*−*uit* , respectively. The originally proposed model assumed all 4 components to be random and homoskedastic. This model did not include the determinants of inefficiency. In our analysis, we will use the [11] model that introduces determinants of both types of inefficiency in (12).

To estimate parameters *ω* in (12), we assume that *vit* ∼ N (0, *σvit*), *v*0*<sup>i</sup>* ∼ N (0, *σv*0*<sup>i</sup>* ), *uit* ∼ <sup>N</sup> <sup>+</sup> (0, *<sup>σ</sup>uit*), and *<sup>u</sup>*0*<sup>i</sup>* ∼ N <sup>+</sup> (0, *<sup>σ</sup>u*0*<sup>i</sup>* ), where <sup>N</sup> <sup>+</sup> means the positive part of the zero mean normal distribution, making *uit* and *u*0*<sup>i</sup>* half-normally distributed. We assume that both noise *vit* and individual effects *v*0*<sup>i</sup>* are homoskedastic, so that *σvit* = *σ<sup>v</sup>* and *σv*0*<sup>i</sup>* = *σv*<sup>0</sup> . We introduce determinants of time-varying inefficiency via the pre-truncated variance of *uit*. More specifically, we assume

$$
\sigma\_{u\_{it}}^2 = \exp\left(z\_{u\_{it}}\boldsymbol{\Psi}\_u\right), \quad i = 1, \cdots, n, \quad t = 1, \cdots, T\_{\dot{\boldsymbol{\nu}}} \tag{14}
$$

where *<sup>z</sup>uit* denotes the vector of covariates that explain time-varying inefficiency. Since *uit* is half-normal, *E*(*uit*) = -(2/*π*) *σuit* = -(2/*π*) exp <sup>1</sup> <sup>2</sup> *<sup>z</sup>uitψ<sup>u</sup>* , and therefore, anything that affects *σuit* also affects time-varying inefficiency. The determinants of persistent inefficiency can be modeled similarly. However, because the data-set does not provide natural determinants of the persistent inefficiency, we leave it homoskedastic, i.e., *σu*0*<sup>i</sup>* = *σu*<sup>0</sup> .

The parameters *ω*, as well as variances of the 4 components and their determinants, can be estimated by the single stage maximum simulated likelihood (MSL) method (see Appendix B and [11] for details of the estimation procedure). We follow [39] to calculate the persistent and transient efficiencies. The overall efficiency is then calculated as the product of the persistent and transient efficiencies.

We add firm-heterogeneity and decompose inefficiency into persistent and transient inefficiency in the same way as in Model 1, for both Models 2 and 3, which are outlined in (8) and (10). After adding these components, the models will look quite similar to (12) mathematically. Because of this, we skip the details and avoid repetitions. However, note that the interpretation of inefficiency in these models are different. In Model 2 inefficiency refers to overuse of energy, given everything else. Consequently, persistent and transient inefficiency in Model 2 decompose energy overuse into a time-invariant and a time-varying components, *c*eteris paribus. Similar to Model 2, inefficiency in Model 3 described in (10) after adding firm heterogeneity and persistent inefficiency is specifically related to energy overuse. But it does not take other inputs as given, which is what Model 2 does. In Model 3 inputs are chosen optimally, and inefficiency in production is transmitted to overuse of inputs via demand for energy. That is, we focus only on energy by examining the energy demand function.
