**1. Introduction**

The evolution of water managemen<sup>t</sup> is a key issue for the human development. An e ffective performance of such service is a challenge for the community. Designing a good managemen<sup>t</sup> system requires considering di fferent factors. In countries such as England and Wales, Portugal, Chile, or the Netherlands, the water industry exists as a monopoly, so that companies and administrations invest their efforts on comparing the di fferent processes within the industry. In general, benchmarking is widely considered a good strategy to control and supervise the performance of this service. Ref. [1] provide a rigorous evaluation of the growing number of benchmarking studies dealing with performance scores based on production or cost estimates. At the same time, the literature reveals frequent use of performance indicators (PIs) when dealing with benchmarking, because of the multiple benefits it brings to the administrations, for instance, to contrast the regulatory conditions, compare, and/or evaluate the quality of the service and establish fair tari ff policies. So, in order to control these values, water utilities-following industry regulations- provide systematic reports on di fferent PIs to the governmen<sup>t</sup> or administrators. The information delivered within this data includes management, environmental, financial and, more recently, social aspects related to water operations. However, di fferent reasons make this set of indicators di fficult to interpret because they do not o ffer a holistic view, as they do not reflect a measure of general performance.

To overcome this di fficulty, a common approach is to aggregate the PIs into a unique indicator, named a composite indicator (CI). Although the literature o ffers a wide range of techniques to create a

CI, most of them use methodologies from multi-criteria decision analysis (MCDA). They have been used to develop CIs applied to diverse sectors of services, activities, or processes [2–4]. In particular, methodologies based on goal programming (GP) are of grea<sup>t</sup> interest for the construction of CIs and they have been successfully applied to diverse fields as tourism [5,6], manufacturing [7], human sustainable development [8–10], or environmental sustainability [11,12]. The main advantages of using GP to develop CIs are: it is not necessary to normalize the initial set of PIs; the CI uses the complete information included in the initial set of PIs; and it does not require a large number of units in comparison with the number of initial indicators.

Usually, another technique used to create CIs is data envelopment analysis (DEA) [13]. DEA is a linear programming tool for evaluating the performance of a set of peer entities that use one or more inputs to produce one or more outputs. As pointed out by [14], the main advantages of using DEA to construct CIs are: it provides a measure of performance based on real data; DEA models do not require the normalization of the initial data; and DEA respects the individual characteristics of the units and their own particular value systems. Techniques based on DEA have been developed to create CIs in [6,15–17].

Since the 1990s, governments of many countries and organizations have emphasized the importance of the concept of sustainability [18]. There is no consensus on the definition of this concept, although it is widely agreed that it must incorporate social, environmental, and economic factors which are interconnected ([19,20]). The water industry has not ignored this trend and, currently, it has extensively recognized its important role in establishing and operating sustainable water supplies and wastewater treatment systems [2,21]. There is clearly a need for a paradigm shift in the water companies, considering social and environmental aspects in the decision making process, not just economic issues [22,23]. In the framework of evaluating the sustainability of water companies, most of the literature focuses on evaluating the sustainability of physical and engineering aspects [24–26], from an environmental perspective [27] or economic sustainability [28,29]. However, there is a lack interest on assessing the sustainability of water companies themselves. In particular, only a few papers apply di fferent techniques from MCDA to assess the sustainability of water companies from a multidimensional perspective. For instance, Ref. [30] construct an index by aggregating the PIs as a linear combination of their normalized values. Also, the MACBETH (Measuring Attractiveness by a Categorical Based Evaluation Technique [31]) method is used to evaluate the sustainability of water supply systems [24]. Another example, Ref. [19], applies the ELECTRE TRI-Nc (Elimination and Choice Expressing Reality [32]) method as a tool to integrate the dimensions of a quality of service index. Additionally, Ref. [12] combines the PIs using an index based on distance-principal components and another based on GP.

In view of the above, in this work, a method to assess the sustainability of the water companies is conducted, using the traditional approach of sustainability, which considers three dimensions into this concept: social, environmental, and economic. Then, a two-phase method combining GP and DEA is proposed, in order to take advantage of both methodologies. A similar two-phase method is proposed in [6] to evaluate the sustainability of Cuban nature-based tourism destinations. Nevertheless, in that work, the distance-principal component (DPC) composite indicator developed by [33] is used to sum up the initial PIs into the dimensions established (social, economic and patrimonial (Although it is usual to use "environmental dimension", in [6] it is replaced by "patrimonial dimension".) instead of GP. Choosing a technique based on GP comes from their good properties, as previously mentioned.

Then, in the first phase, a technique based on GP [5] is used to obtain the dimensional or partial CIs. In the field of water treatments, there is a lack of consensus on the appropriate criteria to select, in order to determine which PIs are involved in evaluating the status of water sustainability. Then, to overcome this di fficulty, as suggested by [34], our proposal groups the initial indicators into the dimensions that characterize the concept of sustainability: social, economic, and environmental dimensions. In this way, when the first phase is applied, three-dimensional composite indicators (social, environmental, and economic) are obtained for each water company. This allows for independently analyzing the performance of each water company among these three dimensions.

Later, in the second phase, the dimensional indicators have to be aggregated in order to design a global composite indicator for evaluating the water companies' sustainability. At this point, a controversial question is the assignment of weights to each dimensional indicator. On the one hand, under some circumstances, it is not easy to obtain information from specialists to determine these weights. On the other hand, the assignation of the same weighting values for all the water companies could be complicated, as each of them might have their own particularities in terms of preferences. To overcome these issues, we have chosen, in the second phase, a DEA-based model known as "Benefit-of-the-Doubts" [32]. To do this, the values obtained in the previous stage are used as outputs of this "Benefit-of-the-Doubt" approach.

This two-phase approach o ffers the possibility of considering the strengths and weaknesses of each water company, as well as providing the decision-makers with useful information.

The hypothesis behind this study is that the water companies should manage their activity in a way as balanced as possible, from social, environmental, and economic point of view. In this sense, the approach proposed in this work allows evaluating and comparing the performance of water companies for each sustainability dimension and, later, identifying if such dimensions have or not a similar influence on the global score. This aspect is an advantage of the proposed approach in comparison to other procedures. In the first phase, an indicator is obtained for each sustainability dimension, and in the second phase the di fferent sustainability dimension indicators are aggregated to build a global indicator. In this aggregation, the weights of the di fferent dimensional indicators are endogenously determined using a DEA-based model, allowing each water company to be assessed in the most favorable way for it. This is another advantage of our proposal, since it does not demand excessive information for obtaining the global indicator.

This study, therefore, presents a pioneering and novel approach to assess the sustainability of water companies. To the best of our knowledge, there is neither any theoretical development nor empirical application that uses composite indicators to assess and/or compare the sustainability of water companies, for each dimension of sustainability and for all the dimensions, simultaneously. Thus, the dimensional composite indicators, in the first phase, allow evaluating the strengths and weaknesses of each water company in a particular dimension. The global indicator, in the second phase, provides a holistic performance perspective, and allows ranking the water companies. However, it provides information about the contribution of the di fferent dimensions to the sustainability overall score.

In the next section, the methodology proposed is detailed. Section 3 introduces the case study, embracing 163 Portuguese water companies as well as the results obtained. Finally, the main conclusions derived from the research are presented in the last section.

### **2. A Two-Phase Evaluation Method**

In this section, the methodology developed to construct the sustainability composite indicator is described, in order to evaluate the performance of water companies.

As previously mentioned, a two-phase procedure is proposed. In the first phase, following the proposal by [5], the composite indicator (sub-indicator) for each dimension of sustainability is calculated: *PSUI<sup>d</sup>* (Partial Sustainability Indicator of dimension *d*). In the second phase, these partial indicators form the basis from which the overall composite indicator is obtained, applying a variant of DEA named the "Benefit-of-the-Doubt" approach. Figure 1 shows the general scheme of the proposed approach.

**Figure 1.** General scheme of the proposed approach.

Then, to calculate *PSUId*, *d* = 1, 2, ... , *D*, it is necessary to distinguish between positive PIs (a larger value involves an improvement in the sustainability) and negative PIs (a larger value involves a decline in the sustainability). Let us suppose that the initial set of PIs is divided into *D* dimensions and there are *M* units to evaluate. For each *d* ∈ *D*, let us denote by *Jd* and *Kd* the number of positive and negative PIs, respectively, assigned to dimension *d*, and *I*+*i*,*jd* and *<sup>I</sup>*<sup>−</sup>*i*,*k<sup>d</sup>* the value of the *i*-th unit with respect to the *j<sup>d</sup>*-th positive and *kd* -th negative PI which belong to the dimension *d* (*i* = 1,2, ... *M*; *jd* = 1,2, ... *Jd*; *kd* = 1,2, ... *Kd; d* = 1,2, ... *D*).

Additionally, the performance of a unit is evaluated, regarding PIs, using the concept of "aspiration level", that is, the achievement level desired for the corresponding PI. Thus, it is possible to obtain a set of goals in line with the basic ideas underlying in GP approach [35]. Accordingly, let us assume that, for each positive PI, it is possible to give an aspiration level (denoted by *u*+*jd* ). It corresponds to the minimum value from which it is considered that a unit shows a suitable performance, regarding the aspect of sustainability evaluated by the PI. Thus, for the *i*-th unit, the goal corresponding to the *j<sup>d</sup>*-th positive PI can be defined as follows:

$$n\_{i\dot{j}^l}^+ + n\_{i\dot{j}^l}^+ - p\_{i\dot{j}^l}^+ = u\_{\dot{j}^l}^+ \text{ with } n\_{i\dot{j}^{l\prime}}^+ \ p\_{i\dot{j}^l}^+ \ge 0, \ n\_{i\dot{j}^{l\prime}}^+ p\_{i\dot{j}^l}^+ = 0 \tag{1}$$

where *n*+*ijd* , *p*+*ijd* represent the negative and positive deviation variables, respectively. Thus, if the goal is satisfied (*I*+*ijd* > *u*+*jd* ), the negative deviation variable would be zero, and the positive deviation variable would measure the over-achievement of the goal (strength). Otherwise, if the goal is not satisfied (*I*+*ijd* < *u*+*jd* ), the positive deviation variable would be zero and the negative deviation variable would quantify the under-achievement of the goal (weakness). It should be noted that, at least, one of the two deviation variables has to be zero. Consequently, for positive PIs, the negative deviation variables will be considered unwanted variables because a better-positioned company will achieve the aspiration level or a higher value.

In a similar way, for each negative PI, we have the following goal:

$$\boldsymbol{h}^{-}\_{\vec{\boldsymbol{k}}^{d}} + \boldsymbol{n}^{-}\_{\vec{\boldsymbol{k}}^{d}} - \boldsymbol{p}^{-}\_{\vec{\boldsymbol{k}}^{d}} = \boldsymbol{u}^{-}\_{\vec{\boldsymbol{k}}^{d}} \text{ with } \boldsymbol{n}^{-}\_{\vec{\boldsymbol{k}}^{d}} \cdot \boldsymbol{p}^{-}\_{\vec{\boldsymbol{k}}^{d}} \ge 0, \ \boldsymbol{n}^{-}\_{\vec{\boldsymbol{k}}^{d}} \cdot \boldsymbol{p}^{-}\_{\vec{\boldsymbol{k}}^{d}} = 0 \tag{2}$$

Again, *<sup>n</sup>*<sup>−</sup>*ik<sup>d</sup>* , *<sup>p</sup>*<sup>−</sup>*ik<sup>d</sup>* represent the negative and positive deviation variables, respectively. However, now, if the goal is satisfied (*I*<sup>−</sup>*ik<sup>d</sup>* < *<sup>u</sup>*<sup>−</sup>*k<sup>d</sup>* ), the positive deviation variable would be zero and the negative deviation variable would quantify the under-achievement of the goal (strength). Otherwise, if the goal

is not satisfied (*I*<sup>−</sup>*ik<sup>d</sup>* > *<sup>u</sup>*<sup>−</sup>*k<sup>d</sup>* ), the negative deviation variable would be zero, and the positive deviation variable would quantify the over-achievement of the goal (weakness). Consequently, for negative PIs, the positive deviation variables will be considered unwanted variables because a better-positioned company will achieve the aspiration level or a lower value.

From all the above, at each dimension *d*, the strengths of each unit can be calculated by aggregating positive deviation variables, in case of positive PIs, and negative deviation variables, for the negative PIs. These variables are normalized by their corresponding aspiration levels to avoid the inadequate effects due to the use of different measurement scales of the initial set of PIs. Similarly, the weaknesses of each water company can be obtained as the sum of the normalized unwanted deviation variables (negative deviation for positive PIs and positive deviation for negative PIs divided by its corresponding aspiration level). Finally, the partial indicator for the *i*-th (*i* = 1,2, ... *M*) unit, in the dimension *d* (*d* ∈ *D*) is determined by the difference between the strengths and weaknesses of this unit as follows:

$$\widehat{PSIII\_i}^d = \left(\sum\_{j^d=1}^{j^d} \frac{p\_{ij^d}^+}{\boldsymbol{\mu}\_{j^d}^+} + \sum\_{k^d=1}^{k^d} \frac{n\_{ik^d}^-}{\boldsymbol{\mu}\_{k^d}^-}\right) - \left(\sum\_{j^d=1}^{j^d} \frac{n\_{ij^d}^+}{\boldsymbol{\mu}\_{j^d}^+} + \sum\_{k^d=1}^{k^d} \frac{p\_{ik^d}^-}{\boldsymbol{\mu}\_{k^d}^-}\right) \tag{3}$$

Additionally, two fictitious units are introduced in the sample, representing the best and worst situation within the data base. For each positive indicator in dimension *d*, *jd*, and negative PI, *<sup>k</sup>d*, the value of the "best" unit (*b*) will be:

$$I\_{bj^d}^+ = \text{Max}\_{i \in M} \{ I\_{i j^d}^\* \}\_{\prime}, \ I\_{bk^d}^- = \text{Min}\_{i \in M} \{ I\_{i k^d}^- \} \tag{4}$$

and the value of the worst unit (*w*):

$$I\_{w\gamma^l}^+ = \textit{Min}\_{i \in M} \left\{ I\_{i\bar{j}^d}^\* \right\} , \ I\_{w\mathbf{k}^d}^- = \textit{Max}\_{i \in M} \left\{ I\_{i\mathbf{k}^d}^- \right\} \tag{5}$$

For these fictitious units, their corresponding partial sustainability indicators are calculated. Finally, we can obtain the difference between *PSUI d i* with respect to the value reached by the worst unit and normalize this value by the difference between the partial sustainability indicator for the best and the worst unit, that is:

$$PSIII\_i^d = \frac{\widehat{PSLII}\_i^d - \widehat{PSLII}\_w^d}{\widehat{PSLII}\_b^d - \widehat{PSLII}\_w^d}, \quad i = 1, 2, \dots, M; \; d \in D \tag{6}$$

The advantage of using *PSUIdi* instead *PSUI* - *i d* is that it offers a relative value between 0 and 1. In fact, it represents how far a unit is from the worst situation regarding the distance between the best and the worst situation. Additionally, this normalization does not distort the previously obtained results, but allows a more homogeneous and simple analysis of the dimensional results obtained.

Once the partial sustainability indicators for each dimension are obtained, the second phase consists of calculating the global sustainability indicator (*GSUI*). To do so, the "Benefit-of the-Doubt" approach [36], which is rooted DEA, is applied.

Now, for each unit *a*, *GSUIa* (*a* =1,2, ... *M*) represents the weighted average of the partial indicators *PSUIda* (*d* ∈ *D*), which is obtained by solving the following optimization problem:

$$GSIII\_a = Max\sum\_{d \in D} w\_a^d PSIII\_a^d$$

Subject to:

$$\begin{array}{c} \sum\_{d \in D} w\_i^d \text{PSLI}\_i^d \le 1 \; ; i = 1, 2, \dots, a \; \dots \; \dots \; M\\ \text{LI}^d \le \frac{w\_i^d \text{PSLI}\_i^d}{\sum\_{d \in D} w\_j^d \text{PSLI}\_i^d} \le \text{I}^d \; ; i = 1, 2, \dots, a \; \dots \; \dots \; M; d \in D\\ w\_i^d \ge 0 \; i = 1, 2, \dots, a \; \dots \; \dots \; M; d \in D \end{array} \tag{7}$$

where *Ud* and *Ld* are the upper and lower bounds allowed for the relative contribution of *PSUId* to the global indicator. The aim of Equation (7) is to obtain the weights (assigned to the partial indicators) that maximize the global score (*GSUIa*) for every unit *a.* Therefore, this model provides a relative objective performance value for each unit without requiring prior knowledge of the weights for the partial indicators [37]. These weights are endogenously determined solving Equation (7) and, by construction, *GSUIa* takes value between 0 (the worst situation) and 1 (the best situation).

In essence, Equation (7) is an output multiplier DEA model with multiple outputs (partial indicators) and a single "dummy input" with value equal to 1 for all the units [38]. In the DEA context, the contribution of each partial indicator to the value of the global indicator ( *w<sup>d</sup> i PSUI<sup>d</sup> i* ) is labelled as the "virtual output" of the corresponding dimension.

To avoid extreme situations, some constraints on the weights have been added to Equation (7). All partial indicators should have a relative contribution on the global indicator, that is, all the dimensions should be taken into account in the global score. For this reason, lower and upper bounds ( *Ud* and *L<sup>d</sup>*) have been established on the relative contribution of each partial indicator (*PSUId*).

Thus, the proposed approach o ffers a composite indicator which provides information about the contribution of each sustainability dimension to the global score. It allows to take into account the special characteristics of the units considered since the same importance does not need to be given to each dimension for the di fferent units.
