1. Introduction
In public decision-making, factors such as personal values, cultural background and different individual perspectives play a central role in the policy cycle of designing, testing, implementation and review [
1]. To assist policy makers, analysts have used an array of qualitative and quantitative methods in all steps of the cycle.
However, the increasing use of sophisticated methods does not always fully address the needs of policy makers and their decision-making process; on the contrary, in many cases, it seems to attract criticism that is focused on their disadvantages [
2]. Furthermore, the rise of Artificial Intelligence and its expanding use in decision and/or policy making has brought forth several issues such as the interpretability of algorithms and whether their output can be trusted, or the availability and quality of the data that are used. Questions such as which specific feature made the model/algorithm reach the specific decision [
3] or how accurate are the data that were used, and hence issues of transparency, interpretability and data quality [
4], are becoming central issues of the critique on quantitative methods and algorithms.
This criticism is not without its merits. The complexity of contemporary problems means that there are issues about which an analyst can only make assumptions due to the existence of deep uncertainty [
5]. Moreover, in such complexity, the perception of the analyst along with the availability and quality of data may limit the view of the policy cycle under study. As a result, the success of a quantitative method relies on all of the above choices to be exactly correct [
2].
Sustainable development perfectly encapsulates these issues. It entered the sphere of public policy-making and analysis in the 1980s, when the Brundtland report defined sustainable development as: “the ability to meet the needs of the present without compromising the ability of future generations to meet their own needs” [
6]. In order to achieve sustainable development, public policies should have economic, social and environmental dimensions, while taking into account the current technological developments [
7], the cultural context and value system in which they are applied [
8]. Thus, sustainable development is a multi-dimensional concept and from early on, sustainability was used as a proxy to measure it.
Sustainability, a notion stemming from ecology, at its basic form is an indication of a system’s endurance and its ability to retain its essential properties [
9]. In human systems, sustainability is regarded as the ability to live without environmental degradation [
7], while encompassing all dimensions of human systems and processes [
10].
Hence, both sustainable development and sustainability have been characterized by multi-dimensionality and different perceptions on how to explicitly define them [
7,
11]. So far, all definitions fall into two categories: there is the three-dimensional approach that seeks to integrate an economic, social and environmental dimension and the dualistic approach that emphasizes the interlinked relationship between humans and nature [
7]. Lately, however, another category has emerged, one that focuses on technology and innovation as the means to achieve sustainable development [
11].
Complementary to the lack of a unified definition is also the absence of an official and unified methodological framework [
12]. The existence of such a framework could be of great assistance, but it should entail certain properties. First, the multi-dimensional nature of sustainability dictates that any quantitative method cannot rely only on terms of costs and benefits [
13]. Moreover, any such method should have integrating properties, since sustainability seeks to combine different dimensions into a single measure [
14], and finally it should be transparent, easy to communicate to non-experts and subject to the review of experts [
7].
One method that is being increasingly used is Data Envelopment Analysis (DEA). DEA is a non-parametric, mathematical programming technique that is used for the assessment of the technical efficiency of Decision-Making Units (DMUs) relative to one another, where technical efficiency can be viewed as the ability of a DMU to transform its inputs to outputs and is defined as the ratio of the sum of its weighted outputs over the sum of its weighted inputs [
15], as indicated in expression (1):
The method was established in the seminal papers of Charnes, Cooper and Rhodes [
16] and Banker, Charnes and Cooper [
17]. It does not require the knowledge of price information [
18], and it requires knowledge neither of the relationship between inputs and outputs nor of the statistical distribution of the data that are used [
19]. Moreover, DEA is flexible enough to be combined with other methods [
20,
21,
22,
23], thus increasing its methodological robustness. These advantages were crucial in recognizing that DEA can be a suitable tool for assessing sustainable development, and as a result it has been increasingly used in sustainability policy-making [
9].
Zhou et al. [
9] performed a literature review on the use of Data Envelopment Analysis in regional sustainability studies, and their study covers the years until 2016. In their paper, the authors identified the trend of using DEA to measure sustainability; however, they also noted several gaps in the literature. Firstly, it appeared that the main focus of the studies was the economic and environmental dimensions of sustainability, while the inclusion of the societal aspect was not equally extensive. Secondly, the authors observed a trend of combining DEA with other methodologies in order to increase the robustness of the measurement by mitigating the methodological limitations of DEA. Moreover, the authors identified that while early studies tend to employ classic DEA models, in later years, more sophisticated versions are used. Nonetheless, Zhou et al. [
9] also identify that there is still the need to decide which parameters will be used in the model that best describe the multi-dimensional concept of sustainability.
Tsaples and Papathanasiou [
24] performed a literature review on DEA and sustainability for the years 2016–2020 and discovered that since 2016 the studies have made an effort to include parameters that represent the social dimension of sustainability. Moreover, there are efforts to include other aspects that represent technological advancement and innovation, despite the fact that the three-dimensional construct appears to be the preferred one. However, they also revealed the lack of a unified context in which sustainability is measured, in two forms: firstly, the choice of inputs and outputs (and intermediate measures) despite commonalities is unique to each research work. Second, the choice of DEA variation and/or combination with other methodologies implies that the perception of each analyst affects the final result of their work.
Consequently, DEA does not come without limitations. First, in its traditional form, the efficiency of Decision-Making Units is calculated with weights that are most favorable to themselves; i.e., each DMU is evaluated under the most favorable weighting scheme with the purpose of maximizing its own efficiency [
25]. Furthermore, Zhou et al. [
9] identified that there is the need to use DEA in the appropriate context, which means that there is the requirement to decide which parameters will best explain different dimensions of sustainability. This particular methodological limitation was not unknown; Moutinho, Madaleno and Robaina [
26] identified that DEA is sensitive to the choice of inputs and outputs, meaning that the calculated efficiency depends on what inputs and outputs will be chosen. Finally, the number of inputs and outputs that can be used is limited by the number of DMUs under evaluation for the measurement to be meaningful, otherwise there would be an increased number of efficient DMUs that would result in inconsistencies [
27]. Using appropriate inputs and outputs is an item of ongoing research in the DEA literature, with researchers attempting to utilize different techniques to select appropriate measures and increase the robustness of the method [
28]. For example, Benítez-Peña et al. [
29] proposed the use of Mixed Integer Programming in choosing the appropriate inputs and outputs.
Moreover, researchers understood that the robustness of a DEA model increases if the DMU under study is not considered a “black box”, and for that reason, the intermediate steps of the typical DEA model were increased [
24]. The addition of intermediate stages was considered a more accurate depiction of certain complex processes, and it allowed researchers to better track sources of inefficiencies [
30]; thus, network DEA models can capture the weights for the calculation of efficiency in a more appropriate manner. Furthermore, the inclusion of those intermediate steps could free the analysis from the limitation of how many inputs and outputs could be used [
27] and could better reflect the inner workings of complex processes. For that reason, two-stage (or network) DEA models have been increasingly used for sustainability assessments of different types of DMUs [
30]. A typical two-stage DEA model is presented in
Figure 1 below:
As can be observed, inputs enter stage 1 and produce the intermediate measures (which are considered the outputs of stage 1). Those intermediate measures are used as inputs for stage 2 of the process and produce the outputs. This structure of DMUs for DEA has proven very effective in increasing the robustness of efficiency measurements; however, as will become clearer in the next section, little attention has been paid to the weight distribution and weak discriminatory power of network models [
30].
Consequently, the power of DEA as a monitoring tool for sustainability is diminished by the same issue that was identified in the beginning of this section: different people (policy makers, analysts, the public, etc.) have different values and perceptions of what sustainable development means and what should be used to measure sustainability. Thus, there is a need to increase the robustness of DEA by incorporating as many perceptions as possible in the measurement of sustainability without losing the value of its advantages.
To summarize, the following gaps have been identified: first, the employment of two-stage (or network) DEA models for sustainability assessments, despite its increasing use, has not reached the levels of use of classic, one-stage DEA models. Second, more efforts are necessary in order to increase the discriminatory power of two-stage models, and finally, research efforts need to be directed towards including more and diverse perceptions for the measurement of sustainability.
The purpose of the current paper is to address the above gaps by proposing a computational framework with a twofold functionality. First, it uses a two-stage Data Envelopment Analysis model with an alternative optimization metric that attempts to intervene in the weights of the inputs, intermediate measures and outputs to better reflect their importance for the DMUs. Second, the framework accompanies the DEA model with a computational stage that will attempt to incorporate different perceptions (meaning different combinations of inputs and outputs) and apply it in the measurement of sustainability of the EU 28 countries using machine learning techniques. To achieve this objective, the framework will rely on Exploratory Modeling and Analysis (EMA). EMA is a school of thought developed at RAND corporation [
31] and promotes the exploratory use of quantitative methods despite methodological limitations, uncertainties and different perceptions. Employing an exploratory approach to sustainability measurement could reveal unanticipated implications of the initial assumptions regarding inputs and outputs. The use of computational experimentations to explore conjectures, models and datasets is not new. It has been applied in simulation models [
5]), mathematics [
32] and of course in various disciplines with the emergence of big data [
33]. The approach requires computational power, development of new algorithms and techniques to analyze the data that will be generated.
Thus, EMA relies on Machine Learning (ML) techniques, even though validation of the developed models may not be possible. However, even when it is not possible to validate a model, exploration could lead to insights on how the different perceptions of sustainability give rise to unexpected results. Moreover, the use of computational explorations could facilitate the explanation of known facts and the discovery of commonalities among different perceptions of sustainability, hence leading towards the development of a composite definition of sustainable development.
For that reason, the combination of DEA and ML has been gaining traction in the literature. For example, Samoikenko and Osei-Bryson [
34] combined DEA with clustering and Classification and Regression Trees (CART) to increase the discriminatory power of the method; Wu [
35] integrated DEA with data mining and CART to evaluate the efficiency of Brazilian companies. De Nicola et al. [
36] combined DEA with CART to evaluate the Italian health system. Nandy and Singh [
37] used DEA to evaluate the efficiency of farms in India and employed machine learning to gain insights into which variables are crucial in predicting performance. Aydin and Yurdakul [
38] separated countries in groups via clustering and then calculated the efficiency of how countries responded to COVID-19 in each cluster with DEA. Finally, Thaker et al. [
39] employed DEA to evaluate the efficiency of Indian banks and then used Random Forest Regression to analyze the impact of corporate governance (and other bank characteristics) on the calculated efficiencies. Consequently, combining DEA with ML offers an alternative approach to the issue of inputs and outputs selection.
However, all the above combinations of DEA with ML are limited by the repeating theme of this introduction: that they do not consider different perceptions into the calculations. Furthermore, all the above attempts, in essence, worked towards reducing the size of the available data with the introduction of ML (e.g., using clustering). In the current paper, the opposite occurs; the variety of calculations under different perceptions can be seen as new data generators that are used as inputs for the ML stage of the model that add new layers of insights.
The novelties of the current paper are the following: first is the proposal and development of an alternative, two-stage DEA model with a different optimization metric and the proof of lemmas and a theorem. The second novelty is the integration of DEA with ML under an exploratory, multi-perspective (similar to a full factorial experimental design pattern) that will not only calculate the performance of EU countries in terms of sustainability, but at the same time will provide insights relevant to policy makers and the general public. Finally, several case studies are presented in the subsequent sections, and each can be considered an addition to the literature of DEA.
The rest of the paper is structured as follows: in
Section 2, the issue of the weight flexibility in Data Envelopment Analysis is approached. The literature is reviewed, an alternative two-stage DEA model is proposed and it is applied in the calculation of the environmental performance of European countries. In
Section 3, a new computational framework is proposed and applied in the calculation of the sustainability of European countries in a step-wise function. Conclusions, contributions of the current paper and future research avenues are explored in
Section 4.
3. Exploratory, Multi-Dimensional Data Envelopment Analysis
As was mentioned above, environmental performance is considered only one of the three (or more) dimensions of sustainability. Consequently, moving in the direction of adding more dimensions to measure sustainability, the need arises to move from a two-stage DEA model to a multi-level or multi-dimensional model that will allow the incorporation of these dimensions without succumbing to the methodological limitations of DEA. In the following sub-sections, a new framework is proposed for the incorporation of multiple dimensions.
3.1. Multi-Dimensional DEA for the Construction of Composite Indicators
The typical calculation of sustainability involves three dimensions: economic, environmental and social. Thus, the calculation of the environmental performance in the previous section can be considered as part of sustainability, despite the fact that many of the inputs, intermediate measures and outputs that have been used by the various authors resemble those that are used in the DEA literature for the calculation of sustainability.
However, for a more inclusive calculation of sustainability that is not limited by the number of inputs and outputs that can be used, the proposed alternative, two-stage model that was described by Equations (11)–(20) can be incorporated in the framework proposed by Tsaples and Papathanasiou [
79] and shown in
Figure 2.
Each sub-indicator/dimension is calculated using Equations (11)–(20) and the overall performance of each sub-indicator/dimension is used in a Benefit-of-the-Doubt (BoD) model to calculate the overall sustainability index. The BoD model is described by Equations (23)–(25) below [
83]:
The BoD model described by Equations (23)–(25) is a typical DEA model with the inputs designated as one. As a result, the model calculates the optimal weights allowing maximum flexibility. In contrast with the proposed two-stage alternative of Equations (11)–(20), the BoD model does not include any restrictions to the weights because the dimensions that are included are typical of sustainability (despite the differences in the underlying measures that are used to calculate those indicators) and limited in number. Moreover, the simplicity of the BoD model, the opportunities that it allows to account for different (countries’) backgrounds [
84], the fact that it has been used by numerous studies (see [
85] for an inclusive account) and has been proposed by OECD for the construction of composite indicators [
86] mean that it can be used without any intervention in the weights. As mentioned in the above paper, its main advantage is that “it results in idiosyncratic weights to aggregate sub-indicators that vary both across sub-indicators and evaluated decision-making units (DMUs)”. In other words, “each evaluated DMU is allowed to choose a set of weights that maximizes its performance in terms of the resulting value of the composite indicator under the restriction that if the same set of weights is used by any other evaluated DMU it will not result in a value of the composite indicator that is greater than one” [
85] (p. 1). The use of a BoD model is not unique and alternative methods can be used equally successfully and efficiently (for example, Shannon’s entropy in the example [
87]). However, in the context of the current paper, the BoD approach is preferred because the overall proposed model continues to be two-stage DEA, in which the first stage consists of two-stage DEA models that calculate the (more refined) dimensions that will be used in the BoD model that brings the above desired properties for the construction of a final scalar index for each country. Thus, the framework is characterized by an esoteric, elegant consistency.
In the current paper, the following are used for each dimension:
Economic
- -
Inputs: Gross fixed capital at current prices (PPS); Total Labor force (×1000 persons);
- -
Intermediate measures: GDP per capita in PPS Index (EU28 = 100);
- -
Outputs: Median equivalized net income [Purchasing power standard (PPS)]; Final consumption expenditure of households [Current prices, million euro].
Environmental
- -
Inputs: Population, Gross electricity production [Thousand tons of oil equivalent (TOE)];
- -
Intermediate measures: Final energy consumption (Terajoule);
- -
Outputs: Terrestrial protected area (km2), Share of renewable energy in gross final energy consumption (%), Greenhouse gas emissions (in CO2 equivalent).
Social
- -
Inputs: Gross fixed capital at current prices (PPS), GDP per capita in PPS Index (EU28 = 100);
- -
Intermediate measures: Total expenditure (Euro per inhabitant);
- -
Outputs: Patent applications to the European patent office (EPO) by priority year; Overall life satisfaction; Satisfaction with living environment; Percentage of females in total labor population [
79].
As it can be observed in
Table 3, there are four countries that are considered sustainable compared to the rest of the set: Germany, Estonia, Latvia and Malta. The rest of the countries can be grouped into two broad categories: those that have a sustainability index above 0.7 compared to the other countries and those that have a sustainability index below 0.7., which includes Belgium, Czech Republic, Ireland, Greece, Spain, Hungary, Netherlands, Portugal and Slovakia. Furthermore, the Spearman Correlation Coefficient was calculated for:
Sustainability–Economic sub-indicator: 0.635
Sustainability–Environmental sub-indicator: 0.627
Sustainability–Social sub-indicator: 0.616
The coefficients illustrate that the sustainability of each country depends almost equally on each sub-indicator, with the economic-sub-indicator, however, having a slightly larger coefficient.
3.2. Proposed DEA-ML Computational Framework
Nonetheless, the above calculated sustainability index suffers from the same limitations that were identified in the Introduction and in the previous section: Since there is no unique, “correct” definition of sustainability, the same indicator can be calculated by using different variations of DEA and/or different combinations of inputs, intermediate measures and outputs.
Furthermore, the proposed two-stage DEA variation might not offer a unique solution that could alter the final results of the calculated index. Finally, one could argue that the BoD model that was used to aggregate the individual dimensions into one sustainability index does not pose any restrictions to the weights, similar to those proposed in the initial two-stage DEA model. Thus, methodological limitations might limit the value of the final results.
Consequently, there is the need to have an indicator of sustainability that will incorporate all these different perceptions that may arise—where perceptions mean different DEA and BoD variations and/or different combinations of inputs, intermediate measures and outputs—and at the same time limit the impact of methodological limitations. Such an indicator would be useful in policy design (and policy making in general) because, as Foster and Sen [
88] proposed, uniqueness is not a prerequisite to make agreed judgments. Hence, the proposed computational framework is based on this principle, and it consists of the following steps:
Step 1: Define different perceptions of sustainability and for each perception:
- (a)
Define how many sub-indicators will be entailed in this perception’s sustainability index;
- (b)
Define the inputs, intermediate measures and outputs that each sub-indicator will entail;
- (c)
Repeat for all perceptions.
Step 2: Define the variation of DEA that will calculate the value of the sub-indicators.
- (a)
Calculate the sub-indicators;
- (b)
Calculate the perception’s sustainability index using model (23)–(25);
- (c)
Once all sustainability indices for all perceptions are calculated, calculate the mean value for each country/DMU.
Step 3: Use machine learning to gain insights into the sustainability of each country under different perceptions.
Figure 3 below illustrates the proposed computational framework.
Consequently, by blending DEA with ML, the available data and analyses are expanded, which contributes to investigating the topic under study (thus implicitly adding new layers to the initial problem), meaning that greater insights are revealed. Furthermore, the absence of a unique solution created by the proposed alternative two-stage model of Equations (11)–(20) can be considered a methodological limitation; however, the issue becomes not of central importance per se, since in the context of the current paper, the model will be used repeatedly and with different data to generate different results, in accordance with the philosophy of Exploratory Modeling and Analysis, where methodological limitations lose their impact from the generation of numerous results under different assumptions. Hence, the exploratory framework offers not only a slight deviation from the typical way that DEA is used, but also a complementary research avenue on the issues of interpretability and transparency of algorithms and/or quantitative methods: by blending methodologies under a multi-perspective design, algorithms become more inclusive and democratic (in the sense that the Benefit-of-the-Doubt notion inherent in the aforementioned DEA formulations is further enriched). Hence, decision support can take a step towards the generation of collective knowledge that includes different values, perceptions and dimensions.
3.3. Illustration of the Proposed DEA-ML Computational Framework
Step 1: In the context of the current paper, three types of Economic (with measures including, for example, Total Labor Force, Gross Fixed Capital at current prices as inputs, GDP per capita as intermediate measures and Median equivalized net income and Final consumption expenditure of households as outputs), three types of Environmental (with measures including, for example, Gross Electricity Consumption as inputs, Energy Consumption as intermediate measures and Greenhouse Gas Emissions as outputs), three types of Social (with measures including, for example, Overall life satisfaction, percentage of females in total labor population as outputs) and two types of Research and Development (R&D) (with measures including, for example, Intramural R&D expenditures, Patent applications to the EPO as outputs) dimensions are defined. These 11 different types of dimensions are combined in all the possible combinations of three and four dimensions, resulting in 135 different perceptions of sustainability. Consequently, in the context of the current paper, the choice of parameters for the models becomes secondary in importance, with the purposing of reducing the bias of the analyst or decision maker and the methodological limitations of DEA. All the parameters/variables that are used in the calculations along with summary statistics are presented in
Appendix B.
Step 2: Each of these 135 perceptions are used with the proposed DEA model that is described by Equations (11)–(20) and (23)–(25). The mean sustainability of the countries with the proposed DEA variation is displayed in
Table 4 below:
Hence, the inclusion of different perceptions alters the results that were illustrated in
Table 3. Under multiple perceptions, Malta, Latvia and Luxemburg have the highest sustainability compared to the rest of the countries. Moreover, with all the different variations of sub-indicators, there are countries for which the mean sustainability falls below 0.5, such as the Czech Republic, Ireland, Italy and the Netherlands. Finally, there are no countries for which the mean sustainability increased with the inclusion of different parameters; only Malta managed to keep the sustainability at the value of one in both cases.
Figure 4 below illustrates the results of the 135 calculations of sustainability on violin plots.
The y axis indicates the measurement of sustainability, while the x axis indicates the distribution of the sustainability indices that were calculated; wider sections of the density indicate that there is a higher probability that data points will take the given value, while narrow sections indicate lower probability.
The first aspect to observe is that Latvia concentrates the majority of their values on the upper side of the violin plot, while Malta has a constant value of one for all calculations, indicating that compared to the rest, the two countries have a high sustainability no matter the perception; thus, the robustness of the conclusion increases. For the rest of the countries, the different perceptions create different situations for their sustainability. For example, Greece has a mean sustainability of 0.57; however, its values can change depending to the perception from 0.45 to 0.75, with the majority of the values concentrated between 0.5 and 0.6. Thus, the sustainability of Greece changes with different perceptions in a significant way, weakening the declaration of any robust conclusions.
Apart from the calculation of the sustainability indices under different perceptions with the proposed two-stage DEA, Step 2 of the computational framework includes the use of different variations of DEA. In the present application, the classic two-stage model of Chen et al. [
70], and an adaptation of the Constant Returns to Scale (CRS) DEA [
16] and of the Variable Returns to Scale (VRS) [
17] are used (along with the proposed variation). For the last two DEA variations, the classic DEA models are used in a chained way to accommodate the two-stage nature of the models. More specifically, each combination of inputs, intermediate measures and outputs is used in the chained way of the classic one-stage models as follows: the efficiency of the first stage is calculated with the inputs and the intermediate measures as outputs, using either CRS or VRS DEA. The efficiency of the second stage is calculated with the intermediate measures as inputs and the outputs, using (similar to the first stage) either CRS or VRS DEA. The sustainability index is calculated by multiplying the efficiencies of the two stages. Finally, the sustainability index of each perception is calculated using the BoD model [
83] (
Table 5).
The inclusion of different variations of DEA (which can be chosen by the analyst and/or the policy maker) with different combinations of inputs and outputs increases the robustness of the results, since many sustainability indices will be calculated that can capture different perceptions both methodologically (which method is the more “correct”) and context wise (which combination of inputs, outputs and intermediates is the more “correct”).
As can be observed in
Table 5, the mean sustainability changes again, indicating that the methodological framework that is used matters in the calculation of the final index. In the current illustration, there are countries where the mean sustainability increases with the inclusion of other DEA variations (such as Belgium, Luxemburg and the Netherlands), others where it is almost the same (such as Malta) and those for which the mean sustainability decreases (the rest of the countries). Moreover, the Spearman correlation coefficient for the two mean sustainability indices of
Table 4 and
Table 5 was calculated and found to be equal to 0.752, which indicates a strong positive correlation between the two indices.
These changes are also mirrored in the violin plots of the sustainability, displayed on
Figure 5.
Focusing again on the example of Greece, there are combinations of sub-indicators and DEA variations that produce high sustainability indices. Furthermore, the number of values below the mean increases with all DEA variations.
Step 3: The final step of the proposed computational framework is to use machine learning techniques in the results of the above computations with the purpose of revealing insights into how the sustainability of countries behaves under different perceptions. For the current paper, the Classification and Regression Decision Trees (CART) are used; since they are not computationally costly, they can be used as tools to communicate with non-experts and offer deep interpretational capabilities [
89]. Consequently, the use of Machine Learning assists in identifying those features that affect the calculations of sustainability under different perceptions and methodological frameworks. However, CART trees tend to overfit the data to their training set and are considered weak learners [
90], and for that reason, an additional ML technique will be used: boosting regression [
91]. Boosting regression is considered a slow learner, where each tree is generated using information from previous ones [
92]. Moreover, the technique will also reveal the relative influence of the individual sub-indicator to the index of sustainability, which could provide further insights into the analysis of the results. It is more robust than CART trees, but this robustness comes at the detriment of intuitive communication capabilities that are the main characteristic of CART trees. Consequently, the use of the two Machine Learning techniques will limit the methodological weaknesses of each method, while providing results and insights that are robust and independent of the technique used. Following the logic of the previous steps, the focus will be on Greece.
Figure 6 below illustrates the CART tree of Greece, along with the relative influence of the sub-indicators that were used in all perceptions.
As can be observed, the overall environmental performance is the sub-indicator that influences the overall sustainability the most. Furthermore, the CART tree illustrates that when the environmental performance of the second stage is larger than 0.59 and the overall environmental performance is missing, the sustainability of Greece takes its lowest values, further supporting the importance of the sub-indicator.
The final part of the analysis is to perform data mining on the countries when all DEA variations are used. For Greece (
Figure 7), the most important sub-indicators become those of the overall economic one and the one representing research and development. From the CART tree, if the overall economic performance is lower than 0.045 and the overall research and development index is lower than 0.59, the sustainability of Greece (under all DEA variations) has its lowest values.
4. Conclusions
The purpose of this paper was to propose a computational framework with a twofold contribution: at its initial phase, it uses a two-stage Data Envelopment Analysis model with an alternative optimization metric that attempts to intervene on the weights of the inputs, intermediate measures and outputs to better reflect their importance for the DMUs. The model takes advantage of deviational variables to handle the variations attributed to the weights distribution. The deviational variables provide a vehicle of interventions on the weights distribution through the goal programming formulation inherent to DEA. The model was used to calculate the environmental performance of EU countries and a comparison was provided with the two-stage variations of Chen et al. [
70]. The results illustrate that the two variations share similarities but also notable differences for environmental performances, and obviously there are changes in the rankings. This is attributed to the fact that the alternative DEA variation uses a different optimization metric through the additional variables that impose limitations on the distributions of the calculated weights.
The second area of contribution of the framework is the integration of a computational stage, which attempts to incorporate different perceptions (that is, different combinations of inputs and outputs) and apply it in the measurement of sustainability of the EU 28 countries using machine learning techniques. This is extremely useful and important especially in the case of multi-dimensional constructs such as sustainability. The overall computational stage relies on the use of multi-level Data Envelopment Analysis in combination with classic DEA variations and on the application of these models for different combinations of inputs, intermediate measures and outputs that represent different perceptions of what sustainability is. Finally, the exploratory analysis on the outcomes with the use of Machine Learning methodologies such as CART decision trees concludes the computational part of the paper by proposing sustainability paths according to each country’s strengths and weaknesses as it is further summarized in the following paragraph.
In this direction, it is worth pointing out that this framework follows the school of thought of Exploratory Modeling and Analysis that supports the use of models and quantitative methods in an exploratory way, not to predict or monitor policy cycles accurately (which can be considered impossible) but to gain insights by incorporating different perceptions and methodological approaches towards the same problem, thus increasing the robustness of the results [
93]. In the current paper, we applied this approach in the measurement of sustainability of EU 28 countries. Concretely, the computational experiments illustrated that the different perceptions of how sustainability is measured, and the use of different DEA variations (hence different methodological frameworks) affect the final results.
Finally, the blend of DEA with machine learning (applied on the results of DEA for the various scenarios) revealed insights on the areas to which policy makers could direct investments to increase sustainability. In addition, the ML application contributed to the identification of the most important features of sustainability for the various countries, something that could have direct implications in the area of EU policy-making: for example, countries that share similar features that drive the behavior of sustainability could be grouped together in clusters and policies, laws, regulations, etc., could be adapted to those clusters in order to boost the particular features that would increase their sustainability. As a result, policy making has the potential to become customized (adapted to the specifics of each group) without missing its overall and principal theme of pursuing sustainable development. This adaptive and adaptable policy-making could be of great assistance, especially when new countries are negotiating their entry to the Union; based on the features that affect the sustainability of the new countries, they could follow the regulations and laws of the appropriate cluster. Finally, the inclusion of new layers and perceptions renders the algorithms more inclusive and participatory, increasing their transparency, thus improving the trust in the final results.
However, the proposed framework is not without limitations. Regarding the definition and/or methodological framework for sustainability, a new approach could be taken: a bottom-up approach, where scientists propose a unified methodological and/or computational framework that attempts to mitigate the limitations of individual methods and integrates different and diverse definitions of sustainability into the same measurement.
Moreover, the addition of this new layer means that the process becomes more computationally costly and new conceptual questions arise; for example, when is it valid to break the inner loop, stop adding new perceptions and report the conclusions? How many new perceptions are necessary to get a clearer picture? Finally, the proposed methodological framework relies on an intrinsic assumption that the majority of perceptions drives the measure of sustainability towards its “real value”. However, this is not always the case, since the notion of sustainability constantly evolves, meaning that perceptions that currently represent the minority in the calculations could become more prevalent in the future. Hence, the proposed computational framework could be enriched further with notions and algorithms that represent values more clearly. Of course, such an inclusion is not limited to the current framework but is a problem that is central to the overall research of the Artificial Intelligence community.
Such questions will drive future research efforts of the current study. Further directions of research include the development of a user interface that could be used by non-experts, and the inclusion of supplementary variations of DEA (for example, the VRS variation of the proposed two-stage model described by Equations (11)–(20), or a version of the BoD model with weight restrictions), the generation of additional sub-indicators along with various data sources. Finally, the framework could be enriched with methods other than DEA which would allow the Machine Learning techniques to identify not only the differences in the context (sub-indicators) but also in the method that was used.