Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART)

Nilashi, Mehrbakhsh; Asadi, Shahla; Abumalloh, Rabab Ali; Samad, Sarminah; Ghabban, Fahad; Supriyanto, Eko; Osman, Reem

doi:10.3390/su13073870

Open AccessArticle

Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART)

by

Mehrbakhsh Nilashi

^1,*

,

Shahla Asadi

²

,

Rabab Ali Abumalloh

³,

Sarminah Samad

⁴,

Fahad Ghabban

⁵,

Eko Supriyanto

¹ and

Reem Osman

³

¹

School of Biomedical Engineering and Health Sciences, Faculty of Engineering, Universiti Teknologi Malaysia, Skudai 81310, Johor, Malaysia

²

Centre of Software Technology and Management, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia

³

Computer Department, Community College, Imam Abdulrahman Bin Faisal University, Dammam 34212, Saudi Arabia

⁴

Department of Business Administration, College of Business and Administration, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia

⁵

Faculty of Computer Science and Engineering, Information System Department, Taibah University, Madinah 41411, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sustainability 2021, 13(7), 3870; https://doi.org/10.3390/su13073870

Submission received: 6 February 2021 / Revised: 15 March 2021 / Accepted: 27 March 2021 / Published: 31 March 2021

(This article belongs to the Special Issue Data Analytics and Predictive Analytics for Sustainable Development)

Download

Browse Figures

Versions Notes

Abstract

:

This study aims to develop a new approach based on machine learning techniques to assess sustainability performance. Two main dimensions of sustainability, ecological sustainability, and human sustainability, were considered in this study. A set of sustainability indicators was used, and the research method in this study was developed using cluster analysis and prediction learning techniques. A Self-Organizing Map (SOM) was applied for data clustering, while Classification and Regression Trees (CART) were applied to assess sustainability performance. The proposed method was evaluated through Sustainability Assessment by Fuzzy Evaluation (SAFE) dataset, which comprises various indicators of sustainability performance in 128 countries. Eight clusters from the data were found through the SOM clustering technique. A prediction model was found in each cluster through the CART technique. In addition, an ensemble of CART was constructed in each cluster of SOM to increase the prediction accuracy of CART. All prediction models were assessed through the adjusted coefficient of determination approach. The results demonstrated that the prediction accuracy values were high in all CART models. The results indicated that the method developed by ensembles of CART and clustering provide higher prediction accuracy than individual CART models. The main advantage of integrating the proposed method is its ability to automate decision rules from big data for prediction models. The method proposed in this study could be implemented as an effective tool for sustainability performance assessment.

Keywords:

Classification and Regression Trees (CART); clustering; decision making; ensemble learning; Self-Organizing Map (SOM); Sustainability Assessment by Fuzzy Evaluation (SAFE); sustainability assessment

1. Introduction

The term sustainability, which means “to hold up or support”, was emerged in the 18th century for forest management issues [1]. In recent decades, sustainability has concentrated on various fields, such as the environment, agriculture, and social sciences. Sustainability assessment has played an important role in the improvement of the decision making process. This process mainly involves intragenerational and intergenerational considerations, which lead to (1) enhanced monitoring and communication of results, (2) a supported constructive interaction among stakeholders, (3) an integration of sustainability spheres, and (4) a consideration of their interdependencies.

Sustainability assessment is an important task [2,3,4,5] that is mainly conducted based on three main aspects of sustainability, environment, economy, and society. The efficient utilization of environmental resources is a basic goal of environmental sustainability. Considering economic sustainability, financial costs and benefits are important factors [6]. Meanwhile, social sustainability concentrates on individuals’ well-being [7].

In recent years, various assessment tools have been developed to evaluate aspects or pillars of sustainability in several contexts such as rice production [8], fashion business [9], clean technological innovation [10], and wastewater reuse [11]. The use of pillars of sustainability could vary due to the use of different sustainability tools. Several studies took all dimensions of sustainability (environmental, economic, and social) into account [12,13], while some other studies only concentrated on the economic and environmental aspects [9]. Several studies placed their focus solely on the environmental aspects [14,15,16]. Other studies concentrated on the economic aspects of sustainability [17,18].

Janeiro and Patel [19] defined sustainability assessment as an issue related to Multi-Criteria Decision Making (MCDM). Multi-Criteria Decision Analysis (MCDA) methods have been widely used for sustainability assessment. Considering sustainability pillars, these techniques are mainly used to determine the best alternatives for policymaking. Ness et al. [20] presented a classification of sustainability assessment tools based on product-related assessments, integrated assessments, and non-integrated indicator-based assessments. In a study by Cinelli and Coles [14], a comprehensive analysis was conducted of MCDA applied-methods for sustainability assessment. As a result, it was found that Dominance Based Rough Set Approach (DRSA), Preference Ranking Organization Method for Enrichment of Evaluations (PROMETHEE), ELECTRE, and Multi-Attribute Utility Theory (MAUT) could be used to manage uncertain information through the definition and use of thresholds and probability distributions.

Fuzzy logic can be used as a natural technical tool to assess sustainability. This technique is effective in emulating individuals’ skills and managing vague situations. Furthermore, it is also capable of managing complex and polymorphous concepts. Compared to traditional mathematical approaches, fuzzy logic is distinguished by its ability to utilize linguistic variables. In this technique, knowledge is represented by the IF-THEN linguistic rules. Following that, a fuzzification technique is implemented to transform real values into linguistic values. Identifying the IF-THEN rules is essential in designing assessment systems through this technique. These fuzzy rules are used during the fuzzy reasoning process of the system. The final output of the system is obtained through a defuzzification technique.

Sustainability Assessment by Fuzzy Evaluation (SAFE) [21] was developed as a fuzzy rule-based system in measuring the overall sustainability of countries. Based on the basic indicators of sustainability, it focuses on two main dimensions, namely Human Sustainability and Ecological Sustainability. The initial SAFE model has been used to investigate sustainability problems and has been enhanced by other researchers, including Kouloumpis et al. [22], Kouikoglou and Phillis [23], and Andriantiatsaholiniaina et al. [24]. The fuzzy logic approach was used in SAFE, where the dimensions in this model comprise hierarchical Fuzzy Inference Systems (FISs). Furthermore, SAFE consists of 75 inputs. The fuzzy rules discovered from SAFE data are used to obtain the final output in each dimension and overall sustainability. The overall sustainability index could be seen in [0, 1]. Notably, SAFE is a flexible technique as it accepts any number of inputs. In respect of the basic sustainability indicators, this technique could also manage several types of information, such as quantitative and qualitative information. The number of inputs in SAFE plays an important role in measuring the level of sustainability through the number of fuzzy rules. Additionally, through using SAFE, the overall sustainability of a country is identified using a combination of two main dimensions, namely societal/human sustainability and ecological sustainability.

SAFE is a comprehensive assessment system as it considers the main elements of ecological sustainability and societal/human sustainability, which are also known as indicators. These elements are land integrity, economic welfare, biodiversity, political aspects, health, water quality, air quality, and education. Furthermore, through the use of more elementary variables including pressure indicators, response, and state, the evaluation of these indicators is conducted. Due to time and human resources, a small subset of indicators is usually taken into account to optimize sustainability. Furthermore, information regarding the SAFE model could be found on the website http://www.sustainability.tuc.gr/ (accessed on 4 December 2019).

This study aims to extend previous literature on sustainability assessment by presenting a new method that uses machine learning techniques. Specifically, two types of machine learning techniques, namely supervised and unsupervised learning techniques, were applied to measure the sustainability performance of the countries. The main reason for developing the method by applying these techniques was its ability in automatic decision rules discovery from the data for the prediction models. Automatic decision rules discovery regarding sustainability is important, as the manual construction of the prediction models from the data is a challenging process due to the data’s complex nature. In addition, without the incorporation of automatic learning techniques, it would be time-consuming to manually determine sustainability performance from a large set of data. It was also found that the use of supervised and unsupervised learning techniques could present highly effective outcomes of measuring countries’ sustainability performance. The outcome of this research can address the shortcomings of previous methods and enhance prediction accuracy.

Hence, in this research, a new method to measure the sustainability performance of countries was implemented, through using Self-Organizing Map (SOM) and Classification and Regression Trees (CART). SAFE data were used to evaluate this method. Following is a summary of the contributions of this research:

A new method to assess sustainability performance was implemented, where machine learning techniques were used. In contrast to previous studies on sustainability assessment which only relied on knowledge-based approaches, unsupervised learning and supervised learning techniques were used in this study to assess sustainability performance.
For an improved efficiency of the sustainability assessment, a clustering technique was applied to construct the groups of data which included similar cases based on sustainability features. In addition, this technique was based on SOM, a neural network approach that is used to identify the clusters of data for the assessment of sustainability performance.
A supervised learning technique was implemented to construct prediction models. These models were used to determine the level of sustainability. Furthermore, the CART technique, which is based on the regression and classification approaches, was applied so that an accurate evaluation of sustainability performance through a set of real-world data was gained. The CART models were developed for ensemble learning. To the best of our knowledge, ensemble learning approaches have not been used extensively to assess sustainability performance.
The proposed method was evaluated on a real-world dataset, which involved the data regarding sustainability assessment in 128 countries. The dataset consisted of two main dimensions of sustainability, namely human sustainability and ecological sustainability within a comprehensive set of indicators.

Overall, we hypothesize that the integration of clustering and supervised learning techniques with the aid of an ensemble learning approach can enhance the efficiency of the assessment systems for sustainability performance in terms of prediction accuracy.

In this article, Section 2 elaborates on studies related to sustainability assessment tools. The research method is presented in Section 3. This is followed by Section 4, in which data analysis and the method of sustainability assessment are presented. The discussions and recommendations from the research are provided in Section 5. This article ends with a conclusion, which is presented in Section 6.

2. Literature Review

Sustainability assessment has been investigated by several studies from different theoretical and methodological perspectives. In the rest of this literature review, we will summarize many of the key studies that have influenced the sue of big data, decision analysis, and sustainability assessment.

In the study by Wiek and Binder [25], a decision support tool for sustainability assessment was used. Through this tool, the systemic knowledge and normative aspects were taken into account to achieve sustainable development in the city-regions. Meanwhile, the authors in [26] developed a sustainability assessment tool based on a multi-criteria approach for the energy power system. Through this tool, several indicators were used, such as the economy, environmental, social, and resource indicators. Zarghami and Azemati [27] used Fuzzy Analytic Hierarchy Process (FAHP) to develop a sustainability assessment tool. This process involved Leadership in Energy and Environmental Design (LEED), Building Research Establishment Environmental Assessment Method (BREEAM), Comprehensive Assessment System for Built Environment Efficiency (CASBEE), and Sustainable Building Tool (SBTool) indicators in the assessment system. The research evaluated five categories of international assessment tools, water efficiency, materials and resources, energy efficiency, sustainable site, and quality of the indoor environment. As a result, it was found that Energy Efficiency was the most prominent category of the sustainability assessment tool in Iran. The authors in [5] proposed ANFIS as an approach to assess sustainability levels in countries. This approach was based on the sustainability dimensions and indicators that were used in SAFE.

For sustainable supplier selection, Amindoust et al. [28] developed a ranking model based on a fuzzy inference system. Economic, environmental, and social indicators were used in the assessment model in three stages of evaluation. To prove the feasibility of the method, an illustrative example for a company with five candidate suppliers was presented. Meanwhile, the authors in [29] used field scale indicators and fuzzy logic to evaluate the impacts of pesticides and tillage on agroecosystems. Toxicity and the dose applied were regarded as the main variables in deciding the final influence of a pesticide application, and Tillage Impact (TI) was evaluated in terms of its influence on the quantity of stubble left after tillage processes and the soil aggregates’ stability. As a result, it was found that fuzzy logic was beneficial for effective environmental analysis and evaluation. In a case study that was conducted by Azadi et al. [30] in Southwest Iran, fuzzy logic was applied to manage the vague and uncertain concept of sustainability. Triangular and trapezoidal membership functions were applied to construct the membership functions of the prediction model. In addition, 27 fuzzy rules were used to determine the overall equilibrium.

The authors in [31] developed a method through ensembles of neuro-fuzzy techniques for measuring country sustainability performance. They used SAFE model criteria to assess sustainability performance. The authors in [32] used the SAFE model to assess the sustainability performance of 128 countries. The study investigated the link between ecological sustainability, human sustainability, and overall sustainability performance by utilizing the decision rules. The authors used fuzzy clustering and decision trees for measuring country sustainability performance. The result of this study demonstrated that the hybrid approach that combines clustering and prediction machine learning techniques can improve the prediction accuracy of ANFIS. The authors in [33] used fuzzy Decision Making Trial And Evaluation Laboratory (DEMATEL) for the assessment of sustainability indicators of green building manufacturing. Research outcomes presented that energy efficiency and quality of the indoor environment are the most significant indicators. On the other hand, innovation and water efficiency are the least significant indices in evaluating green buildings in Malaysia. In the study by Li et al. [34], the authors conducted a study to assess the sustainability of hydrogen production technologies through the MCDM approach. They used objective grey relational analysis and the DEMATEL method to identify the criteria weights. They used DEMATEL to consider the causal relationships among criteria. In the study by Ren et al. [35], a two-stage MCDM method was developed for sustainability assessment of hydrogen production technologies. Five aspects were adopted for the sustainability evaluation of HPTs: political, technological, social, environmental, and economic. The method is developed using the fuzzy best-worst method and fuzzy TOPSIS. The aim was to find the importance level of factors in the proposed model. The authors in [36] developed a method using an advanced hybrid MCDM approach to evaluate the sustainable hydrogen production options. They used AHP as an MCDM technique to determine the weights of the criteria and sub-indicators in the model. The outcome of the study revealed that the wind electrolysis approach is the answer to sustainable hydrogen-producing followed by the biomass gasification method.

Streimikiene and Skulskis [37] developed a method for sustainability assessment in the green building context. They used the interval TOPSIS method for sustainability assessment. The outcomes of multicriteria sustainability evaluation of inorganic and organic building insulation materials indicated that sheep wool and recycled glass are the most desirable choices in several contexts. The authors in [38] used the PROMETHEE method to assess the sustainability of large-scale composting technologies. They used social, economic, environmental, and technical criteria in the sustainability assessment. The outcomes of the study indicated that reactor techniques are more sustainable than enclosed techniques, which are ranked as more sustainable than open technologies. The results also indicated that the rotating drum is the most sustainable composting technique among the economic, environmental, technical, and social aspects. Akhanova and Nadeem [39] conducted a study for building sustainability assessment through an MCDM technique. They used step-wise assessment ratio analysis for weight allocation. The research indicated the most general classes of globally accepted tools, among which, site selection, materials, energy efficiency, quality of the indoor environment, water efficiency, and waste.

The authors in [40] used multiple criteria decision analysis for assessing national energy sustainability. The proposed approach was based on various energy sustainability indicators that entail three main aspects: energy system, human system, and environment. The authors used the ROMETHEE method for the evaluation of the sustainability performance of 43 European countries. The result of their study was interesting. The authors found that there is a significant relationship between geographical and income groupings and energy sustainability performance. The authors in [41] conducted a study for sustainability performance through revised SAFE. In fact, the SAFE model was updated for the fourth time in this study. The aim was to perform a sensitivity analysis to show which indicators can improve sustainability the most. The study indicated that forest change, renewable energy production, corruption, and threatened species are the main important indicators globally. On the other hand, the CO2 emissions indicator is the most significant indicator in developed countries.

The authors in Amini, Rohani [42] developed a method for sustainability assessment of rice production system. They used fuzzy logic for sustainability assessment through agricultural and economic models. Various sources of energy and sustainability and environmental loading indices of rice were inspected. The outcome of the study confirmed that the rice indices are not adequate. The authors in [43] conducted a study for assessing global environmental sustainability through the unsupervised clustering approach. They used a self-organizing map as a clustering technique. Focusing on the environmental dimension of sustainability, the authors presented a novel framework to allow countries to reach informed decisions and define efficient directions. The authors in [44] used fuzzy logic for sustainability performance evaluation through the SAFE model. According to the results of their study, the major factors that influence sustainability were: energy use, terrestrial protected areas, and political rights issues of the pacific island countries. The authors in [45] developed an integrated model through MCDM for the sustainability performance assessment of insurance companies. A group of 4 social, 3 environmental, and 8 economic indices were utilized in this study. The indices were categorized into two sets to assess the companies focusing on the financial and managerial prospects. The authors used principal component analysis to cut the number of evaluation indices and the analytic hierarchy process to rank the indices. Asrol and Papilo [46] presented a machine learning model to evaluate the sustainability performance using a machine learning approach and focusing on the environmental dimension of sustainability of the bioenergy industry. In a study by Attia and Alphonsine [47], the economic, social, and ecological aspects were incorporated to define main performance indices for evaluating sustainable housing and to design a selection tool for student housing.

Based on previous literature on sustainability assessment, it was found that there is a limited number of studies for assessing sustainability performance by implementing clustering and prediction machine learning techniques. In addition, most of the studies on sustainability assessment relied on knowledge-based approaches. These approaches were based on fuzzy logic or MCDM techniques, where the experts’ knowledge and perspective were involved in the assessment. However, it is important to develop methods for the acquisition of a large set of data for this assessment which involves sustainability indicators and dimensions. Additionally, automatic data acquisition is not possible through the MCDM approaches. Overall, this is the main disadvantage of these approaches. In fact, the methods which are based on experts’ knowledge may not be efficient for large datasets as they require interventions from individuals to perform sustainability assessment. Accordingly, in this research we propose a new method for the assessment of sustainability performance using machine learning techniques. In the following section, we introduce the proposed method along with the techniques used in each step of data analysis for the performance assessment.

3. Methodology

Several methods have been applied in previous studies that involved supervised machine learning techniques to measure country sustainability performance, still, the disadvantages of these methods were present when applied to large datasets. It was believed that the clustering techniques could be useful in managing large datasets in sustainability assessment systems. Among the clustering techniques, SOM was shown to be effective in clustering tasks. This study also aims to apply this clustering technique for the clustering tasks in the context of country sustainability. Through this clustering technique, the data were clustered into different classes for a more efficient prediction task. In conducting this task on the sustainability data, an effective supervised technique, CART, was implemented. This technique is based on regression and classification approaches [48]. In addition, an ensemble approach, Random Forest (RF), that relies on CART models is applied to each cluster of SOM. Thus, the proposed method tried to address the shortcoming of previous approaches and enhance the efficiency of assessment systems for sustainability performance in terms of prediction accuracy.

This study is the first to employ the SOM and CART techniques to assess country sustainability. It was believed that the combination of clustering and the learning techniques of the prediction machine could be an effective method of measuring sustainability performance. It could also alleviate the shortcomings of the previous methods and enhance prediction accuracy.

The machine learning technique, which was applied in this study, took two main components of the SAFE model into account, namely ecological sustainability (ECOS) and human sustainability (HUMS). There were four indicators focused by ECOS, namely “water quality (WATER)”, “land integrity (LAND)”, “air quality (AIR)”, and “biodiversity (BIOD)”. Meanwhile, HUMS took four indicators into account, namely “political aspects (POLICY)”, “economic welfare (WEALTH)”, “health (HEALTH)”, and “education (KNOW)”. Each of these indicators in ECOS and HUMS was measured through more elementary variables including Response (RE), State (ST), and Pressure (PR) indicators. These elementary variables were previously used in the SAFE model. Figure 1 displays the aforementioned components, indicators, and elementary variables. The Overall Sustainability (OSUS) was performed through the combination of results gained from different levels of the proposed model.

Figure 1 also presents the hybrid method of clustering and supervised prediction techniques. It could be seen from the figure that before the prediction task, the data should be clustered into different groups, where each cluster comprised similar data regarding sustainability performance. Furthermore, the method in this study included the CART in four levels of sustainability assessment. The second level was developed to take the dimensions of sustainability into account. Sustainability indicators were used for the assessment according to each dimension in the third level. In the fourth level of this assessment, sustainability was evaluated through elementary variables. In the final level, the computation of the overall sustainability of countries was done. Therefore, for each cluster, a total of 35 CART models were developed, and each CART output in the lower level was identified as the upper level’s input. Notably, the total number of prediction models was influenced by the number of clusters generated by the SOM technique.

3.1. CART

CART has been utilized effectively for regression problems as it discovers nonlinear relationships without variable transformations [48,49]. This method is widely used in finding the relationship between inputs and output in decision-making systems. In this method, through recursive binary partitioning, each decision tree in CART is constructed [50]. In addition, it has been shown that outliers have limited impacts on results. Furthermore, there is no significant impact of predictors’ collinearity on the accuracy in CART [51]. In the CART approach, the goal is to find (learn) the relationship between a set of predictor variables and a dependent variable through a learning algorithm that employs recursive portioning. Although CART is considered an accurate method for prediction and classification tasks, the ensemble of different decision trees through the bagging approach can present more effective results. Through bagging which is based on bootstrapping approach, repeatedly selection of random subsets of the training data is performed to develop multiple classification trees. This is called Random Forests which is an ensemble approach that relies on CART models [52,53]. The structure of the bagging ensemble model proposed in this study is shown in Figure 1.

3.2. SOM

Clustering plays an important role in developing prediction methods. We used SOM for data clustering [54,55,56]. The clusters were discovered in different map spaces, which allowed us to transform higher-dimensional input spaces into lower-dimensional map space. The goodness of clustering algorithm results was evaluated by a technique for final clustering size and map.

In SOM, the inputs in the dataset are projected onto the neural net, with connections between the neurons, in the cortical area. In fact, in SOM, output neurons of the model are interconnected in a lower-dimensional space within a defined neighborhood (see Figure 2). In SOM the following main steps are performed:

All data points x_j are compared with all nodes m_i to find the nearest node m_b which is called the best-matching unit (BMU) for each data point;
Each node m_i in the 2D space is updated to averages of the attracted data, including data located in a specified neighborhood σ;
Step 1 and Step 2 are repeated a specified number of times.

4. Data Analysis and Method Evaluation

This study aims to measure country sustainability performance by implementing the SOM and CART techniques through a set of input indicators in the SAFE dataset. SAFE data were used to assess the used method. Figure A1 and Figure A2 in Appendix A present the sustainability data of 128 countries based on ECOS and HUMS. It could be seen from the figures that, similar to ECOS and OSUS, HUMS and OSUS were correlated to each other. Furthermore, societal/human sustainability and ecological sustainability levels were also determined. The first step of this study was the clustering of data into several classes using SOM. In this process, different SOM sizes were tested, and the best number of clusters was selected based on the SOM map quality. To be specific, an attempt of using SOM 2 × 2, SOM 2 × 3, SOM 2 × 4, and SOM 3 × 4 was done for SOM clustering. Meanwhile, Figure 3 displays SOM 2 × 4 clustering results, as the accuracy value of SOM 2 × 4 (8 clusters) clustering was higher compared to the accuracy of other SOM.

In the next procedure of this research, prediction models were constructed using CART from the SAFE data by identifying the rules for decision-making. In fact, the use of CART allowed the identification of relationships between the inputs and outputs. Furthermore, CART was used to determine the relationship between X and Y based on Y = f(X₁, X₂, …, X_n). The sustainability of countries was measured through four levels of CART, as shown in Figure 1. The information about the indicators of the SAFE model could be seen on the website http://www.sustainability.tuc.gr/ (accessed on 4 December 2019). Figure 1 also shows that each dimension in the SAFE model consists of several indicators, which are considered as the inputs for the CART models of this study. The main relationships between the inputs and outputs as per Figure 1 are presented in Equations (1)–(11). These equations will be identified from the data through 10-fold cross-validation in CART models.

Y_{OSUS} = f (ECOS, HUMS)

(1)

Y_{ECOS} = f (BIOD, LAND, AIR, WATER)

(2)

Y_{HUMS} = f (HEALTH, KNOW, WEALTH, POLICY)

(3)

Y_{HEALTH} = f (STHEALTH, REHEALTH, PRHEALTH)

(4)

Y_{KNOW} = f (STKNOW, REKNOW, PRKNOW)

(5)

Y_{WEALTH} = f (STWEALTH, REWEALTH, PRWEALTH)

(6)

Y_{POLICY} = f (STPOLICY, REPOLICY, PRPOLICY)

(7)

Y_{BIOD} = f (STBIOD, REBIOD, PRBIOD)

(8)

Y_{LAND} = f (STLAND, RELAND, PRLAND)

(9)

Y_{AIR} = f (STAIR, REAIR, PRAIR)

(10)

Y_{WATER} = f (STWATER, REWATER, PRWATER)

(11)

CART was applied to each cluster generated by SOM to determine the relationship between the inputs and outputs of SAFE data. Through this relationship, the overall sustainability performance of a country was assessed. Additionally, the decision trees for sustainability performance were induced from SAFE data. The identification of the rules was done to determine country sustainability performance. Essentially, these decision rules are important as they are used in the proposed system for sustainability ranking. In Figure 4 and Table 1 the decision trees discovered from Cluster 1 are visualized. For other clusters, we present the decision trees in Table A1, Table A2, Table A3, Table A4, Table A5, Table A6 and Table A7 in Appendix B. It could be seen from the results that CART was useful for an effective generation of the decision rules for performance prediction. To be specific, these rules were automatically developed from the data, which could accurately predict the output based on the input. In addition, SOM was also useful for the effective development of the decision rules in each cluster as similar data regarding country sustainability were presented in each cluster. In respect to the first cluster of countries, namely Laos, India, Cambodia, Papua NG, Benin, Mali, Bangladesh, Niger, Pakistan, Yemen, Sudan, and Mauritania, eight decision rules were discovered from the data. These data were involved in OSUS prediction, which was based on ECOS and HUMS. In respect to the second cluster of countries, namely Gabon, Kenya, Malawi, Zambia, Nepal, Gambia, Rwanda, Congo, Mozambique, Guinea B, Burkina Faso, Côte d’Ivoire, Guinea, Angola, Chad, DR Congo, Burundi, Ethiopia, and Central African Rep., eight decision rules were discovered from the data. Similarly, these data were used for OSUS prediction based on ECOS and HUMS.

In this study, we used the coefficient of determination (

R_{adjusted}^{2})

to assess all CART models through a 10-fold cross-validation approach. The

R_{adjusted}^{2}

approach is presented in Equation (12).

R_{adjusted}^{2} = 1 - (1 - \frac{\sum_{i = 1}^{N} (A_{k} - {\bar{A}}_{m}) (P_{k} - {\bar{P}}_{m})}{\sqrt{\sum_{i = 1}^{N} {(A_{k} - {\bar{A}}_{m})}^{2}} \times \sqrt{{(P_{k} - {\bar{P}}_{m})}^{2}}}) (\frac{N - 1}{N - m - 1}) = 1 - (1 - R^{2}) (\frac{N - 1}{N - m - 1})

(12)

Based on the equation above,

N,

A_{k},

P_{k},

{\bar{A}}_{m}, {\bar{P}}_{m}, m

represent the number of observations, actual output, predicted value, actual mean value, predicted mean value, and the number of independent variables, respectively.

The accuracy of all CART models for the eight clusters of SOM could be seen from Figure 5a–h. From the results, it is noticeable that the CART’s prediction modules have provided high

R_{adjusted}^{2}

values in all clusters. Accordingly, this technique could be useful for effectively modeling the tools used for sustainability performance evaluation.

This research also applied the bagging approach as one of the most popular ensemble methods to obtain final results. This was performed through aggregating and bootstrap resampling methods. Through bagging bootstrapped, replicas of original data in the clusters were derived and with replacement from the training dataset, different training sub-datasets were randomly drawn. Accordingly, through this procedure, different prediction models were generated and applied for the prediction of the entire data from the subsets. Finally, using the aggregation approach [57], various estimated models were aggregated for final results. In this research, for ensemble learning, a CART model was built through a procedure that is repeated 20 times to get 20 individual forecast models in each cluster. Each of these prediction models was used to predict the output and finally, the linear combination of these predictions was used as the final prediction result. The results for SOM+Ensembles of CART, CART [32], Adaptive Neuro-Fuzzy Inference System (ANFIS) [32], Neural Network (NN) [32], Multiple Linear Regression (MLR) [32], Fuzzy C-Means + CART [32], and Fuzzy C-Means + ANFIS [32] techniques are shown in Table 2. The results of this study’s analysis revealed that the combination of SOM and CART techniques with the aid of ensemble learning resulted in a more superior performance compared to CART, ANFIS, NN, MLR, Fuzzy C-Means + CART, and Fuzzy C-Means + ANFIS in the measurement of country sustainability performance. As shown in Figure 6, the final results showed that minor differences were present between SAFE and the proposed method for the country sustainability rankings (see Table A8 in Appendix B).

5. Discussion and Managerial Implications

In recent years, sustainability assessment and management have become increasingly important. Accordingly, developing integrated and accurate tools to measure sustainability performance has been a challenging task. Several attempts have been made to develop measurement tools based on sustainability indicators to solve specific sustainability issues [31,37,40]. Furthermore, a growing number of approaches for sustainability assessment have been developed to support policy-makers and decision-makers to promote global sustainable development [58]. Among the approaches, machine learning techniques have been utilized effectively in developing methods to solve complex environmental issues. In line with sustainable development, a new scheme was developed based on the unsupervised (SOM) and supervised (CART) learning techniques with the aid of an ensemble learning approach for the assessment of the sustainability of countries. Country sustainability is considered a complex issue, particularly in the context of sustainable decision-making [59]. Moreover, an attempt to provide the prediction models for non-linear relationships between the sustainability indicators was also made.

One of the main advantages of the proposed method, which was confirmed in the results, is its ability to handle large datasets in sustainability assessment systems. The SOM method is capable of managing a significant number of tuples for different levels of sustainability assessment. Furthermore, it is a robust clustering technique, which is based on neural network learning used to identify similar groups from the data. In fact, based on the centroids of the clusters, new data could easily be located in a particular group they originate from. Particularly, the outcome of this study presented additional support to previous literature that indicated the effectiveness of using SOM to cluster large datasets in terms of sustainability assessment [43]. In respect of CART, this technique is capable of effectively identifying the non-linear relationships between the inputs and outputs of the complex models. This technique was applied in this study on the sustainability performance data in each cluster of SOM to effectively determine the relationship between the inputs and outputs. It was found that CART displayed a better accuracy when it was applied to the clustered data. The construction of the prediction models could be improved when the CART was combined with the SOM clustering. In addition, when presenting a new case for sustainability assessment, the method implemented in this study could identify the closest cluster through the cluster centroids and select the corresponding CART model to predict sustainability performance through a set of sustainability indicators.

This study extended the previous studies and provided a new solution for sustainability assessment. As presented in [31], previous approaches, which were based on the manual technique to discover fuzzy rules and determine the membership functions are time-consuming approaches. Compared to SAFE, the method implemented in this study could automatically determine the relationship between the inputs and outputs, which is effective in identifying the decision-making rules. This study’s proposed method could be complementary to the previous assessment models based on knowledge-based approaches. As the previous studies rely on fuzzy rules for sustainability performance assessment, the fuzzy rules can be extracted through the CART technique. As a result, the time complexity could be improved. Overall, the limitations of the previous methods should be addressed focusing on future studies for the rule induction module.

In this study, the indicators used in the SAFE model were emphasized. This study also took ecological sustainability and societal/human sustainability in the SAFE model into account. However, other indicators could also be included in the proposed assessment system to evaluate the country’s sustainability performance. The indicators highlighted in the literature were economic, environmental, social, resource, fuel, carbon steel, CO₂, SO₂, NOx, energy costs, investment, efficiency, job, and diversity indicators.

The method used in this study will benefit other assessment methods using the automatic acquisition of the data for large datasets. The indicators of sustainability are not constant in most cases, therefore, the methods which can accept new indicators possess higher efficiency compared to the method with fixed indicators. Accordingly, such methods can be effectively enhanced for real-world applications in sustainability development. This enhancement should be emphasized by the environmentalists, governmental authorities, and policy-makers of sustainability development so that more methods of sustainability performance evaluation could be developed.

6. Conclusions and Future Work

This study aims to develop a new method for measuring the sustainability performance of countries. The method proposed in this study was developed through clustering and prediction machine learning techniques. Furthermore, SOM and CART techniques were used to cluster the sustainability data and predict country sustainability performance. The CART models were also developed for ensemble learning. The SAFE dataset was used for the evaluation of the method. It was shown from this study’s analysis that clustering could improve the readability of the data and improve the CART technique in its prediction of sustainability performance. It was also found that when clustering the data, the CART model could effectively perform the prediction task in each group, which consisted of similar data regarding sustainability performance. Moreover, the ensembles of CART could enhance the prediction accuracy of individual CART models. The results of this study’s analysis were compared with the results of CART, ANFIS, NN, MLR, Fuzzy C-Means + CART, and Fuzzy C-Means + ANFIS techniques. It was revealed that the combination of SOM and CART techniques with the aid of ensemble learning resulted in a more superior performance compared to CART, ANFIS, NN, MLR, Fuzzy C-Means + CART, and Fuzzy C-Means + ANFIS in the measurement of country sustainability performance. The final results showed that minor differences were present between SAFE and the proposed method for the country sustainability rankings.

Several limitations were present in this study. First, two main dimensions of sustainability were taken into account for performance evaluation. Furthermore, the real-world dataset used in this study included a fixed number of indicators in each dimension. Therefore, it is suggested that the proposed method is evaluated on other datasets using different indicators of sustainability instead of the fixed number of indicators. Second, non-incremental CART was used for the assessment of country sustainability performance. The non-incremental CART was not capable of conducting online predictions of country sustainability performance. Specifically, the method developed by the CART technique is not capable of incremental learning of the models from the data. To construct the learning models, it needs to recompute all the training data, presenting a limitation of the proposed method. Essentially, large datasets require real-time prediction, and further updates on the prediction models must be made. This is followed by trained models, which are among the memory requirements. As a solution, the incremental version of CART (Crawford, 1989) may be considered as a more suitable approach to construct the perdition models. Additionally, with minimal computational burden, the combination of incremental CART with the clustering techniques would lead to improved performance of the sustainability assessment system. With all these points highlighted, the proposed method in this study could be further improved through incremental machine learning techniques. It is also recommended that the computation time of the proposed method and the complexity of the tree are investigated in future works. Furthermore, more studies using machine learning and big data decision analysis are needed to perform complex sustainability assessments at the country level. This future work will advance the field, help practitioners and policy makers, while also advancing our understanding where countries should focus efforts to be more sustainable.

Author Contributions

Conceptualization, M.N., S.A., R.A.A., F.G., E.S., S.S., and R.O.; methodology, M.N.; software, M.N.; validation, M.N., S.A., E.S., R.A.A., F.G., and S.S.; formal analysis, M.N.; investigation, M.N., S.S., R.A.A., F.G., and S.A.; resources, M.N.; data curation, M.N.; writing—original draft preparation, M.N., R.A.A.; writing—review and editing, M.N., S.A., R.A.A., E.S., F.G., R.O., and S.S.; visualization, M.N., S.S., R.A.A., F.G., S.A., and R.O.; supervision, M.N.; project administration, M.N.; funding acquisition, M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible repository. The data presented in this study are openly available in Ref. [21] Phillis, Y.A.; Grigoroudis, E.; Kouikoglou, V.S. Sustainability ranking and improvement of countries. Ecol. Econ. 2011, 70, 542–553, doi:10.1016/j.ecolecon.2010.09.037.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Visualizing the countries sustainability based on ECOS and OSUS.

Figure A2. Visualizing the countries sustainability based on HUMS and OSUS.

Appendix B

Table A1. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 2.

Decision Trees

HUMS < 0.1610
○
ECOS < 0.7420 then avg(OSUS) = 0.4150
○
ECOS ≥ 0.7420 then avg(OSUS) = 0.4010
HUMS ≥ 0.1610
○
HUMS < 0.2645
▪
HUMS < 0.2360 then avg(OSUS) = 0.4845
▪
HUMS ≥ 0.2360
▪
ECOS < 0.7495
▪
ECOS < 0.7415 then avg(OSUS) = 0.4925
▪
ECOS ≥ 0.7415 then avg(OSUS) = 0.4973
▪
ECOS ≥ 0.7495
▪
HUMS < 0.2555 then avg(OSUS) = 0.5000
▪
HUMS ≥ 0.2555 then avg(OSUS) = 0.5050
○
HUMS ≥ 0.2645
▪
HUMS < 0.3060
▪
HUMS < 0.2795 then avg(OSUS) = 0.5100
▪
HUMS ≥ 0.2795 then avg(OSUS) = 0.5150
▪
HUMS ≥ 0.3060 then avg(OSUS) = 0.5400

Table A2. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 3.

Decision Trees
ECOS < 0.7175 ○ ECOS < 0.6665 then avg(OSUS) = 0.4630 ○ ECOS ≥ 0.6665 ▪ HUMS < 0.2480 then avg(OSUS) = 0.4760 ▪ HUMS ≥ 0.2480 ▪ ECOS < 0.6925 then avg(OSUS) = 0.4820 ▪ ECOS ≥ 0.6925 then avg(OSUS) = 0.4810 ECOS ≥ 0.7175 then avg(OSUS) = 0.5020

Table A3. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 4.

Decision Trees

HUMS < 0.4975
○
HUMS < 0.4150 then avg(OSUS) = 0.5620
○
HUMS ≥ 0.4150
▪
ECOS < 0.7260 then avg(OSUS) = 0.5770
▪
ECOS ≥ 0.7260
▪
ECOS < 0.7335 then avg(OSUS) = 0.6130
▪
ECOS ≥ 0.7335 then avg(OSUS) = 0.5920
HUMS ≥ 0.4975
○
HUMS < 0.5960
▪
ECOS < 0.7400
▪
ECOS < 0.6660 then avg(OSUS) = 0.6190
▪
ECOS ≥ 0.6660
▪
ECOS < 0.6775 then avg(OSUS) = 0.5990
▪
ECOS ≥ 0.6775 then avg(OSUS) = 0.6125
▪
ECOS ≥ 0.7400
▪
HUMS < 0.5225 then avg(OSUS) = 0.6248
▪
HUMS ≥ 0.5225 then avg(OSUS) = 0.6475
○
HUMS ≥ 0.5960 then avg(OSUS) = 0.6810

Table A4. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 5.

Decision Trees
HUMS < 0.7220 ○ HUMS < 0.5775 ▪ ECOS < 0.5090 ▪ ECOS < 0.4905 then avg(OSUS) = 0.4910 ▪ ECOS ≥ 0.4905 then avg(OSUS) = 0.5050 ▪ ECOS ≥ 0.5090 then avg(OSUS) = 0.5230 ○ HUMS ≥ 0.5775 ▪ ECOS < 0.5010 then avg(OSUS) = 0.5630 ▪ ECOS ≥ 0.5010 then avg(OSUS) = 0.5580 HUMS ≥ 0.7220 ○ ECOS < 0.5010 then avg(OSUS) = 0.6210 ○ ECOS ≥ 0.5010 then avg(OSUS) = 0.6260

Table A5. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 6.

Decision Trees
HUMS < 0.6055 ○ ECOS < 0.5780 then avg(OSUS) = 0.5410 ○ ECOS ≥ 0.5780 then avg(OSUS) = 0.5560 HUMS ≥ 0.6055 ○ ECOS < 0.5830 ▪ ECOS < 0.5605 then avg(OSUS) = 0.6450 ▪ ECOS ≥ 0.5605 then avg(OSUS) = 0.6240 ○ ECOS ≥ 0.5830 ▪ ECOS < 0.6095 then avg(OSUS) = 0.6680 ▪ ECOS ≥ 0.6095 then avg(OSUS) = 0.6780

Table A6. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 7.

Decision Trees
ECOS < 0.7475 ○ ECOS < 0.7285 then avg(OSUS) = 0.8620 ○ ECOS ≥ 0.7285 then avg(OSUS) = 0.8510 ECOS ≥ 0.7475 ○ HUMS < 0.9970 ▪ ECOS < 0.8190 then avg(OSUS) = 0.8800 ▪ ECOS ≥ 0.8190 then avg(OSUS) = 0.8630 ○ HUMS ≥ 0.9970 ▪ ECOS < 0.8230 ▪ ECOS < 0.7890 then avg(OSUS) = 0.8930 ▪ ECOS ≥ 0.7890 then avg(OSUS) = 0.8960 ▪ ECOS ≥ 0.8230 then avg(OSUS) = 0.9270

Table A7. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 8.

Decision Trees
HUMS < 0.8145 ○ HUMS < 0.7155 then avg(OSUS) = 0.7048 ○ HUMS ≥ 0.7155 ▪ ECOS < 0.7190 then avg(OSUS) = 0.7280 ▪ ECOS ≥ 0.7190 ▪ HUMS < 0.7410 then avg(OSUS) = 0.7377 ▪ HUMS ≥ 0.7410 ▪ ECOS < 0.7635 ▪ HUMS < 0.7665 then avg(OSUS) = 0.7505 ▪ HUMS ≥ 0.7665 then avg(OSUS) = 0.7600 ▪ ECOS ≥ 0.7635 then avg(OSUS) = 0.7610 HUMS ≥ 0.8145 then avg(OSUS) = 0.7975

Table A8. Countries sustainability performance ranking by SAFE, ANFIS, and SOM-CART.

Country	SAFE	SOM-CART	SOM-Ensemble of CART	ANFIS	Difference (ANFIS and SAFE)	Difference (SOM-CART and SAFE)	Difference (SOM-Ensemble of CART and SAFE)
Switzerland	2	2	2	3	1	0	0
Sweden	3	3	3	2	−1	0	0
Finland	5	5	5	6	1	0	0
Denmark	6	6	6	5	−1	0	0
Norway	4	4	4	4	0	0	0
Austria	7	7	7	8	1	0	0
France	10	10	10	11	1	0	0
Netherlands	8	8	8	7	−1	0	0
Germany	1	1	1	1	0	0	0
Belgium	9	9	9	9	0	0	0
Canada	13	13	13	13	0	0	0
New Zealand	11	11	11	10	−1	0	0
Latvia	18	17	17	21	3	−1	−1
Estonia	25	26	25	23	−2	1	0
Lithuania	15	16	15	16	1	1	0
Italy	17	18	17	19	2	1	0
Slovakia	20	20	20	17	−3	0	0
Czech Rep.	16	15	15	15	−1	−1	−1
Australia	14	15	15	14	0	1	1
Portugal	24	24	24	22	−2	0	0
Croatia	29	29	29	28	−1	0	0
UK	12	12	12	12	0	0	0
Poland	23	23	23	24	1	0	0
Hungary	33	32	32	32	−1	−1	−1
Greece	31	31	31	30	−1	0	0
Spain	21	21	21	25	4	0	0
Japan	28	28	28	31	3	0	0
Ireland	22	22	22	26	4	0	0
USA	32	33	33	33	1	1	1
Slovenia	19	19	19	18	−1	0	0
Uruguay	26	27	27	20	−6	1	1
Chile	45	45	45	44	−1	0	0
Bulgaria	36	37	37	37	1	1	1
Georgia	42	42	42	42	0	0	0
Israel	49	49	49	48	−1	0	0
South Korea	48	48	48	53	5	0	0
Panama	43	43	43	46	3	0	0
Malaysia	57	57	57	55	−2	0	0
Belarus	27	25	26	27	0	−2	−1
Albania	44	44	44	43	−1	0	0
Bolivia	56	55	55	57	1	−1	−1
Tunisia	55	56	55	58	3	1	0
Thailand	64	64	64	64	0	0	0
Venezuela	51	51	51	50	−1	0	0
Romania	30	30	30	29	−1	0	0
Paraguay	53	53	53	52	−1	0	0
Ukraine	39	38	38	38	−1	−1	−1
FYR Maced.	38	39	39	39	1	1	1
Peru	61	61	61	59	−2	0	0
El Salvador	58	58	58	56	−2	0	0
Brazil	35	35	35	35	0	0	0
Moldova	66	66	66	65	−1	0	0
Nicaragua	50	50	50	49	−1	0	0
Kazakhstan	40	41	41	40	0	1	1
Argentina	34	34	34	34	0	0	0
Kyrgyzstan	54	54	54	54	0	0	0
Ecuador	46	46	46	45	−1	0	0
Armenia	52	52	52	51	−1	0	0
Azerbaijan	68	68	68	67	−1	0	0
Russia	41	40	41	41	0	−1	0
Vietnam	81	80	80	82	1	−1	−1
Jordan	76	76	76	75	−1	0	0
Mongolia	75	75	75	77	2	0	0
Mexico	60	60	60	60	0	0	0
China	62	62	62	63	1	0	0
Syria	73	73	73	73	0	0	0
Kuwait	59	59	59	61	2	0	0
Turkey	37	36	36	36	−1	−1	−1
Saudi Arabia	79	79	79	78	−1	0	0
Botswana	71	71	71	72	1	0	0
Algeria	83	83	83	85	2	0	0
Morocco	47	47	47	47	0	0	0
Uzbekistan	80	81	81	79	−1	1	1
Gambia	90	90	90	90	0	0	0
Congo	95	95	95	92	−3	0	0
Gabon	82	82	82	81	−1	0	0
Colombia	105	105	105	105	0	0	0
Lebanon	93	92	92	94	1	−1	−1
Egypt	92	93	93	96	4	1	1
Zimbabwe	70	70	70	69	−1	0	0
Senegal	94	94	94	91	−3	0	0
Namibia	77	77	77	76	−1	0	0
Zambia	88	88	88	87	−1	0	0
Malawi	86	86	86	88	2	0	0
Papua NG	118	117	117	118	0	−1	−1
Oman	115	115	115	115	0	0	0
Ghana	69	69	69	70	1	0	0
Honduras	67	67	67	68	1	0	0
Sri Lanka	87	87	87	89	2	0	0
Kenya	84	84	84	84	0	0	0
Cambodia	117	118	118	117	0	1	1
Angola	101	101	101	104	3	0	0
Cote d’Ivoire	99	99	99	97	−2	0	0
Bangladesh	123	123	123	122	−1	0	0
Benin	120	120	120	120	0	0	0
Laos	112	112	112	111	−1	0	0
Guatemala	72	72	72	71	−1	0	0
South Africa	85	85	85	83	−2	0	0
Philippines	74	74	74	74	0	0	0
Chad	102	102	102	103	1	0	0
United Arab E	78	78	78	80	2	0	0
Niger	124	124	124	124	0	0	0
Tanzania	104	104	104	102	−2	0	0
Uganda	108	108	108	107	−1	0	0
Nigeria	110	110	110	112	2	0	0
Togo	111	111	111	110	−1	0	0
Tajikistan	63	63	63	62	−1	0	0
Indonesia	65	65	65	66	1	0	0
Guinea Bissau	97	97	97	100	3	0	0
Centr. Afr. R	121	119	120	123	2	−2	−1
Mozambique	96	96	96	93	−3	0	0
Rwanda	91	91	91	95	4	0	0
Madagascar	114	113	113	114	0	−1	−1
Burkina Faso	98	98	98	98	0	0	0
Cameroon	113	114	114	113	0	1	1
Nepal	89	89	89	86	−3	0	0
Mali	122	122	122	121	−1	0	0
Iran	103	103	103	101	−2	0	0
Guinea	100	100	100	99	−1	0	0
DR Congo	106	106	106	108	2	0	0
India	116	116	116	116	0	0	0
Yemen	126	126	126	125	−1	0	0
Ethiopia	119	121	119	119	0	2	0
Pakistan	125	125	125	126	1	0	0
Sierra Leone	109	109	109	109	0	0	0
Burundi	107	107	107	106	−1	0	0
Mauritania	128	128	128	127	−1	0	0
Sudan	127	127	127	128	1	0	0

References

Brown, B.J.; Hanson, M.E.; Liverman, D.M.; Merideth, R.W., Jr. Global sustainability: Toward definition. Environ. Manag. 1987, 11, 713–719. [Google Scholar] [CrossRef]
Haider, H.; Hewage, K.; Umer, A.; Ruparathna, R.; Chhipi-Shrestha, G.; Culver, K.; Holland, M.; Kay, J.; Sadiq, R. Sustainability assessment framework for small-sized urban neighbourhoods: An application of fuzzy synthetic eval-uation. Sustain. Cities Soc. 2018, 36, 21–32. [Google Scholar] [CrossRef]
Hou, D.; Ding, Z.; Li, G.; Wu, L.; Hu, P.; Guo, G.; Wang, X.; Ma, Y.; O’Connor, D.; Wang, X. A Sustainability Assessment Framework for Agricultural Land Remediation in China. Land Degrad. Dev. 2018, 29, 1005–1018. [Google Scholar] [CrossRef]
Ibrahim, Y.; Arafat, H.A.; Mezher, T.; Almarzooqi, F. An integrated framework for sustainability assessment of seawater desalination. Desalination 2018, 447, 1–17. [Google Scholar] [CrossRef]
Tan, Y.; Shuai, C.; Jiao, L.; Shen, L. An adaptive neuro-fuzzy inference system (ANFIS) approach for measuring country sustainability performance. Environ. Impact Assess. Rev. 2017, 65, 29–40. [Google Scholar] [CrossRef]
Asadi, S.; Pourhashemi, S.O.; Nilashi, M.; Abdullah, R.; Samad, S.; Yadegaridehkordi, E.; Aljojo, N.; Razali, N.S. Investigating influence of green innovation on sustainability performance: A case on Malaysian hotel industry. J. Clean. Prod. 2020, 258, 120860. [Google Scholar] [CrossRef]
Hakovirta, M.; Denuwara, N. How COVID-19 Redefines the Concept of Sustainability. Sustainability 2020, 12, 3727. [Google Scholar] [CrossRef]
Lyu, Y.; Yang, X.; Pan, H.; Zhang, X.; Cao, H.; Ulgiati, S.; Wu, J.; Zhang, Y.; Wang, G.; Xiao, Y. Impact of fertilization schemes with different ratios of urea to controlled release nitrogen fertilizer on environmental sustainability, nitrogen use efficiency and economic benefit of rice production: A study case from Southwest China. J. Clean. Prod. 2021, 293, 126198. [Google Scholar] [CrossRef]
Wong, D.T.; Ngai, E.W. Economic, organizational, and environmental capabilities for business sustainability competence: Findings from case studies in the fashion business. J. Bus. Res. 2021, 126, 440–471. [Google Scholar] [CrossRef]
Ali, S.A.; Alharthi, M.; Hussain, H.I.; Rasul, F.; Hanif, I.; Haider, J.; Ullah, S.; Rahman, S.U.; Abbas, Q. A clean technological innovation and eco-efficiency enhancement: A multi-index assessment of sustainable economic and environmental management. Technol. Forecast. Soc. Chang. 2021, 166, 120573. [Google Scholar] [CrossRef]
Lahlou, F.-Z.; Mackey, H.R.; Al-Ansari, T. Wastewater reuse for livestock feed irrigation as a sustainable practice: A so-cio-environmental-economic review. J. Clean. Prod. 2021, 294, 126331. [Google Scholar] [CrossRef]
Roseland, M. Sustainable community development: Integrating environmental, economic, and social objectives. Prog. Plan. 2000, 54, 73–132. [Google Scholar] [CrossRef]
Santoyo-Castelazo, E.; Azapagic, A. Sustainability assessment of energy systems: Integrating environmental, economic and social aspects. J. Clean. Prod. 2014, 80, 119–138. [Google Scholar] [CrossRef]
Cinelli, M.; Coles, S.R.; Kirwan, K. Analysis of the potentials of multi criteria decision analysis methods to conduct sustainability assessment. Ecol. Indic. 2014, 46, 138–148. [Google Scholar] [CrossRef] [Green Version]
Sala, S.; Farioli, F.; Zamagni, A. Life cycle sustainability assessment in the context of sustainability science progress (part 2). Int. J. Life Cycle Assess. 2013, 18, 1686–1697. [Google Scholar] [CrossRef]
Wahab, M.A. Is an unsustainability environmentally unethical? Ethics orientation, environmental sustainability engagement and performance. J. Clean. Prod. 2021, 294, 126240. [Google Scholar] [CrossRef]
Abad-Segura, E.; González-Zamar, M.-D. Sustainable economic development in higher education institutions: A global analysis within the SDGs framework. J. Clean. Prod. 2021, 294, 126133. [Google Scholar] [CrossRef]
Dabbous, A.; Tarhini, A. Does sharing economy promote sustainable economic development and energy efficiency? Evidence from OECD countries. J. Innov. Knowl. 2021, 6, 58–68. [Google Scholar] [CrossRef]
Janeiro, L.; Patel, M.K. Choosing sustainable technologies. Implications of the underlying sustainability paradigm in the deci-sion-making process. J. Clean. Prod. 2015, 105, 438–446. [Google Scholar] [CrossRef]
Ness, B.; Urbel-Piirsalu, E.; Anderberg, S.; Olsson, L. Categorising tools for sustainability assessment. Ecol. Econ. 2007, 60, 498–508. [Google Scholar] [CrossRef]
Phillis, Y.A.; Grigoroudis, E.; Kouikoglou, V.S. Sustainability ranking and improvement of countries. Ecol. Econ. 2011, 70, 542–553. [Google Scholar] [CrossRef]
Kouloumpis, V.D.; Kouikoglou, V.S.; Phillis, Y.A. Sustainability Assessment of Nations and Related Decision Making Using Fuzzy Logic. IEEE Syst. J. 2008, 2, 224–236. [Google Scholar] [CrossRef] [Green Version]
Kouikoglou, V.S.; Phillis, Y.A. On the monotonicity of hierarchical sum–product fuzzy systems. Fuzzy Sets Syst. 2009, 160, 3530–3538. [Google Scholar] [CrossRef]
Andriantiatsaholiniaina, L.A.; Kouikoglou, V.S.; A Phillis, Y. Evaluating strategies for sustainable development: Fuzzy logic reasoning and sensitivity analysis. Ecol. Econ. 2004, 48, 149–172. [Google Scholar] [CrossRef]
Wiek, A.; Binder, C. Solution spaces for decision-making—A sustainability assessment tool for city-regions. Environ. Impact Assess. Rev. 2005, 25, 589–608. [Google Scholar] [CrossRef]
Begić, F.; Afgan, N.H. Sustainability assessment tool for the decision making in selection of energy system—Bosnian case. Energy 2007, 32, 1979–1985. [Google Scholar] [CrossRef]
Zarghami, E.; Azemati, H.; Fatourehchi, D.; Karamloo, M. Customizing well-known sustainability assessment tools for Iranian residential buildings using Fuzzy Analytic Hierarchy Process. Build. Environ. 2018, 128, 107–128. [Google Scholar] [CrossRef]
Amindoust, A.; Ahmed, S.; Saghafinia, A.; Bahreininejad, A. Sustainable supplier selection: A ranking model based on fuzzy inference system. Appl. Soft Comput. 2012, 12, 1668–1677. [Google Scholar] [CrossRef]
Ferraro, D.O.; Ghersa, C.M.; A Sznaider, G. Evaluation of environmental impact indicators using fuzzy logic to assess the mixed cropping systems of the Inland Pampa, Argentina. Agric. Ecosyst. Environ. 2003, 96, 1–18. [Google Scholar] [CrossRef]
Azadi, H.; Berg, J.V.D.; Shahvali, M.; Hosseininia, G. Sustainable rangeland management using fuzzy logic: A case study in Southwest Iran. Agric. Ecosyst. Environ. 2009, 131, 193–200. [Google Scholar] [CrossRef]
Nilashi, M.; Cavallaro, F.; Mardani, A.; Zavadskas, E.K.; Samad, S.; Ibrahim, O. Measuring Country Sustainability Performance Using Ensembles of Neuro-Fuzzy Technique. Sustainability 2018, 10, 2707. [Google Scholar] [CrossRef] [Green Version]
Nilashi, M.; Rupani, P.F.; Rupani, M.M.; Kamyab, H.; Shao, W.; Ahmadi, H.; Rashid, T.A.; Aljojo, N. Measuring sustainability through ecological sustainability and human sustainability: A machine learning approach. J. Clean. Prod. 2019, 240, 118162. [Google Scholar] [CrossRef]
Yadegaridehkordi, E.; Hourmand, M.; Nilashi, M.; Alsolami, E.; Samad, S.; Mahmoud, M.; Alarood, A.A.; Zainol, A.; Majeed, H.D.; Shuib, L. Assessment of sustainability indicators for green building manufacturing using fuzzy multi-criteria decision making approach. J. Clean. Prod. 2020, 277, 122905. [Google Scholar] [CrossRef]
Li, W.; Ren, X.; Ding, S.; Dong, L. A multi-criterion decision making for sustainability assessment of hydrogen production technologies based on objective grey relational analysis. Int. J. Hydrogen Energy 2020, 45, 34385–34395. [Google Scholar] [CrossRef]
Ren, X.; Li, W.; Ding, S.; Dong, L. Sustainability assessment and decision making of hydrogen production technologies: A novel two-stage multi-criteria decision making method. Int. J. Hydrogen Energy 2020. [Google Scholar] [CrossRef]
Abdel-Basset, M.; Gamal, A.; Chakrabortty, R.K.; Ryan, M.J. Evaluation of sustainable hydrogen production options using an advanced hybrid MCDM approach: A case study. Int. J. Hydrogen Energy 2021, 46, 4567–4591. [Google Scholar] [CrossRef]
Streimikiene, D.; Skulskis, V.; Balezentis, T.; Agnusdei, G.P. Uncertain multi-criteria sustainability assessment of green building insulation materials. Energy Build. 2020, 219, 110021. [Google Scholar] [CrossRef]
Makan, A.; Fadili, A. Sustainability assessment of large-scale composting technologies using PROMETHEE method. J. Clean. Prod. 2020, 261, 121244. [Google Scholar] [CrossRef]
Akhanova, G.; Nadeem, A.; Kim, J.R.; Azhar, S. A multi-criteria decision-making framework for building sustainability assessment in Kazakhstan. Sustain. Cities Soc. 2020, 52, 101842. [Google Scholar] [CrossRef]
Phillis, A.; Grigoroudis, E.; Kouikoglou, V.S. Assessing national energy sustainability using multiple criteria decision analysis. Int. J. Sustain. Dev. World Ecol. 2021, 28, 18–35. [Google Scholar] [CrossRef]
Grigoroudis, E.; Kouikoglou, V.S.; Phillis, Y.A. SAFE 2019: Updates and new sustainability findings worldwide. Ecol. Indic. 2021, 121, 107072. [Google Scholar] [CrossRef]
Amini, S.; Rohani, A.; Aghkhani, M.H.; Abbaspour-Fard, M.H.; Asgharipour, M.R. Sustainability assessment of rice production systems in Mazandaran Province, Iran with emergy analysis and fuzzy logic. Sustain. Energy Technol. Assess. 2020, 40, 100744. [Google Scholar] [CrossRef]
Kanmani, A.P.; Obringer, R.; Rachunok, B.; Nateghi, R. Assessing Global Environmental Sustainability Via an Unsupervised Clustering Framework. Sustainability 2020, 12, 563. [Google Scholar] [CrossRef] [Green Version]
Shang, Y.; Liu, S.; Liu, C. Fuzzy Evaluation on Sustainability Performances of Selected Pacific Islands Countries. J. Coast. Res. 2020, 105, 165–170. [Google Scholar] [CrossRef]
Gharizadeh Beiragh, R.; Alizadeh, R.; Shafiei Kaleibari, S.; Cavallaro, F.; Zolfani, S.H.; Bausys, R.; Mardani, A. An integrated multi-criteria decision making model for sustainability performance assessment for insurance com-panies. Sustainability 2020, 12, 789. [Google Scholar] [CrossRef] [Green Version]
Asrol, M.; Papilo, P.; Gunawan, F.E. Support Vector Machine with K-fold Validation to Improve the Industry’s Sustainability Performance Classification. Procedia Comput. Sci. 2021, 179, 854–862. [Google Scholar] [CrossRef]
Attia, S.; Alphonsine, P.; Amer, M.; Ruellan, G. Towards a European rating system for sustainable student housing: Key performance indicators (KPIs) and a multi-criteria assessment approach. Environ. Sustain. Indic. 2020, 7, 100052. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification And Regression Trees; CRC Press: Boca Raton, FL, USA, 1984. [Google Scholar]
Steinberg, D.; Colla, P. CART: Classification and Regression Trees; Salford Systems; CRC Press: San Diego, CA, USA, 1997. [Google Scholar]
Antipov, E.A.; Pokryshevskaya, E.B. Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics. Expert Syst. Appl. 2012, 39, 1772–1778. [Google Scholar] [CrossRef] [Green Version]
Ahani, A.; Nilashi, M.; Ibrahim, O.; Sanzogni, L.; Weaven, S. Market segmentation and travel choice prediction in Spa hotels through TripAdvisor’s online reviews. Int. J. Hosp. Manag. 2019, 80, 52–77. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef] [PubMed]
Moya-Anegón, F.; Herrero-Solana, V.; Jiménez-Contreras, E. A connectionist and multivariate approach to science maps: The SOM, clustering and MDS applied to library and information science research. J. Inf. Sci. 2006, 32, 63–77. [Google Scholar] [CrossRef]
Liu, Y.-C.; Wu, C.; Liu, M. Research of fast SOM clustering for text information. Expert Syst. Appl. 2011, 38, 9325–9333. [Google Scholar] [CrossRef]
Roh, T.H.; Oh, K.J.; Han, I. The collaborative filtering recommendation based on SOM cluster-indexing CBR. Expert Syst. Appl. 2003, 25, 413–423. [Google Scholar] [CrossRef]
Vigneau, E.; Courcoux, P.; Symoneaux, R.; Guérin, L.; Villière, A. Random forests: A machine learning methodology to highlight the volatile organic compounds involved in olfactory perception. Food Qual. Prefer. 2018, 68, 135–145. [Google Scholar] [CrossRef]
De Olde, E.M.; Bokkers, E.A.; de Boer, I.J. The choice of the sustainability assessment tool matters: Differences in thematic scope and assessment results. Ecol. Econ. 2017, 136, 77–85. [Google Scholar] [CrossRef]
Hjorth, P.; Bagheri, A. Navigating towards sustainable development: A system dynamics approach. Future 2006, 38, 74–92. [Google Scholar] [CrossRef]

Figure 1. The proposed method for overall sustainability assessment.

Figure 2. The best-matching unit in the Self-Organizing Map (SOM).

Figure 3. SOM clusters for countries’ sustainability performance.

Figure 4. Visualizing decision trees for predicting OSUS based on ECOS and HUMS in Cluster 1.

Figure 5. The adjusted coefficient of determination for CART models in evaluating sustainability performance.

Figure 6. Ranking difference of ANFIS and SAFE, and SOM-CART and SAFE in assessing 128 countries sustainability performance.

Table 1. Decision trees for predicting OSUS based on ECOS and HUMS in Cluster 1.

Decision Trees

ECOS < 0.5660
○
HUMS < 0.1555 then avg(OSUS) = 0.3100
○
HUMS ≥ 0.1555
▪
HUMS < 0.2235 then avg(OSUS) = 0.3510
▪
HUMS ≥ 0.2235
▪
ECOS < 0.5340
▪
ECOS < 0.5080
▪
ECOS < 0.5005 then avg(OSUS) = 0.3730
▪
ECOS ≥ 0.5080 then avg(OSUS) = 0.3820
▪
ECOS ≥ 0.5340 then avg(OSUS) = 0.3990
ECOS ≥ 0.5660
○
ECOS < 0.5980
▪
ECOS avg(OSUS) = 0.4710 then avg(OSUS) = 0.4220
▪
ECOS ≥ 0.5850 then avg(OSUS) = 0.4450
▪
ECOS ≥ 0.5980 then avg(OSUS) = 0.4710

Table 2. Methods comparisons.

Method	Adjusted Coefficient of Determination
ANFIS	0.884
MLR	0.795
NN	0.813
CART	0.894
Ensembles of CART	0.907
Fuzzy C-Means + CART	0.923
Fuzzy C-Means + ANFIS	0.918
SOM+Ensembles of CART	0.936

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nilashi, M.; Asadi, S.; Abumalloh, R.A.; Samad, S.; Ghabban, F.; Supriyanto, E.; Osman, R. Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART). Sustainability 2021, 13, 3870. https://doi.org/10.3390/su13073870

AMA Style

Nilashi M, Asadi S, Abumalloh RA, Samad S, Ghabban F, Supriyanto E, Osman R. Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART). Sustainability. 2021; 13(7):3870. https://doi.org/10.3390/su13073870

Chicago/Turabian Style

Nilashi, Mehrbakhsh, Shahla Asadi, Rabab Ali Abumalloh, Sarminah Samad, Fahad Ghabban, Eko Supriyanto, and Reem Osman. 2021. "Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART)" Sustainability 13, no. 7: 3870. https://doi.org/10.3390/su13073870

APA Style

Nilashi, M., Asadi, S., Abumalloh, R. A., Samad, S., Ghabban, F., Supriyanto, E., & Osman, R. (2021). Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART). Sustainability, 13(7), 3870. https://doi.org/10.3390/su13073870

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sustainability Performance Assessment Using Self-Organizing Maps (SOM) and Classification and Ensembles of Regression Trees (CART)

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. CART

3.2. SOM

4. Data Analysis and Method Evaluation

5. Discussion and Managerial Implications

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI