Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model

Hassan, Shoaib; Li, Qianmu; Zubair, Muhammad; Alsowail, Rakan A.; Qureshi, Muaz Ahmad

doi:10.3390/su16145901

Open AccessArticle

Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model

by

Shoaib Hassan

^1,*

,

Qianmu Li

¹

,

Muhammad Zubair

²

,

Rakan A. Alsowail

^3,* and

Muaz Ahmad Qureshi

⁴

¹

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

²

Faculty of Information Technology and Computer Science, University of Central Punjab, Lahore 54000, Pakistan

³

Computer Skills, Self-Development Skills Development, Deanship of Common First Year, King Saud University, Riyadh 11362, Saudi Arabia

⁴

Department of Computer Science, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan

^*

Authors to whom correspondence should be addressed.

Sustainability 2024, 16(14), 5901; https://doi.org/10.3390/su16145901

Submission received: 14 June 2024 / Revised: 2 July 2024 / Accepted: 7 July 2024 / Published: 11 July 2024

Download

Browse Figures

Versions Notes

Abstract

:

Integrating environmental features into software requirements during the requirements engineering (RE) process is known as sustainable requirements engineering. Unlike previous studies, we found that there is a strong relationship between nonfunctional requirements and sustainable environmental factors. This study presents a novel methodology correlating nonfunctional requirements (NFRs) with precise, sustainable green IT factors. Our mapping methodology consists of two steps. In the first step, we link sustainability dimensions to the two groups of green IT aspects. In the second step, we connect NFRs to sustainability aspects. Our proposed methodology is based on the extended PROMISE_exp dataset in combination with the Bidirectional Encoder Representations from Transformers (BERT) language model. Moreover, we evaluate the model by inserting a new binary classification column into the dataset to classify the sustainability factors into socio-economic and eco-technical groups. The performance of the model is assessed using four performance metrics: accuracy, precision, recall, and F1 score. With 16 epochs and a batch size of 32, 90% accuracy was achieved. The proposed model indicates an improvement in performance metrics values yielding an increase of 3.4% in accuracy, 3% in precision, 3.4% in recall, and 16% in F1 score values compared to the competitive previous studies. This acts as a proof of concept for automating the evaluation of sustainability realization in software during the initial stages of development.

Keywords:

sustainability; BERT; machine learning; PROMISE_exp dataset; nonfunctional requirements; green IT factors

1. Introduction

Implementing current information technology (IT) systems with a suitable information and communication technology (ICT) structure has become important in the current decade. The quick advancement of technology performs allows various tasks to be completed simultaneously and delivers high performance and efficiency. The important role that IT plays in several industries, including healthcare and decision-support systems, emphasizes the importance of IT in a variety of sectors. To implement ICT, organizations use specific software that is crucial to service delivery and production measures. Significantly, businesses that produce software and hardware give importance to the development and application of these instruments, normally ignoring the sustainability of the resources used in their manufacturing. This imprecision has resulted in a notable matter that may be examined from two viewpoints: (1) the software within the information technology system, and (2) sustainability [1]. The result for IT systems is an exponential increase in software complexity, which harms costs, stakeholders, and the owner [2]. On the other hand, IT systems negatively influence sustainability due to the amount of hardware, data centers, and energy required for these systems to be developed, maintained, and run. Additionally, it is important to consider the negative effects on social and economic features [3].

The demand for sustainable and eco-friendly software is rising, with implications for performance levels, network bandwidth, and hardware requirements. These factors also affect energy consumption and the use of natural resources. Research on greening software engineering appears to be a critical answer to this growing demand. The goal of this research is to improve software systems’ sustainability through a multiangle approach. To design software with as few negative impacts as possible on the environment, society, and the economy, it becomes imperative to adopt the viewpoint of sustainable software development. Numerous studies show the intersection of nonfunctional requirements, such as performance, maintainability, scalability, and usability, with sustainable environmental factors. This intersection demonstrates the relationship between green IT factors and NFRs. The key evidence showing the relationship between NFRs and green IT factors is given below:

Resource Consumption: Green IT factors focus on the best possible usage of resources, including software and hardware resources. Different NFRs, such as scalability and resource consumption, ensure that the system can handle additional load without adding unnecessary resources to the system. Research studies indicate efficient resource management in the software development process, which can lead to the better resource consumption of existing infrastructure.

Energy Efficiency: Minimizing energy consumption is one of the major green IT factors. Different NFRs, such as efficiency and performance, are directly related to these green IT factors. We can reduce energy consumption by decreasing the computational resources using optimized algorithms and efficient coding techniques [4].

Usability: Usability is a major NFR that can produce an impact on user behavior towards green IT practices. For example, a user-friendly interface that encourages system users to power down the system when it is not working can contribute to energy saving. Useful interface strategies can produce a positive impact on sustainability factors.

Maintainability: The software system that follows green IT practices has a long lifecycle. Maintainability is an NFR that ensures the system can be easily maintained and updated. A maintained system has a very low chance of replacement, which addresses the reduced electronic waste and is one of the green IT factors.

Regulatory Compliance: NFRs that are related to regulatory compliance ensure that a system meets these regulatory requirements, which often include energy efficiency and efficient waste management. By meeting these regulatory requirements, the system will directly support green IT practices.

The study presented here contributes significantly to the convergence of software engineering and sustainability in various ways. First, we present a greening requirements engineering concept, which is an important stage in the software development lifecycle (SDLC). Many studies used machine learning and deep learning approaches for requirement classification [5,6]. In particular, our study defines a procedure to link nonfunctional requirements (NFRs) to characteristics of green software. Secondly, we add a new feature that captures the author-defined aspects of software sustainability. Our study improves the extensively used Predictor Models in Software Engineering (PROMISE) dataset [7]. This dataset is important for training the language model that we created in this work. This novel study creates a link between NFRs and sustainability extents that exist in the PROMISE dataset [7].

Thirdly, to classify different kinds of nonfunctional requirements (NFRs) into relevant sustainability aspects, we build, implement, and evaluate an extremely effective, precisely tailored Bidirectional Encoder Representations from Transformers (BERT) language model. This model is a keystone of our novel methodology, which measures the degree of sustainability perception and attention at the RE stage of the software development lifecycle (SDLC). Different pre-trained language models are used for different natural language processing (NLP) applications. For example, OpenAI’s Generative Pre-Trained Transformer (GPT) generates text using an unsupervised learning technique [8]. Using a uniform framework, Google’s Text-to-Text Transfer Transformer (T5) is trained on tasks including summarization and text translation [9]. Facebook presented a Robustly Optimized BERT Pretraining Approach (RoBERTa) as a BERT alteration that improves overall performance by incorporating more training data and methodologies [10]. In this study, we employ the BERT model based on the features of the selected dataset and the proposed mapping job from NFRs into sustainability aspects of the software. BERT is a highly efficient pre-trained language model that performs exceptionally well in various natural language processing tasks, especially fine-grained classification tasks that require the classification of text into discrete groups.

In contrast to previous language models, BERT’s attention mechanism and bidirectional nature provide extra advantages for fine-grained categorization job accuracy. Additionally, BERT is especially well suited for fine-tuning on a limited dataset, as used in this study [8]. The real motivation to study sustainable requirements engineering is its capability to produce an impact on the environment. Sustainable requirements engineering helps us to design systems that can reduce resource and energy consumption. Sustainable requirements engineering is useful for producing eco-friendly software systems. We can make cheaper and more cost-effective software systems by studying sustainable requirements engineering. Incorporating sustainability into the software development process makes it more efficient and competitive. Moreira [11] discussed the impact of sustainable requirements engineering on the software development process. The study addressed the sustainability requirements catalog with the help of a systematic mapping approach to show the sustainable characteristics that affect the development process. A different study [12] addressed the effectiveness of software sustainability through a systematic mapping approach. A third study [13] analyzed the impact of sustainable requirements on social and environmental issues and found them useful in all types of projects. The main contributions of our research are given below:

Focuses on the correlation of nonfunctional requirements with sustainable green IT factors.
Extends the PROMISE_exp dataset by adding 61 new instances and one column of sustainability class with the binary labels “socio-economic” and “eco-technical”.
Performs sustainability-factor labeling on all 1030 instances of the extended PROMISE_exp dataset.
Evaluates the BERT model for classifying sustainability factors within the context of NFRs.
Improves evaluation metrics’ values and accuracy by using the BERT language model for the classification of sustainability factors.

This paper is structured as follows: A literature review is presented in Section 2. The training dataset, the phases of dataset growth and preprocessing, and the feature engineering procedure are covered in Section 3. The mapping approach suggested in this study is presented and discussed in Section 4. In Section 5, the evaluation of the experiment carried out to assess the suggested BERT model is shown, and the outcomes are discussed. The study’s conclusions and potential future research directions are covered in Section 6.

2. Literature Review

This section provides the foundation of this study. It explains the literature regarding sustainability in software engineering, especially requirements engineering and the classification of requirements through machine learning and deep learning.

2.1. Sustainability

Software sustainability is typically addressed through the concept of “greening” IT. The study and use of effective design, production, and discarding procedures for PCs, servers, and related subsystems are called “greening IT” [14]. Reducing or removing their environmental influence is the main objective of greening IT. To lessen software’s environmental impact, the green software development method places a strong emphasis on self-reflection. Software systems are the pillar of our rapidly technological world, performing a wide range of tasks. Software sustainability is crucial since people depend on it for their everyday tasks [15]. In the field of scientific and software development, sustainability gradually becomes an increasingly important concern as the world moves toward new standards in research and computer infrastructures. The availability of software that will be improved and maintained into the predictable future is what the Software Sustainability Institute defines as “sustainability” [16]. Although the concept is not fully clear, it suggests that software’s continuous availability, extensibility, and maintainability are closely related to sustainability. These features correspond to the definitions provided by the IEEE [17]. Within this framework, software development is a complex process carried out in an environment that is always changing and unpredictable [18]. Due to this changing environment, unsustainable software applications are produced. Therefore, the challenge is to create sustainable software, meaning it can provide the desired results and carry out assigned tasks more efficiently over a longer period of time. This strategic method assures a long-lasting and important presence in the ever-changing field of technology while also cutting down on the time, labor, money, and computational resources required for development.

To accomplish the agenda of 2030, in 2015, the United Nations (UN) recognized 169 targets and 17 Sustainable Development Goals (SDGs) [19]. These objectives cover the first three sides of sustainability: social, economic, and environmental. The study [20] discussed the linkage between ICT and the SDGs. There are various complications in the way of achieving these objectives in all three ways. It is very important to integrate sustainability into software development. To help achieve the SDGs by 2030, there is a growing need for more awareness and invention in the field of green and sustainable software engineering. As a result, it is recognized that identifying sustainability requirements for software development is an important subject, and research on sustainability is being conducted in numerous countries, including Brazil [21] and Pakistan [22]. Studies [23,24] discussed the usage of sustainability in the context of RE. It is defined as an organized procedure that includes long-term software artifact design, implementation, and maintenance. It is anticipated that this procedure will exhibit accountability, efficiency, and flexibility to meet changing consumer needs and technological improvements. It should concurrently work to reduce adverse effects on the economy, the environment, and social institutions. The three aspects of sustainability have been studied in the context of greening the RE, design, and implementation phases of the software development process [25]. The majority of software experts agree that software sustainability is significant and useful. Additionally, [25] suggested that the focus should be given to all sustainability dimensions rather than just environmental consequences while evaluating the influence of software systems [25].

Environmental and technical variables have a great impact on software sustainability [26]. Long-lasting software prioritizes capital preservation and financial value preservation, which has a key economic influence. It also ensures the continued operation of software systems, enabling them to properly adjust to the constantly evolving execution environment. In many different organizations, software-intensive solutions are important for directly supporting social groups. Sustainability is the protection of natural resources combined with human well-being, as seen from an environmental viewpoint [27]. As the resource base grows, this requires the addressing of ecological concerns, such as energy efficacy, and rising ecological realization. There are different types of sustainability. Economic, technical, social, and environmental are four different types of sustainability. Economic sustainability refers to the development, implementation, and management of software that enhances financial feasibility and cost-effective solutions for the people. For example, an open-source software system reduces licensing costs and improves innovation through a collaborative environment [25]. Technical sustainability confirms reliability and adaptability by using modularity and extensibility in software systems [25]. Social sustainability refers to the design and development of a software system that has a great social impact on communities and supports the accessibility of resources to all people from all social backgrounds [28]. Environmental sustainability refers to the design and development of a software system that reduces its ecological effects and promotes resource efficiency [29]. We can increase environmental sustainability by using efficient algorithms [30]. In [31], standard guidelines are introduced to integrate sustainability into the system development process. Rusch [32] introduced a sustainability context engineering framework to analyze the relationship between sustainability and product development processes. In [33], the role of sustainability in the software developer’s approach to producing software is addressed. Silveira [34] focused on the sustainability factors to introduce creativity in requirements engineering.

The term “green IT” refers to the design and development of computers, software systems, and other equipment in a sustainable and environmentally friendly manner [25]. The major objectives of green IT are to reduce resource consumption, reduce negative environmental effects, and support the effective use of technology in a social context. Software sustainability may improve the green IT concept by supporting eco-friendly software solutions. Eco-friendly software solutions consume less energy and fewer resources. Eco-friendly software solutions encourage software developers and other stakeholders to implement social practices that help achieve sustainability objectives [35]. Table 1 presents green IT aspects and their relevance to sustainability factors, based on analysis of the literature.

2.2. Nonfunctional Requirements (NFRs)

Nonfunctional requirements are important requirements that describe the quality and features of a system [36]. These requirements do not describe the system’s functionality, they define how well the system performs its functional requirements. Nonfunctional requirements are also called quality attributes and can be expressed in multiple ways [25,28]. Nonfunctional requirements can be categorized in many ways. Each category addresses a distinct aspect of a software system. These categories include usability, which emphasizes learnability and availability; performance, which focuses on throughput and response time; reliability, which deals with software availability; maintainability, which focuses on testability and modifiability; and security, which deals with cohesion [25,37,38]. Table 2 explains the most common NFRs and their description. In [39], sustainability in the background of software development is discussed. The study [39] used a sustainability analysis framework to identify the sustainability-related factors of the system. It also discussed the influence of those factors on the system. Nonfunctional requirements have been defined as software quality characteristics. Different architectures were developed to improve the quality of the system. NFRs are also used to validate the completion of the system. NFRs have a great influence on the whole software development lifecycle. Software systems that ignore NFRs in system development face many issues. These issues can include low performance, poor security, and numerous reliability issues [40,41].

2.3. Quality Models

Software quality models are the set of frameworks that are used to assess software quality [43]. These models make sure that software meets its desired quality standards and quality criteria. There are two major components of software quality: (1) process and (2) product. Many elements affect software quality, including people, technology, and tools. Different quality models are used in software organizations. One of the famous quality models is the McCall quality model which consists of 11 quality parameters. This quality model was adopted by the United States in 1977 to describe the quality of software [44,45]. In [44], the hierarchical factor criteria model is discussed. The main focus of this model was to close the gap between the software developer and the user to provide user-centric development. In 1978, Boehm presented another quality model. Boehm’s model reduced all the issues of the previous quality model by prioritizing the users, testers, and maintainers [45]. Software Engineering Institute (SEI) introduced the Capability Maturity Model Integration (CMMI) quality model that provides guidelines for software process improvement. Six Sigma has also been used for process improvement and error reduction. In [46], ISO/IEC 25010 systems and software quality models are discussed. This model provides guidelines for developing a model based on Boehm’s functional and McCall’s quality models. This model has six quality characteristics, including maintainability (MN), reliability (RE), portability (PO), efficiency (EF), usability (US), and functionality (F). This study also discussed 21 different types of quality criteria [47,48]. Hewlett-Packard developed the FURPS model. FURPS stands for functionality, usability, performance, and supportability, and the model is based on these five quality attributes.

2.4. Linkage between NFRs and Green IT Factors

NFRs produce a great impact on green IT factors directly or indirectly. Ignoring NFRS in the software development process negatively affects the software product and makes the fulfillment of sustainable values impossible [49]. Pham addresses the differentiation between different NFRs to find the hidden sustainability requirements among them [50]. The study emphasizes implementing sustainability requirements in the software development process to make the product successful. The study gives five dimensions of sustainability and finds NFRs that best suit those dimensions [51]. The study emphasizes adding those NFRs to the scrum framework of software development. Duboc proposed a question-based framework to show the effectiveness of green IT factors and their relationship with requirements engineering, especially NFRs [52]. In [4], a green software model based on NFRs that will implement green IT practices and support reduced energy consumption in the system is proposed. Alberto proposed a classification model that was based on quality attributes. He suggested that quality attributes are the key components for software sustainability [53]. To accomplish sustainability goals, different organizations are integrating NFRs with green IT factors that will cause reduced energy consumption and better resource utilization. Table 3 presents different NFRs and their linkage with green IT factors.

2.5. Machine Learning in Requirements Engineering

Machine learning involves developing software applications that learn from data [54]. Numerous studies have used machine learning for multiple tasks of the SDLC. Multiple studies used machine learning techniques for requirement classification and prioritization [6,55,56]. In [6], a classification model for requirements specification was used. This model was based on BERT. In this study, the BERT model was used to train the classifier on software requirement specification documents. The BERT model categorized the specification text into multiple classes, such as functional, quality, and nonfunctional requirements. In another study, [56], a BERT model for classifying the functional and nonfunctional requirements was used. This study addressed all the issues that occur in classifying the requirements. The study suggested that BERT is more efficient than other deep learning techniques for prediction analysis. Another study, [57], described a nonfunctional requirement classifier to train the model. This study used a weighted scheme for different types of quality attributes. This study classifies only those NFRs on which the classifier is trained. For every possible indicator phrase for every NFR type, a probabilistic weight is calculated by the training set criteria. The weight of an indicator phrase specifies the kind of requirement it has. Indicator terms can be used to classify further statements and requirements in the second phase. The probability value for each requirement’s indicator term is used to find the NFR type of the new requirement. In [6], a high threshold value was used to obtain high recall, precision, and accuracy values. Another study, [58], discussed automatic requirement document classifiers. In [59], a text-mining infrastructure-based general architecture was used for engineering systems. The classifier used in this system was trained on the PROMISE dataset. This study used a support vector machine classifier. Most of the studies in the literature focused on nonfunctional requirement classification [6,56,57,58,59]. No study focused on the relationship between nonfunctional requirements and sustainability factors. Our proposed methodology focuses on the correlation between quality attributes and software sustainability factors with high accuracy.

2.6. Python-Based Natural Language Processing

Many Python3.7 libraries have been developed for the natural language processing field. The most important tools used for natural language processing are the natural language toolkit (NLTK) and spaCyv3.4 [60,61]. NLTK is a rich library that can provide text processing functions such as tokenization, lemmatization, parsing, and tagging. NLTK is an open-source library that is widely used in industry as well as in academia. It is a very good source for understanding and teaching natural language processing (NLP) concepts. Another library that is widely used in NLP is spaCyv3.4. It has a very simple application programming interface. Nowadays, the organization’s priority is spaCyv3.4 for NLP tasks. NLTK and spaCyv3.4 are two major NLP libraries. The suitability of choosing an NLP library is based on scope, complexity, and requirements type [62].

3. Preprocessing and Dataset Training

We used an expanded version of PROMISE, the most commonly used dataset in software engineering research. The PROMISE_exp dataset is used as a base dataset in our research study. We added healthcare software system requirements to this dataset. Firstly, we analyzed healthcare system requirements. After analysis, we added those requirements to our repository. Experts validated these extracted requirements. This process will improve the original dataset. The main reason to choose the healthcare system’s requirements is the diverse nature of the healthcare system. As the healthcare system is a critical and sociotechnical system, adding requirements to this system will improve the dataset and may show the other perspectives of the dataset. The PROMISE dataset has 625 labeled requirements. PROMISE_exp has 969 labeled requirements. The requirements of both datasets have been written in the English language. Numerous studies have used these datasets for requirements classification purposes [63]. Both datasets contain three major attributes. Those attributes are the project ID, requirements text, and class of the requirement. The project ID defines the project from which the requirement has been extracted. Requirements text addresses the major text describing a specific project’s requirements. The class attribute defines the major category of requirement from which it belongs. PROMISE_exp contains functional and 11 types of nonfunctional requirements. Those types are usability (US), legal (L), scalability (SC), portability (PO), maintainability (MN), availability (A), security (SE), look and feel (LF), fault tolerance (FT), performance (PE), and operational (O). Some other nonfunctional requirements are added after adding healthcare system requirements to the PROMISE_exp dataset. Those requirements will be discussed in the next section.

3.1. Overview of Proposed Novel Methodology

The proposed methodology has several benefits over the existing literature studies. The proposed study maps NFRs to sustainable green IT factors and classifies them within the context of NFRs using the BERT model. Classification accuracy and other evaluation metrics have also been calculated. The details of every part of the proposed study will be discussed in the following sections. Figure 1 presents the basic workflow of the proposed methodology. The novel contributions of the proposed work are highlighted in red boxes in the figure.

The highlighted boxes in the above figure show the major contributions of our proposed research study. In the first box, we extended the PROMISE_exp dataset by adding 61 new requirements from the healthcare system document. After extending, we map the NFRs with sustainable green IT factors. We make two sustainable green IT groups (socio-economic and eco-technical) by merging four sustainable dimensions (economic, environmental, social, and technical). We add one column for the binary classification of green IT factors. After evaluating 18 different NFRs in the dataset, software experts map the green IT factor to its relevant NFR. Each NFR is assessed based on its contribution to the socio-economic or eco-technical group. Each NFR is labeled according to its nature. For example, one NFR from the dataset is “The system shall refresh the display every 60 s”, which comes under the eco-technical green sustainable group due to its contribution to environmental and technical sustainable dimensions. After mapping all NFRs with green IT factors, we apply the BERT model for feature engineering. After applying the BERT model, we find the classification accuracy of our proposed methodology. The BERT model gives us an accuracy of 0.90 and classifies 403 socio-economic and 149 eco-technical requirements. Our novel methodology is different from existing literature studies. The study identified hidden sustainability characteristics in the requirement change management process. Sustainable software has different dimensions to understand NFRs in the system. The study addressed the sustainable requirements in the scrum lifecycle. The study addressed the importance of sustainability in the requirements engineering process. Our proposed study merges the four dimensions of sustainability into two green IT groups and maps all NFRs to those green IT factors according to their context. The details of every section of the above figure are given in the following sections.

3.2. PROMISE_exp Dataset Expansion

We extended the PROMISE_exp dataset as a part of our research contribution and to cover all aspects of software quality attributes. All the steps executed for preprocessing, labeling, and extending the PROMISE_exp dataset are shown as an activity diagram in Figure 2.

We expanded the dataset in two different steps. Firstly, we added seven more types of nonfunctional requirement to the dataset: efficiency (EF), interoperability (IN), reliability (RE), accessibility (AC), reusability (REU), accuracy (ACU), and adaptability (AD). These nonfunctional requirements can also be used to find the sustainability degree of the software. All the steps that were performed in the above Figure 2 are the preprocessing steps before adding the functional and nonfunctional instances into the repository. In the first stage, the healthcare system requirement document was searched. After searching, a healthcare system document was retrieved. After a deeper analysis of healthcare system requirements, functional and nonfunctional requirements were extracted and classified according to their type. After extraction, the validation of software requirements was executed by the experts. Experts check the consistency and completeness of the requirements. After the validation of the NFRs, all extracted NFRs were added to the PROMISE_exp dataset. This step extended the PROMISE_exp dataset. The expanded PROMISE_exp dataset now contains 1030 requirements distributed over 50 software projects. There are a total of 478 functional and 552 nonfunctional requirements in the extended dataset. The distribution of all types of nonfunctional requirement is presented in Figure 3. Another column in the dataset was created to specify the sustainability factors as a part of our research contribution. The new expanded dataset was unlabeled. The next section will describe the labeling process and sustainability factors.

From Figure 3, we can see that the SE (security) NFR has the greatest number (128) in the extended dataset, while AD (adaptability), RE (reliability), IN (interoperability), and REU (reusability) have small numbers in the extended dataset. The large number of security requirements indicates that security is a major concern in the PROMISE_exp dataset. Before extending the PROMISE_exp dataset, it contained 969 requirements. We added the 61 new requirements of the healthcare system to the dataset. As security is a major concern in the healthcare system, the new extended dataset contains the maximum security requirements. Many NFRs come under the category of security requirements, such as authentication, authorization, integrity, confidentiality, immunity, and auditing [64]. Another reason is its capability to influence other NFRs. Security can produce an impact on the performance, efficiency, scalability, portability, usability, and availability of the system directly or indirectly [65]. These are the reasons for the major contribution of security requirements in most software engineering–related datasets.

3.3. Dataset Labeling

According to the nature of the extended PROMISE_exp dataset, manual labeling was selected. A software domain specialist was selected to analyze each instance and assign appropriate labels to each instance. These labels serve as a good resource to show the context of sustainability factors related to green IT principles. The labeling process is presented in Figure 4.

To map sustainability dimensions with green IT factors, two labels were introduced for describing the categories of green IT factors. Those two labels are “eco-technical” and “socio-economic”. These two labels were used for the semantic classification of different NFR text into two categories. The software expert asked to decide on unlabeled NFR in the dataset. Software experts analyzed all NFRs and decided whether NFRs contribute to technical or environmental sustainability. For example, the requirement “The application gives output in the designated period” contributes to environmental and technical sustainability by not consuming so much energy and power. After this labeling, the distribution of these green IT factors shows that the expanded dataset has more socio-economic than eco-technical labels. The dataset has 403 socio-economic and 149 eco-technical green IT factors. The evaluation process of labeling the NFRs consists of three steps. Firstly, we check the impact of each NFR on the sustainability dimension. Then, we evaluate the semantics of each NFR and check its impact on the green IT factor. Finally, we added a new column to label each NFR with a specific green IT factor.

3.4. Feature Engineering

In this section, we will describe feature engineering that gets the language features of sustainability NFRs in an extended dataset. There are numerous feature-engineering techniques, but we selected BERT, which Google AI introduced. The BERT model provides syntactic relations and text semantics precisely. The BERT model performs well in classification, question answering, and natural language processing tasks [66]. BERT is a pre-trained model that has two major goals. One goal is to mask language modeling, and the other is sentence prediction. These goals make BERT a favorite and adaptable model for text classification in our proposed study. BERT has many advantages. One of the advantages of BERT is its transfer-learning abilities. BERT replaces the traditional pipeline for natural language processing-based systems. We can reduce the processing time as BERT is a time-consuming model compared to other deep learning models. The BERT model can provide concurrency and efficiency to compute long text sentences. It can understand long sentences in a finer way compared to other models [67]. These benefits of BERT beat other techniques such as LSTM (Long Short-Term Memory), TF-IDF (Term Frequency–Inverse Document Frequency), and Bag of Words.

In this study, we used the scikit library for feature extraction and pre-processing tasks. We can also use the PyTorch library of Python, a user-friendly library for tensor computation. PyTorch is an open-source library that offers compatibility with a broad range of other Python libraries [68]. Figure 5 shows the BERT model training process using an activity diagram. Firstly, the extended dataset is converted to a scikit frame using the scikit library. Then, we load the BERT tokenizer and convert NFR text into tokens. After tokenization, all labels are converted into numeric values. After tokenization, testing and training datasets are formed. Now, BERT model is used for sequence classification. After the classification trainer is created, finally, the BERT’s accuracy is calculated.

4. Proposed Methodology

In this section, we give an overview of the methodology to discuss the mapping. Our study is novel in terms of its linkage between NFRs and green IT factors. We execute our methodology in multiple steps. Firstly, we find the mapping between sustainability dimensions and different NFRs. Then, we discover the mapping between sustainability dimensions and green IT factors. Developing and training the BERT model is our third step. In the last step, we label and expand our dataset with green IT factors. According to a literature search, no study was found that used an expanded dataset for mapping the linkage between NFRs and green IT factors with high accuracy.

4.1. NFRs and Sustainability Dimensions Mapping

In this section, we discuss different NFRs and their impact on sustainability factors. NFRs have a significant effect on sustainability dimensions. By analyzing NFRs and their impact, we can develop software applications that are environmentally friendly. The next section describes the linkage between NFRs and sustainability dimensions. Only those NFRs that have a direct impact on environmental sustainability will be considered.

4.1.1. Mapping Relevant NFRs to Technical Sustainability

Nonfunctional requirements are very important for the technical feasibility of a software system. Nonfunctional requirements make the system technically strong and reliable [38]. As software systems have been used in every field, ensuring the technical feasibility of software systems is very important. We can provide technical feasibility in software systems by managing them to handle requirement changes. We can also ensure technical viability by supporting the reusability of existing resources. Nonfunctional requirements that have a direct impact on the technical capability of software systems can be presented in Table 4.

4.1.2. Mapping Relevant NFRs to Environmental Sustainability

NFRs have a high impact on environmental sustainability factors. NFRs are directly related to environmental sustainability. If we want to make the system secure, this will make the system more resource efficient [69]. Due to the increase in environmental monitoring software applications, it is essential to make those applications energy and resource efficient. In this study, we show the mapping of different NFRs to environmental sustainability. Not all NFRs have a direct impact on environmental sustainability. Those NFRs that have a direct impact on environmental sustainability are given in Table 5.

4.1.3. Mapping Sustainability Aspects of Green IT Factors

Software sustainability dimensions have a great influence on green IT aspects. The presence of green IT factors in software systems makes them more cost effective and energy efficient [70]. Our proposed study focuses on four sustainability dimensions that are grouped into two groups: eco-technical and socio-economic. Eco-technical factors contain technical and environmental sustainability factors. These factors ensure that the software development supports environmental sustainability and efficiency. Socio-economic factors combine social and economic aspects of sustainability. These factors ensure that software should be financially feasible for individuals and communities. The sustainability dimensions used to train the BERT model are given in Figure 6.

5. Results and Model Evaluation

In this section, we discuss the results and evaluate the performance of the BERT model. The BERT model performs well in classification tasks for big datasets [71]. We fine-tuned our pre-trained BERT model on the extended dataset that contains NFRs that are categorized into two green IT factors labels: socio-economic and eco-technical. Our extended dataset is not balanced; it contains 552 NFRs, of which 403 were labeled as socio-economic and 149 were labeled as eco-technical.

5.1. BERT Model Evaluation

Our proposed methodology is designed to measure the ability of the BERT model to perform the binary classification of two labels that relate to green IT factors. The main purpose of our research is to secure maximum scores in different performance metrics that will be discussed in the next section. To evaluate the model, we divided our extended dataset into two parts: training (70%) and testing (30%) datasets. The number of epochs and the batch size are two important parameters that were considered for developing a hyperparameter tuning method to increase the performance of the BERT model. Batch size is defined as how many samples proceed before the model changes, while epoch number defines the total iterations that a model completes for a training dataset [72]. We trained the model over several epoch numbers (1–8 and 16) and two batch sizes (16 and 32). The term “batch size” defines how many training samples are used in a single training update during the training process. It affects the BERT model’s performance. To measure the performance and ability of the BERT model for a correct prediction, we used four common metrics. These metrics are F1 score, recall, precision, and accuracy. These metrics are used for checking the performance of the BERT model. We used a certain random value to control the randomization split of the training and testing data to avoid the issue of overfitting.

5.2. Results Analysis

We used a confusion matrix to define the performance of the BERT model. In Figure 7, we present a confusion matrix with 16 epochs and a batch size of 16. We chose the socio-economic label as positive and the eco-technical label as negative. After assuming positive and negative labels, we obtain four values: true positive (TP), false positive (FP), true negative (TN), and false negative (FN). As shown in Figure 7, we have 75 instances of eco-technical that were accurately identified by the classification model (TP). We have seven instances that were predicted as eco-technical but that were socio-economic (FP). We have 23 instances of socio-economic that were accurately identified by the BERT classification model (TN). And there were six instances that were identified as socio-economic but that were from eco-technical (FN). In Figure 8, we present a confusion matrix with 16 epochs and a batch size of 32. We chose the socio-economic label as positive and the eco-technical label as negative. As shown in Figure 8, we have 77 instances of eco-technical that were accurately identified by the classification model (TP). We have five instances that were predicted as eco-technical but that were socio-economic (FP). We have 23 instances of socio-economic that were accurately identified by the BERT classification model (TN). And there were six instances identified as socio-economic but that were from eco-technical (FN). The model’s performance looks comparatively stable between the two classes, with a slightly higher accuracy in finding eco-technical instances compared to socio-economic ones. This investigation from the confusion matrix delivers valuable perceptions into the model’s strengths and weaknesses in distinguishing between eco-technical and socio-economic classes, helping further improve and enhance its predictive abilities.

We can also evaluate the results by analyzing the performance metric values. Performance metrics contain precision, recall, accuracy, and F1 score values. These evaluation metrics are very useful to show the performance of the model in classification tasks [73]. Table 6 presents the basic formulas used to find the values of the performance metrics.

Table 7 and Table 8 discuss the performance metrics of the BERT model with batch sizes of 16 and 32, respectively, with multiple epoch numbers. The tables show that the performance of the BERT increases as the values of precision, recall, accuracy, and F1 score increase. These parameters increase as we increase the epoch number from 1 to 16 with batch sizes of 16 and 32. We achieved the highest accuracy (89.1%) on three epochs with batch size 16, as shown in Table 7. We see that accuracy gradually increases (from 73.8% to 89.1%) as epoch no. increases. In Table 8, we achieve the highest accuracy (90%) on epoch number 16 with batch size 32. According to the literature, this is the highest accuracy achieved so far; another recent research study did not achieve this level of accuracy [63]. We see that accuracy increases from 73.8% to 90% as we increase the epoch number from 1 to 16. High accuracies in both tables show that the BERT model accurately predicts the green IT classes.

Our BERT model accurately predicts both classes—eco-technical and socio-economic. As seen in the confusion matrix, the highest accuracy is achieved in the eco-technical class, which is labeled as positive. Our model’s performance increases gradually with batch size 16 and epoch number 16, as precision increases from 54.5% to 88.9%. Recall value also increased from 73.8% to 89.1%, which means that our model detects the eco-technical accurately 89.1% of the time. The F1 score can also be considered the best parameter to analyze the model performance. From Table 6, we can see that the F1 score is the harmonic mean of recall and precision [74]. F1 score value increased from 62.7% to 88.91% with batch size 16 and epoch number 16. Likewise, the highest accuracy was achieved with 16 epochs and a batch size of 32; the accuracy increased from 73.8% to 90%. Furthermore, the precision also increases from 54.5% to 89.9%. Recall value also increased from 73.8% to 90% with epoch numbers 1–16. F1 score increased from 88.91% to 90% as we increased the batch size. From Table 7 and Table 8, we can see that accuracy increased as batch size increased from 16 to 32. Our BERT model gives the highest accuracy (90%) for the binary classification of green IT factors.

Our results reveal a unique relationship between nonfunctional requirements and sustainable environmental factors. Unlike previous research studies, our study establishes the classification mechanism of sustainable factors that improve the sustainable requirements-engineering paradigm. By comparing our results to the findings of other studies, our proposed model gives unique results. The proposed model achieves an accuracy of 90%, which is 3.4% higher than the latest research study to date [63]. The proposed study gains a 0.899 precision value, which is 3% higher than the previous study. Similarly, Table 8 shows the values of Recall = 0.90 and F1 Score = 0.90, which are 3.4% higher and 16% higher, respectively, than the competitive research study to date [63]. Using a big batch size and large epoch numbers, we observed higher metric values, which are different from the previous study. We identified 18 different nonfunctional requirements in the extended PROMISE_exp dataset and mapped them to two green IT factors that were not analyzed in the previous study. The improvement in evaluation metric values shows the major research contribution towards mapping the nonfunctional requirements to their designated sustainable factors and classifying those factors using the BERT model. The exclusive findings of our proposed model indicate a revolution in sustainable requirements engineering, which could lead to potential applications in green computing and quantum computing in the future.

5.3. Cross-Validation through K-fold

After evaluating the BERT model and finding the accuracy, we validate our accuracy again through cross-validation. For cross-validation, we are using the most common method, named k-fold. k-fold is a machine learning technique for model evaluation. It is the most common cross-validation technique used for classification tasks [75]. In the k-fold method, we divide the dataset into k parts and validate the model k number of times. Each time, we take k-1 folds as training data and the remaining fold as testing data. After k number of iterations, we evaluate the model by taking an average of all parameter values calculated in each fold. To validate our proposed methodology, we take the value of k = 5. We use 5 folds for the cross-validation of our BERT model results. By doing k-fold cross-validation on our evaluation results, we find an average accuracy of 94% across all folds for batch size 16. For batch size 32, we find an average accuracy of 93% across all folds. This cross-validation method improves the reliability of our proposed methodology. By using cross-validation, we can reduce the problems of overfitting and underfitting. k-fold is a useful method for any machine learning model that performs classification tasks [76].

5.4. Discussion

Using the BERT model, our novel study focuses on the binary classification of the green IT factors for sustainability. For this purpose, we used the PROMISE_exp dataset that is widely used in software engineering research. PROMISE_exp is a well-tested dataset by many research studies [62]. To make a novel contribution to research, we extended the PROMISE_exp dataset. PROMISE_exp dataset contains 969 requirements. We extended this dataset by adding 61 new instances from the healthcare system project. The extended dataset contains 1030 requirements, with 478 functional and 552 different nonfunctional requirements. Before the extension, the dataset contained 11 different types of nonfunctional requirements, discussed above. After the extension, the dataset contained 18 different types of nonfunctional requirements. We analyzed sustainability dimensions that directly influence green IT aspects. After conducting literature research, we found four sustainability dimensions. By merging them into two groups, we identified eco-technical and socio-economic sustainability classes. We added one extra column in an extended dataset with the title “sustainability class”. Experts analyzed all NFRs and suggested good sustainability labels for specific NFRs. After labeling, the dataset contained 403 socio-economic and 149 eco-technical labels. After the labeling process, we conducted feature engineering of our extended dataset. We applied the BERT model for the accurate classification of sustainability factors. The BERT model tokenized the dataset and created training and testing datasets. We fine-tuned the model on the extended dataset to find the accuracy of the model. We selected 16 epochs with batch sizes of 16 and 32. Our model is not overfit due to this epoch number. To assess whether our model was overfitting or not, we examined the training and validation accuracies/losses across epochs. Overfitting happens when the model learns to perform well on the training data but fails to generalize to unseen data. In our provided training dataset, we trained the model for 16 epochs and obtained multiple training and validation accuracies/losses. The training accuracy gradually increases from 0.63 to 0.99, which is promising. The loss reduces continuously, which indicates that the model is learning from the training data. The testing accuracy gradually increases from 0.71 to 0.91. The testing loss reduces at the start but then slightly increases towards the end. These observations show that our model is not overfit. We created a confusion matrix to analyze the performance. Confusion metrics show that the model achieves high accuracy. According to performance metrics, it is concluded that our novel study achieved the highest accuracy of 90% with 16 epochs and a batch size of 32, a level of accuracy not achieved before. One study, [63], used the BERT model for sustainability factor classification in the PROMISE_exp dataset and achieved an overall accuracy of 87%. Our model improves the accuracy to 90% by using 16 epochs and a batch size of 32, which is the main contribution of our research study. The major findings of our proposed model indicate the improvement in performance metrics values yielded an increase of 3.4% in accuracy, 3% in precision, 3.4% in recall, and 16% in F1 score values compared to the previous study [63].

Incorporating sustainability factors into requirements engineering is very beneficial in terms of reducing resource and energy consumption. In terms of energy consumption, sustainable requirements engineering may help to introduce algorithms that can save energy and resources. The integration of sustainability into requirements engineering makes software projects more cost and resource effective [77]. Sustainable requirements engineering is useful for making business processes that are eco-friendly, such as reducing paper usage by automating manual processes. We can follow environmental standards and rules by incorporating sustainability into our development processes. The integration of sustainability into requirements engineering helps us classify the nonfunctional requirements into sustainable and non-sustainable requirements categories.

NFRs such as reliability, performance, and maintainability are the major requirements that produce an impact on sustainability goals. Performance is directly related to energy consumption and resource utilization. Performance optimization helps us to reduce energy consumption and improve the efficiency of resource management. Reliability also ensures that a system requires less maintenance, which will help reduce electronic waste. Scalability is also helpful to save systems from adding extra resources, which is useful for energy consumption. Mapping NFRs with sustainable requirements makes a system more efficient and eco-friendlier. Through sustainable requirements, we can make systems that fulfill the socio-economic and eco-technical objectives of software projects. Risk management can also be handled through efficient, sustainable requirements engineering. Sustainability removes all the risks of software systems related to energy and resource consumption [78]. Incorporating sustainability factors in the software development process helps us to create systems that not only meet functional requirements but also produce an impact on social, economic, environmental, and technical factors. Nowadays, many organizations follow sustainability practices in their software projects to gain maximum benefit. Sustainability is not limited to software projects, it can be applied in all fields. Sustainable requirements engineering fosters technical contributions in software development areas, such as increasing energy efficiency, sustainable design, green IT practices, eco-friendly tools, community management, and regulatory compliance. Sustainable requirements engineering not only increases sustainability but also fulfills the software’s social, environmental, economic, and technical needs.

5.5. Threats to Validity

Many threats to validity exist in this study.

5.5.1. Internal Validity

The selection of the BERT model and the green IT factors produced threats to the internal validity of the research [63]. We tried to deal with this by giving the reasons for choosing that model and those factors. In this way, we have also reduced the threat of personal biases in model selection.

5.5.2. External Validity

External validity refers to the proposed methodology, results, and outcomes. To resolve this threat, we have implemented the BERT model on the most commonly used dataset, PROMISE_exp. The results produced by this dataset reduced this threat.

5.5.3. Evaluation Validity

Our study used four evaluation metrics (accuracy, precision, recall, and F1 score). Evaluation metrics provide the quality of the machine learning model [79]. However, these evaluation metrics may not fulfill sustainable requirements’ gathering and classification effectiveness. We can mitigate this threat by using robust evaluation metrics that are different from traditional metrics and give effective results.

5.5.4. Stakeholders’ Involvement Validity

Obtaining and integrating sustainable requirements with the BERT model is difficult. Different stakeholders have different environments and priorities. Stakeholders have different requirements and preferences regarding software systems [80]. This threat can be removed by taking continuous feedback from all stakeholders during the development process. However, this process consumes time and resources.

6. Conclusions

In this research study, we present an innovative two-step methodology that focuses on connecting NFRs to sustainability aspects affecting green IT factors. In contrast to the use of machine learning methods to classify nonfunctional requirements, our proposed study focuses on an important breakthrough in understanding the true semantics of NFRs and interpreting their sustainability goals. Two labels were created and added to the extended PROMISE_exp dataset to support the binary text classification. These labels are referred to as “socio-economic” and “eco-technical”. We gave labels to all nonfunctional requirements of the extended PROMISE_exp dataset. Moreover, we used the BERT model with its learning capabilities for binary classification. We used the BERT model to identify the NFRs’ semantics within the context of green requirements engineering. We analyzed and monitored the performance of the model to assess its impact on classification. Our model gives valuable and good results. We achieved 89.1% accuracy by selecting batch size 16 and 16 epochs. An acceptable degree of accuracy in the estimates for both the socio-economic and eco-technical cases was discovered by the confusion matrix. To enhance the accuracy, we increased the batch size to 32. By experimenting with batch size 32 and 16 epochs, we achieved 90% accuracy, which is the highest accuracy achieved to date. The performance metrics show improvement in all parameters (accuracy = 3.4%; precision = 3%; recall = 3.4%; and F1 score = 16%) compared to the previous study. Moreover, the confusion matrix shows a reduction in false positives, which indicates an improved predictive capability. The BERT model shows potential in these results by finding the perfect values of performance metric parameters. This study serves as a base for research in sustainable requirements engineering. The accurate classification of sustainability factors through the BERT model shows the effectiveness of the model with the PROMISE_exp dataset. Our research study established a relationship between NFRs and green IT factors. By integrating NFRs with sustainable green IT factors, organizations can reduce the environmental impact of their systems. We can achieve sustainable goals by integrating relevant NFRs into the system. Our proposed model ensures that systems meet technical as well as environmental constraints. It will open new horizons for a sustainable technological future. In the future, we can improve the accuracy further through a hybrid approach or take multiple datasets to show the effectiveness of the BERT model across different datasets. We can also perform further model tuning, depending on how significant each statistic is in a given application.

Author Contributions

S.H.: conceptualization, methodology, writing—original draft preparation. Q.L.: supervision, writing—review and editing, project administration. M.Z.: formal analysis, writing—review and editing. R.A.A.: data visualization, writing—review and editing, funding acquisition. M.A.Q.: formal analysis, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

The authors of this study extend their appreciation to the Researchers Supporting Project number (RSPD2024R544), King Saud University, Riyadh, Saudia Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare that they have no conflicts of interest to report.

References

Manotas, I.; Bird, C.; Zhang, R.; Shepherd, D.; Jaspan, C.; Sadowski, C.; Pollock, L.; Clause, J. An Empirical Study of Practitioners’ Perspectives on Green Software Engineering. In Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), Austin, TX, USA, 14–22 May 2016. [Google Scholar]
Mireles, G.A.G.; Moraga, M.Á.; García, F.; Piattini, M. A classification approach of sustainability aware requirements methods. In Proceedings of the 2017 12th Iberian Conference on Information Systems and Technologies (CISTI), Lisbon, Portugal, 14–17 June 2017. [Google Scholar]
Kern, E.; Dick, M.; Naumann, S.; Guldner, A.; Johann, T. Green software and green software engineering–definitions, measurements, andquality aspects. In Proceedings of the First International Conference on Information and Communication Technologies for Sustainability (ICT4S2013), Zurich, Switzerland, 14–16 February 2013. [Google Scholar]
Singh, S.; Tiwari, A.; Rastogi, S.; Sharma, V. Green and Sustainable Software Model for IT Enterprises. In Proceedings of the 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2–4 December 2021. [Google Scholar]
Dias Canedo, E.; Cordeiro Mendes, B. Software requirements classification using machine learning algorithms. Entropy 2020, 22, 1057. [Google Scholar] [CrossRef]
Hey, T.; Keim, J.; Koziolek, A.; Tichy, W.F. NoRBERT: Transfer learning for requirements classification. In Proceedings of the IEEE 28th International Requirements Engineering Conference, Zurich, Switzerland, 31 August–4 September 2020. [Google Scholar]
Quba, G.Y.; Al Qaisi, H.; Althunibat, A.; AlZu’bi, S. Software Requirements Classification using Machine Learning algorithm’s. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021. [Google Scholar]
Qasim, R.; Bangyal, W.H.; Alqarni, M.A.; Ali Almazroi, A. A finetuned BERT-based transfer learning approach for text classification. J. Healthc. Eng. 2022, 2022, 3498123. [Google Scholar] [CrossRef]
Pipalia, K.; Bhadja, R.; Shukla, M. Comparative analysis of different transformer based architectures used in sentiment analysis. In Proceedings of the 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India, 4–5 December 2020. [Google Scholar]
Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
Moreira, A.; Araújo, J.; Gralha, C.; Goulão, M.; Brito, I.S.; Albuquerque, D. A social and technical sustainability requirements catalogue. Data Knowl. Eng. 2023, 143, 102107. [Google Scholar] [CrossRef]
Venters, C.C.; Capilla, R.; Nakagawa, E.Y.; Betz, S.; Penzenstadler, B.; Crick, T.; Brooks, I. Sustainable software engineering: Reflections on advances in research and practice. Inf. Softw. Technol. 2023, 164, 10731. [Google Scholar] [CrossRef]
Ferreira, M.D.A.V.; Morgado, C.D.R.V.; Lins, M.P.E. Organizations and stakeholders’ roles and influence on implementing sustainability requirements in construction projects. Heliyon 2024, 10, e23762. [Google Scholar] [CrossRef]
Murugesan, S. Harnessing green IT: Principles and practices. IT Prof. 2008, 10, 24–33. [Google Scholar] [CrossRef]
Ayoola, B. Evaluation of Stakeholder Mapping and Personas for Sustainable Software Development. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Melbourne, Australia, 17–19 May 2023. [Google Scholar]
Penzenstadler, B. Towards a definition of sustainability in and for software engineering. In Proceedings of the 28th Annual ACM Symposium on Applied Computing, Coimbra, Portugal, 18–22 March 2013. [Google Scholar]
Malik, M.N.; Khan, H.H. Investigating software standards: A lens of sustainability for software crowdsourcing. IEEE Access 2018, 6, 5139–5150. [Google Scholar] [CrossRef]
Bambazek, P.; Groher, I.; Seyff, N. Sustainability in Agile Software Development: A Survey Study among Practitioners. In Proceedings of the 2022 International Conference on ICT for Sustainability (ICT4S), Plovdiv, Bulgaria, 14–16 June 2022. [Google Scholar]
United Nation. Department of Economic and Social Affairs, the 17 Goals. Available online: https://sdgs.un.org/goals (accessed on 2 September 2023).
Wu, J.; Guo, S.; Huang, H.; Liu, W.; Xiang, Y. Information and communications technologies for sustainable development goals: State-of the art, needs and perspectives. IEEE Commun. Surv. Tuts. 2018, 20, 2389–2406. [Google Scholar] [CrossRef]
Karita, L.; Mourão, B.C.; Martins, L.A.; Soares, L.R.; Machado, I. Software industry awareness on sustainable software engineering: A Brazilian perspective. J. Softw. Eng. Res. Dev. 2021, 9, 2–15. [Google Scholar] [CrossRef]
Javeed, A.; Khan, M.Y.; Rehman, M.; Khurshid, A. Tracking sustainable development goals—A case study of Pakistan. J. Cult. Herit. Manag. Sustain. Dev. 2021, 12, 478–496. [Google Scholar] [CrossRef]
Bambazek, P.; Groher, I.; Seyff, N. Requirements engineering for sustainable software systems: A systematic mapping study. Requir. Eng. 2023, 28, 481–505. [Google Scholar] [CrossRef]
Silveira, C.; Reis, L. Sustainability in software engineering: A design science research approach. In Proceedings of the ERAZ, Prague, Czech Republic, 1–3 July 2022. [Google Scholar]
Penzenstadler, B.; Bauer, V.; Calero, C.; Franch, X. Sustainability in software engineering: A systematic literature review. In Proceedings of the 16th International Conference Evaluation & Assessment Software Engineering (EASE), Ciudad Real, Spain, 14–15 May 2012. [Google Scholar]
Bambazek, P.; Groher, I.; Seyff, N. Application of the Sustainability Awareness Framework in Agile Software Development. In Proceedings of the 2023 IEEE 31st International Requirements Engineering Conference (RE), Hannover, Germany, 4–8 September 2023. [Google Scholar]
Andrikopoulos, V.; Boza, R.D.; Perales, C.; Lago, P. Sustainability in Software Architecture: A Systematic Mapping Study. In Proceedings of the 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Gran Canaria, Spain, 31 August–2 September 2022. [Google Scholar]
Kern, E.; Dick, M.; Johann, T.; Naumann, S. Green Software and Green IT: An End Users Perspective. In Green IT Engineering: Concepts, Models, Complex Systems Architectures; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Bambazek, P.; Groher, I.; Seyff, N. Requirements Engineering Knowledge as a Foundation for a Sustainability-Aware Scrum Framework. In Proceedings of the 2023 IEEE 31st International Requirements Engineering Conference (RE), Hannover, Germany, 4–8 September 2023. [Google Scholar]
Noman, H.; Mahoto, N.A.; Bhatti, S.; Abosaq, H.A.; Al Reshan, M.S.; Shaikh, A. An exploratory study of software sustainability at early stages of software development. Sustainability 2022, 14, 8596. [Google Scholar] [CrossRef]
Quernheim, N.; Schleich, B. Integrating Sustainability Requirements into Product Development Based on Sustainability Reporting Frameworks. Procedia CIRP 2024, 122, 551–556. [Google Scholar] [CrossRef]
Rusch, F.; Demke, N.; Willems, W.; Mantwill, F. Context-based Derivation of Holistic Sustainability Requirements in the Early Phase of Product Development. Procedia CIRP 2024, 122, 306–311. [Google Scholar] [CrossRef]
Petersen, M. How Corporate Sustainability Affects Product Developers’ Approaches toward Improving Product Sustainability. IEEE Trans. Eng. Manag. 2021, 68, 955–969. [Google Scholar] [CrossRef]
Silveira, C.; Santos, V.; Reis, L.; Mamede, H. A new Approach to Sustainability and Creativity in Requirements Engineering. In Proceedings of the 16th Iberian Conference on Information Systems and Technologies (CISTI), Chaves, Portugal, 23–26 June 2021. [Google Scholar]
McGuire, S.; Schultz, E.; Ayoola, B.; Ralph, P. Sustainability is Stratified: Toward a Better Theory of Sustainable Software Engineering. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, 14–20 May 2023. [Google Scholar]
Jabborov, A.; Kharlamova, A.; Kholmatova, Z.; Kruglov, A.; Kruglov, V.; Succi, G. Taxonomy of Quality Assessment for Intelligent Software Systems: A Systematic Literature Review. IEEE Access 2023, 11, 130491–130507. [Google Scholar] [CrossRef]
Ameller, D.; Franch, X.; Cabot, J. Dealing with Nonfunctional Requirements in model-driven development: A Survey. IEEE Trans. Softw. Eng. 2021, 47, 818–835. [Google Scholar] [CrossRef]
Jarzębowicz, A.; Weichbroth, P. A qualitative study on non-functional requirements in agile software development. IEEE Access 2021, 9, 40458–40475. [Google Scholar] [CrossRef]
Kumar, M.S.; Harika, A.; Sushama, C.; Neelima, P. Automated extraction of non-functional requirements from text files: A supervised learning approach. In Handbook of Intelligent Computing and Optimization for Sustainable Development; Wiley: Hoboken, NJ, USA, 2022; pp. 149–170. [Google Scholar]
Kopczyńska, S.; Ochodek, M.; Nawrocki, J. On importance of non-functional requirements in agile software projects—A survey. In Germany: Integrating Research and Practice in Software Engineering; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Gezici, B.; Tarhan, A.K. Systematic literature review on software quality for AI-based software. Empirical Softw. Eng. 2022, 27, 66. [Google Scholar] [CrossRef]
Lago, P.; Koçak, S.A.; Crnkovic, I.; Penzenstadler, B. Framing sustainability as a property of software quality. Commun. ACM 2015, 58, 70–78. [Google Scholar] [CrossRef]
Suman, M.W.; Rohtak, M.D.U. A comparative study of software quality models. Int. J. Comput. Sci. Inf. Technol. 2014, 5, 5634–5638. [Google Scholar]
Adams, K.M. Non-Functional Requirements in Systems Analysis and Design; Springer: Cham, Switzerland, 2015; Volume 28. [Google Scholar]
Ali, M.A.; Yap, N.K.; Ghani, A.A.A.; Zulzalil, H.; Admodisastro, N.I.; Najafabadi, A.A. A systematic mapping of quality models for AI systems, software and components. Appl. Sci. 2022, 12, 8700. [Google Scholar] [CrossRef]
Miguel, J.P.; Mauricio, D.; Rodríguez, G. A review of software quality models for the evaluation of software products. Int. J. Softw. Eng. Appl. 2014, 5, 31–53. [Google Scholar] [CrossRef]
Kaur, A. A systematic literature review on empirical analysis of the relationship between code smells and software quality attributes. Arch. Comput. Methods Eng. 2020, 27, 1267–1296. [Google Scholar] [CrossRef]
Al Hinai, M.; Chitchyan, R. Engineering requirements for social sustainability. In ICT Sustainability; Atlantis Press: Amsterdam, The Netherlands, 2016. [Google Scholar]
Saher, N.; Baharom, F.; Romli, R. Identification of Sustainability Characteristics and Sub-Characteristics as Non-Functional Requirement for Requirement Change Management in Agile. Int. J. Sci. Technol. Res. 2020, 9, 5727–5733. [Google Scholar]
Pham, Y.D.; Bouraffa, A.; Maalej, W. ShapeRE: Towards a Multi-Dimensional Representation for Requirements of Sustainable Software. In Proceedings of the IEEE 28th International Requirements Engineering Conference (RE), Zurich, Switzerland, 31 August–4 September 2020. [Google Scholar]
Garscha, P. From Sustainability in Requirements Engineering to a Sustainability-Aware Scrum Framework. In Proceedings of the IEEE 29th International Requirements Engineering Conference (RE), Notre Dame, IN, USA, 20–24 September 2021. [Google Scholar]
Duboc, L.; Penzenstadler, B.; Porras, J.; Akinli Kocak, S.; Betz, S.; Chitchyan, R.; Leifler, O.; Seyff, N.; Venters, C.C. Requirements engineering for sustainability: An awareness framework for designing software systems for a better tomorrow. Requir. Eng. 2020, 25, 469–492. [Google Scholar] [CrossRef]
Garcia-Mireles, G.A. Exploring sustainability from the software quality model perspective. In Proceedings of the 2018 13th Iberian Conference on Information Systems and Technologies (CISTI), Caceres, Spain, 13–16 June 2018. [Google Scholar]
Gjorgjevikj, A.; Mishev, K.; Antovski, L.; Trajanov, D. Requirements Engineering in Machine Learning Projects. IEEE Access 2023, 11, 72186–72208. [Google Scholar] [CrossRef]
Kici, D.; Malik, G.; Cevik, M.; Parikh, D.; Basar, A. A BERT-based transfer learning approach to text classification on software requirements specifications. In Proceedings of the 34th Canadian Conference AI, Vancouver, BC, Canada, 25–28 May 2021. [Google Scholar]
St-Louis, D.; Suryn, W. Enhancing ISO/IEC 25021 quality measure elements for wider application within ISO 25000 series. In Proceedings of the 38th Annual Conference on IEEE Industrial Electronics Society (IECON), Montreal, QC, Canada, 25–28 October 2012. [Google Scholar]
Binkhonain, M.; Zhao, L. A review of machine learning algorithms for identification and classification of non-functional requirements. Expert Syst. Appl. 2019, 1, 100001. [Google Scholar]
Rashwan, A.; Ormandjieva, O.; Witte, R. Ontology-based classification of non-functional requirements in software specifications: A new corpus and SVM-based classifier. In Proceedings of the IEEE 37th Annual Computer Software Applications Conference, Washington, DC, USA, 22–26 July 2013. [Google Scholar]
Ding, J.; Li, Y.; Ni, H.; Yang, Z. Generative text summary based on enhanced semantic attention and gain-benefit gate. IEEE Access 2020, 8, 92659–92668. [Google Scholar] [CrossRef]
Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit; O’Reilly Media: Sebastopol, CA, USA, 2009. [Google Scholar]
Honnibal, M.; Montani, I. spaCy 2: Natural language understanding with Bloom embeddings. Convolutional Neural Netw. Increm. Parsing 2017, 7, 411–420. [Google Scholar]
Lima, M.; Valle, V.; Costa, E.; Lira, F.; Gadelha, B. Software engineering repositories: Expanding the PROMISE database. In Proceedings of the 33rd Brazilian Symposium Software Engineering, Salvador, Brazil, 23–27 September 2019. [Google Scholar]
Subahi, A.F. BERT-Based Approach for Greening Software Requirements Engineering through Non-Functional Requirements. IEEE Access 2023, 11, 103001–103013. [Google Scholar] [CrossRef]
Kadebu, P.; Sikka, S.; Tyagi, R.K.; Chiurunge, P. A classification approach for software requirements towards maintainable security. Sci. Afr. 2023, 19, e01496. [Google Scholar] [CrossRef]
Kadebu, P.; Thada, V.; Chiurunge, P. Security Requirements Extraction and Classification: A Survey. In Proceedings of the 2018 3rd International Conference on Contemporary Computing and Informatics (IC3I), Gurgaon, India, 10–12 October 2018. [Google Scholar]
Gomes, L.; da Silva Torres, R.; Côrtes, M.L. BERT- and TF-IDF-based feature extraction for long-lived bug prediction in FLOSS: A comparative study. Inf. Softw. Technol. 2023, 160, 107217. [Google Scholar] [CrossRef]
Khadhraoui, M.; Bellaaj, H.; Ammar, M.B.; Hamam, H.; Jmaiel, M. Survey of BERT-Base Models for Scientific Text Classification: COVID-19 Case Study. Appl. Sci. 2022, 12, 2891. [Google Scholar] [CrossRef]
Ketkar, N.; Moolayil, J.; Ketkar, N.; Moolayil, J. Introduction to PyTorch. In Deep Learning with Python; Apress: Berkeley, CA, USA, 2021. [Google Scholar]
Penzenstadler, B.; Raturi, A.; Richardson, D.; Tomlinson, B. Safety, Security, Now Sustainability: The Nonfunctional Requirement for the 21st Century. IEEE Softw. 2014, 31, 40–47. [Google Scholar] [CrossRef]
Condori-Fernandez, N.; Lago, P. Characterizing the contribution of quality requirements to software sustainability. J. Syst. Softw. 2018, 137, 289–305. [Google Scholar] [CrossRef]
Bilal, M.; Almazroi, A.A. Effectiveness of Fine-tuned BERT Model in Classification of Helpful and Unhelpful Online Customer Reviews. Electron. Commer. Res. 2023, 23, 2737–2757. [Google Scholar] [CrossRef]
Sun, J.W.; Bao, J.Q.; Bu, L.P. Text Classification Algorithm Based on TF-IDF and BERT. In Proceedings of the 11th International Conference of Information and Communication Technology (ICTech), Wuhan, China, 4–6 February 2022. [Google Scholar]
GeeksforGeeks. Evaluation Metrics in Machine Learning. Sanchhaya Education Private Limited. Available online: https://www.geeksforgeeks.org/metrics-for-machine-learning-model/ (accessed on 26 June 2024).
Ul Hassan, I.; Ali, R.H.; Ul Abideen, Z.; Khan, T.A.; Kouatly, R. Significance of machine learning for detection of malicious websites on an unbalanced dataset. Digital 2022, 2, 501–509. [Google Scholar] [CrossRef]
Wong, T.T. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 2015, 48, 2839–2846. [Google Scholar] [CrossRef]
Wong, T.T.; Yeh, P.Y. Reliable Accuracy Estimates from k-Fold Cross Validation. IEEE Trans. Knowl. Data Eng. 2020, 32, 1586–1594. [Google Scholar] [CrossRef]
Saputri, T.R.D.; Lee, S.W. Addressing sustainability in the requirements engineering process: From elicitation to functional decomposition. J. Softw. Evol. Proc. 2020, 32, e2254. [Google Scholar] [CrossRef]
Schulte, J.; Knuts, S. Sustainability impact and effects analysis—A risk management tool for sustainable product development. Sustain. Prod. Consum. 2022, 30, 737–751. [Google Scholar] [CrossRef]
Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics. Electronics 2021, 10, 593. [Google Scholar] [CrossRef]
Shen, Y.; Breaux, T. Stakeholder Preference Extraction From Scenarios. IEEE Trans. Softw. Eng. 2024, 50, 69–84. [Google Scholar] [CrossRef]

Figure 1. The overall workflow of the proposed methodology.

Figure 2. Dataset extension and labeling.

Figure 3. Distribution of NFRs in the extended dataset.

Figure 4. Labeling process in the extended dataset.

Figure 5. Training process and BERT model development.

Figure 6. Sustainability dimensions for BERT model training.

Figure 7. Confusion matrix with epoch number 16 and batch size 16.

Figure 8. Confusion matrix with epoch number 16 and batch size 32.

Table 1. Relationship between green IT aspects and sustainability concerns.

Sr. No.	Green IT Aspects	Sustainability Concerns
1	Sustainable Procurement	Creating procedures to permit organizations to improve their productivity, increase their revenue, reduce costs, and maximize the benefits of employees. Increasing computer device usage and the removal of IT equipment. It supports software systems that are scalable, expandable, and upgradable. Adopting procurement practices that are environmentally friendly. Select those IT services and systems that are long lasting.
2	Energy Efficient	Increasing the energy efficiency of computer services and operations. This can be achieved in data centers by installing cooling fans and using energy-efficient computer systems. Reducing the environmental influence of electronic waste. This can be achieved by disposing of old IT equipment such as batteries, screens, and processors. Training employees with the knowledge and skills to adopt social and eco-friendly software practices. This includes increasing employee productivity and reducing costs.
3	Resource Utilization	Increasing computer device usage and the removal of IT equipment. It supports software systems that are scalable, expandable, and upgradable. Adopting procurement practices that are environmentally friendly. Selecting those seller companies that support environmental aspects.
4	E-waste Management and Recycling	Minimizing the environmental influence of electronic waste. This can be achieved by disposing of old IT equipment. Increasing the energy efficiency of computer services and operations. This can be achieved in data centers by server virtualization and using energy-efficient computer systems.

Table 2. Common nonfunctional requirements with description [38,39,42].

Sr. No.	NFRs	Description
1	Availability (A)	It is defined as how much time the software is ready to use for its users. Availability can be calculated by the amount of time that the system is available for users.
2	Legal (L)	It is defined as a software characteristic that fulfills industry standards and organizational rules and minimizes legal disputes.
3	Look & Feel (LF)	It is defined as a software property that shows the software to be attractive and worthwhile to users.
4	Performance (PE)	It is a software property that deals with many factors, such as response time, throughput, and time constraints of the software system.
5	Fault Tolerance (FT)	The property of the software that deals with the detection of and recovery from faults. It enables the software to work properly despite the presence of errors and failures.
6	Operational (O)	The property of the software system that enables the software to work properly in its environment according to all stakeholders.
7	Security (SE)	It is a quality attribute that deals with the security and privacy of the system from unauthorized access, malware, and hacker attacks. This property ensures the confidentiality of the system.
8	Scalability (SC)	The property that enables the software to increase its functions and operations without compromising its performance.
10	Maintainability (MN)	It is defined as a software property that deals with the traceability of the system. It finds how much time a system utilizes to identify errors and manage requirement changes.
11	Accessibility (AC)	It is defined as how much a software system is accessible to specific users with efficiency.
12	Portability (PO)	The property that enables the software to run under different computing environments.
13	Efficiency (EF)	It is defined as how much time a system requires to execute a specific function.
14	Interoperability (IN)	It is defined as the capability of a software system to integrate all its sub-components. It focuses on software compatibility and coupling.
15	Usability (US)	It is defined as how user-friendly a system is. It deals with user easiness and effectiveness within the system.
16	Reliability (RE)	It deals with the software system’s ability to perform its operations without errors and failure under different conditions.

Table 3. Linkage between nonfunctional requirements and green IT factors.

Sr. No.	NFRs	Green IT Factors	Impact on Green IT Factors
1	Scalability	Resource Utilization	Scalability ensures that a system can handle excessive load without adding extra resources. Scalability can reduce waste by proper resource management. Cloud computing and server virtualization are important techniques for efficient resource management.
2	Performance	Energy Consumption	Good performance can minimize the energy consumption of the software system. Optimized algorithms and proper resource management ensure the system performs faster and consumes less energy.
3	Usability	Resource Utilization	User-friendly software systems can minimize time and resources and contribute to sustainability.
4	Maintainability	Environment Friendly	Easily maintainable software systems require low effort and resources to update, which enhances sustainability.
5	Availability	Resource Utilization, Environment Friendly	Making systems available all the time can increase their energy consumption. We can reduce energy consumption by integrating smart energy management practices.
6	Security	Energy-intensive Recovery	Secure and privacy-preserving software systems can handle cyber-attacks easily, which is useful for the energy-intensive recovery process.
7	Efficiency	Energy Consumption, Resource Utilization	Efficiency is directly related to green IT factors. Efficient software systems can cause reduced energy consumption and better resource management.
8	Interoperability	Resource Management	Software systems that interact with other systems can mitigate redundant resources and improve the system’s efficiency.
9	Portability	Energy-efficient Environment	Portable systems run in different environments, consuming less energy.
10	Reliability	Energy Consumption	Reliable systems consume less energy due to reduced downtime and minimize the need for maintenance.

Table 4. Nonfunctional requirements and their impact on technical sustainability.

Sr. No.	NFR	Impact
1	O	Software should operate in such a manner that it improves productivity and satisfaction in an organization.
2	SE	Enabling the software system to be used in a safe and secure environment. Securing the system from unauthorized access and attacks [57].
3	PO	Confirm that the majority of people can benefit from the software system regardless of its working environment. Ensuring access to technology and service for a varied range of people.
4	F	Ensuring the usage of technology and resources efficiently.
5	AC	Ensure that the system can be accessible and usable to a variety of users.
6	L	Building confidence and trust among people to use the system for a long period.
7	LF	Ensuring user interaction with the system efficiently. Ensuring that the system’s interface is attractive and useful for people coming from different regions around the globe.
8	US	Confirming that the system is usable and accessible to a variety of users, including those that have a small knowledge of software systems.
9	A	Ensuring users’ confidence, trust, and ownership of the system. Improving productivity by giving access to all users regardless of time, location, or knowledge level.

Table 5. Nonfunctional requirements and their impact on environmental sustainability.

Sr. No.	NFR	Impact
1	IN	Reducing the duplication of hardware and software ensures less energy consumption.
2	EF	Through efficiency, we can reduce energy consumption.
3	FT	Endorsing an environmentally friendly structure. Ensuring the proper use of resources during system failure or error.
4	PE	Decreasing the time required to complete operations. Reducing memory, CPU, and resource usage.
5	MN	Managing the software properly by decreasing the demand for software replacement.

Table 6. Basic formulas to find values of performance metrics [5].

Sr. No.	Performance Metric	Equation
1	Precision	$\frac{T P}{T P + F P}$
2	Recall	$\frac{T P}{T P + F N}$
3	Accuracy	$\frac{T P + T N}{T P + F N + T N + F P}$
4	F1 Score	$\frac{2 * (P r e c i s i o n * R e c a l l)}{P r e c i s i o n + R e c a l l}$

Table 7. Performance metrics with batch size 16 and epoch number 16. The best-achieved values are in bold, while the second-highest values are underlined.

Sr. No.	Epoch Number	Accuracy	Precision	Recall	F1 Score
1	1	0.738	0.545	0.738	0.627
2	2	0.837	0.846	0.837	0.841
3	3	0.891	0.889	0.891	0.889
4	4	0.882	0.881	0.882	0.882
5	5	0.855	0.859	0.855	0.857
6	6	0.837	0.858	0.837	0.843
7	7	0.846	0.845	0.846	0.845
8	8	0.864	0.875	0.864	0.868
9	16	0.883	0.884	0.882	0.883

Table 8. Performance metrics with batch size 32 and epoch number 16. The best-achieved values are in bold, while the second-highest values are underlined.

Sr. No.	Epoch Number	Accuracy	Precision	Recall	F1 Score
1	1	0.738	0.545	0.738	0.627
2	2	0.747	0.811	0.747	0.648
3	3	0.864	0.866	0.864	0.865
4	4	0.864	0.870	0.864	0.866
5	5	0.828	0.853	0.828	0.831
6	6	0.756	0.838	0.756	0.771
7	7	0.891	0.893	0.891	0.885
8	8	0.882	0.884	0.882	0.875
9	16	0.900	0.899	0.900	0.900

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hassan, S.; Li, Q.; Zubair, M.; Alsowail, R.A.; Qureshi, M.A. Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model. Sustainability 2024, 16, 5901. https://doi.org/10.3390/su16145901

AMA Style

Hassan S, Li Q, Zubair M, Alsowail RA, Qureshi MA. Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model. Sustainability. 2024; 16(14):5901. https://doi.org/10.3390/su16145901

Chicago/Turabian Style

Hassan, Shoaib, Qianmu Li, Muhammad Zubair, Rakan A. Alsowail, and Muaz Ahmad Qureshi. 2024. "Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model" Sustainability 16, no. 14: 5901. https://doi.org/10.3390/su16145901

APA Style

Hassan, S., Li, Q., Zubair, M., Alsowail, R. A., & Qureshi, M. A. (2024). Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model. Sustainability, 16(14), 5901. https://doi.org/10.3390/su16145901

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unveiling the Correlation between Nonfunctional Requirements and Sustainable Environmental Factors Using a Machine Learning Model

Abstract

1. Introduction

2. Literature Review

2.1. Sustainability

2.2. Nonfunctional Requirements (NFRs)

2.3. Quality Models

2.4. Linkage between NFRs and Green IT Factors

2.5. Machine Learning in Requirements Engineering

2.6. Python-Based Natural Language Processing

3. Preprocessing and Dataset Training

3.1. Overview of Proposed Novel Methodology

3.2. PROMISE_exp Dataset Expansion

3.3. Dataset Labeling

3.4. Feature Engineering

4. Proposed Methodology

4.1. NFRs and Sustainability Dimensions Mapping

4.1.1. Mapping Relevant NFRs to Technical Sustainability

4.1.2. Mapping Relevant NFRs to Environmental Sustainability

4.1.3. Mapping Sustainability Aspects of Green IT Factors

5. Results and Model Evaluation

5.1. BERT Model Evaluation

5.2. Results Analysis

5.3. Cross-Validation through K-fold

5.4. Discussion

5.5. Threats to Validity

5.5.1. Internal Validity

5.5.2. External Validity

5.5.3. Evaluation Validity

5.5.4. Stakeholders’ Involvement Validity

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI