Decision-Making and Data Mining for Sustainable Computing

Journal Name	Impact Factor	CiteScore	Launched Year	First Decision (median)	APC
Algorithms algorithms	2.1	4.5	2008	17.8 Days	CHF 1800
Data data	2.0	5.0	2016	25.2 Days	CHF 1600
Information information	2.9	6.5	2010	18.6 Days	CHF 1800
Mathematics mathematics	2.2	4.6	2013	18.4 Days	CHF 2600
Symmetry symmetry	2.2	5.3	2009	17.1 Days	CHF 2400

22 pages, 6282 KB

Open AccessArticle

CropsDisNet: An AI-Based Platform for Disease Detection and Advancing On-Farm Privacy Solutions

by Mohammad Badhruddouza Khan, Salwa Tamkin, Jinat Ara, Mobashwer Alam and Hanif Bhuiyan

Data 2025, 10(2), 25; https://doi.org/10.3390/data10020025 - 18 Feb 2025

Cited by 1 | Viewed by 2958

Crop failure is defined as crop production that is significantly lower than anticipated, resulting from plants that are harmed, diseased, destroyed, or influenced by climatic circumstances. With the rise in global food security concern, the earliest detection of crop diseases has proven to [...] Read more.

Crop failure is defined as crop production that is significantly lower than anticipated, resulting from plants that are harmed, diseased, destroyed, or influenced by climatic circumstances. With the rise in global food security concern, the earliest detection of crop diseases has proven to be pivotal in agriculture industries to address the needs of the global food crisis and on-farm data protection, which can be met with a privacy-preserving deep learning model. However, deep learning seems to be a largely complex black box to interpret, necessitating a prerequisite for the groundwork of the model’s interpretability. Considering this, the aim of this study was to follow up on the establishment of a robust deep learning custom model named CropsDisNet, evaluated on a large-scale dataset named “New Bangladeshi Crop Disease Dataset (corn, potato and wheat)”, which contains a total of 8946 images. The integration of a differential privacy algorithm into our CropsDisNet model could establish the benefits of automated crop disease classification without compromising on-farm data privacy by reducing training data leakage. To classify corn, potato, and wheat leaf diseases, we used three representative CNN models for image classification (VGG16, Inception Resnet V2, Inception V3) along with our custom model, and the classification accuracy for these three different crops varied from 92.09% to 98.29%. In addition, demonstration of the model’s interpretability gave us insight into our model’s decision making and classification results, which can allow farmers to understand and take appropriate precautions in the event of early widespread harvest failure and food crises. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

23 pages, 7192 KB

Open AccessArticle

Data Decomposition Modeling Based on Improved Dung Beetle Optimization Algorithm for Wind Power Prediction

by Jiajian Ke and Tian Chen

Data 2024, 9(12), 146; https://doi.org/10.3390/data9120146 - 9 Dec 2024

Cited by 2 | Viewed by 1247

Abstract

Accurate wind power forecasting is essential for maintaining the stability of a power system and enhancing scheduling efficiency in the power sector. To enhance prediction accuracy, this paper presents a hybrid wind power prediction model that integrates the improved complementary ensemble empirical mode [...] Read more.

Accurate wind power forecasting is essential for maintaining the stability of a power system and enhancing scheduling efficiency in the power sector. To enhance prediction accuracy, this paper presents a hybrid wind power prediction model that integrates the improved complementary ensemble empirical mode decomposition (ICEEMDAN), the RIME optimization algorithm (RIME), sample entropy (SE), the improved dung beetle optimization (IDBO) algorithm, the bidirectional long short-term memory (BiLSTM) network, and multi-head attention (MHA). In this model, RIME is utilized to improve the parameters of ICEEMDAN, reducing data decomposition complexity and effectively capturing the original data information. The IDBO algorithm is then utilized to improve the hyperparameters of the MHA-BiLSTM model. The proposed RIME-ICEEMDAN-IDBO-MHA-BiLSTM model is contrasted with ten others in ablation experiments to validate its performance. The experimental findings prove that the proposed model achieves MAPE values of 5.2%, 6.3%, 8.3%, and 5.8% across four datasets, confirming its superior predictive performance and higher accuracy. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

24 pages, 830 KB

Open AccessSystematic Review

Evolving Strategies in Machine Learning: A Systematic Review of Concept Drift Detection

by Gurgen Hovakimyan and Jorge Miguel Bravo

Information 2024, 15(12), 786; https://doi.org/10.3390/info15120786 - 7 Dec 2024

Cited by 3 | Viewed by 9807

Abstract

In this comprehensive literature review, we rigorously adhere to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for our process and reporting. This review employs an innovative method integrating the advanced natural language processing model T5 (Text-to-Text Transfer Transformer) to [...] Read more.

In this comprehensive literature review, we rigorously adhere to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for our process and reporting. This review employs an innovative method integrating the advanced natural language processing model T5 (Text-to-Text Transfer Transformer) to enhance the accuracy and efficiency of screening and data extraction processes. We assess strategies for handling the concept drift in machine learning using high-impact publications from notable databases that were made accessible via the IEEE and Science Direct APIs. The chronological analysis covering the past two decades provides a historical perspective on methodological advancements, recognizing their strengths and weaknesses through citation metrics and rankings. This review aims to trace the growth and evolution of concept drift mitigation strategies and to provide a valuable resource that guides future research and deepens our understanding of this rapidly changing field. Key findings highlight the effectiveness of diverse methodologies such as drift detection methods, window-based methods, unsupervised statistical methods, and neural network techniques. However, challenges remain, particularly with imbalanced data, computational efficiency, and the application of concept drift detection to non-tabular data like images. This review aims to trace the growth and evolution of concept drift mitigation strategies and provide a valuable resource that guides future research and deepens our understanding of this rapidly changing field. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

16 pages, 624 KB

Open AccessArticle

Flexible Techniques to Detect Typical Hidden Errors in Large Longitudinal Datasets

by Renato Bruni, Cinzia Daraio and Simone Di Leo

Symmetry 2024, 16(5), 529; https://doi.org/10.3390/sym16050529 - 28 Apr 2024

Viewed by 1398

Abstract

The increasing availability of longitudinal data (repeated numerical observations of the same units at different times) requires the development of flexible techniques to automatically detect errors in such data. Besides standard types of errors, which can be treated with generic error correction techniques, [...] Read more.

The increasing availability of longitudinal data (repeated numerical observations of the same units at different times) requires the development of flexible techniques to automatically detect errors in such data. Besides standard types of errors, which can be treated with generic error correction techniques, large longitudinal datasets may present specific problems not easily traceable by the generic techniques. In particular, after applying those generic techniques, time series in the data may contain trends, natural fluctuations and possible surviving errors. To study the data evolution, one main issue is distinguishing those elusive errors from the rest, which should be kept as they are and not flattened or altered. This work responds to this need by identifying some types of elusive errors and by proposing a statistical-mathematical approach to capture their complexity that can be applied after the above generic techniques. The proposed approach is based on a system of indicators and works at the formal level by studying the differences between consecutive values of data series and the symmetries and asymmetries of these differences. It operates regardless of the specific meaning of the data and is thus applicable in a variety of contexts. We implement this approach in a relevant database of European Higher Education institutions (ETER) by analyzing two key variables: “Total academic staff” and “Total number of enrolled students”, which are two of the most important variables, often used in empirical analysis as a proxy for size, and are considered by policymakers at the European level. The results are very promising. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

18 pages, 2117 KB

Open AccessArticle

A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization

by Mohammed Shantal, Zalinda Othman and Azuraliza Abu Bakar

Symmetry 2023, 15(12), 2185; https://doi.org/10.3390/sym15122185 - 11 Dec 2023

Cited by 74 | Viewed by 6892

Abstract

In the realm of data analysis and machine learning, achieving an optimal balance of feature importance, known as feature weighting, plays a pivotal role, especially when considering the nuanced interplay between the symmetry of data distribution and the need to assign differential weights [...] Read more.

In the realm of data analysis and machine learning, achieving an optimal balance of feature importance, known as feature weighting, plays a pivotal role, especially when considering the nuanced interplay between the symmetry of data distribution and the need to assign differential weights to individual features. Also, avoiding the dominance of large-scale traits is essential in data preparation. This step makes choosing an effective normalization approach one of the most challenging aspects of machine learning. In addition to normalization, feature weighting is another strategy to deal with the importance of the different features. One of the strategies to measure the dependency of features is the correlation coefficient. The correlation between features shows the relationship strength between the features. The integration of the normalization method with feature weighting in data transformation for classification has not been extensively studied. The goal is to improve the accuracy of classification methods by striking a balance between the normalization step and assigning greater importance to features with a strong relation to the class feature. To achieve this, we combine Min–Max normalization and weight the features by increasing their values based on their correlation coefficients with the class feature. This paper presents a proposed Correlation Coefficient with Min–Max Weighted (CCMMW) approach. The data being normalized depends on their correlation with the class feature. Logistic regression, support vector machine, k-nearest neighbor, neural network, and naive Bayesian classifiers were used to evaluate the proposed method. Twenty UCI Machine Learning Repository and Kaggle datasets with numerical values were also used in this study. The empirical results showed that the proposed CCMMW significantly improves the classification performance through support vector machine, logistic regression, and neural network classifiers in most datasets. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

33 pages, 2391 KB

Open AccessArticle

A Tourist-Based Framework for Developing Digital Marketing for Small and Medium-Sized Enterprises in the Tourism Sector in Saudi Arabia

by Rishaa Abdulaziz Alnajim and Bahjat Fakieh

Data 2023, 8(12), 179; https://doi.org/10.3390/data8120179 - 28 Nov 2023

Cited by 14 | Viewed by 6646

Abstract

Social media has become an essential tool for travel planning, with tourists increasingly using it to research destinations, book accommodation, and make travel arrangements. However, little is known about how tourists use social media for travel planning and what factors influence their intentions [...] Read more.

Social media has become an essential tool for travel planning, with tourists increasingly using it to research destinations, book accommodation, and make travel arrangements. However, little is known about how tourists use social media for travel planning and what factors influence their intentions to use social media for this purpose. This thesis aims to understand tourists’ intentions to use social media for travel planning. Specifically, it investigates the factors influencing tourists’ intentions to use social media for planning travel to Saudi Arabia. It develops a machine learning (ML) classification model to assist Saudi tourism SMEs in creating effective digital marketing strategies for social media platforms. A survey was conducted with 573 tourists interested in visiting Saudi Arabia, using the Design Science Research (DSR) approach. The findings support the tourist-based theoretical framework, showing that perceived usefulness (PU), perceived ease of use (PEOU), satisfaction (SAT), marketing-generated content (MGC), and user-generated content (UGC) significantly impact tourists’ intentions to use social media for travel planning. Tourists’ characteristics and visit characteristics influenced their intentions to use MGC but not UGC. The tourist-based ML classification model, developed using the LinearSVC algorithm, achieved an accuracy of 99% when evaluated using the K-Fold Cross-Validation (KF-CV) technique. The findings of this study have several implications for Saudi tourism SMEs. First, the results suggest that SMEs should focus on developing social media content that is perceived as useful, easy to use, and satisfying. Second, the findings suggest that SMEs should focus on using MGC in their social media marketing campaigns. Third, the results suggest that SMEs should tailor their social media marketing campaigns to the characteristics of their target tourists. This study contributes to the literature on tourism marketing and social media by providing a better understanding of how tourists use social media for travel planning. Saudi tourism SMEs can use the findings of this study to develop more effective digital marketing strategies for social media platforms. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

22 pages, 3120 KB

Open AccessArticle

The Best Whey Protein Powder Selection via VIKOR Based on Circular Intuitionistic Fuzzy Sets

by Elif Çaloğlu Büyükselçuk and Yiğit Can Sarı

Symmetry 2023, 15(7), 1313; https://doi.org/10.3390/sym15071313 - 27 Jun 2023

Cited by 13 | Viewed by 3097

Abstract

People try very hard to have a symmetrical, strong, and beautiful body. The human body needs high amino acids for muscle protein synthesis. Whey protein is a good choice that contains all amino acids, which increases muscle protein synthesis and improves body shape [...] Read more.

People try very hard to have a symmetrical, strong, and beautiful body. The human body needs high amino acids for muscle protein synthesis. Whey protein is a good choice that contains all amino acids, which increases muscle protein synthesis and improves body shape with resistance exercise. For this reason, those who do sports, especially professionals, prefer to use these products frequently. A large number of commercial whey protein powders are sold on the market, and to achieve maximum purpose, individuals want to use the best one. Intuitionistic fuzzy sets are used in order to minimize the negative effects of the uncertainty environment and ambiguous information encountered in the decision-making process on the solution. In this study, VIKOR, based on the circular intuitionistic fuzzy set, has been used to determine the best whey protein supplement. In line with the comprehensive literature review and expert opinions, the evaluation criteria affecting the selection process have been determined, and the solution of the problem has been focused. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

25 pages, 7323 KB

Open AccessArticle

Data Preprocessing and Neural Network Architecture Selection Algorithms in Cases of Limited Training Sets—On an Example of Diagnosing Alzheimer’s Disease

by Aleksandr Alekseev, Leonid Kozhemyakin, Vladislav Nikitin and Julia Bolshakova

Algorithms 2023, 16(5), 219; https://doi.org/10.3390/a16050219 - 25 Apr 2023

Cited by 3 | Viewed by 3948

Abstract

This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables [...] Read more.

This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables applicated to aggregation of a few indicators to one integrated assessment presents as a superposition of few functions of two variables. Magnetic susceptibility values of the basal veins and veins of the thalamus were used as indicators. Two categories of patients were used as function values. To increase accuracy, the idea of using artificial neural networks was suggested, but a feature of medical data is its limitation. Therefore, neural networks based on limited training datasets may be inefficient. The solution to this problem is proposed to preprocess initial datasets to determine the parameters of the neural networks based on decisions’ roots, because it is known that any can be represented in the incompletely connected neural network form with a cascade structure. There are no publicly available specialized software products allowing the user to set the complex structure of a neural network, which is why the number of synaptic coefficients of an incompletely connected neural network has been determined. This made it possible to predefine fully connected neural networks, comparable in terms of the number of unknown parameters. Acceptable accuracy was obtained in cases of one-layer and two-layer fully connected neural networks trained on limited training sets on an example of diagnosing Alzheimer’s disease. Thus, the scientific hypothesis on preprocessing initial datasets and neural network architecture selection using special methods and algorithms was confirmed. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

19 pages, 1771 KB

Open AccessArticle

Adaptive Kernel Graph Nonnegative Matrix Factorization

by Rui-Yu Li, Yu Guo and Bin Zhang

Information 2023, 14(4), 208; https://doi.org/10.3390/info14040208 - 29 Mar 2023

Cited by 3 | Viewed by 2388

Abstract

Nonnegative matrix factorization (NMF) is an efficient method for feature learning in the field of machine learning and data mining. To investigate the nonlinear characteristics of datasets, kernel-method-based NMF (KNMF) and its graph-regularized extensions have received much attention from various researchers due to [...] Read more.

Nonnegative matrix factorization (NMF) is an efficient method for feature learning in the field of machine learning and data mining. To investigate the nonlinear characteristics of datasets, kernel-method-based NMF (KNMF) and its graph-regularized extensions have received much attention from various researchers due to their promising performance. However, the graph similarity matrix of the existing methods is often predefined in the original space of data and kept unchanged during the matrix-factorization procedure, which leads to non-optimal graphs. To address these problems, we propose a kernel-graph-learning-based, nonlinear, nonnegative matrix-factorization method in this paper, termed adaptive kernel graph nonnegative matrix factorization (AKGNMF). In order to automatically capture the manifold structure of the data on the nonlinear feature space, AKGNMF learned an adaptive similarity graph. We formulated a unified objective function, in which global similarity graph learning is optimized jointly with the matrix decomposition process. A local graph Laplacian is further imposed on the learned feature subspace representation. The proposed method relies on both the factorization that respects geometric structure and the mapped high-dimensional subspace feature representations. In addition, an efficient iterative solution was derived to update all variables in the resultant objective problem in turn. Experiments on the synthetic dataset visually demonstrate the ability of AKGNMF to separate the nonlinear dataset with high clustering accuracy. Experiments on real-world datasets verified the effectiveness of AKGNMF in three aspects, including clustering performance, parameter sensitivity and convergence. Comprehensive experimental findings indicate that, compared with various classic methods and the state-of-the-art methods, the proposed AKGNMF algorithm demonstrated effectiveness and superiority. Full article

(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)

► Show Figures

Figure 1

Topic Menu

Topic Editors

Decision-Making and Data Mining for Sustainable Computing

Topic Information

Keywords

Participating Journals

Published Papers (9 papers)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI