Topic Editors

Dr. Sunil Jha
Adani Institute of Infrastructure Engineering, Adani University, Ahmedabad 382421, Gujarat, India
Department of Cognitive Science and Mathematical Modeling, University of Information Technology and Management, 35-225 Rzeszow, Poland
School of Information and Control, Nanjing University of Information Science and Technology, Nanjing 210044, China

Decision-Making and Data Mining for Sustainable Computing

Abstract submission deadline
closed (30 September 2024)
Manuscript submission deadline
closed (30 November 2024)
Viewed by
19811

Topic Information

Dear Colleagues,

Statistical and machine learning (ML) approaches to artificial intelligence (AI) have been successfully implemented in the predictive applications of several domains of science and engineering in recent years. The ML algorithms of AI are vital components for the development of automated, accurate, and robust prediction systems after analysis of the data for each specific application. Improved accuracy of the statistical and ML approaches to AI is crucial in each of the applications of predictive modeling in many domains, including healthcare, agriculture, space, etc. The development of advanced ML approaches and their implementation in the analysis of experimental and simulated data is challenging research at present. With this objective, the present Issue invites researchers and academicians to submit their novel and unpublished research outcomes related to the current development of statistical and ML approaches to AI in science- and engineering-related predictive modeling applications. This Special Issue is the first in the fields of decision-making and data mining for sustainable computing, and it will cover a broad range of topics related to applications of ML approaches in the analysis of data, not limited to the following subtopics.

Dr. Sunil Jha
Dr. Malgorzata Rataj
Dr. Xiaorui Zhang
Topic Editors

Keywords

  • machine learning
  • data mining
  • predictive modeling
  • intelligent forecasting
  • sustainable computing
  • decision making

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Algorithms
algorithms
1.8 4.1 2008 15 Days CHF 1600
Data
data
2.2 4.3 2016 27.7 Days CHF 1600
Information
information
2.4 6.9 2010 14.9 Days CHF 1600
Mathematics
mathematics
2.3 4.0 2013 17.1 Days CHF 2600
Symmetry
symmetry
2.2 5.4 2009 16.8 Days CHF 2400

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (8 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
23 pages, 7192 KiB  
Article
Data Decomposition Modeling Based on Improved Dung Beetle Optimization Algorithm for Wind Power Prediction
by Jiajian Ke and Tian Chen
Data 2024, 9(12), 146; https://doi.org/10.3390/data9120146 - 9 Dec 2024
Viewed by 424
Abstract
Accurate wind power forecasting is essential for maintaining the stability of a power system and enhancing scheduling efficiency in the power sector. To enhance prediction accuracy, this paper presents a hybrid wind power prediction model that integrates the improved complementary ensemble empirical mode [...] Read more.
Accurate wind power forecasting is essential for maintaining the stability of a power system and enhancing scheduling efficiency in the power sector. To enhance prediction accuracy, this paper presents a hybrid wind power prediction model that integrates the improved complementary ensemble empirical mode decomposition (ICEEMDAN), the RIME optimization algorithm (RIME), sample entropy (SE), the improved dung beetle optimization (IDBO) algorithm, the bidirectional long short-term memory (BiLSTM) network, and multi-head attention (MHA). In this model, RIME is utilized to improve the parameters of ICEEMDAN, reducing data decomposition complexity and effectively capturing the original data information. The IDBO algorithm is then utilized to improve the hyperparameters of the MHA-BiLSTM model. The proposed RIME-ICEEMDAN-IDBO-MHA-BiLSTM model is contrasted with ten others in ablation experiments to validate its performance. The experimental findings prove that the proposed model achieves MAPE values of 5.2%, 6.3%, 8.3%, and 5.8% across four datasets, confirming its superior predictive performance and higher accuracy. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

24 pages, 830 KiB  
Systematic Review
Evolving Strategies in Machine Learning: A Systematic Review of Concept Drift Detection
by Gurgen Hovakimyan and Jorge Miguel Bravo
Information 2024, 15(12), 786; https://doi.org/10.3390/info15120786 - 7 Dec 2024
Viewed by 735
Abstract
In this comprehensive literature review, we rigorously adhere to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for our process and reporting. This review employs an innovative method integrating the advanced natural language processing model T5 (Text-to-Text Transfer Transformer) to [...] Read more.
In this comprehensive literature review, we rigorously adhere to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for our process and reporting. This review employs an innovative method integrating the advanced natural language processing model T5 (Text-to-Text Transfer Transformer) to enhance the accuracy and efficiency of screening and data extraction processes. We assess strategies for handling the concept drift in machine learning using high-impact publications from notable databases that were made accessible via the IEEE and Science Direct APIs. The chronological analysis covering the past two decades provides a historical perspective on methodological advancements, recognizing their strengths and weaknesses through citation metrics and rankings. This review aims to trace the growth and evolution of concept drift mitigation strategies and to provide a valuable resource that guides future research and deepens our understanding of this rapidly changing field. Key findings highlight the effectiveness of diverse methodologies such as drift detection methods, window-based methods, unsupervised statistical methods, and neural network techniques. However, challenges remain, particularly with imbalanced data, computational efficiency, and the application of concept drift detection to non-tabular data like images. This review aims to trace the growth and evolution of concept drift mitigation strategies and provide a valuable resource that guides future research and deepens our understanding of this rapidly changing field. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

16 pages, 624 KiB  
Article
Flexible Techniques to Detect Typical Hidden Errors in Large Longitudinal Datasets
by Renato Bruni, Cinzia Daraio and Simone Di Leo
Symmetry 2024, 16(5), 529; https://doi.org/10.3390/sym16050529 - 28 Apr 2024
Viewed by 970
Abstract
The increasing availability of longitudinal data (repeated numerical observations of the same units at different times) requires the development of flexible techniques to automatically detect errors in such data. Besides standard types of errors, which can be treated with generic error correction techniques, [...] Read more.
The increasing availability of longitudinal data (repeated numerical observations of the same units at different times) requires the development of flexible techniques to automatically detect errors in such data. Besides standard types of errors, which can be treated with generic error correction techniques, large longitudinal datasets may present specific problems not easily traceable by the generic techniques. In particular, after applying those generic techniques, time series in the data may contain trends, natural fluctuations and possible surviving errors. To study the data evolution, one main issue is distinguishing those elusive errors from the rest, which should be kept as they are and not flattened or altered. This work responds to this need by identifying some types of elusive errors and by proposing a statistical-mathematical approach to capture their complexity that can be applied after the above generic techniques. The proposed approach is based on a system of indicators and works at the formal level by studying the differences between consecutive values of data series and the symmetries and asymmetries of these differences. It operates regardless of the specific meaning of the data and is thus applicable in a variety of contexts. We implement this approach in a relevant database of European Higher Education institutions (ETER) by analyzing two key variables: “Total academic staff” and “Total number of enrolled students”, which are two of the most important variables, often used in empirical analysis as a proxy for size, and are considered by policymakers at the European level. The results are very promising. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

18 pages, 2117 KiB  
Article
A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization
by Mohammed Shantal, Zalinda Othman and Azuraliza Abu Bakar
Symmetry 2023, 15(12), 2185; https://doi.org/10.3390/sym15122185 - 11 Dec 2023
Cited by 19 | Viewed by 3541
Abstract
In the realm of data analysis and machine learning, achieving an optimal balance of feature importance, known as feature weighting, plays a pivotal role, especially when considering the nuanced interplay between the symmetry of data distribution and the need to assign differential weights [...] Read more.
In the realm of data analysis and machine learning, achieving an optimal balance of feature importance, known as feature weighting, plays a pivotal role, especially when considering the nuanced interplay between the symmetry of data distribution and the need to assign differential weights to individual features. Also, avoiding the dominance of large-scale traits is essential in data preparation. This step makes choosing an effective normalization approach one of the most challenging aspects of machine learning. In addition to normalization, feature weighting is another strategy to deal with the importance of the different features. One of the strategies to measure the dependency of features is the correlation coefficient. The correlation between features shows the relationship strength between the features. The integration of the normalization method with feature weighting in data transformation for classification has not been extensively studied. The goal is to improve the accuracy of classification methods by striking a balance between the normalization step and assigning greater importance to features with a strong relation to the class feature. To achieve this, we combine Min–Max normalization and weight the features by increasing their values based on their correlation coefficients with the class feature. This paper presents a proposed Correlation Coefficient with Min–Max Weighted (CCMMW) approach. The data being normalized depends on their correlation with the class feature. Logistic regression, support vector machine, k-nearest neighbor, neural network, and naive Bayesian classifiers were used to evaluate the proposed method. Twenty UCI Machine Learning Repository and Kaggle datasets with numerical values were also used in this study. The empirical results showed that the proposed CCMMW significantly improves the classification performance through support vector machine, logistic regression, and neural network classifiers in most datasets. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

33 pages, 2391 KiB  
Article
A Tourist-Based Framework for Developing Digital Marketing for Small and Medium-Sized Enterprises in the Tourism Sector in Saudi Arabia
by Rishaa Abdulaziz Alnajim and Bahjat Fakieh
Data 2023, 8(12), 179; https://doi.org/10.3390/data8120179 - 28 Nov 2023
Cited by 4 | Viewed by 4031
Abstract
Social media has become an essential tool for travel planning, with tourists increasingly using it to research destinations, book accommodation, and make travel arrangements. However, little is known about how tourists use social media for travel planning and what factors influence their intentions [...] Read more.
Social media has become an essential tool for travel planning, with tourists increasingly using it to research destinations, book accommodation, and make travel arrangements. However, little is known about how tourists use social media for travel planning and what factors influence their intentions to use social media for this purpose. This thesis aims to understand tourists’ intentions to use social media for travel planning. Specifically, it investigates the factors influencing tourists’ intentions to use social media for planning travel to Saudi Arabia. It develops a machine learning (ML) classification model to assist Saudi tourism SMEs in creating effective digital marketing strategies for social media platforms. A survey was conducted with 573 tourists interested in visiting Saudi Arabia, using the Design Science Research (DSR) approach. The findings support the tourist-based theoretical framework, showing that perceived usefulness (PU), perceived ease of use (PEOU), satisfaction (SAT), marketing-generated content (MGC), and user-generated content (UGC) significantly impact tourists’ intentions to use social media for travel planning. Tourists’ characteristics and visit characteristics influenced their intentions to use MGC but not UGC. The tourist-based ML classification model, developed using the LinearSVC algorithm, achieved an accuracy of 99% when evaluated using the K-Fold Cross-Validation (KF-CV) technique. The findings of this study have several implications for Saudi tourism SMEs. First, the results suggest that SMEs should focus on developing social media content that is perceived as useful, easy to use, and satisfying. Second, the findings suggest that SMEs should focus on using MGC in their social media marketing campaigns. Third, the results suggest that SMEs should tailor their social media marketing campaigns to the characteristics of their target tourists. This study contributes to the literature on tourism marketing and social media by providing a better understanding of how tourists use social media for travel planning. Saudi tourism SMEs can use the findings of this study to develop more effective digital marketing strategies for social media platforms. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

22 pages, 3120 KiB  
Article
The Best Whey Protein Powder Selection via VIKOR Based on Circular Intuitionistic Fuzzy Sets
by Elif Çaloğlu Büyükselçuk and Yiğit Can Sarı
Symmetry 2023, 15(7), 1313; https://doi.org/10.3390/sym15071313 - 27 Jun 2023
Cited by 7 | Viewed by 2320
Abstract
People try very hard to have a symmetrical, strong, and beautiful body. The human body needs high amino acids for muscle protein synthesis. Whey protein is a good choice that contains all amino acids, which increases muscle protein synthesis and improves body shape [...] Read more.
People try very hard to have a symmetrical, strong, and beautiful body. The human body needs high amino acids for muscle protein synthesis. Whey protein is a good choice that contains all amino acids, which increases muscle protein synthesis and improves body shape with resistance exercise. For this reason, those who do sports, especially professionals, prefer to use these products frequently. A large number of commercial whey protein powders are sold on the market, and to achieve maximum purpose, individuals want to use the best one. Intuitionistic fuzzy sets are used in order to minimize the negative effects of the uncertainty environment and ambiguous information encountered in the decision-making process on the solution. In this study, VIKOR, based on the circular intuitionistic fuzzy set, has been used to determine the best whey protein supplement. In line with the comprehensive literature review and expert opinions, the evaluation criteria affecting the selection process have been determined, and the solution of the problem has been focused. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

25 pages, 7323 KiB  
Article
Data Preprocessing and Neural Network Architecture Selection Algorithms in Cases of Limited Training Sets—On an Example of Diagnosing Alzheimer’s Disease
by Aleksandr Alekseev, Leonid Kozhemyakin, Vladislav Nikitin and Julia Bolshakova
Algorithms 2023, 16(5), 219; https://doi.org/10.3390/a16050219 - 25 Apr 2023
Cited by 1 | Viewed by 3304
Abstract
This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables [...] Read more.
This paper aimed to increase accuracy of an Alzheimer’s disease diagnosing function that was obtained in a previous study devoted to application of decision roots to the diagnosis of Alzheimer’s disease. The obtained decision root is a discrete switching function of several variables applicated to aggregation of a few indicators to one integrated assessment presents as a superposition of few functions of two variables. Magnetic susceptibility values of the basal veins and veins of the thalamus were used as indicators. Two categories of patients were used as function values. To increase accuracy, the idea of using artificial neural networks was suggested, but a feature of medical data is its limitation. Therefore, neural networks based on limited training datasets may be inefficient. The solution to this problem is proposed to preprocess initial datasets to determine the parameters of the neural networks based on decisions’ roots, because it is known that any can be represented in the incompletely connected neural network form with a cascade structure. There are no publicly available specialized software products allowing the user to set the complex structure of a neural network, which is why the number of synaptic coefficients of an incompletely connected neural network has been determined. This made it possible to predefine fully connected neural networks, comparable in terms of the number of unknown parameters. Acceptable accuracy was obtained in cases of one-layer and two-layer fully connected neural networks trained on limited training sets on an example of diagnosing Alzheimer’s disease. Thus, the scientific hypothesis on preprocessing initial datasets and neural network architecture selection using special methods and algorithms was confirmed. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

19 pages, 1771 KiB  
Article
Adaptive Kernel Graph Nonnegative Matrix Factorization
by Rui-Yu Li, Yu Guo and Bin Zhang
Information 2023, 14(4), 208; https://doi.org/10.3390/info14040208 - 29 Mar 2023
Cited by 1 | Viewed by 1733
Abstract
Nonnegative matrix factorization (NMF) is an efficient method for feature learning in the field of machine learning and data mining. To investigate the nonlinear characteristics of datasets, kernel-method-based NMF (KNMF) and its graph-regularized extensions have received much attention from various researchers due to [...] Read more.
Nonnegative matrix factorization (NMF) is an efficient method for feature learning in the field of machine learning and data mining. To investigate the nonlinear characteristics of datasets, kernel-method-based NMF (KNMF) and its graph-regularized extensions have received much attention from various researchers due to their promising performance. However, the graph similarity matrix of the existing methods is often predefined in the original space of data and kept unchanged during the matrix-factorization procedure, which leads to non-optimal graphs. To address these problems, we propose a kernel-graph-learning-based, nonlinear, nonnegative matrix-factorization method in this paper, termed adaptive kernel graph nonnegative matrix factorization (AKGNMF). In order to automatically capture the manifold structure of the data on the nonlinear feature space, AKGNMF learned an adaptive similarity graph. We formulated a unified objective function, in which global similarity graph learning is optimized jointly with the matrix decomposition process. A local graph Laplacian is further imposed on the learned feature subspace representation. The proposed method relies on both the factorization that respects geometric structure and the mapped high-dimensional subspace feature representations. In addition, an efficient iterative solution was derived to update all variables in the resultant objective problem in turn. Experiments on the synthetic dataset visually demonstrate the ability of AKGNMF to separate the nonlinear dataset with high clustering accuracy. Experiments on real-world datasets verified the effectiveness of AKGNMF in three aspects, including clustering performance, parameter sensitivity and convergence. Comprehensive experimental findings indicate that, compared with various classic methods and the state-of-the-art methods, the proposed AKGNMF algorithm demonstrated effectiveness and superiority. Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
Show Figures

Figure 1

Back to TopTop