Next Article in Journal
A Critical Appraisal of System-Reported Organ Dose (OD) Versus Manually Calculated Mean Glandular Dose (MGD) in Dubai’s Mammography Services
Next Article in Special Issue
Information Geometry and Manifold Learning: A Novel Framework for Analyzing Alzheimer’s Disease MRI Data
Previous Article in Journal
Spectral Differentiation of Hyperdense Non-Vascular and Vascular Renal Lesions Without Solid Components in Contrast-Enhanced Photon-Counting Detector CT Scans—A Pilot Study
Previous Article in Special Issue
Optimized Hybrid Deep Learning Framework for Early Detection of Alzheimer’s Disease Using Adaptive Weight Selection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Evolutionary Federated Learning Approach to Diagnose Alzheimer’s Disease Under Uncertainty

1
Cybersecurity Laboratory, Luleå University of Technology, 97187 Luleå, Sweden
2
Department of Computer Science and Engineering, Rangamati Science and Technology University, Rangamati 4500, Bangladesh
3
Department of Computer Science and Engineering, East West University, Dhaka 1212, Bangladesh
*
Authors to whom correspondence should be addressed.
Diagnostics 2025, 15(1), 80; https://doi.org/10.3390/diagnostics15010080
Submission received: 24 November 2024 / Revised: 25 December 2024 / Accepted: 30 December 2024 / Published: 1 January 2025
(This article belongs to the Special Issue Artificial Intelligence in Alzheimer’s Disease Diagnosis)

Abstract

:
Background: Alzheimer’s disease (AD) leads to severe cognitive impairment and functional decline in patients, and its exact cause remains unknown. Early diagnosis of AD is imperative to enable timely interventions that can slow the progression of the disease. This research tackles the complexity and uncertainty of AD by employing a multimodal approach that integrates medical imaging and demographic data. Methods: To scale this system to larger environments, such as hospital settings, and to ensure the sustainability, security, and privacy of sensitive data, this research employs both deep learning and federated learning frameworks. MRI images are pre-processed and fed into a convolutional neural network (CNN), which generates a prediction file. This prediction file is then combined with demographic data and distributed among clients for local training. Training is conducted both locally and globally using a belief rule base (BRB), which effectively integrates various data sources into a comprehensive diagnostic model. Results: The aggregated data values from local training are collected on a central server. Various aggregation methods are evaluated to assess the performance of the federated learning model, with results indicating that FedAvg outperforms other methods, achieving a global accuracy of 99.9%. Conclusions: The BRB effectively manages the uncertainty associated with AD data, providing a robust framework for integrating and analyzing diverse information. This research not only advances AD diagnostics by integrating multimodal data but also underscores the potential of federated learning for scalable, privacy-preserving healthcare solutions.

1. Introduction

In this age, 50 million people are globally suffering because of the impact of Alzheimer’s disease, and by 2050, it is predicted that this number will increase three-fold [1]. It is difficult to understand the actual causes of Alzheimer’s; however, studies show that genetics, environment, and lifestyle can have a profound impact [2]. In addition, this disease directly impacts cognitive functions and gradually impairs the ability of the patient to perform daily activities. With time, the patient may become prone to memory loss and abnormalities in speech and encounter confusion with the time and place they are in. Early-onset signs and symptoms of this disease can begin before the age of 65, while late-onset typically begins after 65 [3]. As a result, it is necessary to diagnose Alzheimer’s early so that the quality of life can be improved, adequate treatment can be acquired, and the progression of the disease can be slowed down, allowing the patient to seek psychological support to cope with the emotional impact of the disease. To achieve this purpose, an Alzheimer’s diagnosis system needs to be developed. Since the actual cause of Alzheimer’s is not clear and remains uncertain, the system should be able to deal with information not only limited to MRI scans but also about the person’s age, lifestyle, and environment. This will, in turn, tackle the complexity of the disease. In this research, a multidisciplinary approach where machine learning, the  belief rule base system [4], and federated learning [5,6] are employed. The machine learning model, namely the convolutional neural network, is customized so that it can achieve optimal accuracy when training the MRI image dataset of patients [7]. Furthermore, to overcome the uncertainty in identifying the exact stage of the disease, a trained belief rule-based model is developed by using evidential reasoning and particle swarm optimization [8]. Particle swarm optimization is used because it excels in finding optimal or near-optimal accuracy in complex, multi-dimensional search spaces. However, focusing solely on these models will not suffice for large-scale early diagnosis of Alzheimer’s. Implementing such a system in hospitals requires a broader approach, particularly addressing data sensitivity issues. Many hospitals are reluctant to share data due to privacy concerns. This is where federated learning—a form of decentralized system—becomes crucial. Federated learning enables the development of a large-scale diagnostic system without compromising data privacy, as it allows hospitals to collaboratively train models without sharing raw data [9]. This approach not only values the issue of patient data sensitivity but also leverages diverse datasets to improve diagnostic accuracy, making it a viable solution for widespread implementation in healthcare facilities.
This research intends to develop an Alzheimer’s disease diagnostic system wherein deep learning (convolutional neural network), knowledge representation (belief rule base system) [10], and federated learning technologies will be integrated. This is done to address the challenges with data uncertainty, diagnostic accuracy, privacy preservation of patient data, and reduced resource utilization.

1.1. Research Objectives

To address the research aims, the following research objectives have been formulated:
  • Systematically clean and curate datasets, including medical images and demographic data, for accurate Alzheimer’s disease (AD) diagnosis.
  • Design a convolutional neural network (CNN) architecture tailored for processing and classifying Alzheimer’s medical imaging data.
  • Combine image data with demographic information to improve diagnostic accuracy.
  • Develop an Optimized trained belief rule base (BRB) to handle data uncertainties.
  • Incorporate the belief rule base (BRB) system in the federated learning framework.
  • Compare and evaluate the different (i.e., FedAvg, FedProx, and genetic algorithm) aggregators used in the federated learning model.

1.2. Research Questions

  • How to develop a pre-processing pipeline combining medical images and demographic data for enhanced Alzheimer’s disease diagnosis?
  • How to design a CNN architecture optimized for Alzheimer’s medical imaging data classification?
  • How to integrate image and demographic data to improve diagnostic accuracy?
  • How to establish a federated learning framework with a belief rule base to handle data uncertainties?
  • How does advanced federated learning for medical applications improve privacy-preserving diagnostic frameworks?
  • How can multimodal data handling and uncertainty management in machine learning for medical diagnostics improve accuracy and overcome uncertainty?

2. Related Work

Machine learning is transforming the healthcare system, offering new advancements and solutions. Alzheimer’s disease (AD), affecting millions worldwide, remains a critical area where these technologies are making a significant impact. It is essential to detect AD in its stages to slow down its progression through proper care and treatment. Consequently, there has been a rise in the use of artificial intelligence (AI) techniques [1,3] due to their capacity to analyze datasets and recognize complex patterns. This section will delve into research on AI-driven Alzheimer’s diagnosis focusing particularly on learning methods such as convolutional neural networks (CNNs) and the application of federated learning in this field. Several studies have provided insights into classifying Alzheimer’s disease using data types machine learning approaches and federated learning techniques [5].

2.1. Machine Learning to Classify AD Using MRI Images

The XAI framework in this study uses the mapping of occlusion sensitivity, emphasizing white matter hyperintensities (WMHs) in the image data [11]. The deep learning classifier, namely EfficientNet-B0, achieves an accuracy of 80.0%. This work [12] integrates image data with the XAI frameworks of Saliency Map and Layer-wise Relevance Propagation, analyzing MRI, 3D PET, biological markers, and assessments. This study will make use of a DL 3D CNN AD classifier. This research [12] incorporates image data and XAI frameworks such as Saliency Map and Layer-wise Relevance Propagation (LRP), analyzing MRI, 3D PET, biological markers, and assessments. A DL 3D CNN AD classifier is employed for this study. This study [13] involves image data and XAI frameworks utilizing decision trees (DTs). Significant features include demographic data, cognitive factors, and brain metabolism data. Classifiers include Bernoulli naive Bayes (NB), SVM, kNN, random forest (RF), AdaBoost, and gradient boosting (GBoost), achieving an accuracy of 91.0%. Significant features, including demographic data, cognitive factors, and brain metabolism data, are highlighted in this study [13], which utilizes decision trees (DTs) as the XAI framework. The classifiers, Bernoulli naive Bayes (NB), SVM, kNN, random forest (RF), AdaBoost, and gradient boosting (GBoost) collectively achieve an accuracy of 91.0%. In this research, demographic data, cognitive factors, and MRI image data are combined [13]. The data are passed through a decision tree (DT) as the XAI framework. Other classifiers, such as Bernoulli naive Bayes (NB), SVM, kNN, random forest (RF), AdaBoost, and gradient boosting (GBoost) collectively achieve an accuracy of 91.0%. In [14], an XAI framework was employed wherein a 3D ultrametric contour map is deployed alongside a 3D class activation map and 3D GRadCAM, all to analyze the image data. As we understand, there are significant features which can be accurately classified after using the 3D CNN deep learning model. Overall, and accuracy of 76.6% was achieved. This work employed sensitivity analysis and occlusion as XAI frameworks in the analysis of 3D image data. The accuracy of the DL 3D CNN classifier is 77.0% [15]. The paper by [16] classifies healthy controls versus Alzheimer’s disease subjects based on numeric data. LIME and SHAP are used as XAI frameworks, while some of the features identified include whole brain volume, years of education, and socio-economic status. Classifiers used include support vector machine (SVM), k-nearest neighbors (kNN), and multilayer perceptron. For the classification of HC versus AD, this study [17] leverages XAI frameworks such as HAM and PCR, focusing on salient AD-related features like cerebral cortex and hippocampus atrophy. The DL CNN classifier achieves an impressive accuracy of 95.4%. In [17], healthy controls (HCs) were compared to Alzheimer’s disease (AD) classification using XAI frameworks (i.e., HAM and PCR). These frameworks focus on features such as the cerebral cortex and hippocampus atrophy through the deep learning (DL) convolutional neural network (CNN), achieving an impressive accuracy of 95.4%. This study [18] focuses on classifying healthy controls (HCs), individuals with mild cognitive impairment (MCI), and Alzheimer’s disease (AD) using brain image data. It employs the XAI framework GNNExplainer to interpret predictions made by a deep learning graph neural network (GNN). Key features include brain region volume, cortical surface area, and cortical thickness. The GNN classifier achieves an accuracy of 53.5 ± 4.5%, providing insights into the important brain features associated with these conditions. Marwa et al. [19] utilized a deep neural network (DNN) pipeline on a dataset of 6400 MRI images to effectively identify various stages of Alzheimer’s disease, achieving a remarkable accuracy of 99.68%. Ghazal et al. [20] developed a classification model using transfer learning to categorize Alzheimer’s disease into Mild Demented, Moderate Demented, Non-Demented, and Very Mild Demented stages, employing AlexNet and obtaining a simulation accuracy of 91.7%. AlSaeed et al. [21] proposed a CNN-based model, ResNet50, for automatic feature extraction from MRI images to detect Alzheimer’s disease. Their model, tested against Softmax, SVM, and RF models, showed superior performance with an accuracy ranging from 85.7% to 99% on the MRI ADNI dataset. Hamdi et al. [22] introduced a CAD system based on a CNN model for Alzheimer’s detection using the MRI ADNI dataset, achieving 96% accuracy. Helaly et al. [23] employed both CNN and VGG-19 transfer learning models to classify Alzheimer’s stages with the ADNI dataset, with VGG-19 achieving an accuracy of 97%. Mohammed et al. [24] utilized AlexNet, ResNet-50, and hybrid models combining AlexNet+SVM and ResNet-50+SVM to diagnose Alzheimer’s using the OASIS dataset, with AlexNet+SVM achieving the highest accuracy of 94.8%. Pradhan et al. [25] compared VGG-19 and DenseNet architectures for Alzheimer’s classification, finding VGG-19 to perform better. Salehi et al. [26] implemented a CNN model to diagnose Alzheimer’s early using the MRI ADNI dataset, achieving a 99% accuracy. Suganthe, Ravi Chandaran et al. [27] employed deep CNN and VGG-16-based CNN models for Alzheimer’s detection using the ADNI dataset, both showing excellent accuracy. Hussain et al. [28] proposed a 12-layer CNN model for Alzheimer’s classification using the OASIS dataset, achieving 97.75% accuracy, outperforming other existing CNN models on this dataset. Basaia et al. [29] applied a CNN model to the ADNI dataset for Alzheimer’s detection, achieving high performance with both ADNI and combined ADNI+non-ADNI datasets. Ji et al. [30] proposed an ensemble learning approach using deep learning for early Alzheimer’s diagnosis with the MRI ADNI dataset, achieving up to 97.65% accuracy for AD/mild cognitive impairment stages. Islam et al. [31] proposed a CNN model for Alzheimer’s detection and classification, evaluated on the OASIS dataset. Their model is noted for its speed, lack of reliance on handcrafted features, and suitability for small datasets.
Table 1 briefly highlights the use of various machine learning models in the classification of Alzheimer’s disease. Overall, a deduction is made that convolutional models used by Yu et al. [17] and De et al. [12] can achieve a high accuracy when it comes to image classification tasks that involve MRI scans. Yu et al. further revealed an impressive accuracy of 95.4% in differentiating between a healthy control group and Alzheimer’s disease. Complex features, such as those found in MRI scans highlighting brain atrophy, are effectively extracted and learned by CNNs, making them ideal for medical imaging tasks. In contrast, traditional models, such as decision trees and support vector machines, are limited in handling complex tasks. Unlike CNN, the problem with a traditional model is that it requires feature engineering which can be computationally expensive, and training large amounts of data can be slow. Furthermore, CNN is open to adding features of explainability such as the use of GradCAM and Layer-wise Relevance Propagation (LRP) which can enhance the CNN’s ability in the process of decision-making by adding more transparency. Explainable AI can be a future direction for this current research. In conclusion, CNNs, according to the reviewed literature, have consistently outperformed traditional machine learning models when it comes to accuracy and scalability, particularly when applied to large and complex datasets.

2.2. Federated Learning-Based Solution for Building Alzheimer’s Diagnosis System

Khalil et al. [5] built a hardware acceleration approach to expedite the training and testing of their FL model. In this approach, the hardware accelerator is constructed using VHDL hardware description language and an Altera 10 GX FPGA. Simulation results indicate that the proposed methods achieved accuracies of 89% and sensitivities of 87% for early detection, outperforming other state-of-the-art algorithms in terms of training time. Moreover, the proposed algorithms exhibited a power consumption range of 35–39 mW, rendering them suitable for resource-constrained devices (see Table 2).
Mitrovska et al. [6] employed two algorithms, namely Federated Averaging (FedAvg) and Secure Aggregation (SecAgg) which were combined, and their performance was compared with a centralized ML model training. By simulating such diverse environments, the influence of demographic factors (such as sex, age, and diagnosis) and imbalanced data distributions was investigated. These simulations made it possible to assess the effect of statistical variations on ML models trained with FL, underscoring the significance of accounting for these differences in training models for AD detection. From the end result, it was demonstrated that, when FL is combined with SecAgg, it ensures privacy, as evidenced by simulated membership inference attacks showcasing its privacy guarantees.
Trivedi et al. [32] developed a streamlined federated deep learning framework for classifying Alzheimer’s diseases, ensuring data privacy using IID datasets in a client-server setup. Evaluations with single and multiple clients under traditional and federated learning scenarios revealed AlexNet as the best pre-trained model, achieving 98.53% accuracy. The framework was then tested with IID datasets using federated learning (FL).
Altalbe et al. [33] developed a deep neural network (DNN) within a federated learning framework to detect and diagnose brain disorders. The dataset used here is from the Kay Elemetrics voice disorder database, which was divided among two clients, to built separate training models. Overfitting was reduced through three review rounds by each client. The model identified brain disorders with an accuracy of 82.82%, preserving privacy and security.
Mandawkar et al. [34] engineered a federated learning-based Tawny Flamingo deep CNN classifier to classify Alzheimer’s disease. The classifier processes clinical data from multiple distributed sources and delivers high accuracy. Tuned using the Tawny Flamingo algorithm, the model achieved accuracies of 98.252% in K-fold analysis and 97.995% in training percentage-dependent analysis for detecting Alzheimer’s disease.
Castro et al. [35] deployed an Alzheimer’s diagnosis system by using MRI brain images wherein federated learning (FL) was integrated for biometric recognition validating image authentication. This approach aimed to protect privacy and prevent data poisoning attacks. Experiments conducted on the OASIS and ADNI datasets demonstrated that the system’s performance was comparable to that of a centralized ML system without privacy measures.
Qian et al. [9] introduced FeDeFo, which stands for a federal deep forest model utilized for calculating hippocampal volume using sMRI images to classify Alzheimer’s disease. This uses a federated learning framework to train a gradient-boosting decision tree (GBDT) model leveraging local client data to ensure data privacy. Furthermore, this approach addresses data discrepancies, by incorporating a deep forest model, combined with the federated GBDT to personalize the model for each client. Experiments demonstrated that this method effectively personalized the model while protecting data privacy, providing a new method for Alzheimer’s disease classification. Table 1 presents a key summary of studies in Alzheimer’s disease detection using FL.

2.3. Addressing the Existing Research Gaps

In summary, future studies should prioritize developing FL models which are capable of integrating data from multiple sources, as this will make the model more generalized. It is also necessary to explore the use of evolutionary algorithms in FL. These algorithms will not only help to optimize model parameters but also improve the performance in areas wherein data are complex and environments are distributed. As a result, methodologies incorporating these evolutionary algorithms should be built to handle data heterogeneity and uncertainty. This will ensure that FL frameworks maintain optimal accuracy and fairness across different client datasets. In scenarios wherein data are distributed across multiple servers, applying advanced privacy-preserving techniques, such as differential privacy and multi-party computation, it is crucial to ensure the security of sensitive patient data in an FL system. In addition, FL frameworks should showcase scale-able behavior to manage large-scale deployment. Furthermore, FL systems should be deployed in a clinical setting to carry out clinical trials in the real world to test their validity and applicability. In order to provide more evidence for effectiveness in real-world healthcare settings, there is also a requirement to perform more research on the aspect of expanding hardware acceleration in FL.
Hence, this research will attempt to address most of the research gaps by introducing evolutionary algorithms and optimization techniques to enhance the performance of FL framework.

3. Methods

Figure 1 illustrates the operational framework used to develop the system in this research. From the figure, it can be deduced that this framework is a hybrid system, as it combines two distinct methodologies: deep learning framework and federated learning framework [5,6]. In the first methodology, MRI images undergo pre-processing steps such as skull stripping and bias correction to improve image quality. After pre-processing the images, the dataset is passed as an input to the convolutional neural network (CNN), for the purpose of classification of Alzheimer’s disease (AD). Thus, the CNN is trained to classify Alzheimer’s disease into three categories: Alzheimer’s Disease (AD), Mild Cognitive Impairment (MCI), and Cognitively Normal (CN). Afterwards, the results from the classification are converted into a CSV file that contains the predicted classes for each subject.
This CSV file is then merged with demographic data received from the ADNI dataset mentioned in Section 3.1. This results in a multi-modal dataset that integrates both clinical and demographic information. The combined dataset is subsequently split equally into three parts, each corresponding to a different client in the federated learning framework. In this second methodology, the three clients can be potentially considered as three hospitals where the local models are trained using a belief rule base (BRB) system [36]. The BRB system is enhanced with particle swarm optimization (PSO) to optimize the parameters [37]. Each client trains its BRB model locally, and the resulting parameters are then sent to a central server. At the server, these local parameters are aggregated to update the global model (which uses the same BRBs). This federated learning process is iterative, where the central server sends the updated global model parameters back to the clients. The clients then use these updated parameters to refine their local models in the next round. This iterative process continues—typically for three iterations—until the global model reaches an optimal state, effectively capturing the combined knowledge from all local models and providing a robust predictive tool for Alzheimer’s disease classification.

3.1. Description of Dataset Applied in This Research

For this research, data were obtained from the Alzheimer’s Disease Neurology Initiative (ADNI) [38]. The ADNI dataset includes a variety of modalities, making it an ideal resource for researchers aiming to utilize it for the early detection of Alzheimer’s disease. This dataset comprises both demographic information and imaging data of patients. The fMRI dataset consists of images provided in DICOM and NIFTI formats.
DICOM images resemble a series of frames, similar to those found in videos, whereas NIfTI images provide a comprehensive representation of a specific part of the human body; in this case, the brain. These images include all slices captured by scanners or other imaging methods. The NIfTI format is converted to enable access to raw MRI images.
The dataset contains a total of 5182 images, categorized into three classes: “Alzheimer’s Disease (AD)”, “Mild Cognitive Impairment (MCI)”, and “Cognitively Normal”. After pre-processing the data, the dataset was divided into training and testing sets with an 80:20 split ratio.

3.2. Steps of MRI Dataset Preprocessing

The conversion of NIfTI files into raw images involves multiple steps that differ from standard image formats such as PNG or JPG. For this process, FSL software (Version v6 2018) is utilized, installed via the Windows Subsystem for Linux (WSL) and added to the PATH environment on a Windows system. The pre-processing pipeline consists of affine registration, skull stripping, and bias correction, each executed sequentially to prepare the MRI data for further analysis.
Affine Registration:
Affine registration determines the optimal transformation that aligns points in one image with corresponding points in another. This transformation includes translation, scaling, rotation, and shearing, offering greater alignment flexibility compared to rigid or similar transformations. The process begins by selecting a reference NIfTI file from the FSL/data/standard directory. Depending on the specifications of the MRI files, the appropriate MNI152 file is chosen. For example, if the MRI files are 1 mm T1 MRIs, the MNI152_T1_1mm.nii file is selected. The reference file is then placed in the atlas folder, and the register.py file in the pre-processing folder is updated. Executing the register.py script in the terminal generates the ADNIReg folder, which contains subdirectories (AD, CN, MCI) with the transformed NIfTI files.
Skull Stripping:
MRI scans often contain extraneous details, such as the skull, eyes, nose, and other facial features, which must be removed to isolate the brain region. Skull stripping eliminates this unnecessary data, ensuring that they do not interfere with the machine learning model’s training or accuracy. The bet tool from the FSL library is used for this purpose. The resulting files are saved in a newly created ADNIBrain folder while preserving the subdirectory structure (AD, CN, MCI).
Bias Correction:
N4 bias correction is applied to address intensity inconsistencies, also known as bias fields, present in medical images. These inconsistencies, caused by scanner settings or physical structures of patients, can compromise the accuracy of image analysis. By eliminating bias fields, the clarity and reliability of the images are improved. The Advanced Normalization Tools (ANTs) library is employed for this step. Since this transformation is computationally intensive, the Python script can be modified to process images in batches by adjusting the batch size accordingly.
By implementing these pre-processing steps, the MRI data are effectively prepared for further analysis, ensuring precision and efficiency in subsequent processing stages.

3.3. Deep Learning Framework

After the MRI images undergo pre-processing, the dataset is randomly divided into 80% training and 20% testing, to integrate into the deep learning model, namely the convolutional neural network. The architecture is demonstrated as follows:

Modified Convolutional Neural Network

Based on the specified CNN model architecture for diagnosing Alzheimer’s disease from imaging data, multiple convolutional layers are incorporated to extract features from the input images. Deep learning techniques are leveraged to capture complex patterns within the data. The model starts with a sequence of convolutional layers, each followed by max pooling and batch normalization. The convolutional layers apply filters to detect features such as edges and textures in the images, with these features becoming increasingly abstract as they progress through the layers. Max pooling layers reduce the spatial dimensions of the data, which decreases the computational burden and mitigates the risk of overfitting. Batch normalization is used to stabilize the learning process by normalizing activations, making the network more robust to variations in the input data distribution. This structured combination of layers enables the model to effectively learn and generalize to new imaging data.
Dense layers with dropout are also included in the model to further refine the features extracted by the convolutional layers into meaningful patterns suitable for classification (See Table 3). Dropout randomly deactivates a fraction of the neurons during training, preventing overfitting by ensuring that the model does not become excessively reliant on any specific feature. The final output layer employs a softmax activation function to generate probabilities for each of the four classes, providing confidence scores for the potential diagnoses. The Adam optimizer and Sparse Categorical Cross-Entropy loss function are utilized to facilitate efficient and precise learning by adjusting the model weights to minimize the error between the predicted and actual class labels.
Overall, this CNN architecture delivers a robust and accurate approach for diagnosing Alzheimer’s disease from imaging data. By leveraging advanced deep learning techniques, the model achieves high performance and reliability.

3.4. Federated Learning Architecture

Federated learning enables the decentralized training of large-scale models without accessing user data, thus ensuring privacy. Federated learning is classified into Horizontal Federated Learning (HFL), Vertical Federated Learning (VFL), and Federated Transfer Learning (TFL). In this research, HFL is adopted, enabling multiple clients to jointly train a global model while ensuring the privacy of their local data. The steps involved in the HFL framework are described as follows:

3.4.1. Data Preparation

Figure 2 illustrates how the data preparation process, essential for implementing a federated learning (FL) framework, is carried out for diagnosing Alzheimer’s disease. This process combines demographic information with CNN-generated predictions and distributes the resulting dataset across multiple clients while ensuring data privacy is maintained.
Step 1: Preparing and Matching Demographic Data
Initially, the demographic data such as patients’ ages and patient ID are extracted from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset. The duplicated values are excluded in the CSV file. The CNN predictions—providing the probabilities for Alzheimer’s stages such as Alzheimer’s Disease (AD), Mild Cognitive Impairment (MCI), and Cognitively Normal (CN)—are gathered into another CSV file. The demographic data are then matched to the CNN predictions by using the Subject ID.
Step 2: Calculating the Crisp Value Using the Utility Function
Once the data have been matched, a “Crisp Value” is calculated for each patient. This is performed by applying a utility function, which combines the predicted probabilities of the Alzheimer’s stages with their corresponding utility scores. The utility function is as follows:
Expected Utility = P ( CN ) · U ( CN ) + P ( MCI ) · U ( MCI ) + P ( AD ) · U ( AD )
Here, P ( CN ) , P ( MCI ) , and  P ( AD ) are the predicted probabilities for Cognitively Normal, Mild Cognitive Impairment, and Alzheimer’s Disease, respectively. Meanwhile, U ( CN ) , U ( MCI ) , and  U ( AD ) represent the utility values for each stage. These utility values are derived from [39], with scores ranging from 0.2 for severe Alzheimer’s to 0.7 for mild cognitive impairment.
Step 3: Horizontal Partitioning of Data Across Clients
Once the Crisp Values are calculated and the data are enriched, the dataset is horizontally partitioned. This means that the data are split into subsets with identical columns but different rows. These subsets are distributed to various clients (Client 1, Client 2, Client 3), allowing them to train their models locally. By ensuring that raw data are never exchanged between clients or with a central server, this process effectively maintains patient privacy.
By integrating CNN predictions with demographic data and applying utility-based calculations, this system improves the accuracy and nuance of Alzheimer’s diagnosis. Horizontal partitioning ensures data privacy is preserved within a federated learning setup, making this approach both practical and secure for real-world healthcare applications.

3.4.2. Local Training with BRB and PSO

Each client independently trains a local model using the belief rule base (BRB) mechanism combined with particle swarm optimization (PSO). The belief rule base (BRB) incorporates uncertainty by assigning belief degrees to different possible outcomes based on input variables such as CNN values and age. These belief degrees represent the level of confidence in each classification, allowing the model to account for uncertainty in the data. The rules in the BRB assess the likelihood of different conditions (such as CN, MCI, AD) and help manage any uncertainty arising from incomplete or noisy data. In the training process, each client in the federated learning setup independently trains a local model using the BRB mechanism. The belief rule base parameters are optimized using particle swarm optimization (PSO), which helps the model better handle data uncertainties and improve classification performance. After local training, the parameters are shared with a central server, where they are aggregated and refined to create a global model. This ensures that the model learns from data across different clients while maintaining privacy. This combination allows the model to handle uncertainties in the data effectively and optimize the parameters for better performance.

3.4.3. Belief Rule Base (BRB) Mechanism

The BRB mechanism is central to our model. It involves defining rules that map input variables to belief degrees for classification outcomes (see Figure 3). Each rule R i is formulated to assess the probability of a patient’s condition based on their CNN value and age. The rules and their weights are applied to determine the likely classification outcome. Table 4 presents the initial rule base used in our framework, detailing the rules, their weights, and activation weights.
The rules in Table 4 serve as the foundation for the BRB framework, guiding the decision-making process by evaluating the likelihood of each classification outcome (CN, MCI, AD) based on the input variables (CNN_Value, Age).
The BRB mechanism is fundamental for managing uncertainties and variability in the data. It encompasses several key components [4,36]:
Rule Structure:
Each rule R i in the BRB maps input variables (e.g., CNN values, age) to belief degrees for possible outcomes (CN, MCI, AD):
R i : IF ( X 1 A i ) AND ( X 2 B i ) THEN ( β i 1 , β i 2 , β i 3 )
Belief Degrees and Rule Weights:
Belief degrees β i j quantify the confidence in each classification outcome given a rule R i . These degrees must sum to 1 for each rule:
j = 1 3 β i j = 1
Rule weights w i indicate the importance of each rule R i and are normalized such that
i = 1 N w i = 1
where N is the total number of rules.
Attribute Weights:
Attribute weights λ k reflect the significance of each input variable X k in influencing the belief degrees. They adjust the impact of each variable on the matching degree calculation.
Analytical Process of BRB:
The analytical process of the belief rule base (BRB) involves the following steps: input transformation, calculation of the matching degree, belief update, and output aggregation.

3.4.4. Local Parameter Optimization

The local model parameters are updated by minimizing the local objective function L i using PSO. The optimization process ensures that the model parameters converge to an optimal solution that reflects the local data characteristics [8,37]. The optimization process is defined as
θ i t = Optimize ( θ i t 1 , D i )
where θ i t represents the optimized parameters at iteration t, and  D i is the dataset for client i.

3.4.5. Parameter Sharing and Aggregation

After completing the local training, the clients transmit their updated BRB parameters to the central server. The gradients θ i t from each client are then aggregated to update the global model parameters θ G t + 1 .
θ G t + 1 = θ G t α θ i = 1 N θ i t
Here, α θ is the learning rate and N represents the number of clients. This aggregation process ensures that each client’s data influence the global model proportionally.

3.4.6. Global Model Update and PSO Optimization

The central server retrains the global model using the aggregated BRB parameters and further optimizes them using PSO. This iterative refinement of parameters helps to achieve a consistent global model across all clients. The updated global parameters are then sent back to the clients:
[ h ! ] θ i t + 1 θ G t + 1

3.4.7. Iterative Optimization with PSO

This iterative process, which involves local training, parameter sharing, and global model updating, is repeated for five iterations to ensure faster convergence and efficient training (see Figure 4). The detailed steps of the PSO optimization process are outlined in Algorithm 1.
Algorithm 1 Federated Learning with Particle Swarm Optimization (PSO) for Alzheimer’s Disease Diagnosis using FedAvg.
1:
Input:
  • Dataset: ’CNN_Value’, ’Age’, ’Group’
  • Clients: 3
  • Iterations: num_iterations
  • PSO params: swarmsize, maxiter
2:
Output: Optimized BRB parameters, classification results
3:
Load and split the dataset into train/test sets.
4:
Divide data into subsets for 3 clients.
5:
Scale features ’CNN_Value’ and ’Age’ using MinMaxScaler.
6:
Save processed data to CSV files.
7:
BRB Model: transform_inputvalue, high, low
8:
return high value high low , value low high low
calculate_degreevalue, ref_high, ref_low, beliefs
9:
( high , low ) t r a n s f o r m _ i n p u t value , ref_high, ref_low
10:
return ( beliefs [ 0 ] high × beliefs [ 1 ] low )
brb_modelparams, inputs
11:
Compute belief degrees and weights.
12:
Compute matching degrees and update beliefs.
13:
return aggregated output.
evaluate_modelparams, data
14:
return Mean Squared Error of predictions.
15:
Optimization:
16:
for iteration 1 to num_iterations do
17:
   for each client do
18:
     Run PSO for the client and obtain parameters.
19:
     Ensure belief degrees sum to 1.
20:
     Evaluate train/test MSE and accuracy.
21:
     Save client-specific results.
22:
   end for
23:
   Average parameters across clients.
24:
   Refine parameters with PSO.
25:
   Evaluate on full dataset.
26:
   Save final results.
27:
end for
28:
Output: Final parameters and evaluation results.
This framework demonstrates the integration of federated learning, the BRB mechanism, and PSO for effective Alzheimer’s disease diagnosis. The iterative process of local training, parameter sharing, and global model updating ensures efficient training and accurate diagnostic results. The detailed structure and mechanisms of the BRB ensure that the system can handle data variability and uncertainties effectively. This Algorithm 1 integrates federated learning (FL) with particle swarm optimization (PSO) to optimize a belief rule base (BRB) model aimed at diagnosing Alzheimer’s disease. The key steps are as follows:
First, the dataset, consisting of features such as “CNN_Value”, “Age”, and “Group”, is divided among three clients. Each client scales the data using a MinMaxScaler, processes it, and stores the results for use in the federated learning process.
In the BRB model, functions are defined to transform input values into belief degrees and calculate matching degrees, which are used to compute the overall belief for a diagnosis. The model’s performance is assessed using Mean Squared Error (MSE) to ensure accuracy.
During the optimization phase, each client independently applies PSO to tune the BRB parameters. The belief degrees are normalized, and the model’s accuracy is evaluated on training and testing data. After each iteration, the parameters from all clients are averaged using the FedAvg technique to create a global model.
The global model undergoes further refinement with PSO, and its performance is evaluated on the entire dataset. Finally, the algorithm outputs the optimized parameters and classification results, which can be used to diagnose Alzheimer’s disease more effectively.

4. Results

4.1. Libraries and Packages for System Implementation

The Alzheimer’s disease prediction system was developed using a comprehensive set of Python libraries, each serving specific roles in the process from data preparation to model deployment. The system started by converting MRI images from the NIFTI format to raw PNG files using libraries like Nibabel, NumPy, and PIL, enabling the images to be used as input for the convolutional neural network (CNN). The CNN, built with TensorFlow and Keras, processed these images to predict disease stages, and its outputs were structured into CSV files using Pandas. These predictions were then merged with demographic data, cleaned by removing redundant values, and a crisp value was calculated using NumPy to quantify the prediction outcomes.
The system also included a custom-built federated learning framework, which involved manual implementation of the trained belief rule-based (BRB) model and particle swarm optimization (PSO) for parameter tuning. The federated learning setup was established by creating secure connections between client nodes and a central server using libraries like Sockets, Paramiko, and Flask. This setup allowed local models to be trained independently on different datasets and then aggregated at the server to form a global model, ensuring data privacy while improving predictive accuracy. The entire process was managed and executed in Visual Studio Code, providing a robust environment for developing and refining the Alzheimer’s disease prediction system. The code constructed for the development of the Alzheimer’s Diagnostic System is available on ERROR: Failed to execute system command: GitHub.

4.2. Hyperparameter Setting

The hyperparameter settings for the Federated Averaging (FedAvg) algorithm in diagnosing Alzheimer’s disease are outlined in Table 5. Initially, the CNN prediction data combined with demographic data are divided among three clients. In each client, 80% and 20% of data are available for training and testing of the optimized BRB model. For optimization of BRB, particle swarm optimization (PSO) is used. The parameters in PSO have a swarm size of 50 and a maximum of 100 iterations, with parameter bounds ranging from 0 to 1 for each of the 38 parameters. A total of five iterations is performed in PSO.

4.3. Evaluation Metrics

Mean square error (MSE) is a critical metric used to evaluate the performance of models in detecting Alzheimer’s disease. MSE measures the average of the squares of the errors, which are the differences between the predicted outcomes and the actual outcomes. It provides a quantitative assessment of the model’s prediction accuracy, particularly in medical diagnosis contexts such as Alzheimer’s disease detection.
MSE = 1 n i = 1 n ( y i y ^ i ) 2
where:
  • n is the number of observations or patients in the dataset.
  • y i represents the actual diagnostic score or label for the i-th patient.
  • y ^ i denotes the predicted diagnostic score or label for the i-th patient made by the model.
In the context of Alzheimer’s disease detection, y i might correspond to the true diagnostic status of a patient (e.g., Mild Cognitive Impairment, or Alzheimer’s Disease), and y ^ i represents the model’s prediction. The MSE metric helps to quantify how closely the model’s predictions match the true diagnostic statuses.
A lower MSE value indicates that the model’s predictions are close to the actual diagnostic results, suggesting a higher level of accuracy in detecting Alzheimer’s disease. Achieving a low MSE is crucial for ensuring the reliability of the model in clinical settings, where accurate detection of Alzheimer’s disease is essential for timely intervention and treatment.

4.4. Performance of CNN for MRI Classification of Alzheimer’s

Figure 5 illustrates the loss and accuracy graph of the customised CNN used to train and test the MRI images extracted from the ADNI dataset. It is observed that the model reached an accuracy of approximately 98% and a loss of 0.01%. It can be deduced that there is little or no difference between the training and testing accuracy and loss. This means that the model is well fit.

4.5. Rule Base Overview for Federated Learning Clients and Servers

The tables below represent the outcome of the rule base for the three iterations. This section is divided into two Section 4.6 and Section 4.7: one for the clients and one for the server. Section 4.6 indicates the client-side rule-base as the input rule-base which takes into account the following parameters: the CNN prediction value represented as “CNN_Value”, the age of the person represented as “Age”, and the predicted Alzheimer’s level represented as “Group”. In contrast, Section 4.7 indicates the server-side rule base as the output rule base, which takes into account the optimized values obtained after training the BRB. The optimized parameters are the likelihood of the person being diagnosed with Alzheimer’s which is distributed in three levels: Cognitively Normal “CN”, Mild Cognitive Impairment “MCI”, and Alzheimer’s Disease “AD”. To find more information about the inheritance of the rule base, refer to the initial rule base depicted in Table 4 of Section 3.4.3.

4.6. Rule Base for Local Training Model on Client Side

In this section, the client-side rule base from three different aggregators, namely FedAvg, FedProx, and Genetic Algorithm, are demonstrated in the following ables. Each table (Table 6, Table 7, Table 8, Table 9, Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20, Table 21, Table 22, Table 23, Table 24, Table 25, Table 26, Table 27, Table 28, Table 29, Table 30, Table 31 and Table 32) illustrates the type of aggregator used, the client, and the number of iterations.

4.7. Optimized Rule Base for Global Training Model on Server Side

In this section, the server-side rule base from three different aggregators, namely FedAvg, FedProx, and Genetic Algorithm, are demonstrated in the following tables. Each table (Table 33, Table 34, Table 35, Table 36, Table 37, Table 38, Table 39, Table 40 and Table 41) illustrates the type of aggregator used, the client, and the number of iterations.

4.8. Graphical Representation of Accuracy and Loss for the Clients and Servers

In this section, some illustrations display the accuracy and loss data, for clients and servers using aggregation methods like FedAvg, Genetic Algorithm, and FedProx. These visuals present a comparison of performance measures, by depicting how local accuracy/loss changes throughout the iterations.
  • Figure 6: The graph visualizes the Global Accuracy derived using the FedAvg aggregator in the three iterations. The x-axis represents the number of iterations, and the y-axis shows the accuracy achieved during global training.
  • Figure 7: This plot dictates the Local Client Accuracy from the FedAvg aggregator for each client across the three iterations. The accuracy for each client is plotted independently, demonstrating the performance consistency or variability among the clients.
  • Figure 8: This figure presents the Global Accuracy achieved using the Genetic Algorithm aggregator. The plot distinguishes the training versus testing accuracy over the three iterations, illustrating the effectiveness of the Genetic Algorithm in global optimization.
  • Figure 9: Similar to Figure 7, this plot shows the Local Accuracy for each client when using the Genetic Algorithm aggregator. The fluctuation in accuracy among clients across iterations is depicted.
  • Figure 10: This plot depicts the Global Loss during training and testing using the Genetic Algorithm aggregator. The y-axis represents the loss, and the x-axis shows the iterations. A reduction in loss over iterations indicates better convergence.
  • Figure 11: This figure shows the Local Loss for each client under the Genetic Algorithm aggregator. The figure aids in visualizing how individual clients are optimizing their loss functions.
  • Figure 12: This plot shows the Global Accuracy achieved using the FedProx aggregator. It illustrates the effectiveness of FedProx in improving the global model accuracy across iterations.
  • Figure 13: Finally, this plot presents the Local Accuracy for each client when using the FedProx aggregator. The comparison among clients is essential to understanding the federated learning model’s performance across distributed data sources.
These plots offer an analysis of the precision and error measurements when using approaches, in federated learning aiding in determining the best method suited to a specific situation.

4.9. Interpretation of Model Performance of Aggregators Used in Federated Learning

This study investigates how aggregation techniques, namely FedAvg, FedProx, and Genetic Algorithm, work within a federated learning (FL) setup to detect Alzheimer’s disease by analyzing images and demographic data. The findings detailed in Table 42 shed light on how each method performs on an overall scale.

4.9.1. FedAvg Aggregator

The FedAvg aggregator, which averages the model parameters from each client, demonstrated remarkable performance. At the global level, it achieved a nearly flawless testing accuracy of 0.99999 and a minimal mean squared error (MSE) of 0.00465. This impressive performance can be attributed to several factors inherent to the FedAvg algorithm:
  • Robust Aggregation of Model Updates: The FedAvg method involves averaging the model parameters from clients to reduce the impact of data differences, among them. By blending these updates, the overall model can effectively adapt to datasets resulting in improved performance.
  • Effective Handling of Data Heterogeneity: In federated learning setups, there is often a challenge, with IID (Independent and Identically Distributed) data, where various clients possess distinct data distributions. To address this, FedAvg tackles the problem by combining updates from all clients, allowing for a representation of data patterns and minimizing the chances of tailoring to any one client’s specific data.
  • Generalization Across Multiple Clients: The variety of data from clients helps the overall model perform effectively in situations. This is shown by the testing accuracy of the model, which is close to 1.0, proving its excellent performance, with different types of data inputs.
Despite its performance, there are clear differences, in accuracy levels across different clients. For instance, client 3 only managed to achieve a testing accuracy of 0.52830, underscoring the difficulties FedAvg may face in adapting to client nuances. This disparity indicates that, while FedAvg excels in building a model, it might struggle to account for the distinct traits present in individual client datasets. Moreover, from Table 43, it can be deduced that the average time required to run PSO is 7,721.84 s. Now, if the number of iteration alongside rules increases, the time required to run BRBs will also increase. So, this raises an issue of time complexity which will not be applicable for real-time applications which support large-scale datasets with large rule sets.

4.9.2. FedProx Aggregator

FedProx, a modified version of FedAvg that incorporates a term to address data differences efficiently, also exhibited impressive results. It reached a testing accuracy of 0.68944 with an MSE of 0.00687. The inclusion of the term in FedProx contributes to enhancing the training process in scenarios with diverse data.
The consistent performance outcomes among clients using FedProx imply that its regularization features offer benefits in maintaining performance stability across varying data distributions. For example, client 1 achieved a testing accuracy of 0.61111, notably surpassing the results seen with FedAvg, suggesting that FedProx might excel in managing types of data disparities.

4.9.3. Genetic Algorithm Aggregator

The Genetic Algorithm (GA) aggregator, which evolves model parameters based on a fitness function, demonstrated potential in adapting to client-specific data. For example, client 1 achieved a testing accuracy of 0.29629 with an MSE of 0.032451. However, the global accuracy was lower, with a testing accuracy of 0.43478 and an MSE of 0.00904.
The lower global accuracy implies that, while GA is adept at fine-tuning models for individual clients, it may require more extensive fine-tuning or a larger population size to reach the same level of generalization as FedAvg. Nonetheless, the adaptability of GA highlights its potential for scenarios wherein client data vary significantly, suggesting that, with further refinement, GA could become a more robust aggregator in FL settings.

4.10. Comparison with State-of-the-Art Methods

In times, there have been advancements in methods to improve the accuracy and reliability of diagnosing Alzheimer’s disease (see Table 44). These cutting-edge techniques leverage sophisticated machine learning models. However, our proposed method shows progress in accuracy and maintaining privacy setting, a benchmark in this field.
For example, Umme [1] and colleagues utilized DementiaNet with transfer learning to achieve a 97% accuracy on an MRI dataset with 6400 images. While this technique enhances performance on medical imaging tasks using trained models, it falls short in addressing data privacy concerns and uncertainties. Similarly, Raees et al. [40] combined support vector machines (SVMs) with deep neural networks (DNNs) to analyze data from 111 individuals in an MCI dataset achieving a 90% accuracy rate. While effective for datasets, this approach may require modifications to generalize to larger and more complex datasets. Buvaneswari et al. [41] experimented with SegNet and ResNet 101 on the ADNI dataset obtaining an accuracy of 96%. This method makes use of learning for feature extraction and segmentation, proving particularly effective for datasets like ADNI. However, there are still challenges when it comes to maintaining data privacy and handling uncertainties in the predictions. In a study by Saratxaga et al. [42] (2021), they created a model using ResNet 18 and BrainNet utilizing datasets from Kaggle. They found varying levels of accuracy ranging from 80% to 90% depending on the dataset used. While their method shows promise, the fluctuations in accuracy indicate a need for improvements to ensure consistency across datasets. Another study by Hu et al. [43] (2021) employed a convolutional neural network (CNN) trained on ADNI and NIFD datasets achieving an accuracy of 92%. CNNs are known for their effectiveness in image classification tasks as they can automatically learn features from raw image data. However, similar to other approaches, this method lacks mechanisms for safeguarding data privacy during the training process. Khalil et al. [5] (2023) investigated the use of hardware acceleration with VHDL and FPGA for diagnosing Alzheimer’s disease using simulation data. Despite reaching an accuracy of 89%, with a sensitivity rate of 87% and low power consumption (35–39 mW), this technique primarily emphasizes hardware efficiency rather than enhancing diagnostic accuracy or addressing concerns regarding data privacy. In their work, Mitrovska et al. [6] (2024) introduced a federated learning framework integrated with aggregation (SecAgg) and demographic simulations. The researchers focus on the importance of ensuring privacy in their work and analyze variations although they do not provide accuracy metrics. This underscores the increasing awareness of privacy in handling data. This also suggests that their method may need further refinement to achieve precise diagnostic results. In their study, Trivedi et al. [32] and colleagues applied federated deep learning using AlexNet on a distributed dataset achieving an accuracy rate of 98.53%. Their research stands out for its emphasis on safeguarding data privacy and demonstrating reliability across both multiple client settings. While this approach marks progress in learning, there is still room for enhancing the management of data distributions and addressing uncertainties. Altalbe et al. [33] developed a network within a federated learning framework for analyzing Kay Elemetrics voice disorder data reaching an accuracy level of 82.82%. Their strategy concentrates on mitigating overfitting and upholding privacy standards for medical applications. However, there is potential for enhancing accuracy, particularly when dealing with more varied datasets. Mandawkar et al. [34] utilized the Tawny Flamingo Deep CNN model on datasets achieving accuracies of 98.252% through Kfold validation and 97.995% during training sessions. Their methodology showcases the effectiveness of adjustments in attaining precision results; however, it falls short in addressing privacy concerns or uncertainties. Castro et al. [35] merged federated learning with biometric authentication and tested this on the OASIS and ADNI datasets. They found accuracy levels focused primarily on safeguarding privacy and preventing data manipulation, both essential aspects in federated learning settings.This strategy highlights the balance between security and performance, although there is room for enhancements in accuracy to improve its practical utility. On a similar note, Qian et al. [9] introduced FeDeFo, a combination of deep forest with federated learning applied to sMRI images. Their approach prioritizes model development while ensuring data privacy even though specific accuracy metrics were not disclosed. This indicates an emphasis on privacy protection and customized modeling, suggesting advancements in accuracy down the line.
In contrast, our research proposes a hybrid deep and federated learning methodology implemented on the ADNI and NIfTI files dataset. This approach achieved a 99.9% accuracy rate, significantly surpassing existing methods’ reported accuracies. Moreover, our model prioritizes data privacy and effectively manages uncertainties, offering a solution to the challenges associated with diagnosing Alzheimer’s disease. By combining the strengths of learning with the benefits of federated learning, our model not only establishes a new standard for diagnostic precision, but also addresses crucial concerns regarding data privacy and uncertainty management. This makes it a robust and dependable tool for world applications.

5. Conclusions and Future Research Directions

In this study, a method is proposed for diagnosing Alzheimer’s disease, addressing challenges such as data variability, privacy concerns, and uncertainty in datasets. CNNs analyze MRI scans, while federated learning ensures patient data privacy, and the BRB system manages diagnostic uncertainty. Experiments evaluating FedAvg, FedProx, and Genetic Algorithm aggregation methods revealed that FedAvg achieved near-perfect accuracy (99.99%) and minimal mean squared error (MSE), demonstrating strong generalization across datasets but limited performance for client-specific variations. FedProx delivered consistent results by effectively addressing client-specific data differences through regularization, while the Genetic Algorithm showed adaptability to diverse data distributions, requiring additional fine-tuning but highlighting its potential for optimizing client-specific parameters.
The proposed framework applies convolutional neural networks (CNNs) in Alzheimer’s disease classification. This heavily relies on access to labeled datasets. Even with the use of techniques like data augmentation and transfer learning, there is still a challenge in ensuring the model’s adaptability to unseen data.
In order to overcome this limitation, the FL framework was employed. This framework is beneficial in sensitive domains such as healthcare. Because it ensures that patient data remain localized on client devices, FL also minimizes the risk of privacy breaches. In the FL approach, these advantages are further enhanced by optimizing the model’s performance across heterogeneous client datasets to enable robust training under uncertainty without compromising data privacy.
However, it is important to recognize the limitations of this research. The implementation of federated learning (FL) which plays a role in safeguarding data privacy, has brought about challenges concerning communication model convergence. In scenarios wherein datasets vary widely, maintaining model performance across nodes has proven to be quite challenging. Despite the utilization of strategies like model averaging and communication-efficient algorithms, there is still a need for optimization within FL frameworks.
Regarding the belief rule base (BRB) system, known for its effectiveness in handling uncertainty, it poses complexity concerns, especially when combined with particle swarm optimization (PSO). This complexity could potentially hinder large-scale applications in real-time situations that require decision-making.
Several avenues for exploration are proposed for future to improve the efficiency and applicability of the framework in real-life situations.
To optimize training and inference times, hardware acceleration tools should be explored to address the complexity of federated learning and belief rule-based systems. These tools can significantly reduce computational overhead, making the framework more feasible for resource-constrained environments. Enhanced data security measures, such as privacy-preserving techniques and homomorphic encryption, should also be integrated to safeguard patient information beyond the privacy provided by federated learning.
Testing the scalability of the framework with diverse datasets is essential, along with extending its application to other neurodegenerative diseases or medical conditions requiring privacy and uncertainty management. This would highlight the framework’s adaptability and robustness across various healthcare contexts.
Practical considerations for large-scale deployment, including resource requirements, must be addressed. Collaborations with healthcare providers and regulatory agencies will be essential to ensure the system meets high diagnostic standards and integrates seamlessly into existing healthcare processes.
Future research should focus on incorporating additional data modalities, such as genetic information, lifestyle factors, and longitudinal data, to improve the precision of the diagnosis of Alzheimer’s disease. Expanding data sources could improve the model’s ability to detect patterns and relationships, enabling more precise and comprehensive diagnoses. Trials in real-world settings will be crucial to validate the framework’s efficacy, with collaboration from healthcare institutions to deploy and monitor its performance while gathering feedback from experts. Moreover, future work should explore cross-dataset performance evaluation to enhance generalization and extend the framework’s applicability across broader medical domains.

Author Contributions

Conceptualization, N.B. and K.A.; methodology, N.B. and K.A.; software, N.B. and K.A.; validation, N.B., T.M., R.U.I. and K.A.; formal analysis, N.B. and K.A.; data curation, N.B., T.M., R.U.I. and K.A.; writing—original draft preparation, N.B.; writing—review and editing, N.B., T.M., R.U.I. and K.A.; visualization, N.B. and T.M.; supervision, K.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

All necessary informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data used to support the findings of this study are available upon reasonable request to the corresponding author.

Acknowledgments

All praise and glory to Almighty Allah for giving me the endurance to pursue this research and complete my academic journey. We extend my deepest gratitude to my thesis supervisor, Karl Andersson, for his invaluable guidance, mentorship, and insightful feedback. His support was crucial to the success of this research. We would also like to express my sincere appreciation to the members of the GENIAL committee for their constructive critiques and suggestions that greatly enhanced the quality of this work. My gratitude goes to Jean-Philippe Georges, Program Coordinator, for always managing our requirements and addressing our challenges, Ah-Lian Kor from Leeds Beckett University, for always looking after us and supporting our academic journey, and Karan Mitra from Luleå University of Technology. Lastly, we wish to acknowledge my family and friends for their unwavering love, prayers, and encouragement, which have been my greatest source of strength throughout this journey. This work is dedicated to them.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Umme Habiba, S.; Debnath, T.; Islam, M.K.; Nahar, L.; Shahadat Hossain, M.; Basnin, N.; Andersson, K. Transfer Learning-Assisted DementiaNet: A Four Layer Deep CNN for Accurate Alzheimer’s Disease Detection from MRI Images. In Proceedings of the International Conference on Brain Informatics, Hoboken, NJ, USA, 1–3 August 2023; Springer: Cham, Switzerland, 2023; pp. 383–394. [Google Scholar]
  2. Mahmud, T.; Aziz, M.T.; Uddin, M.K.; Barua, K.; Rahman, T.; Sharmen, N.; Shamim Kaiser, M.; Sazzad Hossain, M.; Hossain, M.S.; Andersson, K. Ensemble Learning Approaches for Alzheimer’s Disease Classification in Brain Imaging Data. In Proceedings of the International Conference on Trends in Electronics and Health Informatics, Dhaka, Bangladesh, 20–21 December 2023; Springer: Cham, Switzerland, 2023; pp. 133–147. [Google Scholar]
  3. Mahmud, T.; Barua, K.; Barua, A.; Das, S.; Basnin, N.; Hossain, M.S.; Andersson, K.; Kaiser, M.S.; Sharmen, N. Exploring Deep Transfer Learning Ensemble for Improved Diagnosis and Classification of Alzheimer’s Disease. In Proceedings of the International Conference on Brain Informatics, Hoboken, NJ, USA, 1–3 August 2023; Springer: Cham, Switzerland, 2023; pp. 109–120. [Google Scholar]
  4. Hossain, M.S.; Rahaman, S.; Mustafa, R.; Andersson, K. A belief rule-based expert system to assess suspicion of acute coronary syndrome (ACS) under uncertainty. Soft Comput. 2018, 22, 7571–7586. [Google Scholar] [CrossRef]
  5. Khalil, K.; Khan Mamun, M.M.R.; Sherif, A.; Elsersy, M.S.; Imam, A.A.A.; Mahmoud, M.; Alsabaan, M. A federated learning model based on hardware acceleration for the early detection of alzheimer’s disease. Sensors 2023, 23, 8272. [Google Scholar] [CrossRef] [PubMed]
  6. Mitrovska, A.; Safari, P.; Ritter, K.; Shariati, B.; Fischer, J.K. Secure federated learning for Alzheimer’s disease detection. Front. Aging Neurosci. 2024, 16, 1324032. [Google Scholar] [CrossRef]
  7. Mahmud, T.; Barua, K.; Habiba, S.U.; Sharmen, N.; Hossain, M.S.; Andersson, K. An Explainable AI Paradigm for Alzheimer’s Diagnosis Using Deep Transfer Learning. Diagnostics 2024, 14, 345. [Google Scholar] [CrossRef] [PubMed]
  8. Beheshti, Z.; Shamsuddin, S.M.H.; Beheshti, E.; Yuhaniz, S.S. Enhancement of artificial neural network learning using centripetal accelerated particle swarm optimization for medical diseases diagnosis. Soft Comput. 2014, 18, 2253–2270. [Google Scholar] [CrossRef]
  9. Qian, C.; Xiong, H.; Li, J. FeDeFo: A Personalized Federated Deep Forest Framework for Alzheimer’s Disease Diagnosis. In Proceedings of the 35th International Conference on Software Engineering and Knowledge Engineering, San Francisco, CA, USA, 1–10 July 2023. [Google Scholar]
  10. Hossain, M.S.; Ahmed, F.; Andersson, K. A belief rule based expert system to assess tuberculosis under uncertainty. J. Med. Syst. 2017, 41, 43. [Google Scholar] [CrossRef]
  11. Bordin, V.; Coluzzi, D.; Rivolta, M.W.; Baselli, G. Explainable AI points to white matter hyperintensities for Alzheimer’s disease identification: A preliminary study. In Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, UK, 11–15 July 2022; IEEE: New York, NY, YSA, 2022; pp. 484–487. [Google Scholar]
  12. De Santi, L.A.; Pasini, E.; Santarelli, M.F.; Genovesi, D.; Positano, V. An Explainable Convolutional Neural Network for the Early Diagnosis of Alzheimer’s Disease from 18F-FDG PET. J. Digit. Imaging 2023, 36, 189–203. [Google Scholar] [CrossRef] [PubMed]
  13. García-Gutierrez, F.; Díaz-Álvarez, J.; Matias-Guiu, J.A.; Pytel, V.; Matías-Guiu, J.; Cabrera-Martín, M.N.; Ayala, J.L. GA-MADRID: Design and validation of a machine learning tool for the diagnosis of Alzheimer’s disease and frontotemporal dementia using genetic algorithms. Med. Biol. Eng. Comput. 2022, 60, 2737–2756. [Google Scholar] [CrossRef]
  14. Yang, C.; Rangarajan, A.; Ranka, S. Visual explanations from deep 3D convolutional neural networks for Alzheimer’s disease classification. In Proceedings of the AMIA Annual Symposium Proceedings, San Francisco, CA, USA, 3–7 November 2018; American Medical Informatics Association: Bethesda, MD, USA, 2018; Volume 2018, p. 1571. [Google Scholar]
  15. Rieke, J.; Eitel, F.; Weygandt, M.; Haynes, J.D.; Ritter, K. Visualizing convolutional networks for MRI-based diagnosis of Alzheimer’s disease. In Proceedings of the Understanding and Interpreting Machine Learning in Medical Image Computing Applications: First International Workshops, MLCN 2018, DLF 2018, and iMIMIC 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 16–20 September 2018; Proceedings 1. Springer: Cham, Switzerland, 2018; pp. 24–31. [Google Scholar]
  16. Loveleen, G.; Mohan, B.; Shikhar, B.S.; Nz, J.; Shorfuzzaman, M.; Masud, M. Explanation-driven hci model to examine the mini-mental state for alzheimer’s disease. Acm Trans. Multimed. Comput. Commun. Appl. 2023, 20, 1–16. [Google Scholar] [CrossRef]
  17. Yu, L.; Xiang, W.; Fang, J.; Chen, Y.P.P.; Zhu, R. A novel explainable neural network for Alzheimer’s disease diagnosis. Pattern Recognit. 2022, 131, 108876. [Google Scholar] [CrossRef]
  18. Kim, M.; Kim, J.; Qu, J.; Huang, H.; Long, Q.; Sohn, K.A.; Kim, D.; Shen, L. Interpretable temporal graph neural network for prognostic prediction of Alzheimer’s disease using longitudinal neuroimaging data. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; IEEE: New York, NY, YSA, 2021; pp. 1381–1384. [Google Scholar]
  19. Marwa, E.G.; Moustafa, H.E.D.; Khalifa, F.; Khater, H.; AbdElhalim, E. An MRI-based deep learning approach for accurate detection of Alzheimer’s disease. Alex. Eng. J. 2023, 63, 211–221. [Google Scholar]
  20. Ghazal, T.M.; Abbas, S.; Munir, S.; Khan, M.; Ahmad, M.; Issa, G.F.; Zahra, S.B.; Khan, M.A.; Hasan, M.K. Alzheimer Disease Detection Empowered with Transfer Learning. Comput. Mater. Contin. 2022, 70, 5005–5019. [Google Scholar] [CrossRef]
  21. AlSaeed, D.; Omar, S.F. Brain MRI analysis for Alzheimer’s disease diagnosis using CNN-based feature extraction and machine learning. Sensors 2022, 22, 2911. [Google Scholar] [CrossRef] [PubMed]
  22. Hamdi, M.; Bourouis, S.; Rastislav, K.; Mohmed, F. Evaluation of neuro images for the diagnosis of Alzheimer’s disease using deep learning neural network. Front. Public Health 2022, 10, 834032. [Google Scholar]
  23. Helaly, H.A.; Badawy, M.; Haikal, A.Y. Deep learning approach for early detection of Alzheimer’s disease. Cogn. Comput. 2021, 14, 1711–1727. [Google Scholar] [CrossRef]
  24. Mohammed, B.A.; Senan, E.M.; Rassem, T.H.; Makbol, N.M.; Alanazi, A.A.; Al-Mekhlafi, Z.G.; Almurayziq, T.S.; Ghaleb, F.A. Multi-method analysis of medical records and MRI images for early diagnosis of dementia and Alzheimer’s disease based on deep learning and hybrid methods. Electronics 2021, 10, 2860. [Google Scholar] [CrossRef]
  25. Pradhan, A.; Gige, J.; Eliazer, M. Detection of Alzheimer’s disease (AD) in MRI images using deep learning. Int. J. Eng. Res. Technol. 2021, 10. [Google Scholar]
  26. Salehi, A.W.; Baglat, P.; Sharma, B.B.; Gupta, G.; Upadhya, A. A CNN model: Earlier diagnosis and classification of Alzheimer disease using MRI. In Proceedings of the 2020 International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 10–12 September 2020; IEEE: New York, NY, YSA, 2020; pp. 156–161. [Google Scholar]
  27. Suganthe, R.C.; Latha, R.S.; Geetha, M.; Sreekanth, G.R. Diagnosis of Alzheimer’s disease from brain magnetic resonance imaging images using deep learning algorithms. Adv. Electr. Comput. Eng. 2020, 20, 57–64. [Google Scholar] [CrossRef]
  28. Hussain, E.; Hasan, M.; Hassan, S.Z.; Azmi, T.H.; Rahman, M.A.; Parvez, M.Z. Deep learning based binary classification for alzheimer’s disease detection using brain mri images. In Proceedings of the 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway, 9–13 November 2020; IEEE: New York, NY, YSA, 2020; pp. 1115–1120. [Google Scholar]
  29. Basaia, S.; Agosta, F.; Wagner, L.; Canu, E.; Magnani, G.; Santangelo, R.; Filippi, M.; The Alzheimer’s Disease Neuroimaging Initiative. Automated classification of Alzheimer’s disease and mild cognitive impairment using a single MRI and deep neural networks. Neuroimage Clin. 2019, 21, 101645. [Google Scholar] [CrossRef]
  30. Ji, H.; Liu, Z.; Yan, W.Q.; Klette, R. Early diagnosis of Alzheimer’s disease using deep learning. In Proceedings of the 2nd International Conference on Control and Computer Vision, Jeju, Republic of Korea, 15–18 June 2019; pp. 87–91. [Google Scholar]
  31. Islam, J.; Zhang, Y. A novel deep learning based multi-class classification method for Alzheimer’s disease detection using brain MRI data. In Proceedings of the Brain Informatics: International Conference, BI 2017, Beijing, China, 16–18 November 2017; Proceedings. Springer: Cham, Switzerland, 2017; pp. 213–222. [Google Scholar]
  32. Trivedi, N.K.; Jain, S.; Agarwal, S. Identifying and Categorizing Alzheimer’s Disease with Lightweight Federated Learning Using Identically Distributed Images. In Proceedings of the 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 14–15 March 2024; IEEE: New York, NY, YSA, 2024; pp. 1–5. [Google Scholar]
  33. Altalbe, A.; Javed, A.R. Privacy Preserved Brain Disorder Diagnosis Using Federated Learning. Comput. Syst. Sci. Eng. 2023, 47, 2187–2200. [Google Scholar] [CrossRef]
  34. Mandawkar, U.; Diwan, T. Alzheimer disease classification using tawny flamingo based deep convolutional neural networks via federated learning. Imaging Sci. J. 2022, 70, 459–472. [Google Scholar] [CrossRef]
  35. Castro, F.; Impedovo, D.; Pirlo, G. A Federated Learning System with Biometric Medical Image Authentication for Alzheimer’s Diagnosis. In Proceedings of the International Conference on Pattern Recognition Applications and Methods (ICPRAM), Rome, Italy, 24–26 February 2024; pp. 951–960. [Google Scholar]
  36. Biswas, M.; Chowdhury, S.U.; Nahar, N.; Hossain, M.S.; Andersson, K. A belief rule base expert system for staging non-small cell lung cancer under uncertainty. In Proceedings of the 2019 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON), Dhaka, Bangladesh, 28–30 November 2019; IEEE: New York, NY, YSA, 2019; pp. 47–52. [Google Scholar]
  37. Mandal, A.K.; Sarma, P.K.D. Usage of particle swarm optimization in digital images selection for monkeypox virus prediction and diagnosis. Malays. J. Comput. Sci. 2024, 37, 124–139. [Google Scholar]
  38. Alzheimer’s Disease Neuroimaging Initiative (ADNI). Available online: https://adni.loni.usc.edu. (accessed on 25 February 2024).
  39. Wimo, A.; Jonsson, L.; Gustavsson, A.; McDaid, D.; Ersek, K.; Georges, J.; Gulácsi, L.; Karpati, K.; Kenigsberg, P.; Valtonen, H. Utility score changes in Alzheimer’s disease. Value Health 2009, 12, 295–305. [Google Scholar]
  40. Raees, P.M.; Thomas, V. Automated detection of Alzheimer’s Disease using Deep Learning in MRI. J. Phys. Conf. Ser. 2021, 1921, 012024. [Google Scholar] [CrossRef]
  41. Buvaneswari, P.; Gayathri, R. Deep learning-based segmentation in classification of Alzheimer’s disease. Arab. J. Sci. Eng. 2021, 46, 5373–5383. [Google Scholar] [CrossRef]
  42. Saratxaga, C.L.; Moya, I.; Picón, A.; Acosta, M.; Moreno-Fernandez-de Leceta, A.; Garrote, E.; Bereciartua-Perez, A. MRI deep learning-based solution for Alzheimer’s disease prediction. J. Pers. Med. 2021, 11, 902. [Google Scholar] [CrossRef] [PubMed]
  43. Hu, J.; Qing, Z.; Liu, R.; Zhang, X.; Lv, P.; Wang, M.; Wang, Y.; He, K.; Gao, Y.; Zhang, B. Deep learning-based classification and voxel-based visualization of frontotemporal dementia and Alzheimer’s disease. Front. Neurosci. 2021, 14, 626154. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Operational framework.
Figure 1. Operational framework.
Diagnostics 15 00080 g001
Figure 2. Data preparation.
Figure 2. Data preparation.
Diagnostics 15 00080 g002
Figure 3. Belief rule base system architecture.
Figure 3. Belief rule base system architecture.
Diagnostics 15 00080 g003
Figure 4. Trained belief rule base system.
Figure 4. Trained belief rule base system.
Diagnostics 15 00080 g004
Figure 5. CNN performance graph.
Figure 5. CNN performance graph.
Diagnostics 15 00080 g005
Figure 6. Global accuracy from FedAvg aggregator.
Figure 6. Global accuracy from FedAvg aggregator.
Diagnostics 15 00080 g006
Figure 7. Local client accuracy from FedAvg aggregator.
Figure 7. Local client accuracy from FedAvg aggregator.
Diagnostics 15 00080 g007
Figure 8. Global accuracy from Genetic Algorithm aggregator.
Figure 8. Global accuracy from Genetic Algorithm aggregator.
Diagnostics 15 00080 g008
Figure 9. Local accuracy from Genetic Algorithm aggregator.
Figure 9. Local accuracy from Genetic Algorithm aggregator.
Diagnostics 15 00080 g009
Figure 10. Global loss from Genetic Algorithm aggregator.
Figure 10. Global loss from Genetic Algorithm aggregator.
Diagnostics 15 00080 g010
Figure 11. Local loss from Genetic Algorithm aggregator.
Figure 11. Local loss from Genetic Algorithm aggregator.
Diagnostics 15 00080 g011
Figure 12. Global loss from FedProx aggregator.
Figure 12. Global loss from FedProx aggregator.
Diagnostics 15 00080 g012
Figure 13. Local loss from FedProx aggregator.
Figure 13. Local loss from FedProx aggregator.
Diagnostics 15 00080 g013
Table 1. Summary of key studies in CNN-based Alzheimer’s diagnosis.
Table 1. Summary of key studies in CNN-based Alzheimer’s diagnosis.
StudyData TypeMethodologyAccuracyKey FindingsLimitations
Marwa et al. [19]MRI ImagesDNN99.68%High accuracy for various stagesLimited generalizability
Ghazal et al. [20]MRI ImagesTransfer Learning (AlexNet)91.7%Effective for stage classificationPotential overfitting
AlSaeed et al. [21]MRI ImagesCNN (ResNet50)85.7–99%Superior performance in feature extractionNot evaluated on diverse datasets
Helaly et al. [23]MRI ImagesCNN, Transfer Learning97%VGG-19 model performed well in classificationLimited comparison with other DL models
Mohammed et al. [24]MRI ImagesCNN, Hybrid Models94.8%Hybrid models improved diagnostic accuracyRequires large datasets for training
Ji et al. [30]MRI ImagesEnsemble Learning97.65%Early diagnosis of AD/mild cognitive impairmentEnsemble methods can be computationally expensive
Islam et al. [31]MRI ImagesCNNNot SpecifiedFast AD detection suitable for small datasetsNot detailed in feature extraction
Table 2. Key summary of studies in Alzheimer’s disease detection using federated learning.
Table 2. Key summary of studies in Alzheimer’s disease detection using federated learning.
StudyData TypeMethodologyAccuracyKey FindingsLimitations
Khalil et al. [5]MRI Images, Simulation DataFederated Learning (VHDL + FPGA)89%Efficient hardware acceleration for FL with low power consumptionLimited to specific hardware settings
Mitrovska et al. [6]Demographic and Diagnosis DataFederated Averaging (FedAvg) + Secure Aggregation (SecAgg)Not SpecifiedEnhanced privacy guarantees with FL; effective under diverse demographicsPotential complexity in real-world deployment
Trivedi et al. [32]MRI ImagesFederated Deep Learning (AlexNet)98.53%High accuracy in AD classification with privacy-preserving setupRequires IID datasets
Altalbe et al. [33]Voice Disorder DataFederated Learning (DNN)82.82%Overfitting reduced through iterative training; good privacy preservationLower accuracy compared to other models
Mandawkar et al. [34]Clinical DataTawny Flamingo Deep CNN + Federated Learning97.995%High accuracy in AD detection; effective in distributed settingsComputationally intensive
Castro et al. [35]RGB MRI ImagesFederated Learning + Biometric RecognitionComparable to Centralized MLPrivacy preservation in AD diagnosis with biometric authenticationPerformance depends on biometric accuracy
Qian et al. [9]sMRI ImagesFederated Deep Forest (FeDeFo)Not SpecifiedPersonalized AD classification; effective handling of data discrepanciesComplexity in model personalization
Table 3. Modified CNN model architecture.
Table 3. Modified CNN model architecture.
Model ContentDetails
1st Convolution Layer 2DInput Size 128 × 128 × 3, 16 filters of kernel size 3 × 3, he_normal initializer, L2 (0.01), LeakyReLU (alpha = 0.001)
1st Max Pooling Layer 2DPool size 2 × 2
Batch NormalizationNormalizes activations
2nd Convolution Layer 2D32 filters of kernel size 3 × 3, he_normal initializer, L2 (0.01), LeakyReLU (alpha = 0.001)
2nd Max Pooling Layer 2DPool size 2 × 2
3rd Convolution Layer 2D128 filters of kernel size 3 × 3, he_normal initializer, L2 (0.01), LeakyReLU (alpha = 0.001)
3rd Max Pooling Layer 2DPool size 2 × 2
Batch NormalizationNormalizes activations
Flattening LayerConverts 2D matrix to 1D vector
Dense Layer128 units, he_normal initializer, L2 (0.01), LeakyReLU (alpha = 0.001)
Dropout LayerRandomly deactivates 25% neurons
Dense Layer64 units, LeakyReLU (alpha = 0.001)
Dropout LayerRandomly deactivates 25% neurons
Output Layer4 units, SoftMax activation
Optimization FunctionAdam optimizer
Loss FunctionSparse Categorical Crossentropy
MetricsAccuracy
Table 4. Initial rule base.
Table 4. Initial rule base.
Rule IDRule WeightIFTHEN
Group
Activation Weight
CNN_Value Age
R11.0LowYoung-OldCN0.36
R21.0LowMiddle-OldMCI0.40
R31.0LowOld-OldMCI0.22
R41.0MediumYoung-OldMCI0.18
R51.0MediumMiddle-OldAD0.45
R61.0MediumOld-OldAD0.30
R71.0HighYoung-OldAD0.55
R81.0HighMiddle-OldAD0.65
R91.0HighOld-OldAD0.80
Table 5. Hyperparameter setting for FedAvg.
Table 5. Hyperparameter setting for FedAvg.
HyperparameterValue
Train-test split80% training, 20% testing
Scaling methodMinMaxScaler
PSO parameters:Swarm size: 50
Maximum iterations: 100
Lower bounds: [0, 0, …, 0] (38 times)
Upper bounds: [1, 1, …, 1] (38 times)
Number of clients3
Iterations for federated learning5
Table 6. Rule base for FedAvg aggregator client 1: first iteration.
Table 6. Rule base for FedAvg aggregator client 1: first iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.00.16560.74990.0845
R20.71230.74040.20010.0595
R30.00.49540.21100.2936
R40.78820.00720.26190.7309
R50.98210.66430.00000.3357
R60.96500.56830.00000.4317
R70.99710.07770.02120.9011
R80.44320.34390.65610.0000
R90.05420.00000.72700.2730
Table 7. Rule base for FedAvg aggregator client 2: first iteration.
Table 7. Rule base for FedAvg aggregator client 2: first iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.75120.00260.67590.3215
R20.54430.00000.51410.4859
R30.06660.00020.64580.3540
R40.50610.68150.31850.0000
R50.83890.74460.22650.0289
R60.33140.98310.01690.0000
R71.00000.00590.34320.6510
R80.00230.00140.46480.5338
R90.15760.29980.19730.5029
Table 8. Rule base for FedAvg aggregator client 3: first iteration.
Table 8. Rule base for FedAvg aggregator client 3: first iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.62340.52890.00000.4711
R20.18750.52480.21840.2568
R31.00001.00000.00000.0000
R41.00000.23850.01770.7437
R50.34680.41170.45240.1359
R60.90860.00010.99760.0023
R71.00000.00560.35510.6393
R80.63260.00000.40970.5903
R90.97360.58910.31600.0949
Table 9. Rule base for FedAvg aggregator client 1: second iteration.
Table 9. Rule base for FedAvg aggregator client 1: second iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.28690.72230.27770.0000
R20.64180.75590.12180.1223
R30.42510.14360.36810.4883
R40.98710.00460.30040.6950
R50.42960.00000.48480.5152
R60.00220.20940.00000.7906
R70.38910.28480.30850.4067
R80.49430.43550.00000.5645
R90.48810.45100.37790.1711
Table 10. Rule base for FedAvg aggregator client 2: second iteration.
Table 10. Rule base for FedAvg aggregator client 2: second iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.00000.48240.40450.1132
R20.28660.00000.77470.2253
R30.86410.08420.91580.0000
R40.00070.70540.27710.0175
R50.45540.37320.62680.0000
R60.26300.60740.00000.3926
R70.58050.86760.03460.0978
R81.00000.00410.22270.7731
R90.17640.00060.30660.6928
Table 11. Rule base for FedAvg aggregator client 3: second iteration.
Table 11. Rule base for FedAvg aggregator client 3: second iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.09070.57850.34100.0805
R20.66330.00390.22940.7666
R30.33300.08410.00030.9156
R40.00000.40990.32010.2700
R50.41400.00900.26670.7243
R60.66520.00030.69180.3079
R70.94620.90700.02950.0634
R80.90360.45950.00000.5405
R90.00000.40540.13140.4632
Table 12. Rule base for FedAvg aggregator client 1: third iteration.
Table 12. Rule base for FedAvg aggregator client 1: third iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.99970.00170.63850.3598
R20.43240.49620.50380.0000
R30.57500.72450.00000.2755
R40.89200.10160.00500.8934
R50.19370.73260.11640.1509
R60.28510.00010.49760.5023
R70.06320.84960.12040.0300
R80.91090.71540.00000.2846
R90.00000.65650.16590.1776
Table 13. Rule base for FedAvg aggregator client 2: third iteration.
Table 13. Rule base for FedAvg aggregator client 2: third iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.09070.57850.34100.0805
R20.66330.00390.22940.7666
R30.33300.08420.00030.9156
R40.00000.40990.32010.2700
R50.41400.00900.26670.7243
R60.66520.00030.69180.3079
R70.94620.90700.02950.0634
R80.90360.45950.00000.5405
R90.00000.40540.13140.4632
Table 14. Rule base for FedAvg aggregator client 3: third iteration.
Table 14. Rule base for FedAvg aggregator client 3: third iteration.
Rule IDRule WeightCNN_ValueAgeGroup
R10.99590.00000.68760.3124
R20.54080.55970.00000.4403
R30.00000.37820.28890.3329
R41.00000.07280.00070.9265
R50.08630.71500.04050.2447
R60.73420.15960.00000.8404
R70.07960.00000.48750.5125
R80.00010.54860.45140.0000
R90.68530.00000.31890.6810
Table 15. FedProx aggregator client 1 rule base: first iteration.
Table 15. FedProx aggregator client 1 rule base: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.09560.32510.50280.1721
R20.24570.67780.00000.3222
R30.70220.00000.64960.3504
R40.99380.73860.21600.0453
R50.73750.00040.63800.3616
R60.00110.04870.07420.8771
R70.70260.00000.82070.1793
R80.12630.04120.95880.0000
R91.00000.73820.00000.2618
Table 16. FedProx aggregator client 2 rule base: first iteration.
Table 16. FedProx aggregator client 2 rule base: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.32510.50280.1721
R20.24570.67780.00000.3222
R30.70220.00000.64960.3504
R40.99380.73860.21600.0453
R50.73750.00040.63800.3616
R60.00110.04870.07420.8771
R70.70260.00000.82070.1793
R80.12630.04120.95880.0000
R91.00000.73820.00000.2618
Table 17. FedProx aggregator client 3 rule base: first iteration.
Table 17. FedProx aggregator client 3 rule base: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.32510.50280.1721
R20.24570.67780.00000.3222
R30.70220.00000.64960.3504
R40.99380.73860.21600.0453
R50.73750.00040.63800.3616
R60.00110.04870.07420.8771
R70.70260.00000.82070.1793
R80.12630.04120.95880.0000
R91.00000.73820.00000.2618
Table 18. FedProx aggregator client 1 rule base: second iteration.
Table 18. FedProx aggregator client 1 rule base: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.21230.70770.0800
R20.89530.00000.71880.2812
R30.46880.47430.50550.0201
R40.40040.39130.22150.3872
R50.53390.00010.33510.6648
R60.43590.87040.09030.0393
R70.97640.64740.05100.3016
R80.82330.00050.09810.9015
R90.03010.41950.57280.0077
Table 19. FedProx aggregator client 2 rule base: second iteration.
Table 19. FedProx aggregator client 2 rule base: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.21230.70770.0800
R20.89530.00000.71880.2812
R30.46880.47430.50550.0201
R40.40040.39130.22150.3872
R50.53390.00010.33510.6648
R60.43590.87040.09030.0393
R70.97640.64740.05100.3016
R80.82330.00050.09810.9015
R90.03010.41950.57280.0077
Table 20. FedProx aggregator client 3 rule base: second iteration.
Table 20. FedProx aggregator client 3 rule base: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.21230.70770.0800
R20.89530.00000.71880.2812
R30.46880.47430.50550.0201
R40.40040.39130.22150.3872
R50.53390.00010.33510.6648
R60.43590.87040.09030.0393
R70.97640.64740.05100.3016
R80.82330.00050.09810.9015
R90.03010.41950.57280.0077
Table 21. FedProx aggregator client 1 rule base: third iteration.
Table 21. FedProx aggregator client 1 rule base: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.04890.00000.15790.8421
R20.98980.79540.00000.2046
R30.29920.26400.73600.0000
R40.50010.69760.00000.3024
R50.97210.46440.00000.5356
R60.51060.00000.78670.2133
R71.00000.00160.50320.4952
R80.64590.77560.22440.0000
R90.76540.67470.01830.3070
Table 22. FedProx aggregator client 2 rule base: third iteration.
Table 22. FedProx aggregator client 2 rule base: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.04890.00000.15790.8421
R20.98980.79540.00000.2046
R30.29920.26400.73600.0000
R40.50010.69760.00000.3024
R50.97210.46440.00000.5356
R60.51060.00000.78670.2133
R71.00000.00160.50320.4952
R80.64590.77560.22440.0000
R90.76540.67470.01830.3070
Table 23. FedProx aggregator client 3 rule base: third iteration.
Table 23. FedProx aggregator client 3 rule base: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.04890.00000.15790.8421
R20.98980.79540.00000.2046
R30.29920.26400.73600.0000
R40.50010.69760.00000.3024
R50.97210.46440.00000.5356
R60.51060.00000.78670.2133
R71.00000.00160.50320.4952
R80.64590.77560.22440.0000
R90.76540.67470.01830.3070
Table 24. Rule base for Genetic Algorithm aggregator client 1: first iteration.
Table 24. Rule base for Genetic Algorithm aggregator client 1: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.64030.00000.3597
R20.93760.24200.00000.7580
R30.00000.07750.05060.8719
R40.75950.15600.00070.8433
R50.53550.88030.00560.1141
R60.53290.08330.91670.0000
R70.60660.00060.59280.4065
R80.34100.50070.49930.0000
R90.33540.00030.10790.8918
Table 25. Rule base for Genetic Algorithm aggregator client 2: first iteration.
Table 25. Rule base for Genetic Algorithm aggregator client 2: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.09160.07890.8295
R21.00000.59240.00000.4076
R30.00640.00000.66150.3385
R40.88790.00060.12020.8792
R50.23350.69080.30920.0000
R60.89530.00020.12520.8746
R70.49550.00250.47040.5271
R81.00000.65850.33930.0023
R90.72710.69950.00000.3005
Table 26. Rule base for Genetic Algorithm aggregator client 3: first iteration.
Table 26. Rule base for Genetic Algorithm aggregator client 3: first iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.27600.61810.00000.3819
R20.60660.00020.00000.9998
R30.00000.15050.58680.2627
R40.33600.00000.14870.8513
R50.06480.29810.70190.0000
R60.89480.00000.69120.3088
R70.57530.41260.58740.0000
R80.02950.54790.43350.0187
R90.29850.55490.00000.4451
Table 27. Rule base for Genetic Algorithm aggregator client 1: second iteration.
Table 27. Rule base for Genetic Algorithm aggregator client 1: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.00000.24090.61080.1483
R20.99160.00000.87240.1276
R30.99820.00020.92260.0772
R40.44230.00020.99880.0011
R50.97520.85390.00000.1461
R60.32860.42220.57780.0000
R70.86760.94550.04420.0103
R80.92920.00180.12990.8683
R90.25860.76500.00000.2349
Table 28. Rule base for Genetic Algorithm aggregator client 2: second iteration.
Table 28. Rule base for Genetic Algorithm aggregator client 2: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.72870.00100.99900.0000
R21.00000.00000.43660.5634
R31.00000.00001.00000.0000
R40.40410.00000.85180.1482
R50.16970.00000.00001.0000
R60.71890.31200.00000.6880
R70.00500.45480.39230.1529
R80.54960.00000.02690.9731
R90.04090.64710.00000.3529
Table 29. Rule base for Genetic Algorithm aggregator client 3: second iteration.
Table 29. Rule base for Genetic Algorithm aggregator client 3: second iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.64380.40910.59090.0000
R20.64350.00090.58230.4168
R30.40930.00000.82760.1724
R40.73550.08050.00000.9195
R50.48070.00110.35560.6434
R60.84710.68130.31870.0000
R70.81340.52220.36300.1147
R81.00000.00090.95730.0422
R90.62951.00000.00000.0000
Table 30. Rule base for Genetic Algorithm aggregator client 1: third iteration.
Table 30. Rule base for Genetic Algorithm aggregator client 1: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.02590.00000.43710.5629
R20.95480.00050.17980.8202
R30.25210.01750.98250.0000
R40.61340.81460.00640.1790
R50.04100.00060.53370.4657
R60.78050.00010.16450.8354
R70.00000.50330.06310.4336
R80.72420.10370.89630.0000
R90.34250.43870.00000.5613
Table 31. Rule base for Genetic Algorithm aggregator client 2: third iteration.
Table 31. Rule base for Genetic Algorithm aggregator client 2: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.98410.78330.03780.1799
R20.01860.26790.37140.3608
R30.42920.14140.66340.1951
R40.64490.09050.00000.9095
R50.59030.00490.98380.0113
R60.00000.00080.17890.8204
R70.08080.64890.24000.1112
R81.00000.00850.22460.7669
R90.73500.18100.00100.8180
Table 32. Rule base for Genetic Algorithm aggregator client 3: third iteration.
Table 32. Rule base for Genetic Algorithm aggregator client 3: third iteration.
Rule IDRule WeightCNN ValueAgeGroup
R10.53670.80860.00000.1914
R20.49200.00050.21760.7818
R30.21670.00020.18880.8111
R40.39230.61960.38040.0000
R50.99930.00510.21940.7755
R60.00000.38210.23850.3794
R70.37960.63760.28020.0822
R80.29420.00120.21270.7861
R90.00000.20580.02770.7665
Table 33. Optimized server for FedAvg aggregator rule base: first iteration.
Table 33. Optimized server for FedAvg aggregator rule base: first iteration.
Rule IDRule WeightCNMCIAD
R10.96120.00000.96890.0311
R20.33880.00000.47850.5215
R30.08500.86470.07470.0606
R40.08060.49470.50530.0000
R50.36470.00000.84420.1558
R60.42420.00000.07720.9228
R70.44840.71690.28210.0010
R80.00000.00000.67210.3279
R90.64110.00001.00000.0000
Table 34. Final optimized server for FedAvg aggregator rule base: second iteration.
Table 34. Final optimized server for FedAvg aggregator rule base: second iteration.
Rule IDRule WeightCNMCIAD
R10.99970.07580.00000.9242
R20.20770.85080.14920.0000
R30.30420.31630.68370.0000
R40.24390.51590.09410.3899
R50.56730.00070.64320.3561
R60.86710.63330.29250.0742
R70.92150.55570.44430.0000
R80.56690.00330.09420.9025
R90.74530.26290.73710.0000
Table 35. Optimized server for FedAvg aggregator rule base: third iteration.
Table 35. Optimized server for FedAvg aggregator rule base: third iteration.
Rule IDRule WeightCNMCIAD
R10.36290.44770.00000.5523
R20.00000.23640.56480.1988
R30.00000.29700.32870.3742
R40.73160.45220.54780.0000
R50.19700.52900.47100.0000
R60.60370.67600.32410.0000
R70.27790.00010.18160.8183
R80.53310.90310.04490.0521
R90.23060.69620.00000.3038
Table 36. Optimized server for FedProx aggregator rule base: first iteration.
Table 36. Optimized server for FedProx aggregator rule base: first iteration.
Rule IDRule WeightCNMCIAD
R10.09560.32510.50280.1721
R20.24570.67780.00000.3222
R30.70220.00000.64960.3504
R40.99380.73860.21600.0453
R50.73750.00040.63800.3616
R60.00110.04870.07420.8771
R70.70260.00000.82070.1793
R80.12630.04120.95880.0000
R91.00000.73820.00000.2618
Table 37. Optimized server for FedProx aggregator rule base: second iteration.
Table 37. Optimized server for FedProx aggregator rule base: second iteration.
Rule IDRule WeightCNMCIAD
R10.00000.21230.70770.0800
R20.89530.00000.71880.2812
R30.46880.47430.50550.0201
R40.40040.39130.22150.3872
R50.53390.00010.33510.6648
R60.43590.87040.09030.0393
R70.97640.64740.05100.3016
R80.82330.00050.09810.9015
R90.03010.41950.57280.0077
Table 38. Optimized server for FedProx aggregator rule base: third iteration.
Table 38. Optimized server for FedProx aggregator rule base: third iteration.
Rule IDRule WeightCNMCIAD
R10.04890.00000.15790.8421
R20.98980.79540.00000.2046
R30.29920.26400.73600.0000
R40.50010.69760.00000.3024
R50.97210.46440.00000.5356
R60.51060.00000.78670.2133
R71.00000.00160.50320.4952
R80.64590.77560.22440.0000
R90.76540.67470.01830.3070
Table 39. Optimized server for Genetic Algorithm aggregator rule base: first iteration.
Table 39. Optimized server for Genetic Algorithm aggregator rule base: first iteration.
Rule IDRule WeightCNMCIAD
R10.97620.00000.40920.5908
R20.46760.00000.71090.2891
R30.70430.00500.99500.0000
R40.50230.17340.82660.0000
R50.00000.04530.55960.3950
R60.00000.00000.36490.6351
R70.72690.00000.56590.4341
R80.98040.04340.95660.0000
R90.99750.79340.20530.0013
Table 40. Optimized server for Genetic Algorithm aggregator rule base: second iteration.
Table 40. Optimized server for Genetic Algorithm aggregator rule base: second iteration.
Rule IDRule WeightCNMCIAD
R10.20410.40600.48540.1087
R20.76010.00000.89640.1036
R30.63950.71900.22810.0529
R41.00000.60890.39030.0008
R50.03900.00000.08320.9168
R60.62070.00000.88470.1153
R70.39630.88160.00640.1120
R80.35210.00140.27060.7280
R90.04810.20130.60280.1959
Table 41. Optimized server for Genetic Algorithm aggregator rule base: third iteration.
Table 41. Optimized server for Genetic Algorithm aggregator rule base: third iteration.
Rule IDRule WeightCNMCIAD
R10.53670.00000.43710.5629
R20.49200.00050.17980.8197
R30.21670.01750.98250.0000
R40.39230.81460.00640.1790
R50.99930.00060.53470.4653
R60.00000.00010.16450.8354
R70.37960.50330.06310.4336
R80.29420.10370.89630.0000
R90.00000.43870.00000.5613
Table 42. Model performance after last iteration.
Table 42. Model performance after last iteration.
Aggregator
Client
TrainingTesting
MSE Accuracy MSE Accuracy
FedAvg AggregatorClient 10.0150880.404650.010720.48148
Client 20.023090.358140.018080.37037
Client 30.014040.598130.014050.52830
Global0.004090.998450.004650.99999
FedProx AggregatorClient 10.013910.595350.010400.61111
Client 20.037670.413950.037010.38889
Client 30.024910.373830.026830.18868
Global0.005920.736020.006870.68944
Genetic AlgorithmClient 10.0360760.311630.0324510.29629
Client 20.041060.353480.0378350.24074
Client 30.033310.364480.039150.24528
Global0.008330.510860.009040.43478
Table 43. PSO time analysis across iterations for FedAvg aggregator.
Table 43. PSO time analysis across iterations for FedAvg aggregator.
IterationClient 1 Time (s)Client 2 Time (s)Client 3 Time (s)Final PSO Time (s)
1253.88375.92296.97742.59
2251.42249.82252.57746.66
3279.37252.04251.23734.71
4259.25251.35249.78739.96
5249.15249.71250.31734.90
Total1293.071129.091300.863998.82
Table 44. Comparison with state-of-the-art methods (X indicates that the issue was not addressed).
Table 44. Comparison with state-of-the-art methods (X indicates that the issue was not addressed).
ReferenceMethodologyDatasetAccuracyPrivacy-preservingUncertainty Issue
Umme et al. [1]DementiaNet + Transfer LearningMRI Dataset (6400 Images)97%XX
Raees et al. [40]SVM + DNNMCI dataset (111 people)90%XX
Buvaneswari et al. [41]SegNet + ResNet-101ADNI dataset96%XX
Saratxaga et al. [42]ResNet 18 + BrainNetCollected From Kaggle80%, 90%XX
Hu et al. [43]CNNADNI and NIFD dataset92%XX
Khalil et al. [5]Hardware Acceleration (VHDL + FPGA)Simulation Data89%Achieved 87% sensitivity; Low power consumption (35–39 mW)X
Mitrovska et al. [6]FedAvg, Secure Aggregation (SecAgg)Demographic Simulations-Highlighted privacy guarantees; Evaluated statistical variationsX
Trivedi et al. [32]Federated Deep Learning + AlexNetIID Dataset98.53%Ensured data privacy; Tested with single/multiple clientsX
Altalbe et al. [33]DNN within Federated LearningKay Elemetrics voice disorder data82.82%Reduced overfitting; Preserved privacy and securityX
Mandawkar et al. [34]Tawny Flamingo Deep CNNClinical Data98.252% (K-fold), 97.995% (training)High accuracy; Tuned by Tawny Flamingo algorithmX
Castro et al. [35]Federated Learning + Biometric AuthenticationOASIS, ADNI datasetsComparableProtected privacy; Prevented data poisoningX
Qian et al. [9]FeDeFo (Deep Forest + Federated Learning)sMRI images-Personalized model while protecting data privacyX
Proposed ResearchHybrid Deep and Federated LearningADNI and NIfTI files dataset99.9%Ensured data privacy; Tested with multiple clientsSolved
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Basnin, N.; Mahmud, T.; Islam, R.U.; Andersson, K. An Evolutionary Federated Learning Approach to Diagnose Alzheimer’s Disease Under Uncertainty. Diagnostics 2025, 15, 80. https://doi.org/10.3390/diagnostics15010080

AMA Style

Basnin N, Mahmud T, Islam RU, Andersson K. An Evolutionary Federated Learning Approach to Diagnose Alzheimer’s Disease Under Uncertainty. Diagnostics. 2025; 15(1):80. https://doi.org/10.3390/diagnostics15010080

Chicago/Turabian Style

Basnin, Nanziba, Tanjim Mahmud, Raihan Ul Islam, and Karl Andersson. 2025. "An Evolutionary Federated Learning Approach to Diagnose Alzheimer’s Disease Under Uncertainty" Diagnostics 15, no. 1: 80. https://doi.org/10.3390/diagnostics15010080

APA Style

Basnin, N., Mahmud, T., Islam, R. U., & Andersson, K. (2025). An Evolutionary Federated Learning Approach to Diagnose Alzheimer’s Disease Under Uncertainty. Diagnostics, 15(1), 80. https://doi.org/10.3390/diagnostics15010080

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop