A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis

Sait, Abdul Rahaman Wahab

doi:10.3390/app14093879

Open AccessArticle

A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis

by

Abdul Rahaman Wahab Sait

Department of Archives and Communication, Center of Documentation and Administrative Communication, King Faisal University, P.O. Box 400, Hofuf 31982, Al-Ahsa, Saudi Arabia

Appl. Sci. 2024, 14(9), 3879; https://doi.org/10.3390/app14093879

Submission received: 25 March 2024 / Revised: 25 April 2024 / Accepted: 29 April 2024 / Published: 30 April 2024

(This article belongs to the Special Issue Computational and Mathematical Methods for Neuroscience)

Download

Browse Figures

Versions Notes

Abstract

:

Alzheimer’s disease (AD) is a progressive neurodegenerative condition. It causes cognitive impairment and memory loss in individuals. Healthcare professionals face challenges in detecting AD in its initial stages. In this study, the author proposed a novel integrated approach, combining LeViT, EfficientNet B7, and Dartbooster XGBoost (DXB) models to detect AD using magnetic resonance imaging (MRI). The proposed model leverages the strength of improved LeViT and EfficientNet B7 models in extracting high-level features capturing complex patterns associated with AD. A feature fusion technique was employed to select crucial features. The author fine-tuned the DXB using the Bayesian optimization hyperband (BOHB) algorithm to predict AD using the extracted features. Two public datasets were used in this study. The proposed model was trained using the Open Access Series of Imaging Studies (OASIS) Alzheimer’s dataset containing 86,390 MRI images. The Alzheimer’s dataset was used to evaluate the generalization capability of the proposed model. The proposed model obtained an average generalization accuracy of 99.8% with limited computational power. The findings highlighted the exceptional performance of the proposed model in predicting the multiple types of AD. The recommended integrated feature extraction approach has supported the proposed model to outperform the state-of-the-art AD detection models. The proposed model can assist healthcare professionals in offering customized treatment for individuals with AD. The effectiveness of the proposed model can be improved by generalizing it to diverse datasets.

Keywords:

feature extraction; deep learning; transformer; LeViT; hyperparameter tuning; model optimization; neuroimaging; neurodegenerative diseases

1. Introduction

According to the World Health Organization, the total number of individuals aged 60 and older is expected to double by 2050, reaching approximately 2.1 billion people, 22% of the global population [1]. Alzheimer’s disease (AD) is a neurodegenerative condition that primarily affects the elderly population [2]. However, it may manifest in younger individuals. It is the primary cause of dementia. Mild cognitive impairment may occur in the initial stages of AD [2]; this is a transitional stage from normal functioning to AD in which an individual has moderate cognitive abnormalities [3]. The individuals may experience difficulties in performing their routine tasks [4]. They may face challenges in remembering recent events, names, and conversations. In addition, they may exhibit agitation and aggression. With an anticipated increase in AD cases, the disease has become one of the significant global concerns of the modern era. Despite massive efforts to find a cure, AD is still a non-preventable and irreversible form of dementia that impairs an individual’s daily life [5]. It is complicated and progressive, necessitating early discovery, diagnosis, therapy, and family support [6]. As the condition progresses, AD patients increasingly rely on their caretakers and require assistance with routine activities.

The primary etiology of AD remains unclear. However, genetics, environment, and lifestyle may contribute to AD [6]. Medical treatment and assistance can place a financial burden on individuals with AD and their families. Globally, governments, healthcare organizations, and research institutes are focusing on the development of practical approaches to address the challenges associated with AD [6]. Researchers investigate AD’s backgrounds, risk factors, prevention, and therapy to identify successful strategies to reduce its progression. Most cases of dementia are based on neurodegeneration caused by AD. Increasing evidence from neuropathological and neuroimaging studies shows that mixed etiologies cause many dementia cases, especially in people over 80 years [7]. There has been more variation in the findings regarding the prevalence of dementia and AD in populations older than 90 years compared to younger populations [7]. Healthy aging, dementia care, and caregiver assistance are being applied to improve AD patients’ quality of life [8].

Cognitive tests and assessments investigate memory, attention, language, reasoning, and problem-solving [8]. The Mini–Mental State Examination, Montreal Cognitive Assessment, and AD Assessment Scale–Cognitive Subscale are frequently used for AD detection [8]. Cognitive function may be assessed in greater detail with neuropsychological tests [9]. These tests demonstrate cognitive strengths and limitations and may distinguish AD from distinct dementias [10]. Lumbar puncture may collect cerebrospinal fluid from the lower back [11,12,13,14]. Elevated beta-amyloid and tau proteins in cerebrospinal fluid may indicate AD pathology. In a few instances, genetic testing may be utilized to diagnose AD, particularly among individuals with a family history of AD [15]. An in-depth neuropsychological evaluation is a crucial diagnostic component in the diagnosis of dementia [16]. It analyzes magnetic resonance imaging (MRI) scans for signs of regional brain atrophy and determines the AD biomarkers using the cerebrospinal fluid biomarker profile. It evaluates individuals’ memory, attention, language, and emotional performance. Healthcare practitioners’ subjective interpretation of cognitive and neuropsychological testing can result in diagnostic discrepancies [16,17,18].

Several imaging modalities may reveal the brain’s structure and function, highlighting abnormalities associated with AD [18]. The diagnosis of AD relies on a wide variety of biomarkers, including genetic and biological data and neuroimaging techniques, MRI, amyloid positron emission tomography (PET), and diffusion tensor imaging [19,20,21]. The brain structural changes, including hippocampal shrinkage and other AD-related changes in addition to malignancies and strokes, can be identified using MRI [22]. These changes can be used to determine brain abnormalities associated with mild cognitive impairment, which may indicate AD. PET imaging can identify AD’s beta-amyloid plaques and tau protein tangles in the brain. PET scans utilizing florbetapir, flutemetamol, or florbetaben may confirm AD [23]. Chin-Yun Kuo et al. (2023) [24] discussed the significance of integrating neuropsychological assessment with neuroimaging in order to identify AD in its initial stages. Researchers can obtain valuable information on brain anatomical components from high-resolution MRI images. In addition, MRI images have been made available through public open access databases. These datasets are frequently updated, and researchers can utilize them to develop automated AD detection.

DL models can improve early detection, understand disease pathology, integrate image features, leverage large-scale datasets, and advance personalized medicine for individuals with AD [25]. These models can capture complex and high-dimensional patterns in medical imaging data, including MRI and PET, assisting in diagnosing and understanding AD [26]. By identifying biomarkers and subtypes of AD, DL-based models may enable individualized treatments [26]. DL algorithms can learn complex representations from massive data, providing improved precision and generalizable AD detection models [27,28,29,30,31]. Islam and Zhang (2017) [32] employed a multi-class classification model to detect AD. Hussain et al. (2020) [33] introduced a binary classification to distinguish individuals with and without AD using MRI data. Murugan et al. (2021) [34] proposed a DL model to predict AD and dementia. Raees and Thomas (2021) [35] used a Support Vector Machine and Deep Neural Network to detect AD using MRI. Mamun et al. (2022) [36] used a DL model for AD detection. Helaly et al. (2022) [37] proposed a DL model to predict AD in the early stages. Liu et al. (2022) [38] employed a three-dimensional deep convolutional neural network (CNN) to differentiate individuals with mild AD from those without AD. El-Latif et al. (2023) [39] pre-processed the MRI scans and improved the CNN model’s capability in identifying AD. However, the existing AD detection models demand high-performance graphical or tensor processing units and large-scale computing infrastructure for training and inference. An effective fine-tuning algorithm is required to find the optimal hyperparameters for optimal outcomes. Hyperparameter selection requires substantial testing and manual adjustment, which is time-consuming and computationally expensive. Overfitting or poor generalization may result from inadequate data for deep learning models.

Furthermore, researchers and practitioners with restricted computing resources may encounter challenges in model implementation. Existing AD detection approaches using MRI require human interpretation or essential feature extraction, limiting diagnosis accuracy and reliability. There is a demand for advanced and automated techniques to detect subtle disease-specific AD patterns. Transformer-based architectures and CNNs have produced promising results in medical image processing. The integration of transformers and CNNs can extract local and global spatial information from complex images. Combining these architectures may strengthen AD detection feature extraction frameworks. These features have motivated the author to build a hybrid transformer and CNN-based AD detection model. The contributions of the study are as follows:

A feature fusion-driven LeViT–EfficientNet B7-based feature extraction model to extract the crucial features of AD.

An enhanced Dartbooster XGBoost (DXB)-based AD detection model using a Bayesian optimization hyperband (BOHB) optimization algorithm.

The structure of the proposed study is organized as follows: The proposed methodology for detecting AD using MRI images is described in Section 2. Section 3 outlines the findings of the performance validation. The study’s contribution is discussed in Section 4. Lastly, Section 5 concludes the study by outlining the limitations and future direction.

2. Materials and Methods

The author introduced an integrated approach that combines a vision transformer (ViT), CNN, and gradient-boosting model. A ViT can capture global spatial relationships and long-range interdependence in images [40]. To identify AD anomalies in MRI scans, determining the spatial context of brain regions is crucial. Based on task relevance, a ViT utilizes self-attention mechanisms to rank image patches. The model’s interpretability enables researchers and clinicians to observe its regions of interest, allowing them to comprehend AD detection characteristics. A pre-trained ViT model can be fine-tuned on smaller MRI datasets for AD detection [40]. LeVit [40] is a ViT based on a hybrid neural network [37]. Using a transfer learning approach, a feature extraction can be developed to extract crucial AD patterns in order to improve AD detection generalization. LeViT can be seamlessly integrated with CNN to a diverse set of features. CNN can recognize edges, textures, shapes, and structures in MRI images using multiple layers of convolutional and pooling processes. It can identify AD-related regional anomalies in MRI images using attention mechanisms and spatial pooling. EfficientNet B7 is a state-of-the-art CNN model with a compound scaling technique [41]. It is widely used for extracting features from medical images. The capability of LeViT and EfficientNet B7 in extracting the intricate patterns has motivated the author to employ a hybrid feature extraction approach. In addition, the author employed a DXB, which is a gradient-boosting model, to identify the type of AD using the extracted features. Figure 1 reveals the proposed methodology for identifying AD using MRI images.

2.1. Dataset Acquisition

Open Access Series of Imaging Studies (OASIS) Alzheimer’s dataset contains a cross-sectional collection of T1-weighted MRI scans of 416 subjects aged 18 to 96. The subjects include males and females. The dataset provides cognitive scores and the diagnosis status of individuals. OASIS Alzheimer’s dataset is freely accessible through the repository [42]. Alzheimer’s dataset consists of 5000 T1-weighted MRI images [43]. The images were categorized based on the disease severity. The characteristics of the datasets are presented in Table 1.

The datasets were highly imbalanced. EfficientNet B7 and LeViT models may require considerable data augmentation to boost robustness and minimize overfitting. Qi et al. [44] proposed a data augmentation technique for brain MRI images. They applied generative adversarial networks to generate the synthetic images. Thus, the author employed the data augmentation technique [44] to overcome the limitation. In addition, traditional data augmentation techniques, including rotation, translation, scaling, flipping, gamma correction, shearing, and histogram equalization, were used in this study.

2.2. EfficientNet B7-Based Feature Extraction

EfficientNet B7 excels in image categorization [41]. It captures complex MRI characteristics and patterns for AD diagnosis using the depth, width, and resolution scaling features. It can handle massive amounts of MRI data with less computation cost. By revealing MRI image representations, EfficientNet B7’s hierarchical structure can facilitate model interpretation. Clinicians and researchers may use these representations to understand AD’s unique characteristics and provide personalized treatment. EfficientNet B7 may struggle to gain long-range relationships and contextual information in MRI images. This shortcoming may impair the model’s detection of AD symptoms. In order to improve the efficiency of the EfficientNet B7 model, the author employed an attention mechanism and mixed-precision training. Figure 2 highlights the recommended feature extraction model.

Using the EfficientNet B7 backbone, a feature extraction model was constructed. An attention mechanism was introduced to capture the long-range dependencies and contextual information. Residual connections were incorporated to overcome the vanishing gradients during the training phase.

Furthermore, the author employed mixed-precision training to accelerate the training and reduce memory consumption. Activation functions, gradients, and accumulation were performed in a single precision format to prevent numerical underflow or overflow challenges. In addition, a loss scaling factor was dynamically integrated into the loss function to address vanishing gradients.

2.3. LeViT-Based Feature Extraction

LeViT offers a powerful platform to handle a wide range of medical image processing tasks, including classification, object detection, and segmentation [40]. It demands fewer parameters compared to traditional CNN models. The self-attention mechanism can learn interpretable representations of the MRI images. The global context modeling technique captures holistic information associated with the MRI images. The patch extractor transforms the image shape from 224 × 224 × 3 into 250 × 14 × 14. A shrinking attention block is used to reduce the size of the activation maps. These features have motivated the author to employ LeViT to extract AD patterns from the MRI images. However, LeViT faces challenges in capturing fine-grained local details, affecting the ability to locate the smaller objects. To overcome this limitation and improve the performance of LeViT-based feature extraction, the author integrated spatial transformer networks (STNs) [45] with LeViT architecture. Initially, an STN is built to perform spatial transformation on the MRI images and extract features based on the region of interest. A feature extraction model is constructed using the LeViT backbone. The extracted features are passed through the LeViT in order to capture high-level representations of the spatially transformed features. Figure 3 highlights the enhanced LeViT model for the feature extraction.

A fully connected layer with the Softmax function is used to classify the features based on the severity. Equations (1) and (2) show the computational forms of STN and LeViT models.

F = S T N (C, I)

(1)

where F is the image feature, STN() is the spatial transformer network function, C is the input channel, and I is the image.

F = L e V i T (C, C l, F)

(2)

where F is the image feature, C is the input channel, Cl is the AD classes, and LeViT() is the function for implementing the LeViT model.

After fusing the features, the author normalized the features using feature-wise normalization to prevent numerical instability. Finally, a fully connected layer with the Softmax function was used to generate the outcome. The outcomes were stored as a vector.

2.4. Feature Fusion Layer

The author combined a fusion layer with LeViT to fuse the features using an element-wise addition approach. A dimension-matching process was used to identify the features with different dimensions. A reshape function was applied to reshape the feature maps into unique dimensions. Subsequently, element-wise addition combines the elements of EfficientNet B7 and LeViT. Equation (3) shows the mathematical form of feature fusion.

\sum_{i = 1}^{n} f_{f u s e d} = \sum_{i = 1}^{n} f_{E f f i c i e n t N e t B 7} + \sum_{i = 1}^{n} f_{L e V i T}

(3)

where n is the number of features,

f_{f u s e d}

is the fused features,

f_{E f f i c i e n t N e t B 7}

is the EfficientNet B7 features, and

f_{L e V i T}

is the LeViT features.

2.5. Dartbooster XGBoost-Based AD Detection

DXB is an enhanced version of the traditional XGBoost algorithm [46]. It focuses on dropout regularization to prevent overfitting by randomly dropping units during training. Compared to the existing gradient-boosting algorithms, DXB achieves a considerable outcome with limited computational power. In this study, the author employed a DXB model to predict the AD type using the extracted features. However, DXB may face challenges maintaining exploration–exploitation trade-offs in high-dimensional search space. In addition, it may struggle to scale to complex models due to the increased computational requirements. To overcome these limitations, the author employed the BOHB algorithm to fine-tune the model. The hyperband algorithm follows a strategy to allocate computational resources to unique hyperparameter optimization. Bayesian optimization uses a probabilistic surrogate function to control the performance of the DXB hyperparameters. During the training phase, a resource budget (hyperparameters) was initialized. A Gaussian process was updated with the observed performance data. Multiple rounds of optimization were performed until computational resources were exhausted. Equations (4) and (5) show the mathematical forms of the BOHB and DXB hyperparameter tuning processes.

B O H B = \arg {}_{a \in A}^{m a x}{\propto (a)}

(4)

O = B O H B (D X B (f), A)

(5)

where A is the number of hyperparameters,

\propto (a)

is the acquisition function that controls the selection of hyperparameters,

f

is the feature, BOHB() is the Bayesian optimization and hyperband function, DXB() is the Dartbooster XGBoost function, and

O

is the outcomes.

Furthermore, the author included SHapley Additive exPlanations (SHAP) values in the DXB model to improve the model’s interpretability. The integration of SHAP values can assist healthcare professionals in gaining deeper insights into the model’s prediction.

2.6. Performance Validation

The author validates the proposed model’s performance using widely applied evaluation metrics. Accuracy represents the overall correctness of the proposed model’s predictions. Specificity indicates the model’s ability to detect negative instances. Sensitivity measures the model’s capability of detecting positive classes. Precision indicates the proposed model’s capability to prevent false positives, whereas recall represents the model’s ability to identify positive instances. Cohen’s Kappa is used to assess the reliability and consistency of the model’s findings. In addition, the area under the receiver–operating characteristics curve (AUROC) and the area under the precision–recall curve (AUPRC) are used to evaluate the effectiveness of the proposed AD detection model.

3. Results

The performance evaluation of the proposed model was conducted using Windows 11 Pro, Intel i9-12900k, 16 GB RAM, NVIDIA RTX 4090, and Python 3.8.0. The libraries, including Pytorch 1.9, TensorFlow 2.11.0, Theano 1.0.5, and Keras 2.12.0, were used for model development. The OASIS Alzheimer’s dataset was divided into a train set (70%), a validation set (15%), and a test set (15%). Alzheimer’s dataset (20%) was used to generalize the proposed AD detection model. Table 2 reveals the experimental settings for implementing the proposed AD detection model.

The performance of the proposed AD detection during the training and validation phase is highlighted in Figure 4a,b. Compared to the training phase, there was a significant improvement in the validation phase. The recommended early-stopping strategies and regularization techniques have improved the model performance by monitoring the validation loss. The model has attained an optimal performance at the 77th epoch.

The findings of the performance validation using dataset 1 are outlined in Table 3. The recommended LeViT–EfficientNet B7 feature extraction has improved the prediction accuracy of the proposed model. In addition, the data augmentation has supported the model in identifying the critical patterns associated with AD.

Figure 5 presents the findings of a comparative analysis of the existing transformer and CNN backbones. The proposed model has outperformed the existing models by obtaining an optimal generalization accuracy of 99.8%. The recommended fine-tuning processes assisted the proposed model in addressing the overfitting, vanishing gradient, and amplification effects. Figure 6 highlights the computational loss of the AD detection models. The proposed model produced a minimal loss compared to the existing models.

Table 4 presented that the proposed model required a few parameters and FLOPs to deliver a remarkable outcome compared to the existing backbones. The findings indicated that the model can be implemented in a resource-constrained healthcare environment. The BOHB algorithm has supported the proposed model in maintaining a trade-off between high generalization accuracy and limited computational resources.

Table 5 highlights the findings of the reliability and consistency analysis. The proposed model has achieved excellent AUROC and AUPRC, indicating high discrimination in distinguishing the multiple classes of AD. High AUROC and AUPRC highlight the reliability of the proposed AD detection model. The proposed model achieved an exceptional SD and CI, indicating a reliable and consistent outcome. In addition, a smaller SD shows that the model’s performance is consistent across diverse data points. Clinicians can benefit from the model and reduce unnecessary medical interventions. The recommended feature extraction approach has produced highly discriminative features by capturing subtle patterns associated with AD. The suggested BOHB-based hyperparameter tuning has selected appropriate DXB parameters to prevent overfitting and enhance the model’s robustness.

Table 6 presents the performance of the AD detection models. The utilization of improved LeViT enhances the proposed model’s ability to detect long-range dependencies and spatial relationships associated with AD. The scaling coefficient of the EfficientNet B7 model enables the model to handle inherent complexities and variations in the MRI image resolutions.

4. Discussions

In this study, the author introduced an EfficientNet B7 and LeViT-based feature fusion technique for extracting key features from MRI images. The EfficientNet B7 model was improved by integrating the attention mechanism. In addition, the author trained the EfficientNet B7 model using mixed-precision training to reduce the computational cost. A fine-tuned DXB model was used to detect AD using the extracted features. The model was trained and tested using the OASIS Alzheimer’s dataset. A data augmentation technique was employed in order to provide adequate training to the model to learn intricate patterns of AD. The author generalized the model using the Alzheimer’s dataset.

Table 3 highlights the performance of the proposed AD detection model. The model produced an outstanding performance by achieving an accuracy of 98.9% and specificity of 98.7%. Table 4 and Table 5 reveal the findings of the comparative analysis using the existing backbones. Table 6 outlines the findings of the existing AD detection models. The proposed model has outperformed the existing AD detection models. It required less computational power to identify AD. The recommended feature fusion technique has supported the proposed model in delivering an optimal outcome. In addition, the suggested BOHB optimization has fine-tuned the parameters of the DXB model to make an effective decision with limited resources. The proposed model demonstrated remarkable performance with limited computational costs. Models with exceptional AUROC and AUPRC can assist healthcare professionals in diagnostic interpretation and treatment options.

The proposed AD detection model can empower clinicians to make effective decisions and offer personalized care to individuals. It holds promise for improving patient outcomes and advancing the understanding of AD symptoms in the earlier stages. By integrating computational approaches with clinical practice, this study enhanced AD detection using MRI images. The proposed model’s accuracy and efficiency have significant clinical implications. Effective AD detection enables physicians to diagnose, schedule, and track disease development. Reliable diagnostic techniques and timely intervention can enhance patient outcomes and quality of life. Moreover, scientific communities may benefit from the study findings to extend the research in medical imaging analysis and DL methods.

The author trained the proposed model using the OASIS dataset that covers the MRI images with biomarkers, including an individual’s age, sex, cognitive score, and diagnosis status. Researchers can gain insights into the underlying AD pathology and build effective diagnostic and therapeutic strategies. The proposed model allows researchers to identify critical biomarkers, including brain atrophy, cortical thickness changes, hippocampus alterations, white matter integrity alterations, and abnormalities in specific brain regions. Integrating SHAP values facilitates healthcare professionals to identify the significance of MRI biomarkers (features) associated with AD. The proposed model assigns a positive and negative SHAP value to each feature. Healthcare professionals can use SHAP values to understand the importance of features in AD prediction. For instance, a SHAP value of 0.7 related to brain atrophy feature indicates that higher activation in the brain atrophy region is associated with AD prediction. In contrast, a negative SHAP value is associated with a decreased likelihood of AD.

Raees and Thomas (2021) [35] employed AlexNet, Visual Geometry Group (VGG)-16, and ResNet-50 to extract features from MRI images. They used a Support Vector Machine to predict AD. The pre-trained CNN models may produce biased predictions, leading to false positives. The limited generalization ability has reduced the model’s performance in the context of AD prediction. The class imbalances have reduced the Support Vector Machine model’s capability of detecting AD. In addition, the lack of interpretability may cause challenges to healthcare professionals in understanding the results. The proposed AD detection model integrated the SHAP values in order to provide the results with interpretability. With the recommended feature extraction, it generated an exceptional outcome.

Mamun et al. (2022) [36] employed ResNet-101, DenseNet-121, and VGG-16 models to detect AD. These models achieved an average accuracy of 97.8%. VGG-16 required parameters of 138 M to generate the outcome, leading to high computational cost. It is less expensive compared to the proposed AD model. ResNet-101 architecture was complex, resulting in high training time. It required additional computational power due to the residual connections. DenseNet-121 model required a substantial memory during the training phase. The dense connectivity pattern has reduced the ability to find AD patterns compared to the proposed model.

Helaly et al. (2022) [37] used VGG-10 to classify the AD classes using MRI images. They fine-tuned VGG-19’s performance to improve the prediction accuracy. The fixed architecture of the VGG-19 has reduced the model’s performance. The vanishing gradient problem has affected the model’s learning ability. The depth and complexity enabled the model to produce results with high computational cost. In addition, VGG-19 demanded substantial memory to store the intermediate results. In contrast, the proposed AD detection model has employed mixed-precision training to reduce the computational power. Moreover, the self-attention mechanism has supported the proposed model’s remarkable outcome.

Liu et al. (2022) [38] used free surfer segmentation to locate AD patterns. They constructed a gradient-boosting classifier for detecting AD statuses. The processing time of free surfer segmentation may vary depending on the hardware specification. The limited spatial resolutions of MRI have reduced the performance of the model. In addition, augmented samples of 3D MRI were complex, limiting the effectiveness of the AD detection model. In contrast, the proposed AD model combined LeViT and EfficientNet B7 to improve prediction accuracy by producing complex AD patterns.

EL-Latif et al. (2023) [39] constructed a shallow CNN model to classify the AD types. They employed 2D CNN for multi-class classification. The model comprised seven convolutional layers trained using the weights of the pre-trained model. It required extensive image pre-processing in order to maintain a considerable performance. The lack of generalization has reduced the model’s prediction accuracy. The model’s performance was low compared to the proposed model.

The author encountered challenges in managing and optimizing the feature extraction processes. The high-dimensional and heterogeneous MRI images caused challenges in extracting intricate AD patterns. However, the EfficientNet B7 and LeViT backbones were fine-tuned to overcome the image complexities. The high risk of overfitting due to integrating LeViT and EfficientNet B7 models was reduced using regularization and effective data augmentation techniques. The authors applied the mixed-precision training strategy to minimize the computational costs for the feature extraction.

The proposed AD detection model was generalized on two datasets. A rigorous validation and generalization test is essential in order to ensure the proposed model’s robustness and reliability across diverse populations. It can improve the model’s trustworthiness in a real-time environment. The integration of the proposed model into the clinical workflow may demand substantial validation, standardization, and flexible user interfaces. The variations in MRI images may influence the model’s robustness and generalization. Continuous monitoring and updating are essential in order to adapt to technical advancements and clinical guidelines. AD detection is challenging and requires coordination between computer scientists, neuroscientists, radiologists, and medical professionals. To enhance the model’s diagnostic accuracy, multiple data modalities, including PET, genetic information, and cerebrospinal fluid biomarkers can be explored. Investigating advanced data augmentation techniques can enhance the model’s robustness to variations in the image quality. The proposed AD prediction models can be improved through unique differences in risk factors, disease progression, and symptom presentation by incorporating language abilities, societal impact, and cognitive abilities as predictor variables. Researchers and clinicians can improve AD prediction, diagnosis, and treatment by combining these factors.

5. Conclusions

The study presented a novel approach, integrating the strengths of LeViT, EfficientNet B7, and the DXB model with the BOHB algorithm to identify different types of AD using MRI images. The proposed model achieved a remarkable accuracy of 99.8% and specificity of 99.8% with limited computational resources. The improved LeViT and EfficientNet B7 with attention mechanisms have produced critical features of AD. The BOHB algorithm has strengthened the DXB model to deliver a superior generalization capability compared to the existing models. The findings indicate that the proposed model can be deployed in healthcare and rehabilitation centers to diagnose AD. The lightweight nature of the proposed model can reduce the complexities in the model implementation. However, the author encountered challenges integrating STN with LeViT and fine-tuning the DXB model using the BOHB algorithm. Integrating multimodal data sources, including PET and genetic data, can unveil novel biomarkers of AD. In addition, enhancing the model’s interpretability can foster trust and understanding among clinicians and individuals with AD. Advanced data augmentation techniques can improve the proposed model’s generalization capability.

Funding

This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia (Grant No. GrantA035).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

OASIS Alzheimer’s dataset. https://www.kaggle.com/datasets/ninadaithal/imagesoasis, accessed on 21 March 2023. Alzheimer’s dataset. https://www.kaggle.com/datasets/tourist55/alzheimers-dataset-4-class-of-images, accessed on 25 March 2023.

Conflicts of Interest

The author declares no conflict of interest.

References

Mehanna, A. Healthy Ageing: Reviewing the Challenges, Opportunities, and Efforts to Promote Health Among Old People. J. High Inst. Public Health 2022, 52, 45–52. [Google Scholar] [CrossRef]
Ebrahimighahnavieh, M.A.; Luo, S.; Chiong, R. Deep learning to detect Alzheimer’s disease from neuroimaging: A systematic literature review. Comput. Methods Programs Biomed. 2020, 187, 105242. [Google Scholar] [CrossRef] [PubMed]
Altinkaya, E.; Polat, K.; Barakli, B. Detection of Alzheimer’s disease and dementia states based on deep learning from MRI images: A comprehensive review. J. Inst. Electron. Comput. 2020, 1, 39–53. [Google Scholar]
Al-Shoukry, S.; Rassem, T.H.; Makbol, N.M. Alzheimer’s diseases detection by using deep learning algorithms: A mini-review. IEEE Access 2020, 8, 77131–77141. [Google Scholar] [CrossRef]
Kivipelto, M.; Mangialasche, F.; Ngandu, T. Lifestyle interventions to prevent cognitive impairment, dementia and Alzheimer disease. Nat. Rev. Neurol. 2018, 14, 653–666. [Google Scholar] [CrossRef] [PubMed]
Arafa, D.A.; Moustafa, H.E.D.; Ali-Eldin, A.M.; Ali, H.A. Early detection of Alzheimer’s disease based on the state-of-the-art deep learning approach: A comprehensive survey. Multimed. Tools Appl. 2022, 81, 23735–23776. [Google Scholar] [CrossRef]
Kuo, C.-Y.; Stachiv, I.; Nikolai, T. Association of late life depression, (non-)modifiable risk and protective factors with dementia and Alzheimer’s disease: Literature review on current evidences, preventive interventions and possible future trends in prevention and treatment of dementia. Int. J. Environ. Res. Public Health 2020, 17, 7475. [Google Scholar] [CrossRef] [PubMed]
Khojaste-Sarakhsi, M.; Haghighi, S.S.; Ghomi, S.F.; Marchiori, E. Deep learning for Alzheimer’s disease diagnosis: A survey. Artif. Intell. Med. 2022, 130, 102332. [Google Scholar] [CrossRef]
Mggdadi, E.; Al-Aiad, A.; Al-Ayyad, M.S.; Darabseh, A. Prediction Alzheimer’s Disease from MRI Images using Deep Learning. In Proceedings of the 12th International Conference on Information and Communication Systems (ICICS), Valencia, Spain, 24–26 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 120–125. [Google Scholar]
Hamdi, M.; Bourouis, S.; Rastislav, K. Evaluation of neuro images for the diagnosis of Alzheimer’s disease using deep learning neural network. Front. Public Health 2022, 10, 834032. [Google Scholar]
Balaji, P.; Chaurasia, M.A.; Bilfaqih, S.M.; Muniasamy, A.; Alsid, L.E.G. Hybridized deep learning approach for detecting Alzheimer’s disease. Biomedicines 2023, 11, 149. [Google Scholar] [CrossRef]
Mehmood, A.; Yang, S.; Feng, Z.; Wang, M.; Ahmad, A.S.; Khan, R.; Maqsood, M.; Yaqub, M. A transfer learning approach for early diagnosis of Alzheimer’s disease on MRI images. Neuroscience 2021, 460, 43–52. [Google Scholar] [CrossRef] [PubMed]
Saratxaga, C.L.; Moya, I.; Picón, A.; Acosta, M.; Moreno-Fernandez-de-Leceta, A.; Garrote, E.; Bereciartua-Perez, A. MRI deep learning-based solution for Alzheimer’s disease prediction. J. Pers. Med. 2021, 11, 902. [Google Scholar] [CrossRef]
Salehi, A.W.; Baglat, P.; Gupta, G. Alzheimer’s disease diagnosis using deep learning techniques. Int. J. Eng. Adv. Technol. 2020, 9, 874–880. [Google Scholar] [CrossRef]
Yamanakkanavar, N.; Choi, J.Y.; Lee, B. MRI segmentation and classification of human brain using deep learning for diagnosis of Alzheimer’s disease: A survey. Sensors 2020, 20, 3243. [Google Scholar] [CrossRef] [PubMed]
Reul, S.; Lohmann, H.; Wiendl, H.; Duning, T.; Johnen, A. Can cognitive assessment really discriminate early stages of Alzheimer’s and behavioural variant frontotemporal dementia at initial clinical presentation? Alzheimer’s Res. Ther. 2017, 9, 61. [Google Scholar] [CrossRef] [PubMed]
Acharya, U.R.; Fernandes, S.L.; WeiKoh, J.E.; Ciaccio, E.J.; Fabell, M.K.M.; Tanik, U.J.; Rajinikanth, V.; Yeong, C.H. Automated detection of Alzheimer’s disease using brain MRI images—A study with various feature extraction techniques. J. Med. Syst. 2019, 43, 302. [Google Scholar] [CrossRef] [PubMed]
Bi, X.; Li, S.; Xiao, B.; Li, Y.; Wang, G.; Ma, X. Computer aided Alzheimer’s disease diagnosis by an unsupervised deep learning technology. Neurocomputing 2020, 392, 296–304. [Google Scholar] [CrossRef]
Battineni, G.; Hossain, M.A.; Chintalapudi, N.; Traini, E.; Dhulipalla, V.R.; Ramasamy, M.; Amenta, F. Improved Alzheimer’s disease detection by MRI using multimodal machine learning algorithms. Diagnostics 2021, 11, 2103. [Google Scholar] [CrossRef]
Li, H.; Habes, M.; Wolk, D.A.; Fan, Y. Alzheimer’s Disease Neuroimaging Initiative. A deep learning model for early prediction of Alzheimer’s disease dementia based on hippocampal magnetic resonance imaging data. Alzheimer’s Dement. 2019, 15, 1059–1070. [Google Scholar] [CrossRef]
Balne, S.; Elumalai, A. Machine learning and deep learning algorithms used to diagnosis of Alzheimer’s. Mater. Today Proc. 2021, 47, 5151–5156. [Google Scholar] [CrossRef]
Sathiyamoorthi, V.; Ilavarasi, A.K.; Murugeswari, K.; Ahmed, S.T.; Devi, B.A.; Kalipindi, M. A deep convolutional neural network based computer aided diagnosis system for the prediction of Alzheimer’s disease in MRI images. Measurement 2021, 171, 108838. [Google Scholar] [CrossRef]
Ullah, Z.; Jamjoom, M. A Deep Learning for Alzheimer’s Stages Detection Using Brain Images. Comput. Mater. Contin. 2023, 74, 1457–1473. [Google Scholar] [CrossRef]
Kuo, C.-Y.; Tseng, H.-Y.; Stachiv, I.; Tsai, C.-H.; Lai, Y.-C.; Nikolai, T. Combining Neuropsychological Assessment with Neuroimaging to Distinguish Early-Stage Alzheimer’s Disease from Frontotemporal Lobar Degeneration in Non-Western Tonal Native Language-Speaking Individuals Living in Taiwan: A Case Series. J. Clin. Med. 2023, 12, 1322. [Google Scholar] [CrossRef]
Ghazal, T.M.; Abbas, S.; Munir, S.; Ahmad, M.; Issa, G.F.; Zahra, S.B.; Khan, M.A.; Hasan, M.K. Alzheimer Disease Detection Empowered with Transfer Learning. Comput. Mater. Contin. 2022, 70, 5005–5019. [Google Scholar] [CrossRef]
Yagis, E.; De Herrera AG, S.; Citi, L. Convolutional autoencoder based deep learning approach for Alzheimer’s disease diagnosis using brain mri. In Proceedings of the 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), Aveiro, Portugal, 7–9 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 486–491. [Google Scholar]
Liu, J.; Li, M.; Luo, Y.; Yang, S.; Li, W.; Bi, Y. Alzheimer’s disease detection using depthwise separable convolutional neural networks. Comput. Methods Programs Biomed. 2021, 203, 106032. [Google Scholar] [CrossRef]
Han, R.; Chen, C.P.; Liu, Z. A novel convolutional variation of broad learning system for Alzheimer’s disease diagnosis by using MRI images. IEEE Access 2020, 8, 214646–214657. [Google Scholar] [CrossRef]
AlSaeed, D.; Omar, S.F. Brain MRI analysis for Alzheimer’s disease diagnosis using CNN-based feature extraction and machine learning. Sensors 2022, 22, 2911. [Google Scholar] [CrossRef]
Tuan, T.A.; Pham, T.B.; Kim, J.Y.; Tavares, J.M.R.S. Alzheimer’s diagnosis using deep learning in segmenting and classifying 3D brain MR images. Int. J. Neurosci. 2022, 132, 689–698. [Google Scholar] [CrossRef]
Shamrat, F.M.J.M.; Akter, S.; Azam, S.; Karim, A.; Ghosh, P.; Tasnim, Z.; Hasib, K.M.; De Boer, F.; Ahmed, K. AlzheimerNet: An effective deep learning based proposition for Alzheimer’s disease stages classification from functional brain changes in magnetic resonance images. IEEE Access 2023, 11, 16376–16395. [Google Scholar] [CrossRef]
Islam, J.; Zhang, Y. A novel deep learning based multi-class classification method for Alzheimer’s disease detection using brain MRI data. In Proceedings of the Brain Informatics: International Conference 2017, BI 2017, Beijing, China, 16–18 November 2017; Springer International Publishing: Berlin/Heidelberg, Germany, 2017; pp. 213–222. [Google Scholar]
Hussain, E.; Hasan, M.; Hassan, S.Z.; Azmi, T.H.; Rahman, M.A.; Parvez, M.Z. Deep learning based binary classification for Alzheimer’s disease detection using brain mri images. In Proceedings of the 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway, 9–13 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1115–1120. [Google Scholar]
Murugan, S.; Venkatesan, C.; Sumithra, M.G.; Gao, X.Z.; Elakkiya, B.; Akila, M.; Manoharan, S. DEMNET: A deep learning model for early diagnosis of Alzheimer diseases and dementia from MR images. IEEE Access 2021, 9, 90319–90329. [Google Scholar] [CrossRef]
Raees, P.M.; Thomas, V. Automated detection of Alzheimer’s Disease using Deep Learning in MRI. J. Phys. Conf. Ser. 2021, 1921, 012024. [Google Scholar] [CrossRef]
Mamun, M.; Shawkat, S.B.; Ahammed, M.S.; Uddin, M.M.; Mahmud, M.I.; Islam, A.M. Deep Learning Based Model for Alzheimer’s Disease Detection Using Brain MRI Images. In Proceedings of the 2022 IEEE 13th Annual Ubiquitous Computing, Electronics Mobile Communication Conference (UEMCON), New York, NY, USA, 26–29 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 0510–0516. [Google Scholar]
Helaly, H.A.; Badawy, M.; Haikal, A.Y. Deep learning approach for early detection of Alzheimer’s disease. Cogn. Comput. 2022, 14, 1711–1727. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Masurkar, A.V.; Rusinek, H.; Chen, J.; Zhang, B.; Zhu, W.; Fernandez-Granda, C.; Razavian, N. Generalizable deep learning model for early Alzheimer’s disease detection from structural MRIs. Sci. Rep. 2022, 12, 17106. [Google Scholar] [CrossRef] [PubMed]
El-Latif, A.A.A.; Chelloug, S.A.; Alabdulhafith, M.; Hammad, M. Accurate detection of Alzheimer’s disease using lightweight deep learning model on MRI data. Diagnostics 2023, 13, 1216. [Google Scholar] [CrossRef]
LeViT Model. Available online: https://github.com/facebookresearch/LeViT (accessed on 2 May 2023).
PyTorch. Available online: https://github.com/lukemelas/EfficientNet-PyTorch (accessed on 12 May 2023).
OASIS Alzheimer’s Dataset. Available online: https://www.kaggle.com/datasets/ninadaithal/imagesoasis (accessed on 21 March 2023).
Alzheimer’s Dataset. Available online: https://www.kaggle.com/datasets/tourist55/alzheimers-dataset-4-class-of-images (accessed on 25 March 2023).
Qi, C.; Chen, J.; Xu, G.; Xu, Z.; Lukasiewicz, T.; Liu, Y. SAG-GAN: Semi-supervised attention-guided GANs for data augmentation on medical images. arXiv 2020, arXiv:2011.07534. [Google Scholar]
Spatial Transformer Network Model. Available online: https://github.com/topics/spatial-transformer-network (accessed on 15 May 2023).
XGBoost Model. Available online: https://github.com/dmlc/xgboost (accessed on 15 May 2023).

Figure 1. The Proposed AD Detection Methodology.

Figure 2. The recommended EfficientNet B7-Based Feature Extraction.

Figure 3. The Enhanced LeViT Model.

Figure 4. (a) Prediction Accuracy and (b) Loss.

Figure 5. The Comparative Analysis Outcomes.

Figure 6. Computational Loss.

Table 1. Dataset Characteristics.

Classes	OASIS Alzheimer’s Dataset	Alzheimer’s Dataset
Mild	5002	896
Moderate	488	64
Normal	67,200	3200
Very mild	13,700	2240

Table 2. Experimental Settings.

Model	Parameters	Values
LeViT	Image Size	224 × 224 × 3
	Decay Factor	0.1 every 10 epochs
	Initial Learning Rate	0.001
	Batches	43
	Epochs	75
	Loss Function	Cross-Entropy
	Optimizer	Adam
	Fusion Layer	Element-wise addition
EfficientNet B7	Image	224 × 224 × 3
	Optimizer	Adam
	Loss Function	Cross-Entropy
	Validation Loss Monitor	Early Stopping
	Regularization	Dropout, L1, and L2
	Convolutional Layers	5
	Activation Function	Softmax
DXB	Learning Rate	(η, [0, 1])
	Minimum Split Loss	(γ, [0, ∞])
	Maximum Tree Depth	([0, ∞])
	Optimizer	BOHB

Table 3. Outcomes of Performance Validation.

Classes	Accuracy	Specificity	Kappa	Precision	Recall	F1-Score
Mild	99.8	99.9	97.5	99.3	99.5	99.4
Moderate	99.9	99.8	96.8	98.6	99.4	99.0
Normal	99.6	100	97.3	99.3	99.5	99.4
Very mild	99.8	99.8	97.9	99.5	99.6	99.5

Table 4. Computational Configurations.

Model	Parameters (in Millions (m))	FLOPs (in Millions (m))	Testing Time (Seconds)
Proposed Model	27	42	1.02
EfficientNet B7	39	53	2.15
SqueezeNet V1.1	46	59	1.23
MobileNet V3	47	61	2.08
SWIN Transformer	52	59	1.36
LeViT	37	45	1.56

Table 5. Reliability and Consistency Analysis.

Model	AUROC	AUPRC	SD	CI
Proposed Model	0.99	0.97	0.0004	[95.8–96.8]
EfficientNet B7	0.91	0.93	0.0005	[95.1–97.5]
SqueezeNet V1.1	0.89	0.91	0.0007	[94.8–95.9]
MobileNet V3	0.85	0.86	0.0011	[96.1–97.7]
SWIN Transformer	0.91	0.90	0.0006	[95.7–96.9]
LeViT	0.92	0.91	0.0007	[96.1–96.9]

Table 6. Findings of Comparative Analysis.

Model	Accuracy	Specificity	Sensitivity	AUROC	AUPRC
Proposed Model	99.8	99.8	99.4	0.99	0.97
Raees & Thomas (2021) [35]	90.1	88.7	87.6	0.84	0.81
Mamun et al. (2022) [36]	97.8	95.8	96.1	0.91	0.90
Helaly et al. (2022) [37]	97.1	92.4	91.5	0.90	0.91
El-Latif et al. (2023) [39]	95.9	91.5	92.3	0.91	0.88
Liu et al. (2022) [38]	86.1	78.1	80.2	0.85	0.83

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sait, A.R.W. A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis. Appl. Sci. 2024, 14, 3879. https://doi.org/10.3390/app14093879

AMA Style

Sait ARW. A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis. Applied Sciences. 2024; 14(9):3879. https://doi.org/10.3390/app14093879

Chicago/Turabian Style

Sait, Abdul Rahaman Wahab. 2024. "A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis" Applied Sciences 14, no. 9: 3879. https://doi.org/10.3390/app14093879

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A LeViT–EfficientNet-Based Feature Fusion Technique for Alzheimer’s Disease Diagnosis

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset Acquisition

2.2. EfficientNet B7-Based Feature Extraction

2.3. LeViT-Based Feature Extraction

2.4. Feature Fusion Layer

2.5. Dartbooster XGBoost-Based AD Detection

2.6. Performance Validation

3. Results

4. Discussions

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI