Alzheimer’s Multiclassification Using Explainable AI Techniques

Junior, Kamese Jordan; Carole, Kouayep Sonia; Theodore Armand, Tagne Poupi; Kim, Hee-Cheol; The Alzheimer’s Disease Neuroimaging Initiative,

doi:10.3390/app14188287

Open AccessArticle

Alzheimer’s Multiclassification Using Explainable AI Techniques

by

Kamese Jordan Junior

^1,†

,

Kouayep Sonia Carole

^2,†,

Tagne Poupi Theodore Armand

²

,

Hee-Cheol Kim

^1,2,* and

The Alzheimer’s Disease Neuroimaging Initiative

^‡

¹

Department of Computer Engineering, Inje University, Gimhae 50834, Republic of Korea

²

Institute of Digital Anti-Aging Healthcare, College of AI Convergence, u-AHRC, Inje University, Gimhae 50834, Republic of Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

^‡

Data used in preparation of this article were generated by the Alzheimer’s Disease Metabolomics Consortium (ADMC). As such, the investigators within the ADMC provided data but did not participate in the analysis or writing of this report. A complete listing of ADMC investigators can be found at: https://sites.duke.edu/adnimetab/team/ (6 July 2022).

Appl. Sci. 2024, 14(18), 8287; https://doi.org/10.3390/app14188287

Submission received: 12 August 2024 / Revised: 4 September 2024 / Accepted: 10 September 2024 / Published: 14 September 2024

(This article belongs to the Special Issue Future Information & Communication Engineering 2024)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, we address the early detection challenges of Alzheimer’s disease (AD) using explainable artificial intelligence (XAI) techniques. AD, characterized by amyloid plaques and tau tangles, leads to cognitive decline and remains hard to diagnose due to genetic and environmental factors. Utilizing deep learning models, we analyzed brain MRI scans from the ADNI database, categorizing them into normal cognition (NC), mild cognitive impairment (MCI), and AD. The ResNet-50 architecture was employed, enhanced by a channel-wise attention mechanism to improve feature extraction. To ensure model transparency, we integrated local interpretable model-agnostic explanations (LIMEs) and gradient-weighted class activation mapping (Grad-CAM), highlighting significant image regions contributing to predictions. Our model achieved 85% accuracy, effectively distinguishing between the classes. The LIME and Grad-CAM visualizations provided insights into the model’s decision-making process, particularly emphasizing changes near the hippocampus for MCI. These XAI methods enhance the interpretability of AI-driven AD diagnosis, fostering trust and aiding clinical decision-making. Our approach demonstrates the potential of combining deep learning with XAI for reliable and transparent medical applications.

Keywords:

Alzheimer’s disease (AD); deep learning; explainable artificial intelligence (XAI)

1. Introduction

Dementia is a condition characterized by a range of symptoms, including memory impairment and difficulties in learning. It occurs due to the loss of brain cells caused by injury or other medical conditions. Among the several types of dementia, Alzheimer’s disease (AD) is the most prevalent. Neurodegenerative disease is characterized by the loss of neurons, particularly in the cortex. This loss is caused by the formation of protein plaques made of amyloid, which damage cells outwardly, and protein tangles called tau, which destroy cells within. Amyloid plaques consist of aggregates of beta-amyloid. When these plaques accumulate between neurons, they cause a blockage in neural signaling, leading to impaired cognitive abilities such as memory. The plaques are believed to initiate an immunological response that induces inflammation, leading to subsequent cellular damage [1]. The formation of tangles is believed to be caused by the external influence of beta-amyloid, which triggers internal pathways within the cell. This activation leads to the phosphorylation of the tau protein by kinases and results in a modification of the conformation of the tau protein, causing it to cease its support of microtubules. Consequently, clumps of tau accumulate, forming neurofibrillary tangles. Neurons with impaired microtubules and tangles cannot effectively transmit signals, ultimately leading to programmed cell death [2,3,4]. The gyri undergo atrophy, resulting in their narrowing, whereas the sulci and ventricles experience dilation when the brain undergoes shrinkage due to cellular apoptosis [2]. The majority of Alzheimer’s disease (AD) cases that start before the age of 60 (early onset) and around the age of 85 (late onset) are classified as sporadic AD, which accounts for about 90% and 50% of these cases, respectively.

These occurrences are frequently influenced by environmental, behavioral, and genetic factors [5]. The lack of transparency in the internal mechanisms of cutting-edge AI models makes them appear as enigmatic systems, raising problems regarding trust [6,7]. Justifying each forecast helps to bridge this gap, as measurements like accuracy do not offer conclusive assessments of dependability in real-life situations. This occurs when a model is trained using static data, which could include instances that aid in categorization during an experimental scenario but do not accurately reflect real-world circumstances. In order to improve the generalization of models, it is important to have a deeper understanding of their behavior through interpretable explanations [8,9].

Deep learning applications within the scope of image processing extend through segmentation, classification and detection tasks, such as the research on multi-Source domain adaptation (MSDA) for medical image segmentation which improves the performance of segmentation models when applied to unseen datasets by leveraging multiple labeled datasets from different source domains. This addresses the challenge of domain shift, which often causes a drop in performance when a model is trained on one dataset and tested on another [10,11]. A proposed AI system integrates unmanned aerial systems (UASs) with computer vision based on the You Only Look Once (YOLO) framework to enable quick and accurate detection and removal of foreign object debris (FOD) on airport runways. The framework utilizes open-world recognition to identify both known and new types of debris, addressing challenges such as limited data through problem-specific data augmentation, showing improved detection capabilities compared to traditional methods as it enhances runway safety [12]. To segment the left atrium from 3D MRIs using semi-supervised learning, transformers were used for capturing global context, along with V-Net for detailed local feature extraction. Combining these networks improves accuracy in medical image segmentation tasks with limited labeled data. The framework extends Transformer capabilities to 3D data, coupled with a discriminator module to enhance segmentation results [13].

ResNet-50, or Residual Network with 50 layers, is a deep convolutional neural network architecture belonging to the ResNet family. It was developed to overcome the challenges associated with training extremely deep neural networks. The key innovation in ResNet-50 involves the incorporation of residual blocks, designed to enable the training of very deep networks without encountering the vanishing gradient problem. These residual blocks include shortcut connections, facilitating the direct flow of gradients through the block and preventing significant degradation. The ResNet-50 architecture consists of 50 layers, encompassing convolutional, pooling, and fully connected layers. It incorporates three main types of blocks: identity blocks, convolutional blocks with a shortcut, and the bottleneck architecture. The bottleneck architecture is particularly noteworthy for reducing computational complexity while preserving representational power.

Mathematically, the residual block in ResNet-50 is expressed as

y = F (x, W_{i}) + x

, where

x

represents the input to the block (independent variable),

y

is the output (dependent variable), the residual mapping to be learned is denoted by

F

, and

W_{i}

represents the block’s weights (coefficients associated with

x

). Including a skip connection allows for the bypassing of the residual mapping, facilitating the smooth flow of gradients during the backpropagation process. In essence, the architecture employs a linear equation for each layer, contributing to the overall effectiveness of ResNet-50 in training deep neural networks. The linear equation for a single layer can be denoted as in the equation

y = W_{1} x + x

.

ResNet-50 has demonstrated impressive performance across various computer vision tasks, with a notable emphasis on image classification. It functions as a robust feature extractor after pre-training on extensive datasets such as ImageNet. This quality is particularly advantageous for transfer learning scenarios, especially when dealing with applications with constraints on labeled data. The architecture’s depth, combined with the incorporation of skip connections, enhances its capability to capture intricate hierarchical features. These attributes collectively contribute to ResNet-50 being widely favored and integrated into cutting-edge deep learning models [14,15]. In this study, we applied a global average pooling layer to down-sample the ResNet-50 model and calculate the average value of each feature map in the input tensor, which results in a single value feature map calculated with the following formula:

y_{c} = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} F_{i, j, c}

, where

F_{i, j, c}

represents the value at position (

i, j

) in the channel

c

of the feature map,

H

is the height as the width of the feature map is denoted by

W

. The summation is performed over all positions in the feature map (

i, j

). This operation reduces dimensionality before feeding the features into fully connected layers. We created a dense (fully connected) layer with three units and a softmax activation function to convert raw scores into probability distributions over multiple classes. For a given vector

z = [z_{1}, z_{2} \dots z_{k}]

of k real numbers, the softmax function

σ (z)_{i}

for each element

z_{i}

is computed by

σ (z)_{i} = \frac{e^{z_{i}}}{\sum_{j = 1}^{k} e^{z_{j}}}

, where

e

is Euler’s number, the

i

-th element is

z_{i}

(of the input vector

z

), and the denominator is the sum of the exponentials of all elements in the vector. This function ensures that non-negative values are returned and represent valid probabilities summing to 1. The output class is predicted based on the class with the highest probability [16]. The model was compiled to create a keras model to specify inputs from the predefined ResNet-50 model and outputs from the dense layer.

The primary objectives of this study are as follows:

To develop a reliable and accurate classification model utilizing a deep transfer learning architecture.
To extract and represent deep features relevant for inclusion and exclusion during classification.
To enhance the interpretability of results by exploiting XAI techniques through visualization.

2. Related Works

Explainable artificial intelligence (XAI) does play a pivotal role in the landscape of medical applications within the precedent decades with a specific emphasis on addressing the complexities inherent in diagnosing neurodegenerative diseases, such as Alzheimer’s disease, vascular dementia, Parkinson’s disease, and other disorders related to cognitive decline. ML (machine learning) methods, e.g., support vector machines and random forest, trade off interpretability at the expense of accuracy, whereas deep learning (black-box) models trade off accuracy at the expense of interpretability. Incorporating XAI techniques within this domain is driven by the overarching goal of augmenting transparency, interpretability, and trust in machine learning models. This strategic integration aims to furnish clinicians with in-depth insights into the intricate decision-making processes of these black box models. A discerning overview of the pertinent literature reveals a multitude of studies delving into the application of XAI methodologies for Alzheimer’s diagnosis, underscoring the substantive contributions within this rapidly evolving field.

Muthamil Sudar et al. [17] endeavored to delineate the various stages of Alzheimer’s disease through the utilization of the layer-wise relevance propagation (LRP) method within the realm of explainable artificial intelligence (XAI), employing image data as input. Beyond LRP, the study incorporated additional algorithms, such as VGG-16 and CNN, with the goal of improving overall performance and achieving heightened batch accuracy. The primary focus of the project was to enable a thorough analysis of Alzheimer’s disease by leveraging XAI, accompanied by detailed feature explanations. The findings outlined in the article provide a lucid comprehension of Alzheimer’s analysis, reinforcing the results with elaborate explanations to bolster the trustworthiness and dependability of XAI.

The investigation by El-Sappagh, S. et al. [18] highlights the intuitive significance of cognitive scores, including CDRSB and MMSE, in effectively identifying patients with Alzheimer’s disease (AD), a consensus validated by domain experts. However, when focusing on progression detection, the study reveals that the volumes of Hippocampus and MidTerp obtained from MRI images, coupled with FDG and SROI from PET images, exert notable influence. The study employs SHAP explainers to compute feature contributions of random forest (RF) models, as delineated in the Explainability Capabilities Section of the Material and Methods. A condensed overview of the explainer’s responsiveness to diverse feature values is presented for both the first and second layers of the research findings. However, this study employed SHAP on clinical variables that are unlikely to be interpretable by patients and entry-level practitioners, unlike the approach we propose.

In an investigation by Achraf Essemlali et al. [19], an experiment employing explainable AI aimed to unravel the connectomic structure associated with Alzheimer’s disease (AD). Through the utilization of a CNN trained on the brain connectomes of ADNI patients, the researchers executed an ablation procedure to showcase that the manifestation of AD is not solely linked to a specific brain region but rather results from the cumulative effects across various cortical regions. The study underscored the entorhinal region as the most notable distinction between AD and normal control (NC) groups, while the hippocampus exhibited significance in the comparison between mild cognitive impairment (MCI) and NC. These findings align with established research methodologies such as voxel-based morphometry, cortical thickness, or functional connectomics in AD studies. The research signifies the potential of deep convolutional networks in providing intricate insights into the complexities of neurodegenerative diseases. However, the study emphasizes the necessity for a cautious interpretation of the saliency map, acknowledging that the correlation with neural net predictions may be influenced by variations in structural connectivity estimated from DW-MRI. Our study puts this notion into practice by applying a channel-wise attention mechanism to enhance the performance within Grad-CAM.

The study by Eduardo Nigri et al. [20] introduces the swap test method as a novel approach to generate heatmaps, offering insight into the key brain regions indicating Alzheimer’s disease (AD) for improved interpretability by clinicians. Through axiomatic evaluation experiments, it is demonstrated that the swap test outperforms a conventional occlusion test in explaining AD diagnosis using MRI data. These findings suggest that the swap test has the potential to mitigate the inherent black box nature of deep neural networks commonly used in AD diagnosis, providing a valuable tool to enhance transparency and interpretability in the decision-making process. In our study, we employ XAI techniques with lower computational complexity yet similarly achieve results consistent with the medical literature.

Shangran Qiu et al. [21] introduce a sophisticated deep learning pipeline that combines a fully convolutional network (FCN) with a multilayer perceptron (MLP) to directly predict Alzheimer’s disease status using MRI data or a blend of MRI and non-imaging data. The FCN produces high-resolution disease probability maps illustrating local cerebral morphology and Alzheimer’s risk. Leveraging these maps and non-imaging features like age, gender, and MMSE score, the MLP achieves accurate predictions across diverse cohorts. The FCN is specifically trained on randomly selected sub-volumes of MRI data, allowing for efficient processing without redundant decomposition of full-sized test images. The study underscores the interpretability of disease probability maps and their anatomical consistency, shedding light on structures most impacted by neuropathological changes in Alzheimer’s disease. Population-wide maps of Matthew’s correlation coefficient contribute to identifying crucial regions for precise disease status predictions. Our approach, however, achieves high prediction performance with anatomical consistency as well, without the dependence on non-imaging data, which is prone to error if obtainable in practice.

In the context of predicting biological age (BA), I. Boscolo Galazzo et al. [22] explore the valuable framework of the BA prediction paradigm, aiming to understand the underlying factors influencing an individual’s biological age and to characterize diverse aging trajectories. This paradigm not only offers insights into brain mechanisms but also provides a means to identify potential risks associated with cognitive aging and age-related brain disorders. The study emphasizes the promising potential of both ML and deep learning (DL) approaches, particularly in multimodal settings, and highlights the significance of investigating specific BA estimates derived from selective and regional ensembles of intrinsic disorder profiles (IDPs). Furthermore, there is an emphasis on the crucial role of explainable AI (XAI) in enhancing BA prediction, as it contributes to the interpretability of linear and latent variable models, providing user-friendly visualizations of essential features and supporting the application of complex deep models. Our study bridges the gap between complexity and interpretability cited by the authors using model-agnostic and model-specific approaches.

The recent study by Yousefzadeh et al. [23] introduces a novel explainable AI framework called “Granular Neuron-level Explainer” (LAVA) aimed at assessing Alzheimer’s disease (AD) using retinal fundus images. LAVA delves into the intermediate layers of a CNN model to identify key neurons that play a significant role in distinguishing between various stages of AD, thus offering an interpretable diagnostic method. Leveraging data from the UK Biobank, the research demonstrates LAVA’s effectiveness in differentiating AD stages by analyzing retinal vascular features, suggesting that retinal imaging could be a valuable, non-invasive tool for early AD diagnosis. However, the study acknowledges the limitations of its small sample size and emphasizes the need for further research to validate these findings.

3. Materials and Methods

In this paper, the input data are sent to a deep learning model for multiclassification. The resulting predicted output is evaluated using two explainable AI techniques (Figure 1). The method we propose is based on a ResNet-50 network coupled with a channel-wise attention mechanism to perform classification. We created a local ResNet-50 model based on ImageNet weights to which we internally trained on the MRI scans. This curbs the risk of inadvertent sharing of participant-level data while preserving the analytical advantage of our approach. To assess the model’s prediction, we first employ LIME using the quickshift method to highlight key features with superpixels. Our number of perturbations is 150, with the quickshift kernel size as 70, a max distance of 200 and a 0.2 ratio. We generate superpixels with this set-up and analyze the output image. Secondarily, we used Grad-CAM to create a superimposed image, displaying a heat map based on the final feature map from the last convolution layer. We applied channel-wise attention within Grad-CAM in order to enhance the quality of the three channels when generating the final jet heat map.

To conduct the experiments in this research, we used the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data, which the Laboratory disseminated for Neuro Imaging at the University of Southern California. Samples can be seen in Figure 2.

This study is based on the publicly available, large-scale ADNI dataset consisting of 10,346 sagittal brain MRI scans, categorized into three classes, as shown in Table 1: normal cognition (NC), mild cognitive impairment (MCI), and Alzheimer’s disease (AD). Normal cognition is the label for patients that have no indicators of cognitive decline, such as no memory loss or motor impairments, as evidenced by the lack of pathological changes in their brain scans, while patients with mild cognitive impairment show signs of forgetfulness, which is consistent with signs of deterioration around the hippocampus. This progresses into moderate dementia, where patients begin to forget their personal history. When cognition continues to decline as the patient develops Alzheimer’s disease, more historical details are forgotten, along with confusion due to neuronal damage throughout the cerebral cortex, as evidenced by the narrowing of gyri coupled with the dilation of the sulci and ventricles. Severe dementia from this point requires supervision as patients begin to forget their family members while requiring assistance for daily activities due to the development of motor symptoms; this is the stage before death according to the global deterioration scale [2,3,4].

These sets were then normalized through mean standardization and thereafter used to train a deep learning model using ResNet50. The positive outcome derived from the classification was utilized as input for the XAI models. The standardization is given by

X_{s t a n d a r d i z e d} = \frac{X - M e a n (X)}{S t a n d a r d D e v i a t i o n (X)}

(1)

where

X_{s t a n d a r d i z e d}

represents the standardized value of the original variable X, Mean(X) is the average of X, and Standard Deviation(X) quantifies how much each data point differs from the dataset’s average. In this study, we implemented a channel-wise self-attention mechanism to improve feature representation in a convolutional neural network. This mechanism consists of an attention block. The input tensor is initially processed through a convolutional layer with a 1 × 1 kernel and ReLU activation to produce an intermediate feature map. This map then passes through another convolutional layer with a 1 × 1 kernel and sigmoid activation to generate attention weights. These weights are applied element-wise to the original input tensor, highlighting crucial channels and diminishing less important ones. After convolution and pooling layers, this attention block is integrated twice within the network to dynamically modify channel significance at various stages, thereby enhancing feature extraction and classification accuracy. This methodology showcases the efficacy of channel-wise attention in boosting neural network performance. The mechanism is denoted as follows:

F_{i n t e r m e d i a t e} = R e L U (W_{1} \times F + b_{1})

(2)

The input feature map F is a tensor representing the original input features extracted from the previous layers, with dimensions corresponding to the image’s height, width, and number of channels. The weight matrix

W_{1}

of the first convolutional layer uses a 1 × 1 kernel to transform these input features, with learnable parameters adjusted during training. The bias vector

b_{1}

adds bias terms to each channel output from the first convolutional layer. The Rectified Linear Unit (ReLU) activation function introduces non-linearity by setting negative values to zero. The intermediate feature map

F_{i n t e r m e d i a t e}

results from this convolution and ReLU activation, serving as an intermediate representation of the input features.

A = σ (W_{2} \times F_{i n t e r m e d i a t e} + b_{2})

(3)

The attention map

A

is calculated using the sigmoid activation function σ applied to the result of a convolution operation. This involves the weight matrix

W_{2}

of the second convolutional layer, which uses a 1 × 1 kernel to further process the intermediate feature map

F_{i n t e r m e d i a t e}

produced by the previous convolution and ReLU activation, along with the bias vector

b_{2}

, adding bias terms to each channel output. The sigmoid function scales these values to the range [0, 1], creating the attention weights in the attention map, indicating each channel’s importance. The attention map

A

generated in the previous step indicates the relative importance of each channel. The original input feature map

F

is then element-wise multiplied

⨀

with this attention map. The result of this operation is the attended feature map

F_{a t t e n d e d}

, which enhances significant channels and suppresses less relevant ones [21,22,23,24].

F_{a t t e n d e d} = A ⨀ F

(4)

4. Results

Figure 3 and Figure 4 depict the training and prediction performance of our model in a visual format. The expression denotes training accuracy:

T r a i n i n g A c c u r a c y = \frac{N u m b e r o f C o r r e c t P r e d i c t i o n s o n T r a i n i n g D a t a}{T o t a l N u m b e r o f T r a i n i n g I n s t a n c e s}

(5)

Validation accuracy is denoted as follows:

V a l i d a t i o n A c c u r a c y = \frac{N u m b e r o f C o r r e c t P r e d i c t i o n s o n V a l i d a t i o n D a t a}{T o t a l N u m b e r o f V a l i d a t i o n I n s t a n c e s}

(6)

Figure 3. Model accuracy against the count of epochs during training and validation.

Figure 4. Confusion matrix for the pre-trained model.

The confusion matrix in Figure 4 displays the network’s predictive performance on the three classes: NC (normal cognition), MCI (mild cognitive impairment), and AD (Alzheimer’s disease). Figure 3 is a visualization of the deep learning model’s performance where the highest training accuracy of 85% is displayed, calculated through Equation (5). The training process was stopped as the model had 0 out of 23,587,712 trainable parameters. Figure 4 visually represents the label predictions using the confusion matrix. One example from the test set produced a positive prediction for mild cognitive impairment (MCI), as shown in Figure 5. This image served as the input for our XAI experiments.

In this research, we utilized local interpretable model-agnostic explanations (LIMEs) as a method for explainable AI. LIMEs were implemented to demonstrate how the classifier behaves in relation to the predicted instance, specifically focusing on local fidelity. The quickshift segmentation technique generates superpixels, with each superpixel represented by a binary vector. The algorithm is represented by the following formula:

d (p, q) = ∥ p - q ∥^{2} - σ^{2} \cdot s p e e d u p \cdot (\frac{∥ p - q ∥^{2}}{σ^{2}} - 1)

(7)

where feature vectors

p

and

q

represent pixel intensities,

σ

is a bandwidth parameter, and the speedup is a constant to speed up computations. This algorithm detects image regions by continuously updating the density mode of individual pixels according to their neighbors until convergence. The process involves utilizing the previously mentioned distance measure to assess the similarity between pixels.

A value of 1 in the vector indicates the original superpixel, whereas 0 indicates a greyed-out superpixel. The data points disturbed are assigned weights based on their closeness to the original example to train a comprehensible model on the corresponding predictions. A binary matrix is created with perturbations as rows and superpixels as columns. In this matrix, an activated superpixel is represented by 1, while a deactivated (off) superpixel is represented by 0. The cosine distance is calculated between each randomly generated perturbation and the explained image. The distances are transformed into a numerical range of 0 to 1 using a kernel function, and then the coefficients are sorted to identify the superpixels with higher magnitudes. This process masks less significant superpixels and produces an image that highlights the most significant superpixels. The main equation for LIME is given as follows:

L (f, g, π x) = X

(8)

where

L

represents the local surrogate model,

f

is the black box model being explained,

g

is the interpretable (surrogate) model,

π x

is the perturbation distribution, for instance

x

, and

X

is the interpretable representation. Instances from a neighborhood

Z

around the instance

x

are represented by

z, z_{0} \in Z

, where

z

is a perturbed instance sampled from the distribution πx, and

z_{0}

is a specific reference instance from the neighborhood

Z

. The perturbation distribution, for instance

x

, indicates how instances are sampled from the neighborhood. The difference between predictions of the black box model f for the perturbed instance

z

and predictions of the interpretable model

g

for the reference

z_{0}

is represented by the expression

f (z) - g (z_{0})

(Figure 6).

Given that the LIME methodology includes perturbing the instance

x

, sampling perturbed instances

z

from the neighborhood, obtaining predictions from the black box model

f

, and fitting an interpretable model

g

to locally approximate

f ’

s behavior, the interpretable representation

X

is utilized to emphasize significant features or regions represented by superpixels. The deactivation of these pixels returns a greyed-out representation of excluded features in the model’s prediction (Figure 6). Conversely, the activation of superpixels returns a mapping of feature regions that were relevant to the model’s prediction (Figure 7).

Additionally, we employed gradient-weighted class activation mapping (Grad-CAM) to visually identify significant areas that contribute to the model’s prediction. A convolutional neural network was utilized to analyze the image and extract features at various resolutions. The last layer of the network generates scores based on probabilities, which represent the classification of the image. The class score is calculated by

S c o r e (c) = \sum_{i} w_{i}^{(c)} \cdot A_{i}

(9)

where

S c o r e (c)

represents the score for class

c

, the weight of the

i

-th feature map for class

c

is represented by

w_{i}^{(c)},

and

A_{i}

is the

i

-th activation map.

The gradient of the projected score

\frac{\partial S c o r e (c)}{\partial A_{k}}

, with respect to the feature map of the last CNN layer, is computed to quantify the impact of each feature on the class score. The significance of each feature map is determined by taking the average of gradients denoted by

L^{(c)} = \sum_{k} \frac{\partial S c o r e (c)}{\partial A_{k}}

(10)

where

L^{(c)}

represents the loss for class

c

in the network,

A_{k}

represents the activation map for the

k

-th feature in the final convolution layer,

\frac{\partial S c o r e (c)}{\partial A_{k}}

represents the gradient score, and

\sum_{k}

denotes the summation over all activation maps in the network. This updates parameters during backpropagation to minimize loss and improve classification performance. The heat map is activated using the ReLU function so that only positive values contribute to the visualization. The intensity of each pixel in the heatmap corresponds to the spatial location for the image. The smoothing operation convolves the original Grad-CAM heat map with the importance weights to provide an interpretable visualization by smoothing out sharp edges, which returns a final heat map. The formula is denoted by

S m o o t h e d G r a d - C A M = R e L U (\sum_{k} w_{k} (c) * A_{k})

(11)

where ReLU, the rectified linear unit activation function, sets negative values to zero,

\sum_{k}

is the sum of all activation maps,

w_{k} (c)

represents the weights relative to the

k

-th activation map for class

c

, and

A_{k}

is the

k

-th activation map. The ultimate standardized heat map accentuates the areas upon which our deep learning algorithm relies to provide predictions [8,9,25,26,27]. We employed a channel-wise self-attention mechanism to highlight significant features within each channel of the input tensor (Figure 8). This mechanism was accomplished through two convolutional layers with (1, 1) kernels that generate an attention map, which is then element-wise multiplied with the input tensor. The initial convolution layer, followed by a ReLU activation, formed an intermediate representation, while the subsequent convolution layer with a sigmoid activation created the attention weights. These weights adjust the emphasis on different channels, enabling the model to enhance or reduce specific features, thereby improving the learning process and overall performance [24,28,29,30].

5. Discussion

The challenges associated with diagnosing and predicting the progression of Alzheimer’s disease (AD) have been mitigated by using artificial intelligence (AI), particularly in image data analysis. However, in contrast to various brain disorders, AD remains difficult to comprehend despite being relatively easier to categorize based on established features. This study utilized the glass-box methodology to illustrate that the deep learning model generated positive predictions corresponding to the specified characteristics. What makes this research unique is the employment of the channel-wise attention mechanism not only to the deep learning model but also the XAI model, as shown in Figure 9, wherein the left shows the base Grad-CAM with a generalized heat map while the right shows the model with channel-wise attention applied, returning a well-defined heat map. The image on the left of Figure 9 was created using traditional Grad-CAM and, in comparison to our approach that hinges on an attention mechanism, it is evident that the heat map generated with channel-wise attention highlights the region of interest more accurately, which is in line with the medical literature on MCI diagnosis. The hippocampal area shows early signs of degradation for cases of mild cognitive impairment. While conducting this study, a few challenges were met during the conceptualization process. It was not straightforward when deciding which techniques to experiment with in order to avoid redundant output while supplementing the study collectively. Since the quickshift method is highly sensitive, it called for gradual hyperparameter tuning to obtain an acceptable kernel size of 70. The application of this mechanism improves the quality of the explanation on a more granular scale in contrast to traditional tuning, since each channel in the feature map is considered separately, wherein the focus is only on the most relevant channels. This approach verifies the classification while enhancing localization and the overall scope of this methodology is not limited to MRI studies; our method can be employed for generalized imaging tasks such as object detection. The flexibility of this framework also allows for experimentation with multimodal data such as MRI and diffusion tensor imaging (DTI).

Our proposed method employs a hybrid model combining ResNet-50 with attention mechanisms and XAI techniques to enhance the interpretability of Alzheimer’s disease (AD) classifications. The use of channel-wise attention improves feature extraction, and the integration of XAI methods provides visual insights into model predictions, helping to identify regions of interest like the hippocampus, a critical area associated with Alzheimer’s disease. In contrast to previous studies, such as Muthamil Sudar et al. [17] who used layer-wise relevance propagation (LRP) to explain AD progression, and El-Sappagh et al. [18], who used SHAP with random forest for clinical data, the current study leverages deeper models and focuses more on the explainability in an image-centric context. Sudar et al. aimed at batch-level accuracy through models like VGG-16 and CNN, while the current study integrates attention mechanisms directly into the deep learning architecture to boost performance and interpretability. Furthermore, Jahan Sobhana et al. [31] used multimodal data and random forest with SHAP for predicting multiple AD classes. The current research, by focusing solely on MRI data and applying advanced explainability techniques like Grad-CAM, provides a more targeted approach in visualizing critical brain areas contributing to the diagnosis, differentiating it from models that depend heavily on multimodal inputs. This proposed framework delivers a highly explainable AI model capable of identifying specific pathological changes linked to AD, bridging the gap between model accuracy and interpretability for clinical use.

6. Conclusions

The LIME Explainable Artificial Intelligence (XAI) model developed superpixels that highlighted the cortical atrophy of the gyri and the expansion of the sulci and ventricles resulting from cellular degeneration in its positive prediction for MCI. The Grad-CAM Explainable Artificial Intelligence (XAI) model generated a heat map that revealed the highest activity concentration in regions near the parahippocampal gyrus, corresponding to the distinctive pathological features of Alzheimer’s disease (AD). In the early stages of AD, this region has a significant neuronal loss, characterized as mild cognitive impairment. These explainable artificial intelligence (XAI) solutions can effectively bridge the trust gap and enable medical practitioners from other fields to confidently make accurate assessments, utilizing AI as a transparent and supportive tool.

Author Contributions

Conceptualization, K.J.J. and K.S.C.; methodology, K.J.J. and K.S.C.; validation, K.S.C.; formal analysis, K.S.C. and T.P.T.A.; investigation, K.S.C. and T.P.T.A.; resources, K.J.J., K.S.C. and T.P.T.A.; data curation, K.S.C.; writing—original draft preparation, K.J.J., K.S.C. and T.P.T.A.; writing—review and editing, K.J.J., K.S.C. and T.P.T.A.; visualization, K.S.C. and K.J.J.; supervision, H.-C.K.; project administration, H.-C.K.; funding acquisition, H.-C.K. and The Alzheimer’s Disease Neuroimaging Initiative. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the MSIT (Ministry of Science ICT), Korea, under the National Program for Excellence in SW, supervised by the IITP (Institute of Information and Communications Technology Planning & Evaluation) in 2022 (2022-0-01091, 1711175863).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). For up-to-date information, see www.adni-info.org (6 July 2022).

Acknowledgments

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd. and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org, accessed on 6 July 2022). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Conflicts of Interest

The authors declare no conflict of interest.

References

Newcombe, E.A.; Camats-Perna, J.; Silva, M.L.; Valmas, N.; Huat, T.J.; Medeiros, R. Inflammation: The link between comorbidities, genetics, and Alzheimer’s disease. J. Neuroinflammation 2018, 15, 276. [Google Scholar] [CrossRef] [PubMed]
Brokaw, D.L.; Piras, I.S.; Mastroeni, D.; Weisenberger, D.J.; Nolz, J.; Delvaux, E.; Serrano, G.E.; Beach, T.G.; Huentelman, M.J.; Coleman, P.D. Cell death and survival pathways in Alzheimer’s disease: An integrative hypothesis testing approach utilizing -omic data sets. Neurobiol. Aging 2020, 95, 15–25. [Google Scholar] [CrossRef] [PubMed]
Chang, C.-H.; Lin, C.-H.; Lane, H.-Y. Machine Learning and Novel Biomarkers for the Diagnosis of Alzheimer’s Disease. Int. J. Mol. Sci. 2021, 22, 2761. [Google Scholar] [CrossRef] [PubMed]
Sheppard, O.; Coleman, M. Alzheimer’s Disease: Etiology, Neuropathology and Pathogenesis. In Alzheimer’s Disease: Drug Discovery [Internet]; Huang, X., Ed.; Exon Publications: Brisbane, Australia, 2020; Chapter 1. [Google Scholar] [CrossRef]
Caldwell, A.B.; Anantharaman, B.G.; Ramachandran, S.; Nguyen, P.; Liu, Q.; Trinh, I.; Galasko, D.R.; Desplats, P.A.; Wagner, S.L.; Subramaniam, S. Transcriptomic profiling of sporadic Alzheimer’s disease patients. Mol. Brain 2022, 15, 83. [Google Scholar] [CrossRef] [PubMed]
Armand, T.P.T.; Mozumder, M.A.I.; Carole, K.S.; Joo, M.I.; Kim, H.C. September. Enhancing Patient’s Confidence and Trust in Remote Monitoring Systems Using Natural Language Processing in the Medical Metaverse. In 2023 International Conference on Intelligent Metaverse Technologies & Applications (iMETA); IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar] [CrossRef]
Mozumder, M.A.I.; Sumon, R.I.; Uddin, S.M.I.; Athar, A.; Kim, H.C. The metaverse for intelligent healthcare using XAI, blockchain, and immersive technology. In 2023 IEEE International Conference on Metaverse Computing, Networking and Applications (MetaCom); IEEE: Piscataway, NJ, USA, 2023; pp. 612–616. [Google Scholar] [CrossRef]
Ali, S.; Abdullah; Armand, T.P.T.; Athar, A.; Hussain, A.; Ali, M.; Yaseen, M.; Joo, M.I.; Kim, H.C. Metaverse in healthcare integrated with explainable AI and blockchain: Enabling immersiveness, ensuring trust, and providing patient data security. Sensors 2023, 23, 565. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16), San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
Pei, C.; Wu, F.; Yang, M.; Pan, L.; Ding, W.; Dong, J.; Huang, L.; Zhuang, X. Multi-Source Domain Adaptation for Medical Image Segmentation. IEEE Trans. Med. Imaging 2024, 43, 1640–1651. [Google Scholar] [CrossRef] [PubMed]
Theodore Armand, T.P.; Nfor, K.A.; Kim, J.I.; Kim, H.C. Applications of Artificial Intelligence, Machine Learning, and Deep Learning in Nutrition: A Systematic Review. Nutrients 2024, 16, 1073. [Google Scholar] [CrossRef] [PubMed]
Noroozi, M.; Shah, A. Towards optimal foreign object debris detection in an airport environment. Expert. Syst. Appl. 2023, 213, 118829. [Google Scholar] [CrossRef]
Zhao, C.; Xiang, S.; Wang, Y.; Cai, Z.; Shen, J.; Zhou, S.; Zhao, D.; Su, W.; Guo, S.; Li, S. Context-aware network fusing transformer and V-Net for semi-supervised segmentation of 3D left atrium. Expert. Syst. Appl. 2023, 214, 119105. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Koonce, B. ResNet 50. In Convolutional Neural Networks with Swift for Tensorflow; Apress: Berkeley, CA, USA, 2021. [Google Scholar] [CrossRef]
Zhu, Q.; He, Z.; Zhang, T.; Cui, W. Improving Classification Performance of Softmax Loss Function Based on Scalable Batch-Normalization. Appl. Sci. 2020, 10, 2950. [Google Scholar] [CrossRef]
Sudar, K.M.; Nagaraj, P.; Nithisaa, S.; Aishwarya, R.; Aakash, M.; Lakshmi, S.I. Alzheimer’s Disease Analysis using Explainable Artificial Intelligence (XAI). In Proceedings of the 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 7–9 April 2022; pp. 419–423. [Google Scholar] [CrossRef]
El-Sappagh, S.; Alonso, J.M.; Islam, S.R.; Sultan, A.M.; Kwak, K.S. A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease. Sci. Rep. 2021, 11, 2660. [Google Scholar] [CrossRef] [PubMed]
Essemlali, A.; St-Onge, E.; Descoteaux, M.; Jodoin, P. Understanding Alzheimer disease’s structural connectivity through explainable AI. In Proceedings of the Third Conference on Medical Imaging with Deep Learning, Montreal, QC, Canada, 6–8 July 2020. [Google Scholar]
Nigri, E.; Ziviani, N.; Cappabianco, F.; Antunes, A.; Veloso, A. Explainable Deep CNNs for MRI-Based Diagnosis of Alzheimer’s Disease. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
Qiu, S.; Joshi, P.S.; Miller, M.I.; Xue, C.; Zhou, X.; Karjadi, C.; Chang, G.H.; Joshi, A.S.; Dwyer, B.; Zhu, S.; et al. Development and validation of an interpretable deep learning framework for Alzheimer’s disease classification. Brain 2020, 143, 1920–1933. [Google Scholar] [CrossRef] [PubMed]
Galazzo, I.B.; Cruciani, F.; Brusini, L.; Salih, A.; Radeva, P.; Storti, S.F.; Menegaz, G. Explainable Artificial Intelligence for Magnetic Resonance Imaging Aging Brainprints: Grounds and challenges. IEEE Signal Process. Mag. 2022, 39, 99–116. [Google Scholar] [CrossRef]
Yousefzadeh, N.; Tran, C.; Ramirez-Zamora, A.; Chen, J.; Fang, R.; Thai, M.T. Neuron-level explainable AI for Alzheimer’s Disease assessment from fundus images. Sci. Rep. 2024, 14, 7710. [Google Scholar] [CrossRef] [PubMed]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision—ECCV 2018: 15th European Conference, Munich, Germany, 8–14 September 2018; Proceedings, Part VII. Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–19. [Google Scholar]
Bi, X.; Liu, W.; Liu, H.; Shang, Q. Artificial Intelligence-based MRI Images for Brain in Prediction of Alzheimer’s Disease. J. Healthc. Eng. 2021, 2021, 8198552. [Google Scholar] [CrossRef] [PubMed]
Afzal, S.; Maqsood, M.; Khan, U.; Mehmood, I.; Nawaz, H.; Aadil, F.; Song, O.Y.; Yunyoung, N. Alzheimer Disease Detection Techniques and Methods: A Review. Int. J. Interact. Multimed. Artif. Intell. 2021, 6, 1. [Google Scholar] [CrossRef]
Aravinda, C.V.; Lin, M.; Udaya Kumar Reddy, K.R.; Amar Prabhu, G. A demystifying convolutional neural network using Grad-CAM for prediction of coronavirus disease (COVID-19) on X-ray images. In Data Science for COVID-19; Academic Press: Cambridge, MA, USA, 2021; pp. 429–450. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
Chen, L.; Zhang, H.; Xiao, J.; Nie, L.; Shao, J.; Liu, W.; Chua, T. SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning. arXiv 2016, arXiv:1611.05594. [Google Scholar]
Jahan, S.; Abu Taher, K.; Kaiser, M.S.; Mahmud, M.; Rahman, M.S.; Hosen, A.S.; Ra, I.H. Explainable AI-based Alzheimer’s prediction and management using multimodal data. PLoS ONE 2023, 18, e0294253. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Showing the system model and workflow.

Figure 2. Raw MRI samples for normal cognition (NC) in the first row, mild cognitive impairment (MCI) in the middle row, and Alzheimer’s disease (AD) in the bottom row.

Figure 5. Output prediction from the ResNet-50 model; positive for mild cognitive impairment with 58.94% confidence.

Figure 6. Perturbed instances from the predicted image in Figure 5 showing deactivated pixels.

Figure 7. Activated pixels displaying relevant features for the positive MCI prediction.

Figure 8. Jet heatmap of positive values using self-attention for class-specific interpretability with gradient-weighted class activation mapping. The highest level of intensity in the heatmap is observed in close proximity to the hippocampus.

Figure 9. Comparison between Grad-CAM without channel-wise attention (Left), which highlights a generalized region, and Grad-CAM with the attention mechanism (Right), which is more localized close to the hippocampal region.

Table 1. Data distribution per class.

Class	AD	MCI	NC	Total
Samples	3435	3414	3497	10,346

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Junior, K.J.; Carole, K.S.; Theodore Armand, T.P.; Kim, H.-C.; The Alzheimer’s Disease Neuroimaging Initiative. Alzheimer’s Multiclassification Using Explainable AI Techniques. Appl. Sci. 2024, 14, 8287. https://doi.org/10.3390/app14188287

AMA Style

Junior KJ, Carole KS, Theodore Armand TP, Kim H-C, The Alzheimer’s Disease Neuroimaging Initiative. Alzheimer’s Multiclassification Using Explainable AI Techniques. Applied Sciences. 2024; 14(18):8287. https://doi.org/10.3390/app14188287

Chicago/Turabian Style

Junior, Kamese Jordan, Kouayep Sonia Carole, Tagne Poupi Theodore Armand, Hee-Cheol Kim, and The Alzheimer’s Disease Neuroimaging Initiative. 2024. "Alzheimer’s Multiclassification Using Explainable AI Techniques" Applied Sciences 14, no. 18: 8287. https://doi.org/10.3390/app14188287

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Alzheimer’s Multiclassification Using Explainable AI Techniques

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI