Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images

Colin, Jovito; Surantha, Nico

doi:10.3390/info16010053

Open AccessArticle

Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images

by

Jovito Colin

^1,* and

Nico Surantha

^1,2

¹

Computer Science Department, BINUS Graduate Program—Master of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia

²

Department of Electrical, Electronic and Communication Engineering, Faculty of Engineering, Tokyo City University, Setagaya-ku, Tokyo 158-8557, Japan

^*

Author to whom correspondence should be addressed.

Information 2025, 16(1), 53; https://doi.org/10.3390/info16010053

Submission received: 13 November 2024 / Revised: 9 January 2025 / Accepted: 13 January 2025 / Published: 15 January 2025

(This article belongs to the Special Issue Knowledge Management and Semantic Web Technologies for Explainable Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Pneumonia remains a global health issue, creating the need for accurate detection methods for effective treatment. Deep learning models like ResNet50 show promise in detecting pneumonia from chest X-rays; however, their black-box nature limits the transparency, which fails to meet that needed for clinical trust. This study aims to improve model interpretability by comparing four interpretability techniques, which are Layer-wise Relevance Propagation (LRP), Adversarial Training, Class Activation Maps (CAMs), and the Spatial Attention Mechanism, and determining which fits best the model, enhancing its transparency with minimal impact on its performance. Each technique was evaluated for its impact on the accuracy, sensitivity, specificity, AUC-ROC, Mean Relevance Score (MRS), and a calculated trade-off score that balances interpretability and performance. The results indicate that LRP was the most effective in enhancing interpretability, achieving high scores across all metrics without sacrificing diagnostic accuracy. The model achieved 0.91 accuracy and 0.85 interpretability (MRS), demonstrating its potential for clinical integration. In contrast, Adversarial Training, CAMs, and the Spatial Attention Mechanism showed trade-offs between interpretability and performance, each highlighting unique image features but with some impact on specificity and accuracy.

Keywords:

pneumonia detection; interpretable deep learning; Layer-wise Relevance Propagation; Adversarial Training; Class Activation Maps; Attention Mechanisms

1. Introduction

Recent advancements in deep learning, particularly convolutional neural networks (CNNs), have revolutionized the field of image recognition by enabling models to automatically learn complex, hierarchical representations from raw data [1]. These capabilities have led to significant breakthroughs in applications such as computer vision, autonomous systems, and medical diagnostics, but the increasing complexity of CNN architectures has introduced a critical challenge: the lack of interpretability in their decision-making processes. This challenge is especially pronounced in high-stakes domains such as healthcare, where understanding the rationale behind model predictions is essential for clinical trust and decision-making. Interpretability connects model outputs with clinical insights, allowing healthcare professionals to validate and trust automated diagnoses. Pneumonia, a common and potentially life-threatening respiratory infection, exemplifies the need for such trustworthy systems [2].

In image recognition, with its potential uses from medical diagnostics to autonomous systems, the demand extends beyond just accuracy to understanding the reasoning behind the predictions. The interpretability of deep learning models serves as the critical link between the complex internal workings of these architectures and the comprehension needs of decision-makers and end-users. In fields like medical image analysis, the interpretability of model decisions holds significance for fostering trust among healthcare professionals and facilitating the seamless integration of deep learning technologies into clinical practice. For instance, medical professionals require insights into why a model might diagnose a certain condition, a necessity that directly impacts patient care. Pneumonia, a common and potentially fatal respiratory infection, poses a significant global health risk, especially to vulnerable groups like children, the elderly, and immunocompromised individuals [3].

Accurate diagnosis of pneumonia is crucial for initiating appropriate treatment and preventing complications. Chest X-ray (CXR) imaging is a common diagnostic modality used for pneumonia detection, allowing healthcare professionals to visualize and assess lung abnormalities associated with the disease [4].

Despite the widespread use of chest X-rays (CXRs) to diagnose pneumonia, traditional methods often suffer from significant misdiagnosis rates, as shown in Table 1. Between 2010 and 2019, pneumonia remained a major public health concern worldwide, with notable trends in confirmed cases, misdiagnosis rates, and deaths. The number of confirmed cases increased steadily, from approximately 220 million in 2010 to 280 million in 2019 [5]. This rise can be attributed to population growth, improved disease tracking systems, and better awareness of respiratory illnesses. Despite advancements in diagnostics, misdiagnosis rates for pneumonia have shown variability. As indicated in Table 1, misdiagnosis rates ranged from approximately 1.4% to 3.0% between 2010 and 2019, which contrasts with prior reports suggesting rates as high as 30% in certain clinical scenarios. This discrepancy may result from differences in sample populations, diagnostic methods, and reporting criteria [3]. Recent advances in medical image analysis, through deep learning and CNNs, have shown promising results in detecting pneumonia from chest X-rays (CXRs). CNNs excel in medical imaging as they can learn complex, disease-relevant patterns directly from raw pixel data, potentially enhancing diagnostic accuracy and efficiency [6,7]. Among these, ResNet-50 has emerged as a widely adopted model because of its efficient feature extraction and deep architecture, which allow it to capture intricate patterns in chest X-ray images [8].

However, these deep learning models operate as “black boxes”, providing predictions without revealing which spatial regions of the input image contributed most to the decision-making process [9]. This means their internal decision-making processes are not easily interpretable, which poses a challenge for real-world use, particularly in critical areas like medical diagnostics where user trust is essential. Without transparency, reliance on these models becomes difficult, as users are skeptical and uncertain about the rationale behind predictions, which is a significant barrier to widespread adoption [10]. Therefore, although deep learning models perform exceptionally well, their black-box nature limits their acceptance in high-stakes applications. There is a pressing need for interpretable models that not only achieve high accuracy but also provide transparent decision-making processes, ensuring they are both reliable and suitable for deployment in sensitive fields like healthcare.

Several previous studies have attempted to address this interpretability gap by incorporating saliency maps or heatmap-based visualizations. One study demonstrated that CNN models could achieve radiologist-level performance in pneumonia detection; however, the study did not focus on enhancing the interpretability of the results [11]. Meanwhile, another study reported that CNNs trained on chest X-rays are susceptible to dataset biases and can fail to generalize in unseen clinical scenarios, further limiting trust in the model’s predictions [12]. These studies underscore the need for methods that not only improve performance but also highlight spatial regions of interest contributing to the pneumonia diagnosis.

Another critical challenge that affects deep learning models in medical imaging is class imbalance within datasets. Typically, chest X-ray datasets exhibit a disproportionate number of healthy cases compared to pneumonia-positive cases, leading to biased model predictions that favor the majority class [13]. Data augmentation techniques have been widely used to mitigate class imbalance by generating additional synthetic training samples. However, augmentation alone does not guarantee improved interpretability or a focus on clinically relevant regions [14].

To address these challenges, this study investigates the integration of interpretability techniques with CNNs to balance performance and transparency in pneumonia detection from CXRs. Specifically, four interpretability techniques, which are Layer-wise Relevance Propagation (LRP), Class Activation Maps (CAMs), Adversarial Training, and the Spatial Attention Mechanism, are systematically evaluated for their effectiveness in enhancing model transparency while maintaining diagnostic accuracy. By comparing these methods, this study identifies the most effective approach for improving the interpretability of deep learning models in medical diagnostics, with a focus on their practical application in clinical settings.

2. Related Works

Ren et al. [15] proposed a novel approach to pneumonia detection that integrates deep learning with interpretable models. Their work utilizes a combination of Bayesian networks and deep CNN architectures, including ResNet-50, DenseNet121, and AlexNet. The primary focus was on leveraging multisource data to enhance both diagnostic accuracy and interpretability. They achieved their best results with ResNet-50, reporting an accuracy of 82.9% and an interpretability score of 75.9%, measured using custom evaluation metrics. This research is particularly significant as it introduces a balanced framework for integrating interpretability with high-performing CNN models. By employing explainable AI (XAI) techniques alongside Bayesian reasoning, Ren et al. highlighted how interpretable insights can be derived without compromising diagnostic efficacy. The multisource data approach also provides a more comprehensive analysis, which is crucial in medical imaging tasks where heterogeneity in data can affect model robustness.

Rajaraman et al. [16] emphasize the need for interpretability in CNN-based medical diagnosis due to the “black box” nature of these models. Their study integrates interpretability methods like Grad-CAM and LIME into CNN predictions, making the model’s decision process more transparent. The research compares an optimized custom CNN and a pre-trained VGG16, to distinguish between normal, bacterial, and viral pneumonia cases. The pre-trained ResNet-50 model performs best, achieving 96.2% accuracy and an interpretability score of 91.8% (measured by MCC), while the custom CNN reaches 94.1% accuracy and 87.3% interpretability. This advancement in CNN interpretability holds significance for pediatric pneumonia diagnosis, as it combines high accuracy with valuable model transparency (Table 2).

Aljawarneh and Al-Quraan [17] explored the integration of Adversarial Training and explainability techniques to enhance the performance and interpretability of CNN models. They focused on the ResNet-50 architecture, utilizing methods such as Layer-wise Relevance Propagation (LRP) and Attention Mechanisms. Their findings revealed that Adversarial Training significantly improved model robustness, achieving accuracy ranges between 82.8% (lowest) and 92.4% (highest) across different test conditions. The interpretability of the model was also evaluated, with results showing a range from 79.6% to 87.3% in interpretability. These results demonstrate that adversarial techniques, when combined with interpretability methods like LRP and attention, not only improve model accuracy but also provide transparency into the decision-making process of the model.

Another study by Siddiqi and Javaid [18] examined the effectiveness of different XAI techniques in conjunction with several deep CNN architectures, including ResNet-50, VGG-16, DenseNet, AlexNet, and MobileNet. Their research focused on methods such as Grad-CAM, LIME (Local Interpretable Model-Agnostic Explanations), and SHAP (SHapley Additive exPlanations), seeking to enhance the interpretability of these networks. The authors achieved an impressive performance with ResNet-50, reaching an accuracy of 99.39%, the highest among all tested models. Furthermore, their interpretability metrics were also outstanding, with ResNet-50 achieving an interpretability score of 94.96%. These results underline the effectiveness of Grad-CAM, LIME, and SHAP in providing insights into the workings of deep CNNs, improving transparency without sacrificing performance.

The studies discussed in this chapter highlight the growing significance of combining high-performance deep learning models with interpretability techniques in medical diagnostics, particularly for complex tasks such as pneumonia detection. ResNet-50 emerges as a standout architecture in this area, demonstrating both exceptional accuracy and strong interpretability across multiple research works. In conclusion, ResNet-50 is increasingly recognized as a superior model for pneumonia detection due to its balanced performance in both accuracy and interpretability. Its ability to integrate with various explainable AI (XAI) techniques (such as LRP, Grad-CAM, LIME, and SHAP) further enhances its transparency, making it an ideal choice for medical applications where model interpretability is just as crucial as accuracy. This combination of high performance and explainability positions ResNet-50 as the most fitting architecture to address the challenges of accurate and transparent pneumonia detection in clinical settings.

3. Methodology

3.1. Experimental Setup

This research is founded on two key components: a CNN-based deep learning architecture and selected interpretability techniques. A diverse range of interpretability techniques exists for deep learning models, each offering unique insights into the inner workings of the models. These techniques aim to enhance human understanding of model predictions, thereby improving interpretability. Layer-wise Relevance Propagation (LRP) has emerged as a promising interpretability technique for CNNs in image recognition tasks, particularly in analyzing X-ray images for medical diagnostics. LRP is adept at providing pixel-wise relevance scores, attributing significance to individual features within the input data [19]. It works by backward propagating relevance scores through the network, highlighting the pixels that contribute the most to the final classification decision. This technique offers detailed insights into the decision-making process of CNNs, especially in medical image interpretation, making it valuable for enhancing interpretability.

Adversarial Training is another interpretability technique that introduces small, carefully crafted perturbations to the input data during training. By observing how the model responds to these perturbations, researchers gain insights into which features the model prioritizes for decision-making [20]. Adversarial Training aims to improve model robustness and interpretability by uncovering vulnerabilities and biases, thus enhancing the transparency of the model’s decision-making process. Class Activation Maps (CAMs) offer a different approach to interpretability by identifying the most discriminative image regions used by the model for prediction [21]. CAMs highlight the areas of an image that contribute most to a particular class prediction, providing visual explanations that aid in understanding the model’s decision-making process. This technique offers intuitive insights into the model’s focus areas, enhancing interpretability and enabling users to understand why certain predictions are made.

Attention Mechanisms, inspired by human attention, enable models to focus on relevant parts of the input data while ignoring irrelevant information [22]. By visualizing attention maps, researchers can understand which parts of the input data the model deems most important for making predictions, thereby enhancing interpretability. Attention Mechanisms improve model transparency by explicitly highlighting the regions of the input data that influence the model’s decisions, thus providing valuable insights into the decision-making process. In this study, LRP, Adversarial Training, CAM, and the Spatial Attention Mechanism were systematically evaluated and compared in terms of their impact on both model accuracy and interpretability. By analyzing the strengths and limitations of each technique, this research aims to provide valuable insights into the selection and optimization of interpretability techniques for deep learning models, thereby advancing the field of transparent and interpretable AI.

3.2. Experimental Design

In this experimental design, we implemented and integrated four interpretability techniques (Layer-wise Relevance Propagation, Class Activation Maps, Adversarial Training, and the Spatial Attention Mechanism) separately with the CNN model, with the primary goal to uncover and visualize the critical features influencing the CNN’s decisions. Each technique was implemented following established methodologies, ensuring compatibility with the chosen CNN architecture and dataset. As shown in Figure 1, the process involves loading a pre-trained ResNet50 model and preparing a dataset of images, ensuring compatibility with the model’s input requirements. The experimental design includes evaluating the interpretability results by comparing highlighted regions with ground truth features in the images. Fine-tuning parameters and iterating the implementation may be considered for further refinement. Thorough documentation of the process, including the interpretability techniques used and interpretations of the results, the transparency and replicability of the experimental findings are ensured.

3.2.1. Proposed Model

The proposed model for this experiment is based on the ResNet50 architecture, pre-trained on ImageNet. ResNet50 is chosen due to its robust performance and widespread use in image classification tasks. The model architecture includes several key layers and parameters. The input layer accepts images resized to 224 × 224 pixels. The convolutional layers utilize residual blocks to learn complex features. A global average pooling layer reduces the spatial dimensions before the final classification. Finally, a fully connected layer outputs predictions for binary classification (pneumonia or normal). The model parameters are carefully selected to optimize performance, including the Adam optimizer with a learning rate of 0.001, a batch size of 32, and training for 50 epochs. The loss function used is binary cross-entropy.

3.2.2. Dataset

The chosen dataset for this experiment was “Chest X-ray Images (Pneumonia)”, with samples of the data shown in Figure 2, as retrieved from kaggle (https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia, accessed on 12 November 2024). The dataset consists of 5863 chest X-ray images, and is divided into two categories, which are “Pneumonia” for X-ray images of lungs positive for pneumonia and “Normal” for X-ray images of lungs negative for pneumonia. The dataset is also then broken down into 3 sets, which are the training set, test set, and validation set. All images are standardized to a uniform size of 1857 × 1317 pixels for consistency in analysis and processing. However, the images were resized in the data preprocessing phase, which will be explained further in the next section. The dataset includes 1587 images labeled “Normal” and 4276 images labeled “Pneumonia”, ensuring a comprehensive representation of both classes. This distribution reflects the prevalence of pneumonia cases in the dataset and was crucial for training and evaluating the model effectively.

Anterior–posterior chest X-ray images were chosen from pediatric patients at the Guangzhou Women and Children’s Medical Center in Guangzhou. The acquisition of these chest X-ray images was integrated into the routine clinical care of the patients. To ensure the quality of the chest radiographs for subsequent analysis, an initial screening process was conducted, eliminating all scans deemed low-quality or unreadable. Subsequently, two expert physicians graded the diagnoses of the images before they were approved for training the AI system. To mitigate any potential grading errors, a third expert also reviewed the evaluation set.

3.2.3. Data Preprocessing

Before the experimental process, preprocessing of the dataset was essential to ensure data compatibility, compatibility with the chosen CNN architecture, and model performance. Furthermore, the dataset exhibited class imbalance, with significantly fewer positive pneumonia cases than negative cases. To address this imbalance and ensure a robust model performance, data augmentation techniques were applied to generate additional training samples for the minority class. Key steps included the following:

Resizing: As mentioned earlier, the chest X-ray images were resized to standardized dimensions suitable for the CNN input layer. Commonly, 224 × 224 were the dimensions chosen, as they strike a balance between capturing essential details and computational efficiency.
Normalization: The pixel values of the images were normalized to a standard scale, from 0 to 1. Normalization helps stabilize training and ensures that each feature contributes equally to model learning.
Data Augmentation: To enhance the robustness and generalization of the model, data augmentation techniques such as rotation, flipping, and zooming were applied. This introduces variations in the training data, reducing overfitting and improving the model performance on unseen data. The specific augmentation parameters used were as follows:
- Rotation: Images were randomly rotated by up to 15 degrees.
- Flipping: Horizontal flipping was applied to randomly selected images.
- Zooming: Images were randomly zoomed in or out by up to 20%.

These augmentation techniques created a more varied training set, helping the model to generalize better to new, unseen data.

3.2.4. Model Development

As shown in Figure 3, the interpretability methods share a core objective: enhancing transparency in model decision-making. Each method is integrated into the ResNet-50 model to visually reveal model focus areas, contributing to a more intuitive understanding of pneumonia diagnosis from chest X-rays. While all methods employ visual outputs—such as heatmaps for Class Activation Maps (CAMs) and Layer-wise Relevance Propagation (LRP), or attention maps in the case of the SAM (Spatial Attention Mechanism)—each method’s optimization varies in the specific configuration, to balance interpretability with accuracy.

Optimization of Each Method:

Adversarial Training: The model uses a perturbation magnitude of 0.01 with a 0.001 learning rate to improve robustness against adversarial inputs. Iterative tuning of these parameters ensures that the model can handle slight image disturbances while remaining accurate.
Class Activation Maps (CAMs): Using Grad-CAM, gradient-based feature maps help localize pneumonia features within chest X-rays. Grad-CAM’s focus on class-specific regions is adjusted through its gradient thresholds, balancing interpretability (MRS) and diagnostic performance.
Layer-wise Relevance Propagation (LRP): This method utilizes the epsilon variant (ε = 0.001), ensuring numerical stability and a clear relevance distribution across layers. Adjusting the epsilon refines the pixel-level relevance, yielding high interpretability without performance trade-offs, making LRP especially suitable for clinical settings.
Spatial Attention Mechanism (SAM): The SAM optimizes the spatial focus by tuning attention weights across lung regions, emphasizing relevant image features while minimizing background noise. This method’s optimization focuses on clarity in spatial regions rather than granular features, sacrificing minor specificity to broaden attention across the X-ray for overall interpretability.

Key Hyperparameters

This experiment employed specific hyperparameters optimized for accuracy and interpretability in pneumonia detection within chest X-ray images. A learning rate of 0.001 was selected for stability in training, balancing the convergence speed with control over weight updates. This learning rate is compatible with the Adam optimizer, allowing efficient adjustments to complex features typical in medical images. A batch size of 32 was chosen as it offers an optimal balance between memory efficiency and training consistency, ensuring the model captures detailed image features without overwhelming computational resources. A 50-epoch training period was set, providing the model sufficient iterations to learn feature mappings from the data while preventing overfitting.

Each interpretability technique also has specialized hyperparameters to enhance both functionality and relevance for medical imaging tasks. For Adversarial Training, an adversarial learning rate of 0.001 and perturbation magnitude of 0.01 are applied, balancing robustness with interpretability by subtly challenging the model to improve its resilience. For Grad-CAM, the gradients of the target class are backpropagated through the model to compute class-specific weights, which are then applied to feature maps to highlight regions of interest. This process localizes discriminative features by emphasizing the areas that contribute most to the target class prediction. Layer-wise Relevance Propagation (LRP), using the epsilon LRP variant with an epsilon value of 0.001, effectively attributes relevance scores to pixels, enhancing transparency. Lastly, the Spatial Attention Module (SAM) integrates attention layers to help the model focus on disease-relevant regions, supported by visualization for insight into decision-making areas.

Iterative Refinement and Model Optimization

To achieve an optimal configuration for each interpretability technique, an iterative approach of parameter adjustment and metric analysis was implemented:

For Adversarial Training, fine-tuning the perturbation magnitude and learning rate helped balance robustness with interpretability, with the final settings demonstrating the optimal trade-off score; this allowed the model to remain accurate while becoming more resistant to adversarial input.
Grad-CAM optimizations involve gradient analysis and feature map adjustments, using the Mean Relevance Score (MRS) as the indicator, which effectively highlights pneumonia-affected regions.
LRP was refined through adjustments to its epsilon parameter, achieving the highest MRS possible with no significant trade-offs in accuracy, which made it the best-optimized technique for interpretability without performance loss.
The SAM underwent multiple adjustments to spatial configurations, balancing attention weights and feature map visualization. This enabled clear visualization of critical areas, which is beneficial for understanding the model focus despite a slight compromise in accuracy.

These iterative refinements and model optimizations ensure that each interpretability technique is effectively adapted to the specific demands of pneumonia detection and the gradience of chest X-ray images in medical imaging.

3.2.5. Interpretability Technique Implementation

Interpretability techniques (Adversarial Training, Class Activation Maps (CAMs), Layer-wise Relevance Propagation (LRP), and the Spatial Attention Mechanism) were systematically applied to a CNN model to enhance the interpretability of its deep learning. Using a pre-trained ResNet50 model, each technique was incorporated and evaluated based on its effect on model accuracy and interpretability.

Layer-wise Relevance Propagation (LRP), particularly the epsilon LRP variant, assigns relevance scores to each pixel, indicating their contribution to the final classification. As shown in Figure 4, the LRP Integrated ResNet-50 architecture has dual paths: the forward path, where the input image passes through convolutional, pooling, global average pooling (GAP), and fully connected (FC) layers to produce the predicted probabilities for each class (p(ŷ)) via the Softmax layer, and the relevance computation path, which computes the relevance scores (R(L)) starting from the final layer and propagating back through the network to highlight important regions in the input. This pixel-level interpretability is crucial for pinpointing areas of abnormalities within X-rays, thus supporting clinical decisions by highlighting precise diagnostic features. Epsilon LRP ensures stability and accuracy in relevance scores, offering insights that align well with the reliability needs in medical diagnostics [23].

In Adversarial Training and as shown in Figure 5, subtle perturbations are introduced to input data to test and improve the model’s robustness. The Fast Gradient Sign Method (FGSM) is employed to create adversarial examples that test the ResNet50 model’s resilience and reveal which features are essential for decision-making. Perturbed images generated through FGSM attacks were applied to the dataset, and three types of perturbed images were generated. For medical images like chest X-rays, FGSM’s subtle alterations ensure the model learns to detect key details, enhancing interpretability by reducing the model’s sensitivity to minor perturbations [25].

Class Activation Maps (CAMs) reveal the specific regions of an image that most influence the model’s classification [21]. As shown in Figure 6, Grad-CAM, an extension of CAM, is used to highlight regions in the X-ray that influence the model’s decision and are indicative of pneumonia. This is achieved by calculating weights based on gradients for each feature map, passed through ReLU activation to focus on positive influences. The resulting CAM overlay is combined with the original X-ray, providing a heatmap that highlights important areas, helping clinicians understand the model’s reasoning. By visualizing these areas, researchers can better understand how the model focuses on pneumonia-related regions, contributing to more transparent diagnostic decisions [26].

Attention Mechanisms, specifically the Spatial Attention Mechanism (SAM), enable the model to concentrate on relevant spatial regions. A SAM is particularly suited for medical imaging, where focusing on lung lobes or other specific areas is essential. This Attention Mechanism technique is added after the ResNet-50 model to capture spatial regions of interest, which enhances the interpretability of the features extracted. By focusing on relevant spatial areas, the model can prioritize critical patterns indicative of pneumonia, thereby improving overall performance. As shown in Figure 7, the SAM’s attention maps reveal the lung regions that the model deems important, reducing background noise and improving interpretability by clarifying the spatial areas most indicative of pneumonia. Accuracy metrics and relevance scores are also reviewed to assess how each technique impacts model transparency and performance. This methodical approach allows for adjustments in parameters and fine-tuning, ensuring that the techniques yield interpretable, clinically relevant insights in pneumonia detection [27].

4. Results

This research focuses on and compares four interpretability techniques, which are Layer-wise Relevance Propagation (LRP), Adversarial Training, the Spatial Attention Mechanism, and Class Activation Maps (CAMs).

The dataset was loaded using TensorFlow’s ImageDataGenerator to facilitate image preprocessing and augmentation, which divides the data into training, validation, and test sets. To improve the model’s robustness and reduce overfitting, the training set underwent augmentation with random rotations (up to 15 degrees), horizontal flipping, and zooming (up to 20%). As shown in Figure 8, the images were resized to 224 × 224 pixels to match ResNet-50’s input requirements and were normalized between 0 and 1 for stable training. The base model was built on a pre-trained ResNet-50 architecture from TensorFlow, excluding its top layers for customization (include_top = False). A global average pooling layer was added to reduce spatial dimensions, followed by a fully connected layer with a sigmoid activation function for binary classification, predicting whether an image is pneumonia-positive or -negative.

Compiled with the Adam optimizer (learning rate of 0.001) and binary cross-entropy loss, the model was trained on the augmented data for 50 epochs, with validation at each epoch’s end. The training history, including accuracy and loss for both sets, was saved for analysis. Post-training, the model’s performance was evaluated on the test set using metrics like accuracy, sensitivity, specificity, and AUC-ROC. Sensitivity and specificity, respectively, assess the model’s effectiveness in identifying pneumonia and normal cases, while AUC-ROC evaluates its ability to distinguish between positive and negative cases. The model outputs test accuracy, sensitivity, specificity, and AUC-ROC scores, providing a comprehensive performance summary. Additionally, accuracy and loss plots across epochs offer a visual understanding of the model’s learning progress, highlighting potential overfitting or underfitting issues.

4.1. Model Evaluation

4.1.1. Model Optimization Process and Loss Function

The ResNet-50 model is used as the backbone for feature extraction, and the optimization process relies on minimizing the classification loss. The cross-entropy loss (CE) [1] is utilized for multi-class classification, defined as

L_{C E} = - \frac{1}{N} \sum_{i = 1}^{N} \sum_{c = 1}^{C} y_{i, c} l o g ({\hat{y}}_{i, c})

(1)

N is the number of samples;
C is the number of classes;
$y_{i, c}$ is the ground-truth label for class c of sample i;
${\hat{y}}_{i, c}$ is the predicted probability of class c.

The optimization process is performed using Stochastic Gradient Descent (SGD) [1], defined as the following formula:

W_{t + 1} = W_{t} - η \cdot \nabla L (W_{t})

(2)

$W_{t}$ are the model weights at iteration $t$ ;
$η$ is the learning rate;
∇L(Wt) is the gradient of the loss function with respect to the weights.

To enhance the performance and prevent overfitting, L2 regularization is applied to the model weights, where λ is the regularization coefficient:

L_{t o t a l} = L_{C E} + λ | | W | |^{2}

(3)

Additionally, a Spatial Attention Mechanism [28] is added to the extracted feature maps from ResNet-50. The Attention Mechanism computes the attention weights A as follows:

A = s o f t m a x (C o n v (F))

(4)

where F represents the feature maps, and Conv(·) denotes the convolutional operation used to generate attention scores. The final weighted features are obtained as

F^{'} = A ⊙ F

(5)

where ⊙ represents element-wise multiplication, and F′ are the attended feature maps. The combination of ResNet-50 and the Attention Mechanism enhances interpretability and performance by focusing on the most relevant spatial regions.

4.1.2. Quantitative Assessment

The interpretable deep learning model applied in the experimental process was analyzed to determine its performance. In the performance evaluation of the interpretable image recognition model for pneumonia detection in chest X-ray images, several standard metrics were utilized to assess the model’s effectiveness. These metrics provide a comprehensive understanding of the model’s accuracy and its ability to correctly identify positive and negative cases:

Accuracy measures the overall correctness of the model’s predictions, calculating the ratio of correctly predicted instances to the total instances. In the context of pneumonia detection, a high accuracy indicates the model’s proficiency in distinguishing between pneumonia and normal cases.

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(6)

Recall (Sensitivity) evaluates the model’s ability to correctly identify true positive cases among all actual positive cases. It is particularly crucial in medical applications to ensure that pneumonia cases are not overlooked, which requires minimizing false negatives and enhancing the model’s diagnostic sensitivity.

Recall = \frac{T P}{T P + F N} or \frac{T r u e P o s i t i v e}{A c t u a l R e s u l t s}

(7)

Specificity gauges the model’s capability to correctly identify true negative cases among all actual negative cases. In the context of pneumonia detection, a high specificity indicates the model’s proficiency in avoiding false positives, reducing unnecessary concern for patients without pneumonia.

Specificity = \frac{T N}{T N + F P}

(8)

The Area Under the ROC Curve (AUC-ROC) provides a graphical representation of the model’s ability to discriminate between pneumonia and normal cases across various decision thresholds. A higher AUC-ROC value signifies an improved discrimination performance, offering a comprehensive assessment of the model’s diagnostic accuracy.

\begin{matrix} R O C - A U C = \int_{0}^{1} T P R (F P R) d F P R \\ = \int_{0}^{1} T P R ({F P R}^{- 1} (x)) d x \end{matrix}

(9)

The Mean Relevance Score (MRS) [29] is used to assess the performance of interpretability techniques. In the context of CNN-based interpretable deep learning models for pneumonia detection in chest X-ray images, the MRS provides a quantitative measure of the interpretability of these techniques by quantifying how effectively they reveal the features influencing the CNN’s decisions. A comparative analysis of the MRS across different techniques identifies the most effective approach for revealing critical features associated with pneumonia, facilitating a better understanding and trust in the model’s decision-making process by clinicians and researchers.

MRS = \frac{1}{N} Σ_{i = 1}^{N} R_{i}

(10)

The trade-off score [30] is a numerical way to measure the trade-off, which can be constructed by combining the performance metric (such as accuracy) with a quantitative measure of interpretability. One common method is to adjust the model’s score as follows:

Trade - off Score = A c c u r a c y - λ (1 - M R S)

(11)

where accuracy refers to the model’s performance in a given task. The MRS is the quantitative metric that measures the interpretability of the model based on how effectively it reveals the features influencing its decisions. A higher MRS indicates better interpretability. However, 1 − MRS acts as an interpretability penalty; the lower the MRS, the larger the penalty, which reflects a lower interpretability. λ is a tunable hyperparameter controlling the balance between accuracy and interpretability. These performance metrics collectively offer insights into different aspects of the interpretable image recognition model. Accuracy provides an overall measure of correctness, while sensitivity and specificity focus on the model’s ability to correctly identify positive and negative cases, respectively. AUC-ROC offers a visual representation of the model’s discriminative power and is used as the evaluation metric for the model’s interpretability. To quantitatively measure interpretability, the evaluation metric Mean Relevance Score (MRS) is computed based on the relevance scores generated by each interpretability technique. MRS quantifies the degree to which relevance scores align with ground truth features in the chest X-ray images, providing a numerical measure of interpretability effectiveness.

4.1.3. Visual and Graphical Representations

Visual representations are essential in assessing the interpretability of the pneumonia detection-based ResNet50 model, particularly through heat maps overlaid on chest X-ray images. These visual tools reveal relevance scores and attention weights across image regions, allowing researchers to identify the areas contributing most to predictions and providing an intuitive understanding of the model focus areas. By comparing heat maps before and after applying interpretability techniques, researchers can gauge how effectively each method enhances model transparency, particularly in highlighting medically relevant features indicative of pneumonia. Scatterplots and line graphs further illustrate the relationship between model accuracy and interpretability, with axes reflecting changes in accuracy and interpretability scores (e.g., Mean Relevance Scores, MRSs). These graphs reveal the balance each interpretability technique strikes between performance and transparency, guiding the selection of suitable methods based on visual comparisons of different model configurations.

4.2. Pre-Trained ResNet50 Without Interpretability Technique Results

Post-training, the model was evaluated on the test set using various metrics to determine its overall performance. After the model was trained for 50 epochs, with the validation set used to monitor performance during training, it achieved a final test accuracy of 0.90. The model’s sensitivity (recall), which measures the model’s ability to correctly identify pneumonia cases and is calculated as the ratio of true positives to the sum of true positives and false negatives, was 0.92. The model’s specificity, which measures the ability to correctly identify non-pneumonia cases and is calculated as the ratio of true negatives to the sum of true negatives and false positives, was 0.88.

The models’ Area Under the Receiver Operating Characteristic (AUC-ROC) curve is computed to assess the model’s ability to distinguish between positive and negative cases. The final AUC-ROC score for this model was 0.93. The ResNet50 model without interpretability techniques demonstrated a strong performance in terms of accuracy, sensitivity, specificity, and AUC-ROC. However, the black-box nature of the model limits insight into its decision-making process. These results provide a baseline for comparison against future models incorporating interpretability techniques, such as Layer-wise Relevance Propagation (LRP) and Class Activation Maps (CAMs), which aim to provide transparency in decision-making while maintaining accuracy.

The training history, showing the trends in accuracy and loss over the 50 epochs, is depicted in Figure 9, where it increases while the validation loss stabilizes. The confusion matrix further highlights the balance between true positives and true negatives, suggesting that the model performs well in identifying pneumonia cases without overfitting. Note that ResNet-50, as a convolutional neural network (CNN), does not inherently include interpretability mechanisms. Its primary purpose is to extract hierarchical features for tasks such as classification or detection. However, while it excels at performance metrics like accuracy, its predictions are often treated as a black box because the decision-making process within the model is not easily interpretable. In summary, while the ResNet50 model has a high accuracy in detecting pneumonia, the lack of interpretability makes it difficult to trust the model in medical settings. This evaluation establishes a benchmark for subsequent implementations that incorporate interpretability techniques.

4.3. Layer-Wise Relevance Propagation Implementation Results

Post-LRP integration, the model went through the same procedures as the base model and was evaluated on the test set using the same metrics to determine its overall performance. However, due to LRP implementation, the evaluation metrics interpretability MRS and trade-off score were also evaluated on the test set to account for the interpretability aspect of the model and the technique’s effect on the overall performance. As depicted in Figure 10, the model was also trained for 50 epochs, and we utilized the same validation set for performance monitoring during training. The model achieved a final test accuracy of 0.91, a sensitivity of 0.90, a specificity value of 0.92, and an AUC-ROC score of 0.93. Excluding sensitivity, the integration of epsilon LRP into the model produced no signs of a trade-off between the performance of the model and its interpretability. Instead, the technique inadvertently caused a slight increase in the overall performance of the model.

To quantitatively assess the interpretability of the model across all test images, the Mean Relevance Score (MRS) was calculated. The final interpretability MRS was calculated to be 0.85, indicating that the model consistently focuses on medically significant regions. The use of LRP significantly enhanced the interpretability of the ResNet50 model. Clinicians can trust that the model bases its decisions on relevant features rather than spurious correlations or noise. This interpretability is especially crucial for models deployed in medical settings where decisions must be transparent and explainable. As shown in Figure 10, it was observed that there was zero compromise in accuracy when interpretability was factored into the model evaluation, with trade-off scores remaining high across epochs. This suggests that the model effectively balances both a high classification performance and transparency in its decision-making.

Epsilon LRP enhances model accuracy without sacrificing interpretability due to several factors. First of all, epsilon LRP helps a pre-trained ResNet50 model avoid irrelevant features and noise, enabling it to make cleaner, more confident predictions. Second, by guiding the model’s attention to genuinely relevant features, epsilon LRP prevents the model from overfitting to noisy patterns, thereby increasing its generalization ability and, consequently, its accuracy. Third, unlike interpretability techniques that add constraints or limit model flexibility, epsilon LRP merely redirects the model’s focus, without restricting it, so no trade-off occurs. Instead, the model becomes both more interpretable and more accurate.

The e-LRP heatmap shown in Figure 11 prominently highlights specific regions within the lung fields, which aligns with the clinical expectation that pneumonia affects lung tissues, particularly through infiltrates or consolidations. These abnormalities typically show as dense regions in chest X-rays, which are pathological markers the model has been trained to identify. The model’s high attribution scores over these areas suggest it has effectively learned to focus on regions clinically indicative of pneumonia. In cases of positive pneumonia detection, the heatmap reveals a concentration of relevance in areas with higher likelihoods of infiltrates or consolidation patterns. The spatial distribution in the e-LRP heatmap demonstrates the model’s ability to differentiate pathological from non-pathological regions, critical for a reliable diagnosis.

4.4. Adversarial Training Implementation Results

Figure 12 shows a generally upward trend over the epochs, with fluctuations due to the challenges posed by adversarial examples introduced during training. The final test accuracy after 50 epochs of Adversarial Training is 0.85. This drop reflects the trade-off between robustness and predictive performance, as Adversarial Training often sacrifices some accuracy to improve the model’s resistance to attacks. However, the accuracy stabilizes as training progresses, indicating that the model adapts to the adversarial conditions over time.

The sensitivity, which measures the model’s ability to correctly identify pneumonia cases, reaches a final value of 0.87. Although this is lower than the base model’s sensitivity, it still demonstrates a strong detection ability under adversarial conditions. Specificity, on the other hand, shows a more noticeable drop to 0.79, suggesting that Adversarial Training has a greater impact on the model’s ability to correctly identify non-pneumonia cases. This reduction in specificity may imply that the model is slightly less conservative when distinguishing pneumonia cases after Adversarial Training, possibly a consequence of prioritizing robustness. The AUC-ROC score, which measures the model’s ability to differentiate between positive and negative cases, reaches a final value of 0.87 after Adversarial Training. The decrease in AUC-ROC reflects the overall performance trade-off introduced by Adversarial Training, as the model becomes less precise in distinguishing classes. This reduction aligns with the model’s general accuracy and specificity decrease, indicative of the compromises inherent in enhancing robustness.

The interpretability of the model, assessed through the Mean Relevance Score (MRS), shows a downward trend in relation to accuracy. The average MRS after Adversarial Training is 0.76, lower than the interpretability scores typically observed in models without Adversarial Training. This decline in MRS indicates that as the model becomes more robust to adversarial inputs, it sacrifices some interpretability, suggesting a trade-off where the model’s decision-making process becomes more complex or less transparent.

The final trade-off score, calculated to balance accuracy and robustness, is 0.79. This score reflects the model’s effectiveness in managing the trade-off between maintaining predictive accuracy and enhancing robustness. Although the trade-off score does not match the base model’s high accuracy, it suggests that the Adversarial Training has achieved a balanced performance by moderately preserving accuracy while bolstering robustness. The resulting graph shows a steady improvement, implying that Adversarial Training effectively optimizes the model’s resilience without excessive loss of predictive power.

The fgsm_attack function generates an adversarial example by adding a small perturbation along the sign of the gradient to the original image. Figure 13 utilizes a gradient-based heatmap to visualize the gradients of the original image, highlighting pneumonia-positive areas. This approach uses a simple gradient saliency map to identify which parts of the image contribute most to the model’s prediction.

To summarize, Adversarial Training on the pre-trained ResNet-50 model for pneumonia detection results in notable trade-offs between accuracy, interpretability, and robustness. The evaluation metrics demonstrate that while accuracy, sensitivity, specificity, and AUC-ROC are slightly compromised, the model gains improved robustness against adversarial examples. The trade-off parameter, λ, plays a critical role in balancing these aspects, with the results indicating a delicate equilibrium that slightly favors robustness at the expense of interpretability and accuracy. While Adversarial Training enhances the model’s resistance to adversarial attacks, it introduces challenges in maintaining interpretability and precision, which should be considered carefully in medical contexts where transparency is essential.

4.5. Class Activation Map Implementation Results

The integration of Grad-CAM was analyzed with quantitative metrics that reflect a slight trade-off between diagnostic performance and interpretability. As shown in Figure 14, the final accuracy of 0.906 is a slight increase from the baseline model, suggesting that the trade-off for interpretability comes at no cost to accuracy but increases it slightly instead. The model’s sensitivity stands at 0.86, indicating a strong ability to detect pneumonia cases, crucial for minimizing false negatives. Specificity was slightly lower, at 0.83, suggesting the model’s ability to correctly identify non-pneumonia cases was impacted, though it remained robust. The AUC-ROC score of 0.89 reflects a high overall classification effectiveness, balancing sensitivity and specificity, even with the added interpretability layer.

The Mean Relevance Score (MRS), which quantifies the model’s interpretability, was 0.70, indicating a moderate degree of interpretability that helps reveal decision-making areas while maintaining a reasonable classification performance. Figure 14 also visualizes the interpretability MRS of the model and its accuracy rate at each epoch. The downward trend shows that the increase in the interpretability of the model inadvertently caused the accuracy to decrease over 50 epochs.

As shown in Figure 15, Grad-CAM offers a visual map indicating which regions in a chest X-ray image contribute most to the model’s decision-making. By generating heat maps that highlight areas of high activation, Grad-CAM provides insight into the specific lung areas that influence the model’s prediction of pneumonia. This visualization aims to focus on relevant regions, particularly areas showing signs of opacities or other pneumonia-related abnormalities, thereby reinforcing the reliability of its predictions.

In summary, Grad-CAM’s integration into the pneumonia detection model provides critical interpretability, with heatmaps consistently highlighting pneumonia-relevant regions, such as consolidated lung areas. This enhances model trustworthiness in clinical settings. While there is a small reduction in raw predictive performance metrics, the added transparency in model predictions holds significant value, allowing clinicians to verify the model’s focus areas. The slight trade-off in metrics like accuracy and specificity is outweighed by the gain in interpretability, reinforcing Grad-CAM as a valuable tool in healthcare machine learning applications.

4.6. Attention Mechanism Implementation Results

The SAM Integrated ResNet-50 model achieves a final accuracy of 0.88 after 50 epochs, with a sensitivity of 0.90, specificity of 0.87, and AUC-ROC score of 0.91. As shown in Figure 16, The model’s loss curve exhibits a gradual decrease over the epochs, stabilizing towards the end of training. This stabilization indicates that SAM integration does not significantly alter the model’s ability to learn effectively, although there is a slight decrease in performance metrics compared to the baseline model without SAM. The drop in performance is an expected outcome of adding interpretability-focused mechanisms, which often slightly compromise predictive accuracy due to their emphasis on transparency.

The Mean Relevance Score (MRS) evaluation reflects this interpretability limitation, with a final score of 0.12, signifying relatively low precision in the interpretability of SAM’s attention maps. The final trade-off score for the SAM-integrated model is 0.42, indicating a moderate balance between accuracy and interpretability. Although this score is lower than desired for a model intended for medical use, it does highlight that SAM introduces interpretability at the expense of some diagnostic accuracy. This trade-off score underlines a consistent theme in interpretability research, where models optimized for transparency often encounter performance trade-offs. However, for clinical settings, this trade-off may still be acceptable if it provides clinicians with insight into the model’s decision-making process.

The attention maps generated by SAM in Figure 17 reveal the areas of the chest X-ray images that the model considers most relevant for identifying pneumonia. However, the visual results indicate that SAM’s highlighted regions are relatively broad and somewhat vague, which may lead to challenges in pinpointing specific diagnostic areas within the lungs. This broadness in highlighted areas could result from SAM’s design, which serves to capture general spatial information rather than channel-specific activations, though the latter would allow for finer feature distinctions.

In summary, while the SAM enhances the interpretability of the ResNet-50 model by focusing on pneumonia-relevant areas within chest X-rays, its broad attention maps and low MRS suggest potential limitations in providing precise visual explanations. Despite a modest decrease in accuracy, the model remains valuable in scenarios where a transparent decision-making process is prioritized, albeit with some compromise in specificity and sensitivity. The trade-off score of 0.420 reflects this balance and underscores the SAM’s contribution to model interpretability, albeit with room for improvement in spatial precision.

5. Discussion

5.1. Summary of Evaluation

A comparative evaluation of the interpretability techniques (Layer-wise Relevance Propagation, Adversarial Training, Class Activation Maps, and the Spatial Attention Mechanism) integrated into a ResNet50 model for pneumonia detection reveals distinct effects on model interpretability and performance. As shown in Table 3, each method, except for LRP, shows a trade-off between diagnostic accuracy and interpretability, highlighting the complexity of balancing transparency with performance. Epsilon Layer-wise Relevance Propagation (e-LRP) performed well, achieving an accuracy of 0.89, sensitivity of 0.91, and AUC-ROC of 0.92. Notably, LRP is the only technique that does not diminish the performance in favor of interpretability; instead, it slightly enhances the performance. By attributing relevance to individual pixels, LRP offers fine insights into how specific areas influence predictions. However, its interpretability MRS of 0.85 comes at a computational cost, especially in deep models like ResNet-50, where backward propagation can create inconsistencies in convolutional layers.

FGSM Adversarial Training achieved an accuracy of 0.85, sensitivity of 0.87, specificity of 0.79, and AUC-ROC of 0.87. Known for increasing robustness to input perturbations, it provides insight into prioritized features under manipulated conditions, earning an interpretability MRS of 0.76. However, this technique requires careful tuning and can reduce specificity, leading to trade-offs in diagnostic accuracy when distinguishing pneumonia from other conditions. Its trade-off score of 0.79 underscores the balance achieved but indicates limitations for high-precision applications like medical imaging. Class Activation Maps (CAMs) highlight class-discriminative areas within images and achieve an accuracy of 0.905, sensitivity of 0.86, specificity of 0.83, and AUC-ROC of 0.89. CAMs’ spatial attention maps help visualize areas of significance, providing a moderate interpretability MRS of 0.699. While useful in medical imaging, CAMs’ dependency on the final convolutional layer may overlook subtle pneumonia indicators.

SAMs, scoring 0.88 in accuracy, 0.90 in sensitivity, 0.87 in specificity, and 0.91 in AUC-ROC, deliver broad interpretability across spatial regions. The SAM’s focus on generalized spatial patterns aids in initial screening but lacks the specificity for precise diagnostics, yielding an interpretability MRS of 0.12 and the lowest trade-off score of 0.42. While suitable for general screening, the SAM’s broad focus makes it less effective for pinpointing localized pneumonia markers. Each interpretability technique offers unique strengths and trade-offs, with LRP excelling in pixel-level detail without sacrificing performance, while Adversarial Training, CAMs, and the SAM each introduce interpretability with varying impacts on diagnostic precision and application suitability.

5.2. Research Contributions

This study addresses the pressing issue of interpretability in deep learning models for medical diagnostics, specifically in detecting pneumonia from chest X-ray images using a ResNet50 architecture. The primary goal was to evaluate and compare four interpretability techniques—Layer-wise Relevance Propagation (LRP), Class Activation Maps (CAMs), Adversarial Training, and the Spatial Attention Mechanism—to identify the most effective method for enhancing transparency while maintaining diagnostic accuracy. The findings demonstrate that LRP excels in balancing performance and interpretability, achieving the highest Mean Relevance Score (MRS) of 0.85 without compromising accuracy, which stood at 91%. This technique reliably highlights clinically significant regions within X-rays, offering a transparent and trustworthy decision-making process.

CAMs, on the other hand, provided moderate interpretability (MRS of 0.70) and a slight accuracy improvement. However, their reliance on final convolutional layers limited their ability to detect subtle pneumonia features, which are often critical for diagnosis. Adversarial Training improved robustness against input perturbations but resulted in notable trade-offs, including a reduction in specificity (79%) and interpretability (MRS of 0.76). The Spatial Attention Mechanism highlighted broad spatial patterns in X-ray images but lacked the precision necessary to pinpoint specific diagnostic markers, leading to the lowest MRS (0.12) among the evaluated techniques.

These findings are significant as they advance the field of explainable AI in healthcare, addressing the critical need for transparency in high-stakes applications. LRP’s ability to enhance interpretability without performance trade-offs makes it a strong candidate for clinical adoption, aligning with prior research that underscores the trade-off between accuracy and transparency. By introducing interpretability metrics like MRS into the evaluation process, this study provides a robust framework for balancing these aspects in medical imaging tasks. Future research may generalize these findings to other diseases and datasets, explore hybrid approaches combining the strengths of different techniques, and refine Attention Mechanisms to enhance precision and interpretability.

5.3. Limitations

Despite its contributions, this study has several limitations that must be acknowledged. First, the research is confined to a single dataset of pediatric chest X-rays, which limits its generalizability to other populations and imaging modalities. Validation on broader datasets that include diverse patient demographics, age groups, and clinical settings is necessary to ensure the robustness and applicability of the findings.

Additionally, each interpretability technique used in this study has specific limitations. Layer-wise Relevance Propagation (LRP), while effective in providing pixel-level insights, imposes significant computational demands, particularly in complex models like ResNet50. Its implementation can be time-intensive, making it challenging to deploy in real-time or resource-constrained environments. Moreover, LRP’s sensitivity to model initialization and parameter settings may introduce variability in the results.

Adversarial Training, although valuable for improving model robustness against perturbations, demonstrated a reduction in specificity and interpretability in this study. This trade-off suggests that Adversarial Training may over-prioritize robustness at the expense of precision in distinguishing non-pneumonia cases. Additionally, its reliance on artificially generated perturbations may not fully reflect real-world challenges, potentially limiting its clinical relevance.

Grad-CAM provides intuitive heatmaps that localize discriminative features; however, it is inherently limited by its dependence on the final convolutional layer. This dependency can cause Grad-CAM to miss subtle or non-obvious features distributed across earlier layers, particularly in complex medical images where such features may be diagnostically significant.

Attention Mechanisms, while useful in highlighting broader spatial regions, often lack precision and granularity. In this study, attention maps were found to be overly diffuse, making it difficult to pinpoint specific diagnostic features. This limitation reduces their effectiveness in clinical settings where fine details are critical for decision-making.

Another limitation lies in the reliance on the Mean Relevance Score (MRS) as the primary metric for interpretability. While the MRS provides a quantitative measure, its alignment with clinical expectations and its ability to capture nuanced interpretability require further validation through feedback from healthcare professionals.

Moreover, while this study focuses on individual techniques, it does not explore the potential benefits of combining methods to leverage their complementary strengths. For example, a hybrid approach integrating LRP’s precision with Grad-CAM’s intuitive visualizations could provide a more comprehensive interpretability framework.

Finally, although this study evaluates interpretability by comparing highlighted regions with known diagnostic features, we acknowledge that AI has the potential to uncover novel patterns beyond human recognition. The current assumptions in this study prioritize alignment with human-defined features, potentially overlooking the opportunity for AI to contribute new diagnostic insights.

Future research should address these limitations by incorporating clinical feedback to assess the practical utility of interpretability visualizations, optimizing the computational efficiency of techniques for real-world deployment, and validating the results on broader datasets. Additionally, exploring hybrid approaches that combine complementary interpretability methods could enhance their effectiveness and clinical relevance. Expanding the scope of this research to include other medical imaging tasks may further demonstrate the generalizability and impact of the proposed techniques.

6. Conclusions

This study identifies Layer-wise Relevance Propagation (LRP) as the most effective interpretability technique for enhancing ResNet50’s diagnostic performance in pneumonia detection. LRP achieved the highest Mean Relevance Score (MRS) of 0.85 and improved the model accuracy to 91%, demonstrating its capability to deliver transparency without compromising performance. In contrast, other methods, such as Adversarial Training and the SAM, showed notable trade-offs, underscoring LRP’s superiority in clinical applications. These findings contribute to advancing explainable AI in healthcare, promoting trust and reliability in automated diagnostics. LRP offers a fairly high interpretability of the model by pinpointing critical regions in the input images and providing intuitive visual insights for clinicians. This is achieved without sacrificing the model’s performance, and instead slightly increases its overall performance. On the other hand, the SAM, with its broader spatial focus, disables the entire interpretability approach, rendering it unable to pinpoint and capture specific patterns across the entire image. While this technique resulted in less of a trade-off with the performance of the model for interpretability, it does not support comprehensive screening. After comparing the selected interpretability techniques for this research, we conclude that LRP is the most appropriate match for ResNet50 due to its high interpretability MRS and zero trade-off, allowing it to deliver interpretability without severely impacting accuracy; instead, it increases the overall performance. These findings emphasize the importance of combining spatial awareness with layer-specific analysis for optimal interpretability and diagnostic effectiveness.

Author Contributions

Conceptualization, J.C. and N.S.; methodology, J.C.; software, J.C.; validation, J.C. and N.S.; formal analysis, J.C.; investigation, J.C.; resources, J.C.; data curation, J.C.; writing—original draft preparation, J.C.; writing—review and editing, J.C.; visualization, J.C. and N.S.; supervision, N.S.; contribution to the design and implementation of the research, J.C. and N.S.; development of the theoretical framework, J.C. and N.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Directorate General of Strengthening for Research and Development, Ministry of Research, Technology, and Higher Education, Republic of Indonesia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available online.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning, 2nd ed.; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
Arrieta, A.B.; Díaz-Rodríguez, N.; Ser, J.D.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
World Health Organization. Pneumonia in Children. Available online: https://www.who.int/news-room/fact-sheets/detail/pneumonia (accessed on 10 December 2024).
Chen, J.; Wu, L.; Zhang, J.; Zhang, L.; Gong, D.; Zhao, Y.; Chen, Q.; Huang, H.; Yang, M.; Yang, X.; et al. Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: A prospective study. Sci. Rep. 2020, 10, 19196. [Google Scholar] [CrossRef]
Our World in Data. Pneumonia. Available online: https://ourworldindata.org/pneumonia (accessed on 10 December 2024).
Yan, K.; Wang, X.; Lu, L.; Summers, R.M. DeepLesion: Automated Mining of Large-scale Lesion Annotations and Universal Lesion Detection with Deep Learning. J. Med. Imaging 2020, 7, 014501. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Kang, B.; Ma, J.; Zeng, X.; Xiao, M.; Guo, J.; Cai, M.; Yang, J.; Li, Y.; Meng, X.; et al. A Deep Learning Algorithm Using CT Images to Screen for Corona Virus Disease (COVID-19). Eur. Radiol. 2021, 31, 6096–6104. [Google Scholar] [CrossRef] [PubMed]
Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning (ICML), Vienna, Austria, 12–18 July 2020; pp. 6105–6114. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpretable Machine Learning. Adv. Neural Inf. Process. Syst. 2020, 33, 1701–1710. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed.; Leanpub: Vancouver, BC, Canada, 2020; pp. 45–67. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 12 November 2024).
Zhou, J.; Ye, J.; Zhang, Y.; Chen, J.; Xu, Y.; Cao, L. Pneumonia Detection Using Chest X-ray Images Based on Convolutional Neural Network. J. Med. Imaging Health Inform. 2021, 11, 1512. [Google Scholar] [CrossRef]
Quazi, S. Artificial Intelligence and Machine Learning in Precision and Genomic Medicine. Med. Oncol. 2022, 39, 120. [Google Scholar] [CrossRef] [PubMed]
Johnson, A.E.W.; Pollard, T.J.; Mark, R.G.; Berkowitz, S.J.; Horng, S. MIMIC-CXR: Chest Radiographs in Critical Care. Sci. Data 2020, 7, 317. [Google Scholar] [CrossRef]
Fonseka, D.; Chrysoulas, C. Data augmentation to improve the performance of a convolutional neural network on Image Classification. In Proceedings of the International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 8–9 November 2020; pp. 515–518. [Google Scholar] [CrossRef]
Ren, H.; Wong, A.B.; Lian, W.; Cheng, W.; Zhang, Y.; He, J.; Liu, Q.; Yang, J.; Zhang, C.J.; Wu, K.; et al. Interpretable pneumonia detection by combining deep learning and explainable models with Multisource Data. IEEE Access 2021, 9, 95872–95883. [Google Scholar] [CrossRef]
Rajaraman, S.; Thoma, G.; Antani, S.; Candemir, S. Visualizing and explaining deep learning predictions for pneumonia detection in pediatric chest radiographs. In Proceedings of the SPIE Conference Medical Imaging 2019: Computer-Aided Diagnosis, San Diego, CA, USA, 16–21 February 2019. [Google Scholar] [CrossRef]
Aljawarneh, S.A.; Al-Quraan, R. Pneumonia detection using enhanced convolutional neural network model on chest X-ray image. Big Data 2023. [Google Scholar] [CrossRef] [PubMed]
Siddiqi, R.; Javaid, S. Deep Learning for Pneumonia Detection in Chest X-ray Images: A Comprehensive Survey. J. Imaging 2024, 10, 176. [Google Scholar] [CrossRef] [PubMed]
Koh, P.W.; Liang, P.; Nguyen, A.; Tang, K.; Guo, Z.; Doshi-Velez, F. Concept Bottleneck Models. Adv. Neural Inf. Process. Syst. 2020, 33, 11623–11634. Available online: https://api.semanticscholar.org/CorpusID:220424448 (accessed on 12 November 2024).
Han, T.; Nebelung, S.; Pedersoli, F.; Zimmermann, M.; Schulze-Hagen, M.; Ho, M.; Haarburger, C.; Kiessling, F.; Kuhl, C.; Schulz, V.; et al. Advancing diagnostic performance and clinical usability of neural networks via adversarial training and dual batch normalization. Comput. Biol. Med. 2021, 124, 103926. [Google Scholar] [CrossRef] [PubMed]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
He, K.; Gan, C.; Li, Z.; Rekik, I.; Yin, Z.; Ji, W.; Zhang, Y.; Shen, D. Transformers in medical image analysis. Intell. Med. 2023, 3, 59–78. Available online: https://mednexus.org/doi/full/10.1016/j.imed.2022.07.002 (accessed on 12 November 2024). [CrossRef]
Nillmani; Sharma, N.; Saba, L.; Khanna, N.N.; Kalra, M.K.; Fouda, M.M.; Suri, J.S. Segmentation-Based Classification Deep Learning Model Embedded with Explainable AI for COVID-19 Detection in Chest X-ray Scans. Diagnostics 2022, 12, 2132. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y.; Piao, Y.; Tan, B.; Liu, J. A multi-task fusion model based on a residual–Multi-layer perceptron network for mammographic breast cancer screening. Comput. Methods Programs Biomed. 2024, 247, 108101. [Google Scholar] [CrossRef] [PubMed]
Dong, J.; Chen, J.; Xie, X.; Lai, J.; Chen, H. Adversarial Attacks and Defenses for Medical Image Analysis: Methods and Applications. ACM Comput. Surv. 2024, 57, 1–38. [Google Scholar] [CrossRef]
Suara, S.; Jha, A.; Sinha, P.; Sekh, A. Is Grad-CAM Explainable in Medical Images? In Computer Vision and Image Processing, 1st ed.; Springer Nature: Cham, Switzerland, 2024; pp. 124–135. [Google Scholar]
Li, X.; Zheng, Y.; Ge, Z.; Dai, D.; Ju, Z. Attention Mechanism-Based Image Analysis for Medical Diagnostics. Neurocomputing 2022, 478, 31–45. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 2011–2023. [Google Scholar] [CrossRef] [PubMed]
Ismail, H.; Wu, T.; Roy, A.; Guestrin, C.; Xing, E. Benchmarking Deep Learning Interpretability in Time Series Predictions. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020; pp. 6441–6451. Available online: https://proceedings.neurips.cc/paper/2020/file/47a3893cc405396a5c30d91320572d6d-Paper.pdf (accessed on 12 November 2024).
Lakkaraju, H.; Rudin, C.; McCormick, T.H. An Empirical Study of the Accuracy-Explainability Trade-off in Machine Learning for Public Policy. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; pp. 2120–2130. Available online: https://facctconference.org/static/pdfs_2022/facct22-3533090.pdf (accessed on 12 November 2024).

Figure 1. Experimental design.

Figure 2. Illustrative examples of chest X-rays in patients with pneumonia (arrows indicate clinical annotations of pneumonia-positive areas).

Figure 3. Model development flow chart.

Figure 4. Layer-wise Relevance Propagation (LRP) Integrated ResNet50 architecture [24].

Figure 5. Adversarial Training Integrated ResNet50 architecture [25].

Figure 6. Class Activation Map (CAM) Integrated ResNet50 architecture [26].

Figure 7. Attention Mechanism Integrated ResNet50 architecture [27].

Figure 8. Preview of initial chest X-ray dataset.

Figure 9. Pre-Trained ResNet50 pneumonia detection accuracy and model loss results visualization across 50 epochs.

Figure 10. Epsilon LRP Integrated ResNet50 pneumonia detection accuracy, model loss, and accuracy vs. interpretability visualization across 50 epochs.

Figure 11. Epsilon LRP heatmap for pneumonia detection.

Figure 12. Adversarial Training Integrated ResNet50 pneumonia detection accuracy and model loss visualization across 50 epochs.

Figure 13. Adversarial chest X-ray images (pneumonia highlighted).

Figure 14. Grad-CAM Integrated ResNet50 pneumonia detection accuracy, model loss, and accuracy vs. interpretability visualization across 50 epochs.

Figure 15. Grad-CAM heatmap visualization of chest X-ray images for pneumonia detection.

Figure 16. SAM Integrated ResNet50 pneumonia detection accuracy, model loss, accuracy vs. interpretability, and trade-off score visualization across 50 epochs.

Figure 17. SAM map visualization on chest X-ray images for pneumonia detection.

Table 1. Epidemiological pneumonia data over time (2010–2019).

Year	Confirmed Cases (in Millions)	Misdiagnosis Rate	Deaths (in Millions)
2010	220	1.400	46
2013	230	2.200	67.5
2016	250	3.000	94
2019	280	2.500	118

Table 2. Literature review of Layer-wise Relevance Propagation (LRP).

Publication	Interpretability Technique	Architecture	Results
Ren et al., 2021 [15]	Bayesian network	AlexNet, DenseNet121, Inception V-4, ResNet-50, Xception	Accuracy: 82.9% (highest with ResNet-50) Interpretability: 75.9% (highest with ResNet-50)
Rajaraman et al., 2019 [16]	Grad-CAM and LIME	CNN, Visual Geometry Group-16 (VGG-16)	Accuracy: 96.2% (VGG-16), 94.1% (CNN) Interpretability: 91.8% (VGG-16), 87.3% (CNN)
Aljawarneh and Al-Quraan, 2023 [17]	Adversarial Training, LRP, Attention Mechanism	Enhanced CNN, ResNet-50	Accuracy: 82.8% (lowest) and 92.4% (highest) Interpretability: 79.6% (lowest) 87.3% (highest)
Siddiqi and Javaid, 2024 [18]	Grad-CAM, LIME, SHAP	ResNet-50, VGG-16, Dense-Net, AlexNet, Mobile Net	Accuracy: 99.39% (highest) Interpretability: 94.96% (highest)

Table 3. Summary of evaluation.

Model	Accuracy	Sensitivity	Specificity	AUC-ROC	MRS	Trade-Off
Base ResNet-50	0.90	0.92	0.88	0.93	0.00	-
LRP	0.91	0.90	0.92	0.93	0.85	-
Adversarial Training	0.90	0.87	0.79	0.87	0.76	0.79
CAM	0.905	0.86	0.83	0.89	0.70	-
Attention Mechanism	0.88	0.90	0.87	0.91	0.12	0.42

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Colin, J.; Surantha, N. Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images. Information 2025, 16, 53. https://doi.org/10.3390/info16010053

AMA Style

Colin J, Surantha N. Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images. Information. 2025; 16(1):53. https://doi.org/10.3390/info16010053

Chicago/Turabian Style

Colin, Jovito, and Nico Surantha. 2025. "Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images" Information 16, no. 1: 53. https://doi.org/10.3390/info16010053

APA Style

Colin, J., & Surantha, N. (2025). Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images. Information, 16(1), 53. https://doi.org/10.3390/info16010053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Interpretable Deep Learning for Pneumonia Detection Using Chest X-Ray Images

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Experimental Setup

3.2. Experimental Design

3.2.1. Proposed Model

3.2.2. Dataset

3.2.3. Data Preprocessing

3.2.4. Model Development

Key Hyperparameters

Iterative Refinement and Model Optimization

3.2.5. Interpretability Technique Implementation

4. Results

4.1. Model Evaluation

4.1.1. Model Optimization Process and Loss Function

4.1.2. Quantitative Assessment

4.1.3. Visual and Graphical Representations

4.2. Pre-Trained ResNet50 Without Interpretability Technique Results

4.3. Layer-Wise Relevance Propagation Implementation Results

4.4. Adversarial Training Implementation Results

4.5. Class Activation Map Implementation Results

4.6. Attention Mechanism Implementation Results

5. Discussion

5.1. Summary of Evaluation

5.2. Research Contributions

5.3. Limitations

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI